Business · 6 tasks · 93 models
Best AI models for Frontend & Landing Pages
Which models build landing pages that actually look designed, convert, and ship — not just valid HTML?
gemini-3-flash-preview-max leads Frontend & Landing Pages (usable). For tighter budgets, gemini-3.1-flash-lite-medium is competitive at about 29% of the cost.
Top score — usable
Clears the quality bar at $3.19/1k/run
Quality vs. cost
Every model placed by what it delivers and what it costs. The best value sits high and to the left.
Full ranking
| # | Model | Score | Cost/run | Speed | Best for | Arena · UI |
|---|---|---|---|---|---|---|
| 1 | gemini-3-flash-preview-max | 73.0 Usable | $0.0109 | 25.2s | Needs review | — |
| 2 | claude-opus-4.5-medium | 72.3 Usable | $0.2514 | 103.7s | Needs review | 1296 |
| 3 | qwen3.5-plus-02-15 | 71.5 Usable | $0.0083 | 97.6s | Needs review | — |
| 4 | gemini-3.5-flash-max | 71.3 Usable | $0.0815 | 47.9s | Needs review | 1315 |
| 5 | claude-haiku-4.5-max | 70.8 Usable | $0.0682 | 78.0s | Needs review | 1155 |
| 6 | gemini-3.1-flash-lite-medium | 70.0 Usable | $0.0032 | 12.2s | Needs review | — |
| 7 | gemini-3-flash-preview | 69.8 Needs editing | $0.0103 | 19.3s | Needs review | — |
| 8 | gpt-5-mini-medium | 68.2 Needs editing | $0.0128 | 78.2s | Needs review | — |
| 9 | gpt-5.5-medium | 67.7 Needs editing | $0.2021 | 68.4s | Needs review | 1300 |
| 10 | qwen3.7-max-medium | 67.5 Needs editing | $0.0307 | 150.0s | Needs review | 1327 |
| 11 | grok-4.20 | 67.3 Needs editing | $0.0119 | 20.4s | Needs review | — |
| 12 | gemini-3.1-flash-lite | 67.2 Needs editing | $0.0020 | 5.0s | Needs review | — |
| 13 | claude-opus-4.5 | 67.2 Needs editing | $0.1651 | 60.3s | Needs review | 1296 |
| 14 | claude-opus-4.8-medium | 66.5 Needs editing | $0.1504 | 61.6s | Needs review | 1285 |
| 15 | gpt-5.4 | 66.3 Needs editing | $0.0749 | 31.5s | Needs review | 1258 |
| 16 | deepseek-v3.2-medium | 65.8 Needs editing | $0.0012 | 94.5s | Needs review | 1203 |
| 17 | kimi-k2.5-max | 65.8 Needs editing | $0.0134 | 135.3s | Needs review | — |
| 18 | gemini-3-flash-preview-medium | 65.5 Needs editing | $0.0101 | 20.8s | Needs review | — |
| 19 | claude-sonnet-4.5 | 65.5 Needs editing | $0.0503 | 39.3s | Needs review | 1215 |
| 20 | gpt-5-mini-max | 65.3 Needs editing | $0.0225 | 130.8s | Needs review | — |
| 21 | grok-4.20-beta | 65.2 Needs editing | $0.0266 | 27.6s | Needs review | 1245 |
| 22 | deepseek-v3.1-terminus | 65.0 Needs editing | $0.0032 | 122.6s | Needs review | 1237 |
| 23 | gemini-3.1-pro-preview-medium | 64.8 Needs editing | $0.0648 | 48.9s | Needs review | 1284 |
| 24 | mistral-medium-3.1 | 64.7 Needs editing | $0.0064 | 24.2s | Needs review | 1154 |
| 25 | mistral-medium-3.1-max | 64.7 Needs editing | $0.0074 | 31.7s | Needs review | 1154 |
| 26 | gpt-5.4-mini | 64.7 Needs editing | $0.0183 | 23.4s | Needs review | — |
| 27 | claude-sonnet-4.5-medium | 64.7 Needs editing | $0.0558 | 51.1s | Needs review | 1215 |
| 28 | mistral-medium-3.1-medium | 64.3 Needs editing | $0.0055 | 28.4s | Needs review | 1154 |
| 29 | deepseek-v3.2-max | 64.2 Needs editing | $0.0012 | 92.7s | Needs review | 1203 |
| 30 | claude-haiku-4.5-medium | 64.0 Needs editing | $0.0512 | 57.0s | Needs review | 1155 |
| 31 | claude-sonnet-4.5-max | 64.0 Needs editing | $0.0628 | 53.1s | Needs review | 1215 |
| 32 | deepseek-v3.2 | 63.5 Needs editing | $0.0013 | 67.7s | Needs review | 1203 |
| 33 | claude-opus-4.8-high | 62.9 Needs editing | $0.1677 | 64.8s | Needs review | 1285 |
| 34 | gemini-3.1-pro-preview-low | 62.5 Needs editing | $0.0687 | 51.6s | Needs review | 1284 |
| 35 | deepseek-v3.1-terminus-medium | 62.3 Needs editing | $0.0036 | 126.1s | Needs review | 1237 |
| 36 | gpt-5.5-high | 61.2 Needs editing | $0.2108 | 82.0s | Needs review | 1300 |
| 37 | gpt-5.4-medium | 61.0 Needs editing | $0.1267 | 67.4s | Needs review | 1258 |
| 38 | gpt-5.5-max | 61.0 Needs editing | $0.2000 | 74.2s | Needs review | 1300 |
| 39 | claude-opus-4.6 | 60.7 Needs editing | $0.1570 | 73.6s | Needs review | 1332 |
| 40 | gpt-5.5 | 60.3 Needs editing | $0.1834 | 75.8s | Needs review | 1300 |
| 41 | kimi-k2.7-code-max | 60.2 Needs editing | $0.0317 | 127.6s | Needs review | 1300 |
| 42 | claude-opus-4.8-low | 60.2 Needs editing | $0.1244 | 49.0s | Needs review | 1285 |
| 43 | gpt-5.4-mini-medium | 60.0 Needs editing | $0.0377 | 56.2s | Needs review | — |
| 44 | qwen3.7-max | 59.8 Weak | $0.0270 | 128.6s | Needs review | 1327 |
| 45 | gemini-3.1-pro-preview-max | 59.7 Weak | $0.0659 | 52.1s | Needs review | 1284 |
| 46 | claude-opus-4.5-max | 59.7 Weak | $0.2252 | 90.0s | Needs review | 1296 |
| 47 | deepseek-v3.1-terminus-max | 59.5 Weak | $0.0033 | 111.7s | Needs review | 1237 |
| 48 | gemini-3.1-pro-preview | 59.5 Weak | $0.0878 | 55.4s | Needs review | 1284 |
| 49 | claude-opus-4.6-medium | 59.3 Weak | $0.1994 | 103.1s | Needs review | 1332 |
| 50 | claude-opus-4.6-high | 59.3 Weak | $0.2269 | 112.3s | Needs review | 1332 |
| 51 | kimi-k2.5-medium | 59.2 Weak | $0.0175 | 174.7s | Needs review | — |
| 52 | gemini-3.1-pro-preview-high | 59.2 Weak | $0.0678 | 46.1s | Needs review | 1284 |
| 53 | qwen3.5-plus-02-15-max | 59.0 Weak | $0.0089 | 114.9s | Needs review | — |
| 54 | claude-opus-4.6-max | 59.0 Weak | $0.2038 | 99.2s | Needs review | 1332 |
| 55 | gemini-3.5-flash-low | 58.6 Weak | $0.0563 | 31.3s | Needs review | 1315 |
| 56 | grok-4.20-medium | 58.2 Weak | $0.0120 | 25.9s | Needs review | — |
| 57 | deepseek-v3.2-high | 57.8 Weak | $0.0012 | 67.0s | Needs review | 1203 |
| 58 | kimi-k2.5 | 57.8 Weak | $0.0148 | 181.0s | Needs review | — |
| 59 | grok-4.20-beta-max | 56.8 Weak | $0.0321 | 35.1s | Needs review | 1245 |
| 60 | glm-5 | 54.2 Weak | $0.0119 | 93.0s | Needs review | 1287 |
| 61 | qwen3.7-max-max | 53.8 Weak | $0.0244 | 122.8s | Needs review | 1327 |
| 62 | qwen3.5-plus-02-15-medium | 53.7 Weak | $0.0079 | 93.1s | Needs review | — |
| 63 | grok-4.20-max | 53.3 Weak | $0.0109 | 20.5s | Needs review | — |
| 64 | gemini-3.5-flash-medium | 53.2 Weak | $0.0570 | 34.4s | Needs review | 1315 |
| 65 | claude-opus-4.5-high | 52.7 Weak | $0.2281 | 87.1s | Needs review | 1296 |
| 66 | claude-sonnet-4.6-medium | 52.7 Weak | $0.2403 | 208.6s | Needs review | 1324 |
| 67 | grok-4.20-beta-medium | 51.7 Weak | $0.0263 | 26.2s | Needs review | 1245 |
| 68 | gemini-3.5-flash-high | 51.3 Weak | $0.0792 | 43.8s | Needs review | 1315 |
| 69 | kimi-k2.7-code | 50.0 Weak | $0.0281 | 140.6s | Needs review | 1300 |
| 70 | minimax-m2.7-max | 49.7 Weak | $0.0083 | 107.4s | Needs review | 1261 |
| 71 | glm-5-medium | 47.5 Weak | $0.0178 | 223.4s | Needs review | 1287 |
| 72 | kimi-k2.7-code-medium | 46.8 Weak | $0.0246 | 105.1s | Needs review | 1300 |
| 73 | minimax-m2.7 | 46.7 Weak | $0.0074 | 54.5s | Needs review | 1261 |
| 74 | gemini-3.1-flash-lite-max | 45.5 Weak | $0.0031 | 15.0s | Needs review | — |
| 75 | minimax-m2.7-medium | 36.8 Failed | $0.0072 | 86.9s | Needs review | 1261 |
| 76 | deepseek-v3.2-low | 35.0 Failed | $0.0013 | 134.7s | Needs review | 1203 |
| 77 | gpt-5.4-low | 35.0 Failed | $0.0809 | 32.6s | Needs review | 1258 |
| 78 | gpt-5.5-low | 35.0 Failed | $0.1476 | 51.3s | Needs review | 1300 |
| 79 | gpt-5-mini | 34.7 Failed | $0.0131 | 63.8s | Needs review | — |
| 80 | claude-haiku-4.5 | 34.7 Failed | $0.0238 | 25.5s | Needs review | 1155 |
| 81 | qwen3.7-max-high | 34.7 Failed | $0.0532 | 254.9s | Needs review | 1327 |
| 82 | claude-sonnet-4.5-low | 34.7 Failed | $0.0539 | 42.7s | Needs review | 1215 |
| 83 | claude-sonnet-4.5-high | 34.7 Failed | $0.0600 | 49.3s | Needs review | 1215 |
| 84 | claude-opus-4.6-low | 34.7 Failed | $0.1819 | 84.9s | Needs review | 1332 |
| 85 | claude-sonnet-4.6-high | 34.7 Failed | $0.3623 | 299.1s | Needs review | 1324 |
| 86 | claude-opus-4.5-low | 34.3 Failed | $0.1777 | 70.9s | Needs review | 1296 |
| 87 | claude-sonnet-4.6-low | 33.7 Failed | $0.1214 | 92.3s | Needs review | 1324 |
| 88 | qwen3.7-max-low | 33.0 Failed | $0.0217 | 103.7s | Needs review | 1327 |
| 89 | gpt-5.4-max | 71.0 Usable | $0.1599 | 104.6s | Needs review | 1258 |
| 90 | claude-sonnet-4.6-max | 64.8 Needs editing | $0.3609 | 299.3s | Needs review | 1324 |
| 91 | glm-5-max | 49.8 Weak | $0.0202 | 343.1s | Needs review | 1287 |
| 92 | gpt-5.4-high | 42.6 Weak | $0.1711 | 88.9s | Needs review | 1258 |
| 93 | gpt-5.4-mini-max | 33.0 Failed | $0.0528 | 90.0s | Needs review | — |
“Arena · UI” is a third-party benchmark shown for context — independent of our tests. Source: Artificial Analysis (artificialanalysis.ai) via OpenRouter (openrouter.ai/rankings). · Source: Design Arena (www.designarena.ai) via OpenRouter (openrouter.ai/rankings).
What separates the top models
Landing Page Build
mediumBuild a complete, self-contained, responsive landing page from a product brief — hero, value props, social proof, and one primary CTA.
Leader: claude-opus-4.8-medium
Pricing Page Build
hardBuild a self-contained, responsive pricing section with tiers and a highlighted recommended plan — clear hierarchy, honest copy, no invented prices.
Leader: gpt-5.5-medium
See the difference
Real model outputs and how our judge scored them.
Top score
gpt-5.5-medium
Ferrovia — enterprise pricing tiers
80/100
“Clean, professional design with a distinct recommended tier and disciplined typography. Copy is precise, benefit-led, and adheres strictly to constraints, using 'Custom' pricing appropriately. The mobile layout exhibits horizontal overflow, cutting off text.”
```html
<!doctype html>
<html lang="en">
<head>
<meta charset="utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1" />
<title>Ferrovia Analytics Pricing</title>
<style>
:root {
--bg: #f6f8fb;
--surface: #ffffff;
--surface-strong: #f9fbff;
--text: #142033;
--muted: #5d6b82;
--muted-2: #78869a;
--line: #d9e0ea;
--accent: #1f5eff;
--accent-dark: #1748c7;
--accent-soft: #eaf0ff;
--success: #147a5c;
--shadow: 0 18px 50px rgba(20, 32, 51, 0.10);
--radius: 22px;
--max: 1180px;
}
* {
box-sizing: border-box;
}
body {
margin: 0;
font-family: Inter, ui-sans-serif, system-ui, -apple-system, BlinkMacSystemFont, "Segoe UI", Roboto, Arial, sans-serif;
color: var(--text);
background:
radial-gradient(circle at top left, rgba(31, 94, 255, 0.10), transparent 32rem),
linear-gradient(180deg, #ffffff 0%, var(--bg) 44%, #eef2f7 100%);
line-height: 1.5;
}
.pricing-section {
padding: clamp(3.5rem, 6vw, 6rem) 1.25rem;
}
.pricing-wrap {
max-width: var(--max);
margin: 0 auto;
}
.eyebrow {
display: inline-flex;
align-items: center;
gap: 0.5rem;
margin: 0 0 1rem;
padding: 0.45rem 0.75rem;
border: 1px solid var(--line);
border-radius: 999px;
background: rgba(255, 255, 255, 0.72);
color: var(--muted);
font-size: 0.875rem;
font-weight: 700;
letter-spacing: 0.01em;
}
.eyebrow-dot {
width: 0.55rem;
height: 0.55rem;
border-radius: 50%;
background: var(--success);
box-shadow: 0 0 0 4px rgba(20, 122, 92, 0.12);
}
.section-header {
display: grid;
Mid-pack
minimax-m2.7
Cedar & Sage — DTC product page
70/100
“Visual design fails due to a severely broken, non-responsive mobile layout with horizontal overflow. Hero copy is clear and benefit-led, but the 'customer' section awkwardly repeats product features instead of providing testimonials. Grounding is excellent, strictly adhering to provided facts without fabrication.”
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Cedar & Sage – Refillable Home Cleaning</title>
<meta name="description" content="Cedar & Sage delivers refillable aluminium bottles and concentrated cleaning tablets. Add water at home, cut shipping weight, and skip single‑use plastic.">
<style>
:root {
--color-bg: #f9f5f0;
--color-bg-alt: #f0ebe3;
--color-text: #2b2b2b;
--color-primary: #2c5e4f;
--color-accent: #d68c45;
--color-light: #ffffff;
--color-muted: #5a5a5a;
--font-sans: -apple-system, BlinkMacSystemFont, "Segoe UI", Roboto, "Helvetica Neue", Arial, sans-serif;
--font-serif: Georgia, "Times New Roman", serif;
--spacing: 1rem;
--max-width: 1200px;
--radius: 6px;
}
*, *::before, *::after {
box-sizing: border-box;
margin: 0;
padding: 0;
}
html {
scroll-behavior: smooth;
}
body {
font-family: var(--font-sans);
background-color: var(--color-bg);
color: var(--color-text);
line-height: 1.6;
font-size: 1rem;
}
.skip-link {
position: absolute;
top: -40px;
left: 0;
background: var(--color-primary);
color: var(--color-light);
padding: 0.5rem 1rem;
text-decoration: none;
z-index: 200;
transition: top 0.2s;
}
.skip-link:focus {
top: 0;
}
.container {
width: 100%;
max-width: var(--max-width);
margin: 0 auto;
padding: 0 var(--spacing);
}
img, svg {
max-width: 100%;
display: block;
}
/* Header */
header {
background: var(--color-bg);
padding: 1rem 0;
border-bottom: 1px solid #e0dbd3;
}
.logo {
font-family: var(--font-serif);
font-size: 1.5rem;
color: var(--color-primary);
font-weight: bold;
letter-spacing: 0.02em;
}
/* Hero */
.hero {
background-color: var(--color-primary);
color: var(--color
Lowest score
glm-5
Glow & Grain — local bakery site (non-tech)
0/100
The
Where models still fail
The most common problems we flagged across all models.
Frequently asked
What is the best AI model for frontend & landing pages?
In our benchmarks, gemini-3-flash-preview-max ranks first for frontend & landing pages, scoring usable, across 6 test cases.
What is the cheapest good model for frontend & landing pages?
gemini-3.1-flash-lite-medium is the best value: it clears our quality bar for frontend & landing pages at $3.19/1k per run.
Which model is fastest for frontend & landing pages?
gemini-3.1-flash-lite-medium is the fastest model that still performs well for frontend & landing pages.
How we test
Each model output is scored by a strict JSON LLM judge, supported by deterministic heuristics, then normalized to a 0-100 score.
Judge: gemini-3.1-pro-preview · 582 model runs across 2 benchmarks · last tested 2026-06-30
This page is Spring Prompt, running
We just did this for every model. Do it for your prompt.
The rankings above come from running real tasks through real models and scoring every output. Spring Prompt is that same engine — pointed at your prompt, your test cases, and your definition of good.
- Generate test cases from your prompt — no eval set required to start.
- Compare models side by side with quality, cost and latency in one matrix.
- Optimise the winner until the scores say it's ready to ship.
Prompt × model results
12 test cases · 3 evals