Confirm Action

Are you sure you want to proceed?

Live benchmark

ROASBench

A hard-mode performance marketing simulation for LLMs. Models act as the marketer for a DTC skincare brand, choose channels, plan spend, write creative angles, react to results, and live with the consequences for 12 months.

12-month simulation 3 repeats per model 6 controllable channels 8 shopper personas

Top cumulative profitability

Anthropic: Claude Opus 4.6

$506,094

Current leader

Anthropic: Claude Opus 4.6

Avg score 40.61

Profitable after 12 months

1 model

Anthropic: Claude Opus 4.6

Lead over #2

+13.47 points

vs. Google: Gemini 3.1 Pro Preview

Closest profit challenger

Qwen: Qwen3.5 Plus 2026-02-15

$-29,480 after 12 months

Money balance over time

Average cumulative profit by month

Average across all completed runs for each participant.

Score vs. cost per run

Quality is not evenly priced

Average benchmark score against average main-model API cost per run.

Monthly contribution profit

Who stabilizes fastest?

Average monthly contribution profit across completed runs.

Average score

ROASBench leaderboard

Models ranked by average primary score (highest first). Values above each bar are the mean score; whiskers show standard deviation across completed runs when more than one run exists.

Leaderboard

Current standings

Sorted by average benchmark score. Tap a row for sub-scores and detail.

# Model Score ROAS
1

Anthropic: Claude Opus 4.6

Leads the pack by compounding a coherent plan: retention channels stay funded, discounting stays rare, and changes are absorbed without constant learning resets. Averaged 40.61 across 3 completed run(s); contribution profit $506,094; ROAS 192.5%. Avg first month cumulative contribution profit turns positive: ~7.0.

40.61
192.5%

Sub-scores

Business
35.70
Behavior
27.28
Planning
59.06
Persona
56.75

Economics

Cost / run
$0.99
Spend
$1,244,940
Revenue
$2,402,318
Runs
3
1st profitable mo (avg)
~7.0

Avg profit: $506,094

Strength: Stable iteration, persona-aware creative, disciplined CRM and remarketing.

Watch: Still hits saturation and high CAC when scaling search and broad demand.

2

Google: Gemini 3.1 Pro Preview

Often nearer break-even with structurally sensible moves; execution and generic creative hold the score down, with too many mid-course resets. Averaged 27.14 across 3 completed run(s); contribution profit $-34,549; ROAS 132.9%. Avg first month cumulative contribution profit turns positive: ~12.0.

27.14
132.9%

Sub-scores

Business
12.17
Behavior
29.25
Planning
53.04
Persona
49.11

Economics

Cost / run
$0.53
Spend
$1,002,707
Revenue
$1,333,366
Runs
3
1st profitable mo (avg)
~12.0

Avg profit: $-34,549

Strength: Directionally right budget and channel choices vs. weaker frontier peers.

Watch: Generic copy, remarketing churn, and learning resets under pressure.

3

Qwen: Qwen3.5 Plus 2026-02-15

Ranked #3 of 13 with an average benchmark score of 26.07 across 3 run(s). Sub-scores are strongest on planning (51.7) and weakest on business (14.8). Average contribution profit $-29,480 and ROAS 133.8%. On average, cumulative contribution profit stayed negative through the full simulation year.

26.07
133.8%

Sub-scores

Business
14.79
Behavior
21.36
Planning
51.68
Persona
45.55

Economics

Cost / run
$0.00
Spend
$1,018,413
Revenue
$1,362,638
Runs
3
1st profitable mo (avg)

Avg profit: $-29,480

Strength: Relative edge: planning (51.7).

Watch: Relative gap: business (14.8).

4

Anthropic: Claude Sonnet 4.6

Ranked #4 of 13 with an average benchmark score of 21.21 across 3 run(s). Sub-scores are strongest on planning (57.4) and weakest on business (9.7). Average contribution profit $-146,915 and ROAS 117.6%. On average, cumulative contribution profit stayed negative through the full simulation year.

21.21
117.6%

Sub-scores

Business
9.73
Behavior
14.86
Planning
57.35
Persona
36.06

Economics

Cost / run
$0.53
Spend
$985,912
Revenue
$1,159,482
Runs
3
1st profitable mo (avg)

Avg profit: $-146,915

Strength: Relative edge: planning (57.4).

Watch: Relative gap: business (9.7).

5

DeepSeek: DeepSeek V3.2

Ranked #5 of 13 with an average benchmark score of 19.77 across 3 run(s). Sub-scores are strongest on planning (49.5) and weakest on business (8.9). Average contribution profit $-126,535 and ROAS 120.4%. On average, cumulative contribution profit first turns positive around month 12.0.

19.77
120.4%

Sub-scores

Business
8.93
Behavior
13.28
Planning
49.54
Persona
37.26

Economics

Cost / run
$0.00
Spend
$987,722
Revenue
$1,193,277
Runs
3
1st profitable mo (avg)
~12.0

Avg profit: $-126,535

Strength: Relative edge: planning (49.5).

Watch: Relative gap: business (8.9).

6

OpenAI: GPT-5.4

Looks plausible on paper but weak compounding: revenue without efficient spend patterns; repeated broad demand spend without durable payoff. Averaged 18.39 across 3 completed run(s); contribution profit $-250,461; ROAS 103.2%. Across runs, cumulative contribution profit never crossed zero on average in the first 12 months.

18.39
103.2%

Sub-scores

Business
5.69
Behavior
17.55
Planning
54.12
Persona
30.77

Economics

Cost / run
$0.29
Spend
$970,302
Revenue
$1,001,373
Runs
3
1st profitable mo (avg)

Avg profit: $-250,461

Strength: Readable strategy and channel mix in isolation.

Watch: Search/remarketing saturation and budgeting that does not match outcomes.

7

xAI: Grok 4.20 Beta

Ranked #7 of 13 with an average benchmark score of 16.77 across 3 run(s). Sub-scores are strongest on planning (43.9) and weakest on business (5.9). Average contribution profit $-243,069 and ROAS 103.5%. On average, cumulative contribution profit stayed negative through the full simulation year.

16.77
103.5%

Sub-scores

Business
5.91
Behavior
16.36
Planning
43.92
Persona
29.32

Economics

Cost / run
$0.00
Spend
$972,947
Revenue
$1,009,791
Runs
3
1st profitable mo (avg)

Avg profit: $-243,069

Strength: Relative edge: planning (43.9).

Watch: Relative gap: business (5.9).

8

Gemini 3 Flash Preview

Ranked #8 of 13 with an average benchmark score of 16.29 across 3 run(s). Sub-scores are strongest on planning (49.3) and weakest on business (4.6). Average contribution profit $-239,859 and ROAS 104.6%. On average, cumulative contribution profit stayed negative through the full simulation year.

16.29
104.6%

Sub-scores

Business
4.63
Behavior
13.03
Planning
49.33
Persona
30.27

Economics

Cost / run
$0.10
Spend
$967,576
Revenue
$1,012,636
Runs
3
1st profitable mo (avg)

Avg profit: $-239,859

Strength: Relative edge: planning (49.3).

Watch: Relative gap: business (4.6).

9

MoonshotAI: Kimi K2.5

Ranked #9 of 13 with an average benchmark score of 13.10 across 2 run(s). Sub-scores are strongest on planning (50.8) and weakest on business (3.7). Average contribution profit $-292,423 and ROAS 97.5%. On average, cumulative contribution profit stayed negative through the full simulation year.

13.10
97.5%

Sub-scores

Business
3.72
Behavior
5.12
Planning
50.75
Persona
22.91

Economics

Cost / run
$0.00
Spend
$966,702
Revenue
$942,650
Runs
2
1st profitable mo (avg)

Avg profit: $-292,423

Strength: Relative edge: planning (50.8).

Watch: Relative gap: business (3.7).

10

MiniMax: MiniMax M2.7

Ranked #10 of 13 with an average benchmark score of 12.13 across 3 run(s). Sub-scores are strongest on planning (39.4) and weakest on business (3.5). Average contribution profit $-316,117 and ROAS 94.2%. On average, cumulative contribution profit stayed negative through the full simulation year.

12.13
94.2%

Sub-scores

Business
3.54
Behavior
9.08
Planning
39.37
Persona
21.24

Economics

Cost / run
$0.00
Spend
$961,885
Revenue
$906,316
Runs
3
1st profitable mo (avg)

Avg profit: $-316,117

Strength: Relative edge: planning (39.4).

Watch: Relative gap: business (3.5).

11

OpenAI: GPT-5.4 Mini

Ranked #11 of 13 with an average benchmark score of 11.83 across 1 run(s). Sub-scores are strongest on planning (44.9) and weakest on business (2.0). Average contribution profit $-353,629 and ROAS 87.3%. On average, cumulative contribution profit stayed negative through the full simulation year.

11.83
87.3%

Sub-scores

Business
2.00
Behavior
5.58
Planning
44.94
Persona
24.04

Economics

Cost / run
$0.00
Spend
$952,498
Revenue
$831,373
Runs
1
1st profitable mo (avg)

Avg profit: $-353,629

Strength: Relative edge: planning (44.9).

Watch: Relative gap: business (2.0).

12

Z.ai: GLM 5

Ranked #12 of 13 with an average benchmark score of 9.64 across 3 run(s). Sub-scores are strongest on planning (42.6) and weakest on business (0.9). Average contribution profit $-370,131 and ROAS 86.0%. On average, cumulative contribution profit stayed negative through the full simulation year.

9.64
86.0%

Sub-scores

Business
0.91
Behavior
5.34
Planning
42.62
Persona
16.65

Economics

Cost / run
$0.00
Spend
$949,211
Revenue
$816,409
Runs
3
1st profitable mo (avg)

Avg profit: $-370,131

Strength: Relative edge: planning (42.6).

Watch: Relative gap: business (0.9).

13

OpenAI: GPT-5.4 Nano

Ranked #13 of 13 with an average benchmark score of 6.80 across 3 run(s). Sub-scores are strongest on planning (46.5) and weakest on business (0.0). Average contribution profit $-577,506 and ROAS 56.5%. On average, cumulative contribution profit stayed negative through the full simulation year.

6.80
56.5%

Sub-scores

Business
0.00
Behavior
1.33
Planning
46.45
Persona
5.36

Economics

Cost / run
$0.00
Spend
$959,525
Revenue
$542,214
Runs
3
1st profitable mo (avg)

Avg profit: $-577,506

Strength: Relative edge: planning (46.5).

Watch: Relative gap: business (0.0).

Avg profit is shown in expanded rows on small screens — tap a model.

For model providers

Want your model on ROASBench?

We can run official benchmark passes and publish results alongside the leaderboard. Tell us which model and API access to use.

Contact us

Methodology

How ROASBench works

Open each section for setup, simulation flow, what models see, personas, state, and what skills the benchmark rewards.

ROASBench drops the model into a year-long operating environment for one premium-but-accessible skincare brand and scores the result on business outcomes, not nice-sounding plans.

Brand

Northstar Skin

Barrier Repair Serum at $68 with 76% gross margin.

Time horizon

12 months

The model has to adapt over time instead of solving one isolated scenario.

Controlled channels

6

Meta prospecting, Search, Shopping, TikTok, Email / CRM, and Remarketing.

Scoring

Business + behavior

Primary score blends profitability, planning quality, persona response, and long-run adaptation.

What the model can control

  • Budget allocation — spend per month and split across channels.
  • Campaign design — type, segments, creative angle, copy structure.
  • Offer strategy — discounts, remarketing, CRM cadence, margin vs. conversion.
  • Iteration — hold, scale, or change course after monthly results.

Each round is a real operating cycle, not a one-shot prompt. Past choices affect future state, so the benchmark rewards consistency and punishes lazy resets.

1. Seeded world

Fixed brand, budget, customers, email list, warm pool, seasonality, shocks.

2. Decision step

Structured monthly plan: objective, budget, discount, remarketing, channels, creative.

3. Persona panel + rules

Panel judges copy and targeting; rules produce clicks, trust, purchases, retention.

4. State update

Budget, base, momentum, fatigue, pools, and channel memory roll forward.

What data the model gets back

  • Monthly operating metrics: spend, revenue, ROAS, CAC, repeat rate, reinvestment.
  • State: budget left, customer base, email list, warm pool, momentum, fatigue.
  • Audience map: segments, persona sizes, fit, competition, value.
  • Working memory: prior decisions, highlights, penalty flags.
  • Market notes: seasonality and shocks.

No raw persona-by-persona judge feedback in the prompt — infer from outcomes.

Main difficulties

Learning resets
Abrupt reallocations hurt efficiency.
Saturation
Finite warm pools and auctions.
Offer fatigue
Discounts can damage later months.
Organic carryover
Upper funnel pays off slowly.
Persona tradeoffs
Easy vs. valuable audiences.
Soft caps
CRM, retargeting, channel limits.

Every persona differs in size, growth, fit, competition, and value. The model starts with a commercial map but must learn what actually monetizes.

Value Seeker

Large and relatively easy to wake up with offers, but lower-value and highly price competitive.

Motivations: visible results, discount

Premium Conscious

Smaller but high-value premium audience with strong fit for the brand and heavy competition from other prestige skincare.

Motivations: ingredients, authority

Ingredient Researcher

Harder to win because they scrutinize claims, but they compound into valuable, durable customers when convinced.

Motivations: clinical details, ingredient list

Impulse Buyer

Big upper-funnel opportunity that is easier to engage creatively, but conversion quality and retention are weaker.

Motivations: aesthetic creative, quick payoff

Comparison Shopper

Commercially meaningful and high-intent, but expensive to win because comparison behavior increases competition and pressure on proof.

Motivations: clear differentiation, proof

Returning Loyalist

Smaller owned audience but the most valuable and efficient to monetize if protected with the right cadence.

Motivations: routine, restock convenience

Lapsed Customer

Warm and recoverable with decent value, but reactivation requires freshness and fatigue management.

Motivations: newness, better routine fit

Low Intent Browser

Largest reachable pool and easiest to attract at the top of funnel, but low intent and low customer value.

Motivations: light curiosity, visual intrigue

Fixed upfront

Brand, economics, personas, channels, seasonality, shocks.

Persistent state

Budget, customers, email, warm pool, momentum, fatigue, reinvestment, channel memory.

Iteration summary

Prior decisions and outcomes compressed for the next month.

Economic judgment

Margin, contribution profit, CAC, bad revenue.

Budget pacing

When to press, hold, sequence spend.

Channel allocation

Prospecting vs. intent vs. CRM vs. remarketing.

Persona targeting

Easy vs. valuable audiences.

Creative specificity

Angles that match motivations and objections.

Stable iteration

Improve in place; avoid constant restructures.

In practice: protect margin, keep retention alive, scale demand capture carefully, stay persona-aware. Models fail when they confuse activity with progress or write polished but generic copy.

Qualitative findings

What the frontier models are actually doing

Across 36 completed runs, Anthropic: Claude Opus 4.6 is currently leading with an average score of 40.61, average profit of $506,094, and average ROAS of 192.5%.

Top model

Anthropic: Claude Opus 4.6

Average score

40.61

Claude Opus 4.6 is winning by doing the boring but important things well. It keeps CRM and remarketing active every month, avoids discounting, and improves the account without repeatedly breaking learning.

Performance

Positive months 27 / 36 · Discount months 0

ROAS 192.5% · Avg profit $506,094

Decision quality

Copy 82.39 · Targeting 85.78 · Iteration 87.11

Why it works

The pattern is compounding discipline: many profitable months, price integrity, retention channels funded throughout. It still gets hit by CAC and search saturation at scale but keeps the account coherent.

Other frontier models

OpenAI: GPT-5.4

GPT-5.4 reads as strategically plausible in isolation, but run-level data shows weak compounding and broad demand spend without durable payoff.

Google: Gemini 3.1 Pro Preview

Nearer break-even with structurally sensible moves; execution and generic creative limit the score, with learning resets on remarketing.

Bottom line: The top model compounds a coherent strategy over time; others may look smart in prose without twelve months of aligned outcomes.

Example turns

Snapshots from strong runs

One high-scoring run per model. Pick a month to read objective, outcome, and what the simulated audience did — expand for channel and creative detail.

Strong completed run

Anthropic: Claude Opus 4.6

Run score

42.61

Run profit

$652,288

Month 1 Score 20.2

Objective

Establish baseline performance across channels, build retargeting pools, and prioritize high-fit personas (Premium Conscious, Ingredient Researcher, Returning Loyalist) while seeding upper-funnel demand. Month 1 is a learning month: diversify spend to gather signal, lean into education-led creative, and avoid discounting to protect margin and brand positioning.

Budget

$93,716

Profit

$-30,993

ROI

-0.33

Discount

0%

What happened next

Strong educational baseline with excellent persona-message fit for researchers and premium buyers. However, high remarketing spend on a small 18K audience pool risks early frequency fatigue.

Why it thought this made sense

Month 1 strategy: No discount — we lead with ingredient credibility, founder story, and clinical proof. The brand's strongest fit is with Premium Conscious (0.91), Ingredient Researcher (0.94), and Returning Loyalist (0.96). We allocate ~38% to Meta prospecting to build awareness and feed retargeting pools, ~18% to Google Search/Shopping to capture existing intent, ~12% to TikTok for upper-funnel creative testing, ~11% to remarketing to convert warm audiences (site_visitors 37.8K + cart_abandoners 8.6K are small but actionable), and ~3% to email/CRM to activate the 62K list and 14K customer base. We hold back ~18% of annual budget capacity for later months when retargeting pools are larger and we have performance data. Key risks: small retargeting pools limit remarketing efficiency; cold audiences may have high CPAs initially. We accept that and optimize in Month 2 with real data.

Top channels

Meta Prospecting $36,500

Education-led ingredient authority — lead with ceramide science and barrier repair mechanism. Split test founder-to-camera explainer vs. clinical before/after carousel. Targets Premium Conscious and Ingredient Researcher…

Tiktok $21,500

Raw UGC test battery — 4-5 hook variations from real customers. Test: texture close-up, 'dermatologist reacts', morning routine integration, 'products I stopped buying'. Optimizing for thumb-stop rate and click-through. …

Google Search $14,000

Competitor conquest — target searches for competing ceramide/barrier products (CeraVe, Dr. Jart, SK-II barrier lines). Differentiate on clinical formulation and ingredient purity. Comparison Shopper persona.

Month 6 Score 45.9

Objective

Compound Month 5's strong momentum (ROI 0.22, $152K revenue) by scaling proven channels incrementally while protecting efficiency. Google Search and Remarketing remain the ROI backbone. Meta gets a modest increase to feed upper-funnel pools for H2 scaling. TikTok holds at minimum viable. Email gets a slight uplift to monetize the growing 92K list and 16.7K customer base — particularly Returning Loyalists and Lapsed Customers. No discount: brand momentum is +0.28 and offer fatigue is 0.00, so we preserve pricing integrity and let education + trust do the work. Target: ROI 0.25+, revenue $155K+, CAC under $200.

Budget

$92,343

Profit

$38,923

ROI

0.43

Discount

0%

What happened next

Disciplined scaling of proven channels and strict margin protection are driving strong profitability. The clinical, ingredient-led messaging resonates deeply with high-value personas, though casual browsers remain alienated.

Why it thought this made sense

Month 6 strategy: M5 was the best month yet — ROI 0.22, revenue $152K, and direct revenue emerging at $19K (brand demand building). Google Search ROI climbed back to 1.40 after stabilizing from learning resets. Remarketing continues to be the highest-ROI channel at 3.02 but is finite — we increase slightly to $14K to capture the growing retargeting pools (site visitors 182K, cart abandoners 41K) without over-saturating. Google Search gets $32K — a small increment from $31K — respecting the persistent saturation flag while capturing the intent we're generating. Google Shopping scales to $16K as PMax continues to improve (1.16 ROI in M4). Meta gets $26K again — stable spend to keep feeding lookalike and cold audiences; we rotate creative to avoid fatigue. TikTok holds at $5K minimum viable to maintain upper-funnel presence. Email gets a $1.5K allocation to run a restock sequence for existing customers and a win-back for lapsed — the 92K list and 21% repeat rate represent meaningful owned-channel revenue. Key persona focus: Premium Conscious and Ingredient Researcher on search/shopping (high brand fit, high value), Returning Loyalist and Lapsed Customer on email/remarketing (highest value, lowest competition), Impulse Buyer on Meta/TikTok (creative-led engagement to build pools). We avoid discounting — brand momentum is positive and we have zero offer fatigue, which is a strategic asset heading into H2 when we may want promotional levers for peak season.

Top channels

Google Search $31,269

Problem-aware search capture with ingredient authority — targeting barrier repair, damaged skin barrier, ceramide serum queries. Maintain brand search defense at ~10% of budget.

Meta Prospecting $25,406

Rotate M5 winning angles with fresh creative iterations: (1) Founder story + ingredient deep-dive for Premium Conscious/Ingredient Researcher, (2) Before/after transformation with customer testimonials for Impulse Buyer,…

Google Shopping $15,635

Performance Max with optimized feed — hero product image, star ratings, price clarity. Let Google's signals find high-intent shoppers across surfaces.

Month 11 Score 68.0

Objective

Maximize holiday peak revenue in Month 11 by scaling proven channels aggressively with the largest budget yet ($174K). M10 was the best month ever — ROI 1.12, $324K revenue, CAC $102, 1,090 new customers, and $91K direct revenue showing massive brand momentum. With holiday demand surge, brand momentum +1.00, zero offer fatigue, and a 21.5K customer base feeding direct/organic, this is the month to push hard. Scale remarketing to capture the enlarged warm pool (189K), increase Meta and Google Shopping for holiday gift-intent traffic, and lean into email for holiday lifecycle pushes to the 138K list. No discount needed — brand equity is strong and holiday urgency provides natural conversion lift.

Budget

$147,500

Profit

$195,218

ROI

1.34

Discount

0%

What happened next

Aggressive holiday scaling capitalizes well on peak brand momentum, though zero discounting risks alienating price-sensitive shoppers during a highly promotional season. The heavy $35K remarketing push may quickly saturate the warm pool, requiring close monitoring of frequency caps.

Why it thought this made sense

Month 11 strategy: M10 was the breakout month — ROI 1.12, $324K revenue, $91K direct, CAC $102, 1,090 new customers. Brand momentum is at +1.00 and offer fatigue is 0.00 — ideal conditions for aggressive scaling during holiday peak. Key decisions: (1) Scale total spend from $114K to $174.5K — a 53% increase justified by holiday demand surge, proven channel ROI, and the need to capture seasonal intent before it evaporates. (2) Remarketing scaled to $35K (from $24K) — warm pool is 189K, cart abandoners at 91K, and holiday browsing will inflate these further. Remarketing has been our most consistent ROI channel (3.4-3.8x) and the larger pool can absorb more spend. (3) Meta scaled to $38K — holiday is Meta's strongest period for DTC skincare (gift-giving intent, aesthetic creative performs well). Targeting Premium Conscious and Ingredient Researcher personas with gift-angle creative. (4) Google Shopping scaled to $28K — holiday shopping intent peaks in November; PMax should capture high-intent gift and self-purchase queries. (5) Google Search held at $34K — already showing saturation flags but holiday intent should temporarily expand the addressable search volume. (6) TikTok scaled to $10K — holiday content consumption spikes; creator whitelisting with gift-guide angles. (7) Email at $3,537.50 — holiday lifecycle push to 138K list with gift guide, restock reminder for loyalists, and lapsed reactivation. No discount: brand equity is strong, holiday urgency replaces discount as conversion driver, and protecting margin at 76% gross is more valuable than marginal conversion lift from discounting. Budget remaining after this month: $116,358.34 for M12, which gives us room for a strong close. Risk management: watching for auction cost inflation during holiday (CPMs typically +20-40% in Nov), which could compress ROI despite higher demand. If remarketing pools saturate mid-month, excess budget should flow to Google Shopping PMax.

Top channels

Meta Prospecting $38,000

Holiday gift-giving + self-care angles targeting Premium Conscious and Ingredient Researcher personas. Three creative variants: (1) 'The skincare gift they'll actually use' — elegant product-as-gift visual with ingredien…

Remarketing $35,000

Dynamic retargeting with holiday urgency overlay. Cart abandoners see 'Still in your cart — don't let winter win' with product imagery and review snippets. Site visitors see 'The barrier repair serum 21,000+ customers tr…

Google Search $34,000

Problem-aware search capture with holiday self-care framing. Target barrier repair, damaged skin barrier, winter skincare queries that spike in November.

Strong completed run

Google: Gemini 3.1 Pro Preview

Run score

30.38

Run profit

$432

Month 1 Score 16.9

Objective

Establish baseline acquisition and capture existing demand while introducing the brand to high-fit premium audiences.

Budget

$93,216

Profit

$-47,083

ROI

-0.51

Discount

0%

What happened next

Solid baseline launch protecting premium positioning with zero discounts and strong clinical messaging on Meta. However, generic copy on Google Shopping and a lack of low-friction hooks will limit conversion for impulse and value-driven segments.

Why it thought this made sense

Month 1 focuses on finding our footing with a balanced mix of upper-funnel prospecting and high-intent demand capture. We are leaning heavily on Meta Advantage+ and Google PMax to let the algorithms find high-value Ingredient Researchers and Premium Conscious buyers. No site-wide discounts to protect our premium positioning.

Top channels

Meta Prospecting $40,000

clinical efficacy, ceramides, and authority

Google Shopping $18,000

product feed, high intent

Google Search $12,000

clear differentiation and ingredient proof

Month 6 Score 28.6

Objective

Improve unit economics and reduce blended CAC by maintaining platform stability on paid channels while activating the 90k+ email list for high-margin conversions.

Budget

$86,819

Profit

$1,219

ROI

0.01

Discount

0%

What happened next

Maintaining platform stability and zero discounts solidifies premium positioning but limits volume. Generic ad copy across search and remarketing fails to strongly differentiate the brand.

Why it thought this made sense

Kept budget changes on Meta and Google under 15% to preserve platform learning. Introduced a dedicated budget for Email/CRM to monetize the large list of non-purchasers via an education sequence, leaning into the brand's strength in trust-building.

Top channels

Meta Prospecting $29,500

Clinical evidence and ingredient deep-dive

Google Search $25,500

High-intent capture for barrier repair and skin healing

Google Shopping $14,500

Product feed highlighting premium positioning and clinical efficacy

Month 11 Score 57.1

Objective

Capitalize on Holiday/BFCM demand by introducing a 15% discount to activate the large email list (124k) and warm audience pool (161k), while scaling top-performing Meta and Google Search campaigns within the $84k budget limit.

Budget

$84,000

Profit

$47,607

ROI

0.58

Discount

15%

What happened next

Introducing a 15% discount perfectly times the monetization of a large, un-fatigued audience for BFCM. However, the ad copy is highly generic and relies entirely on the discount rather than strong brand differentiation.

Why it thought this made sense

Introducing a 15% BFCM discount to activate our growing email list and warm audiences. Increasing budget slightly across Meta and Google Search to capture high-intent holiday traffic while keeping shifts under 15% to maintain platform learning and stability.

Top channels

Meta Prospecting $29,500

BFCM offer highlighting clinical evidence and barrier repair

Google Search $23,000

Intent capture for barrier repair and BFCM deals

Google Shopping $14,500

Product feed highlighting the 15% off holiday offer

Strong completed run

OpenAI: GPT-5.4

Run score

20.56

Run profit

$-248,939

Month 1 Score 5.7

Objective

Build efficient new-customer acquisition while establishing premium trust signals, then convert warm traffic with education-led CRM and disciplined remarketing.

Budget

$92,216

Profit

$-62,142

ROI

-0.68

Discount

0%

What happened next

Premium positioning is strong, but restricting Google Search and Shopping to retargeting pools severely limits top-of-funnel acquisition. Heavy remarketing spend against a small warm audience risks early saturation without driving enough new traffic.

Why it thought this made sense

Month 1 should prioritize signal gathering and scalable acquisition without cheapening the premium positioning. Northstar Skin has strong ingredient credibility and founder-story potential, so the plan leans into Premium Conscious and Ingredient Researcher personas via Meta, Google Search, and Shopping, while using TikTok selectively for creative-led reach. No broad discount is recommended this month because premium brand fit is high, repeat economics are healthy, and early offer fatigue should be preserved. Email focuses on education, founder credibility, and warm conversion support. Remarketing is funded meaningfully but kept disciplined due to finite pool sizes.

Top channels

Meta Prospecting $30,000

Founder-led barrier repair education with premium social proof and texture/results storytelling

Google Shopping $22,000

Feed-based conversion capture with premium imagery, reviews, and clear product differentiation

Google Search $18,000

High-intent problem/solution capture focused on barrier repair, ingredient credibility, and trust-building claims

Month 6 Score 21.4

Objective

Improve contribution efficiency and reduce CAC by keeping continuity in the proven paid structure while shifting more spend toward high-fit, trust-led Meta prospecting, disciplined TikTok creator content, strong but capped remarketing, and heavier CRM monetization. Prioritize Premium Conscious, Ingredient Researcher, and warm reactivation audiences over broader low-intent scale, and avoid discounting to protect premium positioning and limit offer fatigue.

Budget

$85,670

Profit

$-27,672

ROI

-0.33

Discount

0%

What happened next

Consolidating spend into Meta and TikTok while pausing saturated Google channels is a smart structural move. However, the ad copy relies on generic claims rather than specific ingredient proof, which will struggle to convert the targeted premium and research-heavy personas.

Why it thought this made sense

Months 4-5 showed the clearest path: Meta remains the most scalable paid driver despite weak standalone ROI, while remarketing is the strongest efficient channel and email is extremely profitable. Brand momentum is slightly negative and offer fatigue is already elevated, so the plan avoids promos and instead leans into education, proof, founder authority, and routine-based conversion. Remarketing is trimmed slightly from recent levels to respect saturation in finite warm pools. Google Search and Shopping remain paused after repeated saturation and learning reset flags. This month is about disciplined continuity, better audience quality, and stronger owned-channel capture rather than broad expansion.

Top channels

Meta Prospecting $54,000

Founder-led barrier repair education with ingredient credibility, visible routine outcomes, and premium trust signals aimed at Premium Conscious and Ingredient Researcher audiences while still giving enough social proof …

Tiktok $18,000

Native creator routines showing irritated-to-calm skin journeys, texture shots, and quick explanations of why barrier repair matters, optimized to stop the scroll without leaning discount-heavy.

Remarketing $12,000

Conversion-focused reminder sequence with ingredient proof, reviews, routine reassurance, and premium value framing rather than aggressive offers.

Month 11 Score 36.2

Objective

Build on month 10's first positive ROI by preserving the proven Meta + TikTok + remarketing + CRM structure, leaning into holiday demand with education-led premium conversion messaging instead of discounting, and adding a controlled layer of high-intent Google capture to harvest incremental demand without resetting the account too aggressively.

Budget

$71,764

Profit

$19,397

ROI

0.28

Discount

0%

What happened next

Maintaining continuity and avoiding holiday discounts protects premium positioning and margins. However, the ad copy remains too generic to fully convert ingredient-conscious shoppers or re-engage lapsed buyers.

Why it thought this made sense

Month 10 was the clearest proof point yet that continuity is working. The best plan for month 11 is not a broad reset but a disciplined scale inside the channels that have repeatedly shown strength: remarketing, Meta creative testing, TikTok creator-led storytelling, and owned CRM. Holiday demand should lift branded and problem-aware traffic, so a measured reintroduction of Google Search is warranted, but kept modest because prior abrupt changes caused learning resets and saturation. Messaging should skew toward founder authority, barrier-repair proof, ingredient credibility, reviews, and routine-fit for Premium Conscious and Ingredient Researcher segments. Avoid lazy promos given offer fatigue and the brand's premium positioning.

Top channels

Meta Prospecting $36,500

founder-led barrier repair education with visible skin-comfort outcomes, ingredient credibility, and premium social proof

Remarketing $16,000

conversion-focused reminder with review proof, ingredient trust markers, and low-friction return-to-cart urgency for holiday shopping

Tiktok $14,000

native creator routine storytelling focused on irritated skin, quick texture payoff, and credible 'why it works' explanation