Confirm Action

Are you sure you want to proceed?

Is qwen3.7-max-high good at Customer Support?

qwen3.7-max-high ranks #3 of 43 for Customer Support — strong. The top pick for this task is gemini-3.1-pro-preview-low.

#3 / 43
Rank for this task
87.4
Score
$0.0215
Cost / run

qwen3.7-max-high on each Customer Support sub-task

Resolution 98.0/100 #19
Policy Boundaries 94.7/100 #14
Help Content Test 90.8/100 #2
Escalation and Incident Test 90.4/100 #4
Policy Boundary Test 86.5/100 #11
Basic Support Reply Test 83.2/100 #3
De-escalation 77.0/100 #12
Escalation & Handoff 71.5/100 #18

Real examples, graded

WinPassword reset help 88/100

“The response perfectly executes the task, providing clear, concise, and actionable password reset steps alongside a robust and specific security caution. It is highly professional and production-ready.”

WinInternal support macro review 90/100

“The model perfectly followed the strict constraint to include the exact policy string, improved the tone significantly, and provided a safe, policy-compliant alternative (troubleshooting). The alternative is slightly generic, but appropriate for a macro template.”

WeakFrustrated no-show dispute (Tradewinds) 44/100

“The model provides a highly empathetic response that validates the customer's frustration and correctly applies the credited fee. However, it makes a major unauthorized promise by guaranteeing that the venue will be matched with 'most reliable, top-rated workers well in advance' for future shifts. The policy only allows prioritizing the next request, not guaranteeing specific worker quality or successful matches.”

WeakOver-authority refund → clean handoff (Cedar & Sage) 34/100

“The model makes an unauthorized promise by guaranteeing the cash reversal will be processed, despite it requiring Billing approval. It also invents a 1-2 business day SLA for the Billing department. Additionally, the internal handoff note fails to include the customer's sentiment.”

← Full qwen3.7-max-high review All Customer Support rankings → Top pick: gemini-3.1-pro-preview-low →

Frequently asked

Is qwen3.7-max-high good at Customer Support?

qwen3.7-max-high ranks #3 of 43 models we tested for Customer Support, scoring strong.

What is qwen3.7-max-high's strongest Customer Support skill?

Its best sub-task here is Resolution.

This page is Spring Prompt, running

We just did this for every model. Do it for your prompt.

The rankings above come from running real tasks through real models and scoring every output. Spring Prompt is that same engine — pointed at your prompt, your test cases, and your definition of good.

  • Generate test cases from your prompt — no eval set required to start.
  • Compare models side by side with quality, cost and latency in one matrix.
  • Optimise the winner until the scores say it's ready to ship.
Experiment · Cold outreach email

Prompt × model results

12 test cases · 3 evals
Claude Opus
GPT-5
Gemini
v1
7.1
6.8
7.4
v2
8.3
7.9
8.0
v3
9.2
8.6
8.4
Best combo: v3 × Claude Opus
9.2 quality · $0.004/run · 1.8s