Confirm Action

Are you sure you want to proceed?

Is claude-sonnet-4.6-high good at Legal & HR?

claude-sonnet-4.6-high ranks #1 of 44 for Legal & HR — excellent.

#1 / 44
Rank for this task
96.8
Score
$0.0347
Cost / run

claude-sonnet-4.6-high on each Legal & HR sub-task

Job Description 100.0/100 #1
Contract Clause Review 100.0/100 #1
Performance Feedback 100.0/100 #3
Structured Interview Kit 99.3/100 #20
Plain-English Explainer 84.7/100 #7

Real examples, graded

WinOne-sided indemnity (Ferrovia vendor contract) 100/100

“The model perfectly executed the prompt's instructions. It identified all key risks (one-sidedness, inclusion of provider's negligence, and uncapped liability), provided excellent explanations of the implications, included appropriate disclaimers, and refrained from fabricating any external legal authority.”

WinAuto-renewal trap (Northwind SaaS order form) 100/100

“The model provided an excellent response that perfectly aligns with the prompt's requirements. It accurately spotted all material risks in the provided clause (auto-renewal, 90-day window, 12% escalator), suggested practical mitigations, included the appropriate non-legal-advice framing, and completely avoided fabricating any external legal authority.”

WinCafé manager (Glow & Grain, non-tech) 100/100

“The model provided an excellent, highly tailored job description that perfectly captures the local hospitality context. It clearly separates must-have and nice-to-have requirements, includes practical and realistic duties, and maintains a warm, inclusive tone with a solid EEO statement. There is no biased language or unlawful criteria, and no inappropriate fabrication.”

← Full claude-sonnet-4.6-high review All Legal & HR rankings →

Frequently asked

Is claude-sonnet-4.6-high good at Legal & HR?

claude-sonnet-4.6-high ranks #1 of 44 models we tested for Legal & HR, scoring excellent.

What is claude-sonnet-4.6-high's strongest Legal & HR skill?

Its best sub-task here is Job Description.

This page is Spring Prompt, running

We just did this for every model. Do it for your prompt.

The rankings above come from running real tasks through real models and scoring every output. Spring Prompt is that same engine — pointed at your prompt, your test cases, and your definition of good.

  • Generate test cases from your prompt — no eval set required to start.
  • Compare models side by side with quality, cost and latency in one matrix.
  • Optimise the winner until the scores say it's ready to ship.
Experiment · Cold outreach email

Prompt × model results

12 test cases · 3 evals
Claude Opus
GPT-5
Gemini
v1
7.1
6.8
7.4
v2
8.3
7.9
8.0
v3
9.2
8.6
8.4
Best combo: v3 × Claude Opus
9.2 quality · $0.004/run · 1.8s