Is gemini-3.1-pro-preview-high good at Chef / Home Cooking?
gemini-3.1-pro-preview-high ranks #1 of 50 for Chef / Home Cooking — strong.
gemini-3.1-pro-preview-high on each Chef / Home Cooking sub-task
| Practical Recipe Test | 87.3/100 | #1 |
| Dinner Rescue Test | 87.0/100 | #3 |
| Substitution Test | 86.7/100 | #14 |
| Meal Timing Test | 74.0/100 | #4 |
Real examples, graded
WinSerrano ham scrambled eggs 91/100
“The model perfectly executes the task, employing sound culinary techniques to meet the constraints (crisping ham and adding yogurt off-heat for creamy eggs) while providing a highly appealing, well-sequenced, and practical recipe.”
WinLow-carb/high-carb shared dinner 89/100
“The response is exceptionally strong, providing a realistic, tasty, and well-timed recipe that perfectly meets all constraints. The shared cooking base is handled elegantly, and the timing is highly practical. The only minor culinary nitpick is the risk of burning the garlic/zest marinade over medium-high heat.”
WinDry chicken breast 89/100
“The model provides highly practical, creative, and well-structured solutions using only the provided ingredients. The culinary techniques (shredding, adding fat/acid, using a runny yolk) are expert-level fixes for dry meat. Instructions are exceptionally clear and the 'what not to do' section is highly accurate.”
Frequently asked
Is gemini-3.1-pro-preview-high good at Chef / Home Cooking?
gemini-3.1-pro-preview-high ranks #1 of 50 models we tested for Chef / Home Cooking, scoring strong.
What is gemini-3.1-pro-preview-high's strongest Chef / Home Cooking skill?
Its best sub-task here is Practical Recipe Test.
This page is Spring Prompt, running
We just did this for every model. Do it for your prompt.
The rankings above come from running real tasks through real models and scoring every output. Spring Prompt is that same engine — pointed at your prompt, your test cases, and your definition of good.
- Generate test cases from your prompt — no eval set required to start.
- Compare models side by side with quality, cost and latency in one matrix.
- Optimise the winner until the scores say it's ready to ship.
Prompt × model results
12 test cases · 3 evals