Highest reasoning score from the currently public benchmark fields.
Claude Sonnet 4.6 and Gemini 3.1 Pro Preview are both strong shortlist candidates, but they usually appeal to different buyers once context posture and ecosystem preference are explicit.
Go with Gemini 3.1 Pro Preview when Google alignment and aggressive input pricing are high on the checklist. Go with Claude Sonnet 4.6 when you want Anthropic’s premium production profile and a direct alternative to the Google path.
Biggest tradeoff
This pair tends to pivot on stack and pricing more than on a clean universal quality gap. Gemini can look friendlier at the front of the cost curve, while Claude often earns consideration through its Anthropic positioning and workflow familiarity.
Quick Decision Cards
These cards call out the most useful early distinctions without hiding the fact that different public fields may point to different winners.
Highest reasoning score from the currently public benchmark fields.
Best coding posture from AA Coding Index, LiveCodeBench, or SWE Bench when present.
Lowest currently published input-token price.
Largest resolved context window from the public detail dataset.
Use-Case Framing
Best for buyers deciding between Google and Anthropic before they ever get to an OpenAI branch.
Best when long-context analysis, price posture, and platform preference all influence the shortlist.
Best for teams that want an indexable alternative to the usual OpenAI-centered comparison pages.
Full Matrix
Missing values stay visible as N/A, and softly tinted cells mark the leading value in each comparable row so the matrix scans faster.
Overview
Decision-first fields that summarize fit before the deeper benchmark matrix.
| Field | Anthropic: Claude Sonnet 4.6 anthropic | Google: Gemini 3.1 Pro Preview google |
|---|---|---|
| Creator | anthropic | |
| Overall profile | Selective fit | Selective fit |
| Best for | Long-context research / Multimodal | Long-context research / Agent workflows |
| Vision support | Yes | Yes |
| New in 2026 | Yes | Yes |
Intelligence / Reasoning
Broad reasoning quality, knowledge depth, and flagship benchmark posture.
| Field | Anthropic: Claude Sonnet 4.6 anthropic | Google: Gemini 3.1 Pro Preview google |
|---|---|---|
| Reasoning label | Situational | Situational |
| Intelligence score | 44 | 57 |
| Intelligence Index | 44.4 | 57.2 |
| AA Intelligence Index | 44.4 | 57.2 |
| MMLU Pro | N/A | N/A |
| GPQA | 79.9% | 94.1% |
| HLE | 13.2% | 44.7% |
| Arena ELO | N/A | N/A |
Coding
Signals that matter for code generation, refactors, debugging, and software tasks.
| Field | Anthropic: Claude Sonnet 4.6 anthropic | Google: Gemini 3.1 Pro Preview google |
|---|---|---|
| Coding score | 46 | 56 |
| AA Coding Index | 46.4 | 55.5 |
| LiveCodeBench | N/A | N/A |
| LiveBench | N/A | N/A |
| SWE Bench | N/A | N/A |
| SciCode | 46.9% | 58.9% |
Math
Published math-oriented signals, including both summary indexes and narrower benchmark cuts.
| Field | Anthropic: Claude Sonnet 4.6 anthropic | Google: Gemini 3.1 Pro Preview google |
|---|---|---|
| Math score | N/A | N/A |
| AA Math Index | N/A | N/A |
| Math 500 | N/A | N/A |
| AIME | N/A | N/A |
| AIME 25 | N/A | N/A |
Agent / Tool Use
Signals that better reflect tool loops, long-running tasks, and agent-style workflows.
| Field | Anthropic: Claude Sonnet 4.6 anthropic | Google: Gemini 3.1 Pro Preview google |
|---|---|---|
| Agent score | 56 | 75 |
| IFBench | 41.2% | 77.1% |
| TAU2 | 79.5% | 95.6% |
| TerminalBench Hard | 46.2% | 53.8% |
| LCR | 57.7% | 72.7% |
Latency / Speed
Interactive responsiveness and throughput signals from the public detail dataset.
| Field | Anthropic: Claude Sonnet 4.6 anthropic | Google: Gemini 3.1 Pro Preview google |
|---|---|---|
| Latency tier | Balanced | Heavy |
| Speed label | Situational | Limited |
| Speed score | 57 | 40 |
| Tokens per second | 50 | 113 |
| TTFT | 1.13s | 22.16s |
| AA Tokens per second | 53 | 115 |
| AA TTFT | 0.97s | 20.66s |
| First answer token | 0.97s | 20.66s |
Pricing
Published token pricing plus the lower-level OpenRouter and Artificial Analysis cost fields.
| Field | Anthropic: Claude Sonnet 4.6 anthropic | Google: Gemini 3.1 Pro Preview google |
|---|---|---|
| Price tier | Mid-range | Mid-range |
| Price label | Competitive | Competitive |
| Price score | 62 | 62 |
| Input price | $3.00 | $2.00 |
| Output price | $15.00 | $12.00 |
| AA input price | $3.00 | $2.00 |
| AA output price | $15.00 | $12.00 |
| AA blended 3:1 | $6.00 | $4.50 |
| OR prompt price | $3.0000 | $2.0000 |
| OR completion price | $15.0000 | $12.0000 |
| OR request price | N/A | N/A |
| OR image price | N/A | $0.0000 |
| OR audio price | N/A | $0.0000 |
| OR web search price | $0.0100 | N/A |
| OR cache read price | $0.0000 | $0.0000 |
| OR cache write price | $0.0000 | $0.0000 |
| OR internal reasoning price | N/A | $0.0000 |
Context
Window size and completion limits relevant to long-context tasks and workspace planning.
| Field | Anthropic: Claude Sonnet 4.6 anthropic | Google: Gemini 3.1 Pro Preview google |
|---|---|---|
| Context tier | Large | Large |
| Context label | Above average | Above average |
| Context score | 100 | 100 |
| Primary context window | 1000K Tokens | 1049K Tokens |
| OpenRouter context length | 1000K Tokens | 1049K Tokens |
| Top provider context | 1000K Tokens | 1049K Tokens |
| Max completion tokens | 128000 | 65536 |
Modality / Vision
Modalities stay visible near the decision surface so multimodal support is easy to compare.
| Field | Anthropic: Claude Sonnet 4.6 anthropic | Google: Gemini 3.1 Pro Preview google |
|---|---|---|
| Vision support | Yes | Yes |
| Modalities | text, image->text, image | text, image, file, audio, video->text, video |
| OpenRouter modality | text+image->text | text+image+file+audio+video->text |
| OR input modalities | text, image | audio, file, image, text, video |
| OR output modalities | text | text |
Provider Internals
Curated pages handle editorial intent. The leaderboard handles discovery. Custom compare URLs stay available for working sessions without being promoted as canonical landing pages.