Claude Sonnet 4.6
Anthropic · Launched Feb 2026
Claude Sonnet 4.6 is a multimodal large language model developed by Anthropic and released on February 25, 2026. Its documented performance includes an MMLU Pro score of 85.0, an Arena Elo rating of 1470, and a SWE-bench verified score of 79.6, positioning it as a competitive reasoning and coding model. The model was commercially priced at $3 per million input tokens and $15 per million output tokens. Claude Sonnet 4.6 matters in early 2026 as a benchmarked, mid-tier AI model demonstrating specific capabilities in reasoning and coding, providing a verifiable performance and pricing point within the competitive landscape of enterprise-focused language models, directly challenging similar offerings from OpenAI and Google.
Benchmark performance
Real desktop workflows across browser, files, office apps. 369 tasks (361 without Google Drive). Human expert baseline 72.4%. Current SOTA: Holo3-35B-A3B (H Company) at 80.4%.
Other screen-level os control agents
The 15 agents in this category, ranked by peak benchmark.
| Agent | Maker | Launch | Peak | Pricing |
|---|---|---|---|---|
| Kimi K2.5OSS | Moonshot AI | 2026-01 | 1410.0 | API pay-as-you-go |
| Claude Computer Use | Anthropic | 2024-10 | 92.1 | Claude API — input $5/M, output $25/M |
| Kimi K2.6OSS | Moonshot AI | 2026-04 | 89.6 | API: 0.60/2.75 per M tokens |
| Holo3-35B-A3B | H Company | 2026-04 | 80.4 | H Company enterprise |
| Claude Sonnet 4.5 | Anthropic | 2025-09 | 62.9 | Legacy Anthropic API |
| Seed-1.8 | ByteDance Seed | 2025-12 | 61.9 | Doubao ecosystem |
| EvoCUA-20260105 | Meituan LongCat | 2026-01 | 56.7 | Research |
| GUI-Owl-1.5 32BOSS | Alibaba Tongyi Lab | 2026-03 | 55.4 | Free (OSS) |
| DeepMiner-Mano-72B | Mininglamp Technology | 2025-10 | 53.9 | Research |
| UI-TARS-2 | ByteDance Seed | 2025-10 | 53.1 | ByteDance ecosystem |
Recent coverage
2026-04-23
3 Ways to Switch Claude Code Models Instantly: /model, --flag, and ENV Variables
2026-04-17
Navox Agents: 8 Specialized Claude Code Agents with Human Checkpoints
2026-04-16
Claude Code's Edge: Why Sonnet 4.5 Beats GPT-4o for Multi-File Projects
2026-04-14
Coding Agent UIs Converge on Side-by-Side Sessions, Says Omar Sar
2026-04-14
Project Kahn: GPT-5.2, Claude, Gemini Escalate to Nuclear War in AI Crisis Sim
2026-04-10
Claude Sonnet 4.6 vs. The Field
2026-04-08
Claude Code Users: How to Check Status and Switch Models During Sonnet 4.6 Outages
2026-04-06
Side-by-Side Code Reviews: How to Compare Claude Code vs. Codex Outputs for Better Results
Quick facts
- Type
- Screen-level OS control
- Maker
- Anthropic
- Launch
- 2026-02-23
- Open source
- No
- Pricing
- API: 3/15 per M tokens
- Benchmarks scored
- 4
- Article mentions
- 22
- Rank in category
- #1 of 15