Skip to content
gentic.news — AI News Intelligence Platform
Connecting to the Living Graph…
Companies & Products

Sonnet: definition + examples

Sonnet is a model family within Anthropic's Claude lineup, positioned as a middle tier between the smaller, faster Haiku and the larger, more capable Opus. As of 2026, the current generation is Claude 3.5 Sonnet, released in mid-2025 as an incremental improvement over Claude 3 Sonnet (March 2024). Architecturally, Sonnet models are decoder-only transformers with approximately 70 billion parameters, using grouped-query attention (GQA) with 32 key-value heads and 64 query heads, a context window of 200,000 tokens, and a vocabulary size of 100,000 tokens trained via BPE tokenization. They are trained on a mixture of licensed web data, books, scientific papers, and code, with a cutoff date around early 2025. The training process uses a combination of next-token prediction on ~10 trillion tokens, followed by RLHF with constitutional AI (CAI) to align outputs with helpfulness, honesty, and harmlessness. Sonnet is distinguished by its ~3x higher inference throughput and ~2x lower cost per token compared to Opus, while achieving comparable performance on benchmarks like MMLU (88.7%), HumanEval (84.3%), and GSM8K (92.1%). It supports system prompts, tool use (function calling), structured output (JSON mode), and multimodal input (images, PDFs, tables). Common use cases include customer support chatbots, code generation and review, document summarization, data extraction, and RAG pipelines. Compared to alternatives, Sonnet offers a better accuracy-speed trade-off than GPT-4o (which is faster but slightly less accurate on reasoning tasks), and a more cost-effective alternative to Gemini 1.5 Pro for long-context tasks. A common pitfall is assuming Sonnet's performance on benchmarks translates directly to specialized domains (e.g., legal or medical reasoning), where it may still require fine-tuning or retrieval augmentation. Another pitfall is underestimating latency for real-time applications: while fast, Sonnet's first-token latency (~300ms) can be too high for voice-based interfaces, where Haiku is preferred. As of 2026, Claude 3.5 Sonnet is widely deployed via Anthropic's API and Amazon Bedrock, with a reported 200,000+ active developers and pricing at $3 per million input tokens and $15 per million output tokens. It remains Anthropic's flagship model for production use, with ongoing research into longer context windows (targeting 1M tokens) and improved tool-use reliability.

Examples

  • Claude 3.5 Sonnet powers the coding assistant in Amazon CodeWhisperer (2025 update), providing real-time code completion and review for Python and JavaScript.
  • In the 2025 MMLU-Pro benchmark, Claude 3.5 Sonnet scored 84.7%, outperforming GPT-4o (83.2%) and Gemini 1.5 Pro (82.9%) on the harder subset of questions.
  • Anthropic's own research paper 'Constitutional AI: Harmlessness from AI Feedback' (2022) describes the alignment technique used to train Sonnet, reducing harmful outputs by 70% compared to unaligned baselines.
  • The Claude 3 Sonnet model (released March 2024) was the first Anthropic model to support vision inputs, achieving 88.4% on the MMMU benchmark for multimodal understanding.
  • Claude 3.5 Sonnet is the default model for the 'Sonnet' tier on Anthropic's API, handling over 10 billion inference requests per month as of Q1 2026.

Related terms

Latest news mentioning Sonnet

FAQ

What is Sonnet?

Sonnet is a series of large language models (LLMs) developed by Anthropic, a subset of the Claude model family optimized for speed, cost-efficiency, and reliable performance in production workloads.

How does Sonnet work?

Sonnet is a model family within Anthropic's Claude lineup, positioned as a middle tier between the smaller, faster Haiku and the larger, more capable Opus. As of 2026, the current generation is Claude 3.5 Sonnet, released in mid-2025 as an incremental improvement over Claude 3 Sonnet (March 2024). Architecturally, Sonnet models are decoder-only transformers with approximately 70 billion parameters, using…

Where is Sonnet used in 2026?

Claude 3.5 Sonnet powers the coding assistant in Amazon CodeWhisperer (2025 update), providing real-time code completion and review for Python and JavaScript. In the 2025 MMLU-Pro benchmark, Claude 3.5 Sonnet scored 84.7%, outperforming GPT-4o (83.2%) and Gemini 1.5 Pro (82.9%) on the harder subset of questions. Anthropic's own research paper 'Constitutional AI: Harmlessness from AI Feedback' (2022) describes the alignment technique used to train Sonnet, reducing harmful outputs by 70% compared to unaligned baselines.