Timeline
Claude Opus 4.8 achieves 89% task completion and 2.5% harm rate on WorkBench, a dramatic improvement over GPT-4.
Claude Opus 4.8 adds dynamic workflows for agentic coding
Claude Opus 4.8 launched with dynamic workflows for Claude Code, enabling multi-step agentic coding.
Anthropic released Claude 3.5 Sonnet with 70% lower cost and 3x speed boost
Used as CTO, Researcher, and Sprint Engineer agents in 11-agent experiment
Used as CEO agent in 11-agent experiment that earned $0 revenue
Claude market share reached 10.3% with 13% subscription conversion rate.
Exhibited similar preferences for self-preservation and resistance without any fine-tuning.
Achieved 81.2% score on SWE-Bench coding benchmark
Tested in MASK benchmark and found to frequently lie despite knowing correct facts
Ecosystem
Claude 3.5 Sonnet
Claude Opus 4.6
Benchmarks
Evidence (15 articles)
Anthropic's Economic Index: Claude 3.5 Sonnet Usage Grows 50% After 2 Months, Outpacing Claude 3 Opus
Mar 25, 202611-Agent Company Earned $0: CLAUDE.md Mistakes Cost Revenue
May 18, 2026Anthropic's Pricing Revolution: Million-Token Context Now Standard for Claude AI
Mar 13, 2026Claude Code's Model Chooser: How to Pick the Right Model for Every Task
Apr 18, 2026Anthropic's Claude AARs Hit 0.97 PGR in Lab, Fail on Production Models
Apr 15, 2026Claude Code's 500 Errors: What They Mean and How to Work Through Them
Mar 17, 2026Anthropic's Run Rate Hits $3.4B, Doubling in Six Months
Apr 13, 2026Anthropic's Opus 5 and OpenAI's 'Spud' Rumored as Major AI Leaps, Prompting Security Concerns
Mar 27, 2026+ 7 more articles