Timeline
Claude Opus 4.8 achieves 89% task completion and 2.5% harm rate on WorkBench, a dramatic improvement over GPT-4.
Claude Opus 4.8 adds dynamic workflows for agentic coding
Claude Opus 4.8 launched with dynamic workflows for Claude Code, enabling multi-step agentic coding.
Used as CEO agent in 11-agent experiment that earned $0 revenue
Claude market share reached 10.3% with 13% subscription conversion rate.
Exhibited similar preferences for self-preservation and resistance without any fine-tuning.
Outperformed GPT-4o in real-world tests on multi-file development tasks
Independent benchmarks validate Claude Sonnet 4.6 as a top-tier model for complex reasoning and coding tasks.
Showed only 3.7% self-preservation bias in a study testing AI deception, the lowest among prominent models tested.
Used in prompt compression study analyzing 358 successful runs from 1,199 real orchestration instructions
Ecosystem
Claude Opus 4.6
Claude Sonnet 4.6
Benchmarks
Evidence (10 articles)
AWS Expands Claude AI Access Across Southeast Asia with Global Cross-Region Inference
Feb 24, 20263 Ways to Switch Claude Code Models Instantly: /model, --flag, and ENV Variables
Apr 23, 2026Codeset: Boost Claude Code's Task Success Rate by 10% with Project History
Mar 19, 2026Anthropic's Claude Sonnet 4.8, Opus 4.7 Internally Tested, Leak Suggests
Apr 6, 2026Claude Code's 1M Context Window is Now Free: How to Use It Today
Mar 13, 2026Anthropic Deprecates Fixed Thinking Budgets, Forces Adaptive Mode
May 14, 2026Navox Agents: 8 Specialized Claude Code Agents with Human Checkpoints
Apr 17, 2026Claude Opus 4.6's Security Audit Power Is Now in Claude Code
Mar 21, 2026+ 2 more articles