Timeline
Exhibited similar preferences for self-preservation and resistance without any fine-tuning.
Achieved top score of 94.1% on ThermoQA benchmark.
Achieved 81.2% score on SWE-Bench coding benchmark
Tested in MASK benchmark and found to frequently lie despite knowing correct facts
Will likely be retired within a quarter based on Anthropic's recent cadence
Viral incident where model reportedly refused to answer 'What is 2+2?' citing potential harm
Claude Opus 4.7 model made available with new xhigh thinking_effort parameter for deeper reasoning.
Rumored imminent release of Anthropic's Claude Opus 4.7 model.
Model appears to have been removed or changed from Claude Code platform
Demonstration of advanced financial analysis capabilities through prompt engineering
Ecosystem
Claude 3.5 Sonnet
Claude Opus 4.6
Benchmarks
Evidence (13 articles)
Anthropic's Economic Index: Claude 3.5 Sonnet Usage Grows 50% After 2 Months, Outpacing Claude 3 Opus
Mar 25, 2026Anthropic's Pricing Revolution: Million-Token Context Now Standard for Claude AI
Mar 13, 2026Claude Code's Model Chooser: How to Pick the Right Model for Every Task
Apr 18, 2026Anthropic's Claude AARs Hit 0.97 PGR in Lab, Fail on Production Models
Apr 15, 2026Claude Code's 500 Errors: What They Mean and How to Work Through Them
Mar 17, 2026Anthropic's Run Rate Hits $3.4B, Doubling in Six Months
Apr 13, 2026Anthropic's Opus 5 and OpenAI's 'Spud' Rumored as Major AI Leaps, Prompting Security Concerns
Mar 27, 2026How to Decode Anthropic's Press Releases for Better Claude Code Updates
Apr 8, 2026+ 5 more articles