[KG] Claude Mythos Preview — momentum
Anthropic's Claude Mythos Preview is not just a research model—it's a safety-cleared offensive cyber tool. Endorsed by the UK AI Safety Institute and used by the NSA, it builds working exploits in hours, scores 86.9% on BrowseComp, and clears all UK cyberattack simulators. Its OSWorld-Verified 79.6% and METR time horizon doubling at 80% success signal deployment velocity that rivals Codex 5.3, GPT-5.3, and even Claude Opus 4.6. Dependencies on Firefox, Windows, and Project Glasswing anchor it in Anthropic's stack, while Mozilla and XBOW feed its capabilities. The UK AI Safety Institute both endorses and regulates it, a tension that OpenAI's new Daybreak Cyber initiative aims to exploit. As task length doubles every four months, the open question is: can Mythos maintain its safety-first edge as competitors race to match its exploit speed?
- •Endorsed by UK AI Safety Institute and used by NSA for cyber operations
- •86.9% BrowseComp, 79.6% OSWorld-Verified, clears all UK cyberattack simulators
- •Competes with Codex 5.3, GPT-5.3, and Claude Opus 4.6
- •Dependent on Firefox, Windows, METR, and Project Glasswing
- •Task length doubling every 4 months; METR time horizon doubled at 80% success
Raw payload
{
"entity_slug": "claude-mythos-preview",
"entity_name": "Claude Mythos Preview",
"entity_type": "ai_model",
"title": "Claude Mythos Preview: Safety-Approved Cyber Weapon Doubles Down",
"narrative": "Anthropic's Claude Mythos Preview is not just a research model—it's a safety-cleared offensive cyber tool. Endorsed by the UK AI Safety Institute and used by the NSA, it builds working exploits in hours, scores 86.9% on BrowseComp, and clears all UK cyberattack simulators. Its OSWorld-Verified 79.6% and METR time horizon doubling at 80% success signal deployment velocity that rivals Codex 5.3, GPT-5.3, and even Claude Opus 4.6. Dependencies on Firefox, Windows, and Project Glasswing anchor it in Anthropic's stack, while Mozilla and XBOW feed its capabilities. The UK AI Safety Institute both endorses and regulates it, a tension that OpenAI's new Daybreak Cyber initiative aims to exploit. As task length doubles every four months, the open question is: can Mythos maintain its safety-first edge as competitors race to match its exploit speed?",
"key_points": [
"Endorsed by UK AI Safety Institute and used by NSA for cyber operations",
"86.9% BrowseComp, 79.6% OSWorld-Verified, clears all UK cyberattack simulators",
"Competes with Codex 5.3, GPT-5.3, and Claude Opus 4.6",
"Dependent on Firefox, Windows, METR, and Project Glasswing",
"Task length doubling every 4 months; METR time horizon doubled at 80% success"
],
"angle": "momentum",
"neighborhood_size": 19,
"generated_at": "2026-06-13T03:41:27.900678+00:00"
}