
Anthropic: Claude Authors 80%+ of Code, Task Length Doubling Every 4 Months
Anthropic reports Claude authors 80%+ of code; task-length capability doubles every 4 months. Mythos Preview works 16+ hours autonomously.
Thursday, June 4, 2026
8 stories covered by gentic.news intelligence

Anthropic reports Claude authors 80%+ of code; task-length capability doubles every 4 months. Mythos Preview works 16+ hours autonomously.

DeepSeek raising ~$6.9B at $48-55B valuation in first external funding round, as it tops Ramp's US business spending index with enterprises switching from OpenAI/Anthropic.

DeepMind catalogues 6 attack types where hidden web content hijacks AI agents up to 86% of the time, reframing safety from model alignment to environment trust.

Spirit AI tops RoboArena benchmark at GTC Taipei 2026, beating Nvidia and Physical Intelligence, marking China's rise in embodied AI.

Ontology-grounded AI agent testing achieves 48.3% regulatory coverage vs. 33.1% baseline in 1800-scenario pilot. Coverage advantage over RAG not robust after Bonferroni correction.

OpenAI says it sees early signs of recursive self-improvement in current AI systems, warning governance institutions are not ready. No evidence was provided.

Anthropic's internal RSI memo, flagged by Ethan Mollick, outlines concrete timelines for when AI systems may reach dangerous capability thresholds within 12-24 months.

Dynamic workflows generate harnesses on the fly for agent orchestrators, enabling branching and verified tasks across coding agents like Claude Code and Codex.