ByteDance's Seed AI team published a paper Thursday revealing that AI agents can double their learning speed every three months through real-world interaction. The finding offers a new scaling paradigm as traditional pre-training methods, which OpenAI co-founder Andrej Karpathy warned cannot last forever, hit diminishing returns.
Key facts
- ByteDance's Seed AI team published the paper on Thursday
- AI agents double learning speed every three months in real-world tasks
- EdgeBench features 134 ultra-long-horizon tasks, each ≥12 hours
- Tasks span software engineering, scientific discovery, and formal math
- Andrej Karpathy warned brute-force pre-training scaling cannot last
The paper, posted by researchers at the TikTok parent's AI lab, tackles a blind spot in the agentic AI push. While tech firms race to deploy autonomous software that executes tasks on a human's behalf, ByteDance researchers noted that how these systems "learn from real-world environments after deployment remains far less understood" According to SCMP.
To quantify that learning, the team built EdgeBench, a benchmark suite of 134 ultra-long-horizon tasks across software engineering, scientific discovery, formal mathematics, and professional knowledge work. Each task demands at least 12 hours of continuous AI agent operation — far longer than typical agent benchmarks like SWE-Bench or GAIA, which measure single-session performance.
The core result: agents double their task-completion speed every three months when deployed in real-world environments. This "deployment scaling law" mirrors the compute scaling law that drove GPT-4o and its peers, but runs on post-deployment interaction data rather than pre-training compute.
Why the timing matters
The finding lands as the global AI industry searches for new ways to improve models. For years, developers relied on feeding systems more data and computing power during initial training. Prominent figures — including OpenAI co-founder Andrej Karpathy — have warned that this brute-force approach cannot last forever. The recent Epoch AI EBR-Bench results, where top models scored only 30-50% on experience-based reasoning, underscore the gap.
ByteDance's result suggests that agentic AI may unlock a second scaling axis: time spent interacting with real environments. If the doubling holds, agents deployed today would be 16x faster in a year and 256x faster in two years — without additional pre-training compute.
Caveats and open questions
The paper does not disclose the exact environment conditions, agent architectures, or whether the scaling law generalizes across different agent designs. The study used ByteDance's own agent systems; independent replication will be critical. The company also did not reveal whether the law holds beyond the 134 tasks in EdgeBench or whether it applies to frontier models like GPT-5.6 Sol or Claude.
Still, the finding provides a concrete counterpoint to the narrative that AI scaling is exhausted. If deployment-time learning can sustain progress, the industry's massive investment in agentic infrastructure — from Anthropic's Claude Code to OpenAI's Codex API — may have a compounding return that pre-training alone could not deliver.
Key Takeaways
- ByteDance's Seed AI team discovered that AI agents double learning speed every three months via real-world interaction, per a Thursday paper.
- EdgeBench benchmark with 134 tasks ≥12 hours each underpins the finding.
What to watch
Watch for independent replication attempts from OpenAI, Anthropic, or Google on the deployment scaling law. If confirmed, expect agent infrastructure investment to accelerate — and a shift in how model performance improvements are measured, from pre-training compute to months-in-production velocity.

Source: scmp.com







