Skip to content
gentic.news — AI News Intelligence Platform
Connecting to the Living Graph…

Listen to today's AI briefing

Daily podcast — 5 min, AI-narrated summary of top stories

ByteDance Seed AI researchers present a graph showing AI agent learning speed doubling quarterly, with data points…
AI ResearchBreakthroughScore: 90

ByteDance Finds AI Agents Double Learning Speed Every 3 Months

ByteDance's Seed AI team discovered that AI agents double learning speed every three months via real-world interaction, per a Thursday paper. EdgeBench benchmark with 134 tasks ≥12 hours each underpins the finding.

·5h ago·3 min read··10 views·AI-Generated·Report error
Share:
Source: scmp.comvia scmp_techCorroborated
What scaling law did ByteDance discover for AI agents?

ByteDance's Seed AI team found that AI agents can double their learning speed every three months through real-world interaction, per a paper released Thursday. The finding offers a new scaling path as traditional pre-training methods face diminishing returns.

TL;DR

ByteDance discovers new scaling law for AI agents · Learning speed doubles every three months in real-world tasks · EdgeBench benchmark features 134 ultra-long-horizon tasks

ByteDance's Seed AI team published a paper Thursday revealing that AI agents can double their learning speed every three months through real-world interaction. The finding offers a new scaling paradigm as traditional pre-training methods, which OpenAI co-founder Andrej Karpathy warned cannot last forever, hit diminishing returns.

Key facts

  • ByteDance's Seed AI team published the paper on Thursday
  • AI agents double learning speed every three months in real-world tasks
  • EdgeBench features 134 ultra-long-horizon tasks, each ≥12 hours
  • Tasks span software engineering, scientific discovery, and formal math
  • Andrej Karpathy warned brute-force pre-training scaling cannot last

The paper, posted by researchers at the TikTok parent's AI lab, tackles a blind spot in the agentic AI push. While tech firms race to deploy autonomous software that executes tasks on a human's behalf, ByteDance researchers noted that how these systems "learn from real-world environments after deployment remains far less understood" According to SCMP.

To quantify that learning, the team built EdgeBench, a benchmark suite of 134 ultra-long-horizon tasks across software engineering, scientific discovery, formal mathematics, and professional knowledge work. Each task demands at least 12 hours of continuous AI agent operation — far longer than typical agent benchmarks like SWE-Bench or GAIA, which measure single-session performance.

The core result: agents double their task-completion speed every three months when deployed in real-world environments. This "deployment scaling law" mirrors the compute scaling law that drove GPT-4o and its peers, but runs on post-deployment interaction data rather than pre-training compute.

Why the timing matters

The finding lands as the global AI industry searches for new ways to improve models. For years, developers relied on feeding systems more data and computing power during initial training. Prominent figures — including OpenAI co-founder Andrej Karpathy — have warned that this brute-force approach cannot last forever. The recent Epoch AI EBR-Bench results, where top models scored only 30-50% on experience-based reasoning, underscore the gap.

ByteDance's result suggests that agentic AI may unlock a second scaling axis: time spent interacting with real environments. If the doubling holds, agents deployed today would be 16x faster in a year and 256x faster in two years — without additional pre-training compute.

Caveats and open questions

The paper does not disclose the exact environment conditions, agent architectures, or whether the scaling law generalizes across different agent designs. The study used ByteDance's own agent systems; independent replication will be critical. The company also did not reveal whether the law holds beyond the 134 tasks in EdgeBench or whether it applies to frontier models like GPT-5.6 Sol or Claude.

Still, the finding provides a concrete counterpoint to the narrative that AI scaling is exhausted. If deployment-time learning can sustain progress, the industry's massive investment in agentic infrastructure — from Anthropic's Claude Code to OpenAI's Codex API — may have a compounding return that pre-training alone could not deliver.

Key Takeaways

  • ByteDance's Seed AI team discovered that AI agents double learning speed every three months via real-world interaction, per a Thursday paper.
  • EdgeBench benchmark with 134 tasks ≥12 hours each underpins the finding.

What to watch

Watch for independent replication attempts from OpenAI, Anthropic, or Google on the deployment scaling law. If confirmed, expect agent infrastructure investment to accelerate — and a shift in how model performance improvements are measured, from pre-training compute to months-in-production velocity.

ByteDance, the Chinese tech giant behind viral app TikTok, is also at the forefront of AI research in China. Photo: Reuters


Source: scmp.com


Sources cited in this article

  1. Thursday
Source: gentic.news · · author= · citation.json

AI-assisted reporting. Generated by gentic.news from 2 verified sources, fact-checked against the Living Graph of 4,300+ entities. Edited by Ala SMITH.

Following this story?

Get a weekly digest with AI predictions, trends, and analysis — free.

AI Analysis

ByteDance's deployment scaling law is the most concrete counterargument yet to the 'scaling is dead' thesis that has dogged AI stocks since mid-2025. While pre-training compute scaling has shown diminishing returns — GPT-5.6 Sol's gains over GPT-4o were modest relative to compute spent — the agentic scaling axis operates on a fundamentally different resource: post-deployment interaction time. The 3-month doubling rate is striking but should be treated with caution. The paper does not control for environment complexity, agent architecture, or whether the law holds across different task distributions. If the doubling is partly an artifact of early-stage learning on simple tasks, it may slow as agents saturate. Conversely, if it holds on EdgeBench's 12-hour tasks, it suggests meaningful long-horizon reasoning improvement. The strategic implication: companies with the most deployed agents — ByteDance (TikTok), OpenAI, Anthropic — may compound their advantage through deployment data moats. Pre-training compute is a commodity anyone can buy; deployment-time learning data is proprietary and cumulative.
Enjoyed this article?
Share:

AI Toolslive

Five one-click lenses on this article. Cached for 24h.

Pick a tool above to generate an instant lens on this article.

Related Articles

From the lab

The framework underneath this story

Every article on this site sits on top of one engine and one framework — both built by the lab.

More in AI Research

View all