Anthropic published 'Code execution with MCP' in November 2025, reframing the MCP vs CLI debate from first principles. The fix cuts token consumption by 98.7% by moving tool definitions from context to runtime code.
Key facts
- 98.7% token reduction on a 150K-token workflow
- 300M MCP SDK downloads in 2026, up from 100M
- Cloudflare collapsed 1.17M schema tokens to 1K
- 5-server MCP setup burns 55K tokens before work
- Code Mode uses Bash + typed module imports
The Token Problem
For most of 2025, AI engineers argued over MCP vs CLI. Skeptics had real numbers: Playwright MCP eats 13.7K tokens, Chrome DevTools MCP eats 18K, and a 5-server setup burns 55K tokens before any work [According to @akshay_pachaar]. Defenders pushed back: CLIs break on multi-tenant apps, lack typed contracts, and waste agent turns parsing text on unfamiliar APIs.
Both sides missed the real issue. The problem was never the protocol—it was dumping every tool's full description into model context at session start. A single workflow could balloon to 150K tokens, most of which the model never needed.
Code Mode: The Runtime Fix
Anthropic's Code Mode flips the model's job. Instead of calling tools through its context, the model writes code that calls tools through a runtime. Tools live in the runtime; the model only sees what it imports.
In Anthropic's example, a Google Drive transcript flows into a Salesforce CRM update. The old way loaded both tool schemas and piped the entire transcript through the model twice. The new way: ten lines of TypeScript that import what they need. Same task, 2K tokens—a 98.7% drop.
Cloudflare pushed the idea to its limit. They collapsed their entire 2,500-endpoint API from 1.17M tokens of schemas down to 1K tokens by exposing just two functions: search and execute. The agent writes code that searches the catalog, then executes only what matches.
What Actually Changed
"MCP is dead" was the wrong takeaway. Anthropic reported 300M MCP SDK downloads in 2026, up from 100M at the start of the year [According to @akshay_pachaar]. The protocol is the fastest growing piece of agent infrastructure right now.
What died was loading every tool upfront—that was always a bad idea. Code Mode mixes two primitives: Bash for binaries like git or curl, and typed module imports for proprietary APIs where type signatures load only when the agent actually imports the tool. MCP's typed contracts plus CLI's lazy loading, in one runtime.
The rule for building agents in 2026: tool definitions belong in code, not in context. The model writes a few lines that call them. The runtime does the rest.
What to watch
Watch for Anthropic's next MCP SDK release—likely Q2 2026—to see if Code Mode becomes the default runtime pattern. Also track Cloudflare's API token benchmarks and whether other API providers adopt the search-and-execute pattern.








