A large production API with 2,600 endpoints consumes 1.1 million tokens when converted to MCP tools. The Model Context Protocol's design, meant to standardize agent-tool connections, instead creates a context window crisis.
Key facts
- 1.1M tokens consumed by MCP tools for a large API.
- 2,600 endpoints expressed as TypeScript types in under 1,000 tokens.
- Sandboxed V8 execution blocks file system and network access.
- Tool search uses ~2,100 tokens per query, only ~500 relevant.
- MCP hit 10K servers and 97M monthly SDK downloads by May 2026.
Key Takeaways
- MCP tool definitions for a 2,600-endpoint API consume 1.1M tokens, breaking agent context.
- Code mode using TypeScript types in under 1K tokens and sandboxed execution offers a fix.
The Token Math That Breaks Agents
The Model Context Protocol (MCP), introduced by Anthropic in November 2024, aimed to standardize how AI agents connect to external services. Before MCP, every team bundled their own tools privately. MCP fixed duplication but created a new problem: scale.
According to the source, a large production API specification can contain 2.3 million tokens of documentation. Converting each endpoint into an MCP tool still yields 1.1 million tokens. Typical context windows run 100,000 to 200,000 tokens. The agent's working memory is exhausted before processing a single user request.
Teams first try splitting: one MCP server per service category. Sixteen servers for a large platform. That reduces tokens per server but forces users to pre-select servers before knowing what they need, and coverage drops—a product server might expose six tools when the full API has thirty endpoints.
Three Approaches to Progressive Discovery

Three real solutions share a common principle: progressive discovery—loading the right tool at the right moment rather than everything upfront.
The CLI Approach. The agent uses a command-line tool like a developer, navigating sub-commands and requesting parameter help. No tool definitions enter context. The catch: the agent needs shell access, ruling out cloud-hosted or sandboxed clients.
Tool Search. A user's question triggers keyword search; the eight most relevant tools load into context, using about 2,100 tokens per query (only ~500 relevant). Quality depends entirely on keyword matching accuracy. A bad match means the right tool never appears, and the agent confidently gives a wrong answer.
Code Mode. Instead of pre-loading tools, the agent receives TypeScript type definitions generated from the API specification. A complete API with 2,600 endpoints—which would consume 1.1 million tokens as tools—can be expressed in under 1,000 tokens. The agent writes a small function tailored to the current request, executes it, and returns the result. Every improvement to the API specification automatically improves the agent.
The Security Sandbox That Unblocked Code Mode
Code mode stalled because running AI-written code without review is a textbook security vulnerability. AI-generated code could read file systems, steal environment variables, or exfiltrate secrets. These were not theoretical; they happened to real teams.
What changed the equation was not a new model or code quality breakthrough, but where the code runs. Small, fast sandboxes built on the V8 engine—the same engine running JavaScript in web browsers—start in milliseconds and impose hard limits: no file system access, no network access by default, programmable guardrails. AI writes code, that code runs in isolation, and the dangerous parts of the world are simply not reachable. The security objection disappears because the blast radius is zero.
As of June 2026, MCP has hit 10,000 servers and 97 million monthly SDK downloads [as previously reported by gentic.news]. The tool overload problem is widespread, but code mode offers a path that doesn't require waiting for larger context windows.
What to watch
Watch for Anthropic's next MCP specification update—expected Q3 2026—to see if code mode becomes a recommended pattern. Also track whether OpenAI or Google adopt similar type-based tooling in their agent SDKs, which would signal industry convergence.
Source: pub.towardsai.net









