Listen to today's AI briefing

Daily podcast — 5 min, AI-narrated summary of top stories

Developer's screen showing a code editor with a search tool interface that dynamically loads MCP tool definitions…

How Claude Code's Tool Search Saves 90% of Your Context Window

Tool search automatically defers MCP tool definitions, replacing them with a single search tool that loads tools on-demand, preserving your context window for actual work.

AAAla SMITH & AI Research Desk·Apr 8, 2026·4 min read··568 views·AI-Generated·Report error

Source: dev.tovia devto_claudecode, reddit_claude, hn_claude_code, medium_claude, @heygurisingh, hn_claude_code, devto_claudecode, hn_claude_code, devto_claudecodeWidely Reported

TL;DR

Claude Code's tool search system defers MCP tool definitions until needed, saving tens of thousands of tokens per turn and dramatically improving context efficiency.

How Claude Code's Tool Search Saves 90% of Your Context Window

What Tool Search Actually Does

Every MCP tool you connect to Claude Code comes with a definition — name, description, and JSON schema — that costs 200-800 tokens. With multiple MCP servers (GitHub, Slack, Jira, etc.), you can easily burn 60,000+ tokens on tool definitions alone, every single turn, before Claude even reads your message.

Tool search solves this by deferring most tool definitions. Instead of sending all 147 tool schemas, Claude Code sends:

~25 built-in tool definitions
A single ToolSearch tool
A prompt telling Claude: "147 deferred tools available — use ToolSearch to load them"

This reduces token overhead from ~90,000 to ~15,000 tokens immediately.

How It Works In Practice

When Claude needs a specific tool, it calls:

ToolSearch({"query": "github create issue"})

The system returns a tool_reference for mcp__github__create_issue. On the next turn, that tool's full schema is available, and Claude can call it normally.

The cost? One extra turn and ~200 tokens for discovery.
The savings? ~1.5 million tokens over a 20-turn conversation.

Which Tools Get Deferred?

The system uses a priority-ordered checklist:

Explicit opt-out first: MCP tools can declare _meta['anthropic/alwaysLoad'] to force loading every turn
MCP tools deferred by default: Most MCP tools are workflow-specific and numerous
ToolSearch never deferred: It's the bootstrap mechanism
Core communication tools never deferred: Agent, Brief — Claude needs these immediately
Built-in tools with shouldDefer flag: Rarely used but available

This follows Anthropic's principle of "fail closed, fail toward asking." If anything is uncertain, the system loads all tools rather than hiding them.

Three Modes You Can Configure

Tool search operates in three modes controlled by ENABLE_TOOL_SEARCH:

Mode 1: `tst` (Default)

Always defer MCP and shouldDefer tools. This is the right default — if you're using MCP tools, you've already accepted the latency tradeoff for a larger effective context window.

Mode 2: `tst-auto`

Threshold-based deferral. Only defer when tools exceed a token budget. Use ENABLE_TOOL_SEARCH=auto or ENABLE_TOOL_SEARCH=auto:50 (where 1-99 is the percentage threshold).

Mode 3: `standard`

Never defer. Use ENABLE_TOOL_SEARCH=false to disable completely.

The Snapshot Mechanism

Discovered tools are preserved across context compaction through a snapshot system. When compaction occurs, the system takes a snapshot of:

All currently loaded tools
The ToolSearch tool
The deferral state

This ensures Claude doesn't lose access to tools it's already discovered, even as the conversation context gets trimmed.

What This Means For Your MCP Servers

If you're building MCP servers, consider:

Mark critical tools with alwaysLoad: If a tool is needed on nearly every turn (like a primary database query), opt it out of deferral
Write clear tool descriptions: Tool search uses semantic matching, so good descriptions improve discovery accuracy
Group related tools: Tools with similar prefixes or descriptions will be discovered together

Try It Now

Check your current configuration:

echo $ENABLE_TOOL_SEARCH

If you're not seeing tool search benefits, ensure:

You have MCP tools configured
You're not in standard mode
Your MCP servers aren't all marked alwaysLoad

For maximum savings with minimal latency impact, use the default tst mode. The one-turn discovery overhead is negligible compared to the context window preservation.

When To Use Each Mode

Default (tst): Most users — balances savings with discovery latency
tst-auto:20: If you're sensitive to latency but still want some savings
standard: Only if you have very few MCP tools or need every tool available immediately

This system represents a fundamental shift in how Claude Code handles tool ecosystems. Instead of paying the token cost for every possible tool upfront, you now pay only for what you actually use.

Source: gentic.news · Apr 8, 2026 · author=Ala SMITH · citation.json

AI-assisted reporting. Generated by gentic.news from multiple verified sources, fact-checked against the Living Graph of 4,300+ entities. Edited by Ala SMITH.

Following this story?

Get a weekly digest with AI predictions, trends, and analysis — free.

AI Analysis

Claude Code users should immediately check their tool search configuration and understand how it impacts their workflow. If you're using multiple MCP servers, ensure you're in `tst` mode (the default) to maximize context window efficiency. When building or using MCP servers, be strategic about which tools should always load versus which can be deferred. Critical communication tools and frequently used utilities should be marked with `alwaysLoad`, while specialized, rarely-used tools should remain deferred. Monitor your token usage before and after enabling tool search. The savings are most dramatic when you have 50+ MCP tools configured. If you notice Claude struggling to find tools, improve your tool descriptions to include clear keywords that match how you'd naturally search for them.

#tokens #mcp #workflow #optimization #claude-code

Compare side-by-side

Claude Code vs Jira

→

Mentioned in this article

Claude Code GitHub Jira

Enjoyed this article?