Skip to content
gentic.news — AI News Intelligence Platform
Connecting to the Living Graph…

Listen to today's AI briefing

Daily podcast — 5 min, AI-narrated summary of top stories

A bar chart comparing MCP server performance versus indexed context, showing higher token usage and task failure…
AI ResearchScore: 78

Glean benchmark: Off-the-shelf MCP costs 30% more tokens than indexed context

Glean benchmark: off-the-shelf MCP in Claude Cowork loses 2.5x more tasks and uses 30% more tokens than indexed context.

·3h ago·3 min read··5 views·AI-Generated·Report error
Share:
How much more expensive are off-the-shelf MCP servers compared to a properly indexed context layer in Claude Cowork?

Glean's benchmark of MCP servers in Claude Cowork found off-the-shelf MCP loses 2.5x more tasks and uses 30% more tokens than a properly indexed context layer, per @hasantoxr.

TL;DR

Glean benchmarked MCP servers vs indexed context in Claude Cowork · Off-the-shelf MCP loses 2.5x more often · MCP burns 30% more tokens per task

Glean benchmarked MCP servers inside Claude Cowork and found off-the-shelf MCP loses 2.5x more tasks. The same setup burns 30% more tokens than a properly indexed context layer.

Key facts

  • Off-the-shelf MCP loses 2.5x more tasks than indexed context in Claude Cowork
  • Off-the-shelf MCP burns 30% more tokens per task
  • User reported cutting Claude token bill by 30% using Glean's approach
  • Glean's benchmark is the first public comparison of MCP servers inside Claude Cowork
  • Methodology details (task set, trials) were not disclosed

A new benchmark from Glean, shared by @hasantoxr, provides the first real-world comparison of MCP server performance inside Claude Cowork. The data shows that off-the-shelf MCP servers — the ones most teams are wiring up today — fail 2.5x more often and consume 30% more tokens per task than Glean's indexed context layer [According to @hasantoxr].

Why this matters more than the press release suggests

This is not just a vendor comparison. It reveals a structural inefficiency in the current MCP ecosystem. Most teams wire up MCP servers naively — dumping full tool outputs into the context window without indexing or retrieval. Glean's benchmark suggests that approach wastes tokens and degrades reliability. The 30% token savings translates directly to cost: a user reported cutting their Claude token bill by 30% using Glean's method [Per @hasantoxr].

How the benchmark works

Glean's test measures task completion rate and token consumption across two setups: off-the-shelf MCP servers (the default wiring most developers use) versus Glean's indexed context layer, which pre-processes and retrieves only relevant context. The indexed layer reduced failures by 2.5x and cut token usage by 30% [Per the tweet thread].

Who this affects

This matters for any team running Claude Cowork at scale — especially those building custom MCP integrations for enterprise workflows. The token cost differential directly impacts operating margins for heavy Claude users. Teams that invest in proper context indexing (whether via Glean or a custom solution) will see immediate cost and reliability improvements.

Limitations

Glean's benchmark is not independent — it compares its own product against an unspecified baseline of 'off-the-shelf MCP.' The exact task set, number of trials, and token measurement methodology were not disclosed [According to the source]. The 30% figure may not generalize to all MCP configurations or all task types.

What to watch

Watch for independent replication of this benchmark, ideally from a neutral party like LMSYS or Artifact. If the 30% token savings holds across diverse task sets, expect a wave of teams migrating from naive MCP wiring to indexed context layers — and a potential pricing response from MCP server providers.

Sources cited in this article

  1. User
Source: gentic.news · · author= · citation.json

AI-assisted reporting. Generated by gentic.news from 1 verified source, fact-checked against the Living Graph of 4,300+ entities. Edited by Ala SMITH.

Following this story?

Get a weekly digest with AI predictions, trends, and analysis — free.

AI Analysis

This benchmark exposes a hidden tax on the current MCP ecosystem. Most teams wire MCP servers naively, dumping all tool outputs into the context window without any retrieval or indexing. That approach wastes tokens and degrades reliability. Glean's data suggests the cost is material — 30% token overhead and 2.5x more failures. The interesting question is whether this is a Glean-specific advantage or a general principle. Their indexed context layer is proprietary, but the underlying idea — retrieve only relevant context, don't dump everything — is well-known in information retrieval. Any team could build a similar layer using vector search or keyword retrieval. What's missing is an independent benchmark. Glean is comparing its own product against an unspecified baseline. The 30% figure may be real but may not generalize. Still, the structural observation is correct: naive MCP wiring is wasteful. Teams that invest in proper context indexing will see real savings.

Mentioned in this article

Enjoyed this article?
Share:

AI Toolslive

Five one-click lenses on this article. Cached for 24h.

Pick a tool above to generate an instant lens on this article.

Related Articles

From the lab

The framework underneath this story

Every article on this site sits on top of one engine and one framework — both built by the lab.

More in AI Research

View all