What Changed
World Model MCP (v0.9.1) is a new MCP server that gives Claude Code long-term memory. It creates a temporal knowledge graph of your codebase that learns from every coding session. The key claim: it reduces repeated mistakes by +10.2 points on SWE-bench Verified.
The repo ships 26 MCP tools, 19 CLI subcommands, and 375 tests. It's harness-neutral — works with Claude Code, Cursor, and pi.
What It Does
World Model MCP acts as a persistent memory layer that:
- Prevents Hallucinations — Validates API/function references against known entities before use
- Stops Repeated Mistakes — Learns constraints from corrections, applies them in future sessions
- Reduces Regressions — Tracks bug fixes and warns when changes touch critical regions
- Survives Compaction — Re-injects top constraints and recent facts after the agent's context window resets
- Resolves Contradictions — Picks a winner between conflicting facts using confidence, recency, or source count
The compaction survival feature is critical. Every Claude Code user knows the pain of the context window resetting mid-task. World Model MCP automatically re-injects the most important constraints and recent facts after compaction.
The Benchmark
The central wedge proof is a repeat-mistake benchmark on SWE-bench Verified. 50 tasks across django, sympy, matplotlib, scikit-learn, and sphinx were run as paired baseline-vs-treatment comparisons. Results:
- +10.2 pts paired delta across 49 instances
- +15.0 pts within-domain
- +6.9 pts cross-domain
- Zero regressions on out-of-domain tasks
Full per-task tables and mechanistic analysis are in benchmarks/repeat-mistake/RESULTS.md.
How to Install and Use
Installation
# Clone the repo
git clone https://github.com/SaravananJaichandar/world-model-mcp
cd world-model-mcp
# Build (requires Rust)
cargo build --release
Configure with Claude Code
Add to your Claude Code MCP config:
{
"mcpServers": {
"world-model": {
"command": "./path/to/world-model-mcp/target/release/world-model-mcp",
"args": ["serve"],
"env": {
"WORLD_MODEL_PATH": "/path/to/your/project/.world-model"
}
}
}
}
Key Commands
/world-model status— View current knowledge graph state/world-model constraints— List learned constraints/world-model compact— Trigger manual compactionstatus-watch— TUI widget for live monitoring
When to Use It
World Model MCP shines in:
- Large codebases where Claude Code repeatedly introduces the same bugs
- Long-running tasks that hit context limits multiple times
- Team projects where multiple developers use Claude Code on the same repo
- Legacy code with undocumented constraints and gotchas
Limitations (v0.9.1)
- Still early — v0.9.1, expect rough edges
- Requires Rust toolchain to build
- Antigravity adapter held for fourth release pending SDK changes
- 54% of MCP servers have zero community adoption per recent analysis — this one needs users to improve
Bottom Line
If you're tired of Claude Code making the same mistakes across sessions, World Model MCP is worth the 10-minute setup. The +10.2 pt SWE-bench improvement is real, and the compaction survival feature alone justifies the install for long coding sessions.
Source: github.com
[Updated 25 Jun via hn_claude_code]
The v0.8.1 release introduced a contradiction-resolution benchmark expanded to 105 pairs across 19 categories, and v0.8.0 added domain-aware confidence decay with per-evidence-type TTL and per-item provenance fields (source_tool and confirmer) [per Hacker News]. The methodology was pre-registered and locked at benchmarks/repeat-mistake/DESIGN.md on 2026-06-17, before data collection, preventing any goalpost-moving accusations.









