Add Vector Memory to Claude Code: The claude-memory-mcp Server Solves CLAUDE.md's 200-Line Limit

Add Vector Memory to Claude Code: The claude-memory-mcp Server Solves CLAUDE.md's 200-Line Limit

Install this open-source MCP server to give Claude Code persistent, searchable memory across projects. It surfaces only relevant context, solving CLAUDE.md's scaling problems.

GAlex Martin & AI Research Desk·5h ago·4 min read·4 views·AI-Generated
Share:
Source: dev.tovia devto_mcpSingle Source

The Problem: CLAUDE.md Breaks at Scale

You've hit the wall. Your CLAUDE.md files are over 400 lines, and Claude Code is visibly ignoring half of them. You're re-explaining the same authentication flows, database schemas, and API quirks every session. This isn't a bug—it's a structural limitation.

CLAUDE.md files get injected into the system prompt at session start with no filtering or search. The practical limit for adherence sits around 200 lines. Past that, attention gets diluted across the long context. Everything loads at once: CSS conventions when you're debugging webhooks, deployment instructions when you're working on auth. There's no mechanism to load only what's relevant.

The Solution: Vector Memory via MCP

The open-source claude-memory-mcp server gives Claude Code persistent, searchable, cross-project memory backed by PostgreSQL and pgvector. This follows Anthropic's Model Context Protocol (MCP) standard, which we've covered in 50 prior articles as the framework for extending Claude's capabilities.

How it works:

  • Stores decisions, bug fixes, and patterns across sessions and projects
  • Surfaces only what's relevant to the current task via semantic search
  • Costs roughly $0.50/month to run (OpenAI embeddings + Supabase free tier)
  • Takes about 15 minutes to set up

Architecture: Four MCP Tools

The server exposes four tools through the MCP protocol:

{
  "remember": {
    "parameters": ["content", "memory_type", "project_id", "tags"],
    "purpose": "Embeds and stores a memory"
  },
  "recall": {
    "parameters": ["query", "project_id", "memory_type", "limit"],
    "purpose": "Semantic search, returns top matches"
  },
  "forget": {
    "parameters": ["memory_id"],
    "purpose": "Soft-deletes by setting expiry"
  },
  "project_status": {
    "parameters": ["project_id"],
    "purpose": "Returns memory count by type"
  }
}

Memory types: 8 categories including decision, bug_fix, pattern, context, blocker, learning, convention, and dependency. Typed memories are easier to filter and query than free text.

Setup: 15 Minutes to Persistent Memory

  1. Clone and configure:
git clone https://github.com/glivera/claude-memory-mcp
cd claude-memory-mcp
cp .env.example .env
  1. Set environment variables:
OPENAI_API_KEY=your_key_here
SUPABASE_URL=your_url_here
SUPABASE_SERVICE_ROLE_KEY=your_key_here
PORT=3101
  1. Start the server:
docker-compose up -d
  1. Add to Claude Code config:
    Add to ~/.claude_desktop_config.json:
{
  "mcpServers": {
    "claude-memory": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-memory"],
      "env": {
        "MEMORY_SERVER_URL": "http://localhost:3101"
      }
    }
  }
}

Workflow: How to Use It Daily

Session start instructions for CLAUDE.md:

# Memory System Instructions

At session start:
1. Run `project_status` for current project
2. Run `recall` with current task description
3. Run `recall` without project filter for cross-project knowledge

During work:
- Call `remember` after decisions, bug fixes, and pattern discoveries
- Use appropriate memory_type and tags

Session end:
- Save summary with `memory_type: "context"` and `tags: ["session-summary"]`

Example in action: When implementing Stripe webhook handling, Claude recalled that in a previous project, Stripe sends duplicate webhook events and the fix was idempotency keys in a PostgreSQL table. That knowledge surfaced through vector similarity—never in any CLAUDE.md file.

When to Use CLAUDE.md vs. Vector Memory

Use CLAUDE.md for:

  • Static rules that never change
  • Project setup instructions
  • Team conventions that apply to every task
  • Anything under 200 lines

Use Vector Memory for:

  • Decisions made during development
  • Bug fixes and their solutions
  • Patterns discovered across sessions
  • Cross-project knowledge transfer
  • Anything that accumulates over time

Cost and Performance

  • Embeddings: text-embedding-3-small from OpenAI (1536 dimensions)
  • Storage: Supabase PostgreSQL with pgvector extension
  • Token cap: Recall responses limited to 2,000 tokens
  • Latency: 200-500ms per recall operation
  • Monthly cost: ~$0.50 at ~50 memories/day (100 tokens each)

This isn't a replacement for CLAUDE.md—it's a complementary system. CLAUDE.md holds the rules that never change. Vector memory holds everything that happened after day one.

AI Analysis

## What Claude Code Users Should Do Now **1. Install claude-memory-mcp for any multi-project workflow.** If you work across more than one codebase or have complex projects where you're re-explaining the same concepts, this server pays for itself in time saved. The 15-minute setup is trivial compared to the hours wasted re-teaching Claude about your stack. **2. Restructure your CLAUDE.md files.** Move anything dynamic—bug fixes, decisions, patterns—out of your CLAUDE.md and into the memory system. Keep only static rules and conventions in CLAUDE.md. This aligns with our previous coverage of CLAUDE.md best practices, where we emphasized keeping these files concise and focused. **3. Use the memory types strategically.** When Claude remembers something, specify the memory_type parameter. This makes recall searches more precise. For example, when debugging, search specifically for `bug_fix` memories. When designing architecture, search for `decision` and `pattern` memories. **4. Implement the session workflow immediately.** Add the memory instructions to your CLAUDE.md today. The three-step process (project_status → recall with project → recall without project) ensures Claude surfaces both project-specific and cross-project knowledge at the start of every session. **5. Consider team deployment.** Since the MCP server can handle multiple connections simultaneously, teams can share a single Supabase instance. This creates a collective memory system where one developer's solution to a Supabase edge case becomes available to everyone.
Enjoyed this article?
Share:

Related Articles

More in Products & Launches

View all