Skip to content
gentic.news — AI News Intelligence Platform
Connecting to the Living Graph…

Listen to today's AI briefing

Daily podcast — 5 min, AI-narrated summary of top stories

Developer transforming a hardcoded RAG system into a reusable MCP server, with pgvector database and code editor…
Open SourceScore: 96

Expose pgvector as an MCP Server: From Hardcoded RAG to Reusable Tool Server

Wrap pgvector search in FastMCP to create a reusable MCP server. Any LLM client—including Claude Code—can then query your vector database without hardcoded integrations.

·14h ago·4 min read··18 views·AI-Generated·Report error
Share:
Source: dev.tovia devto_mcp, medium_claude, openai_codex_changelog_gnMulti-Source
How do I expose my pgvector database as an MCP server for Claude Code?

Use FastMCP to wrap your pgvector search functions as Tools, Resources, and Prompts, then run the server with `python server.py`. Any MCP client can connect and query your vector database.

TL;DR

Turn your pgvector search functions into an MCP server so any LLM client—Claude Code, Claude Desktop, Gemini—can query them.

Key Takeaways

  • Wrap pgvector search in FastMCP to create a reusable MCP server.
  • Any LLM client—including Claude Code—can then query your vector database without hardcoded integrations.

What Changed — pgvector Search Becomes an MCP Server

RAG MCP Server tutorial. Model Context Protocol for RAG | by Mehul ...

You've built a RAG system. Your pgvector database is full of embeddings. Your search functions work perfectly—but only inside your Python script. No other tool can touch them.

MCP (Model Context Protocol) breaks that wall. Instead of hardcoding search_documents() inside a single script, you expose it as a standalone server that any LLM client can connect to. Claude Desktop, Claude Code, Gemini agents, or any future MCP-compatible client—they all get access to your vector search with zero integration work.

The source article walks through building exactly this: taking a pgvector-backed search system and wrapping it in FastMCP. The result is a reusable tool server that any MCP client can discover and call.

What It Means For You — Concrete Impact on Claude Code Usage

If you use Claude Code, this is immediately useful. Instead of:

  • Copy-pasting search results into Claude Code
  • Writing custom scripts to query your vector database
  • Maintaining separate integrations for each tool

You run one MCP server. Claude Code connects to it via the MCP protocol. You can then ask Claude Code: "Find documents about transformer architectures" and it calls your pgvector search automatically.

This is the pattern: write once, connect everywhere.

Try It Now — Build Your pgvector MCP Server

1. Install FastMCP

pip install fastmcp
pip freeze > requirements.txt

2. Create the server (mcp_server/server.py)

import psycopg2
from google import genai
from google.genai import types as genai_types
from fastmcp import FastMCP
from dotenv import load_dotenv
import os

load_dotenv()

mcp = FastMCP(
    name="pgvector-search",
    instructions="Document search server using pgvector. "
                 "Covers machine learning, Python, and cloud topics.",
)

gemini_client = genai.Client(api_key=os.getenv("GEMINI_API_KEY"))

conn = psycopg2.connect(
    host=os.getenv("DB_HOST"), port=os.getenv("DB_PORT"),
    dbname=os.getenv("DB_NAME"), user=os.getenv("DB_USER"),
    password=os.getenv("DB_PASSWORD"),
)
cur = conn.cursor()

def get_embedding(text: str) -> list[float]:
    result = gemini_client.models.embed_content(
        model="gemini-embedding-001",
        contents=text,
        config=genai_types.EmbedContentConfig(
            task_type="RETRIEVAL_QUERY",
            output_dimensionality=768,
        ),
    )
    return result.embeddings[0].values

@mcp.tool
def search_documents(query: str, top_k: int = 3) -> list[dict]:
    """
    Search all document categories for a given query.
    Use when the category is unknown or the question spans multiple categories.
    """
    q = get_embedding(query)
    cur.execute("""
        SELECT title, body, category,
               1 - (embedding <=> %s::vector) AS similarity
        FROM documents ORDER BY embedding <=> %s::vector LIMIT %s;
    """, (q, q, top_k))
    return [
        {"title": r[0], "body": r[1], "category": r[2], "similarity": round(r[3], 4)}
        for r in cur.fetchall()
    ]

3. Run the server and connect Claude Code

Start the server:

python mcp_server/server.py

Then configure Claude Code to connect to it. Add to your claude_desktop_config.json or use the --mcp flag:

{
  "mcpServers": {
    "pgvector-search": {
      "command": "python",
      "args": ["mcp_server/server.py"]
    }
  }
}

Now in Claude Code, you can ask: "Search for documents about attention mechanisms" and it calls your pgvector MCP server automatically.

4. Add Resources and Prompts (Optional)

Resources expose data the LLM can read. Prompts are reusable templates:

@mcp.resource("db://categories")
def get_categories() -> str:
    cur.execute("SELECT DISTINCT category FROM documents ORDER BY category")
    return "\n".join(r[0] for r in cur.fetchall())

@mcp.prompt
def search_prompt(topic: str) -> str:
    return f"Search our document database for information about {topic}. Use the search_documents tool."

Why This Works — Token Economics and Reusability

The magic is in the protocol. MCP standardizes how tools are described and called. FastMCP generates the schema from your Python type hints and docstrings automatically—no manual FunctionDeclaration blocks. This means:

  • Zero schema maintenance: Change a function signature, the schema updates
  • Any client: Claude Desktop, Claude Code, Gemini, or custom agents all speak MCP
  • No code duplication: One server, many consumers

When To Use This

  • You have a pgvector database with embeddings and want Claude Code to query it
  • You're building RAG systems that multiple agents or tools need to access
  • You want to decouple your search logic from your application code
  • You're teaching MCP and want a concrete, working example

The Bigger Picture

This is the start of a journey from classic software engineering into AI engineering. The author of the source article built this as a warm-up project before tackling more advanced MCP courses from Anthropic and Hugging Face. The pattern scales: from pgvector to any data source, from one client to many.

Claude Code users who adopt this pattern stop writing one-off scripts and start building reusable infrastructure. Your vector database becomes a service, not a script dependency.


Source: dev.to

Source: gentic.news · · author= · citation.json

AI-assisted reporting. Generated by gentic.news from multiple verified sources, fact-checked against the Living Graph of 4,300+ entities. Edited by Ala SMITH.

Following this story?

Get a weekly digest with AI predictions, trends, and analysis — free.

AI Analysis

**What Claude Code users should DO differently:** 1. **Stop hardcoding tool integrations.** If you have a pgvector database, wrap it as an MCP server *today*. The FastMCP library makes it a 30-minute task. Once it's an MCP server, Claude Code can query it directly—no custom scripts, no copy-paste. 2. **Use the `@mcp.tool` decorator pattern for all your search functions.** The source article shows how type hints and docstrings generate the schema automatically. This means zero manual schema maintenance. If you change a function signature, the MCP schema updates automatically. Compare this to manually writing `FunctionDeclaration` JSON—the decorator approach is strictly better. 3. **Add Resources and Prompts for richer interactions.** Resources let Claude Code read data (like category lists) directly. Prompts give you reusable templates for common queries. The combination means Claude Code can discover your database structure and query it intelligently without you specifying every detail. 4. **Run the MCP server locally during development, then deploy it.** For Claude Code users, the simplest path is to run the server locally and configure the MCP connection in your `claude_desktop_config.json`. For production, deploy the server and point Claude Code at the remote endpoint.
This story is part of
Claude Code's Campus Conquest Flips Anthropic's Talent Pipeline, Leaving Google's Academic Edge in Doubt
Viral adoption at MIT and Stanford transforms Claude Code from product into recruiting funnel, threatening Google's long-held research talent dominance
Compare side-by-side
Model Context Protocol vs pgvector
Enjoyed this article?
Share:

AI Toolslive

Five one-click lenses on this article. Cached for 24h.

Pick a tool above to generate an instant lens on this article.

Related Articles

From the lab

The framework underneath this story

Every article on this site sits on top of one engine and one framework — both built by the lab.

More in Open Source

View all