Concept Explainer

The LLM Wiki

A pattern for building personal knowledge bases using LLMs.

TL;DR

Instead of RAG — where every query re-derives knowledge from raw chunks — the LLM incrementally builds a persistent wiki. When you add a source, the LLM reads it, extracts information, and integrates it into existing pages. Cross-references are built. Contradictions are flagged. Knowledge compounds over time. You never write the wiki yourself — the LLM writes and maintains all of it.

Read Karpathy's original gist

The Core Insight

How RAG Works

UUserQQueryCChunkCChunkCChunkLLLMAAnswer(forgotten)
  • Every query starts from scratch — the LLM has no memory of previous questions or synthesis
  • Documents are split into chunks and stored as embeddings — no structure, no relationships
  • If two documents contradict each other, the system has no way to detect or flag it
  • Ask a question requiring synthesis across 5 documents — the LLM must find and piece together fragments every time
  • Nothing accumulates. NotebookLM, ChatGPT file uploads, and most RAG systems work this way.

How LLM Wiki Works

SSourceLLLM.md.md.md.mdWiki PagesGraphUUser(compounds)
  • Knowledge is compiled once into structured markdown pages, then kept current — not re-derived
  • When you add a new source, the LLM updates entity pages, revises summaries, and notes where new data contradicts old claims
  • Cross-references are already built. Contradictions are already flagged. Synthesis reflects everything you’ve read.
  • Good answers get filed back into the wiki — your explorations compound just like ingested sources
  • The wiki is a persistent, compounding artifact. It keeps getting richer with every source and every question.

Side-by-Side Comparison

DimensionRAGLLM Wiki
Knowledge stateRe-derived every query from raw chunksCompiled once into structured pages, kept current
Cross-referencesNone — chunks are isolated vectorsBuilt automatically when new sources arrive
ContradictionsGo undetected until a user noticesFlagged during ingestion by the LLM
Maintenance costZero effort, but zero value accumulationZero effort, but compounding value over time
Best scaleAny number of documentsSweet spot ~100–500 sources with deep synthesis
Query qualityDepends on chunk retrieval qualityBenefits from pre-built synthesis and connections
Human roleUpload documents, ask questionsCurate sources, direct analysis, ask good questions

Architecture: Three Layers

Every LLM Wiki has the same three-layer structure. The LLM bridges them.

The Schema

A document — CLAUDE.md for Claude Code, AGENTS.md for Codex — that tells the LLM how the wiki is structured. It defines the conventions, page formats, and workflows the LLM should follow when ingesting sources, answering questions, or maintaining the wiki. This is the key configuration file: it’s what makes the LLM a disciplined wiki maintainer rather than a generic chatbot. You and the LLM co-evolve this over time as you figure out what works for your domain.

The Wiki

A directory of LLM-generated markdown files: summaries, entity pages, concept pages, comparisons, an overview, a synthesis. The LLM owns this layer entirely — it creates pages, updates them when new sources arrive, maintains cross-references, and keeps everything consistent. You read it; the LLM writes it. Two special files help navigation: index.md is a content catalog (each page with a link and one-line summary), and log.md is a chronological record of what happened and when.

Raw Sources

Your curated collection of source documents: articles, papers, images, data files. These are immutable — the LLM reads from them but never modifies them. This is your source of truth. Use Obsidian Web Clipper (a browser extension) to convert web articles to markdown and drop them into your raw collection. Optionally download images locally so the LLM can reference them directly.


Three Operations

Everything you do with an LLM Wiki falls into one of three categories.

Ingest

Add a source and watch the wiki grow

1.Drop a new source into the raw collection
2.Tell the LLM to process it
3.LLM reads the source and discusses key takeaways with you
4.Writes a summary page in the wiki
5.Updates the index and relevant entity/concept pages
6.Appends an entry to the log — a single source might touch 10–15 pages

Query

Ask questions against compiled knowledge

1.Ask a question against the wiki
2.LLM reads index.md to find relevant pages
3.Drills into those pages and synthesizes an answer
4.Answer can take different forms: markdown page, comparison table, chart, slide deck
5.Good answers get filed back into the wiki as new pages — explorations compound

Lint

Health-check and improve over time

1.Ask the LLM to health-check the wiki periodically
2.Find contradictions between pages
3.Detect stale claims that newer sources have superseded
4.Identify orphan pages with no inbound links or missing cross-references
5.Suggest new questions to investigate and new sources to look for

What Can You Build?

Click any card to expand the full use case.


Why This Actually Works

0
maintenance burden

LLMs don’t get bored, don’t forget to update a cross-reference, and can touch 15 files in one pass. The wiki stays maintained because the cost of maintenance is near zero.

0+
pages updated per source

A single ingested source ripples through the entire knowledge base — updating summaries, revising entity pages, strengthening or challenging the evolving synthesis.

0%
cross-references maintained

Every connection between pages is checked and updated. No orphans, no stale links, no forgotten relationships.

The human's job is to curate sources, direct the analysis, ask good questions, and think about what it all means. The LLM's job is everything else.

— Andrej Karpathy

From Memex to LLM Wiki

1945
The Memex

Vannevar Bush envisioned a personal, curated knowledge store with associative trails between documents. His vision was closer to this than to what the web became: private, actively curated, with the connections as valuable as the documents.

2023
RAG

Retrieval-Augmented Generation. Upload documents, retrieve chunks, generate answers. Works, but no accumulation. Knowledge is re-derived every time.

2026
LLM Wiki

The LLM incrementally builds and maintains a persistent wiki. The part Bush couldn’t solve — who does the maintenance — the LLM handles.


gentic.news: An LLM Wiki at Scale

We built an LLM Wiki before the term existed. Here is how gentic.news maps to Karpathy's three-layer architecture.

Raw Sources48+ RSS feeds, Twitter/X, Reddit, YouTube, arXiv
The WikiKnowledge Graph: 3,200+ entities, 3,500+ relationships, 2,800+ articles
The SchemaLiving Agent: 23 cycle types, 8,200+ lines, runs every 60 minutes

Concrete example

When OpenAI raises $122B at $852B valuation, the Living Agent: updates OpenAI's entity page, creates a timeline event, adjusts competitive relationships with Anthropic and Google, checks 3 active predictions for new evidence, and updates the weekly intelligence briefing. All automatically. One source touches 10+ wiki pages — exactly the pattern Karpathy describes.


The Toolkit

Karpathy's recommended stack. All open-source or freely available.

💎

Obsidian

Local-first markdown vault. Pages are plain .md files — searchable, portable, no lock-in. Graph view maps your knowledge network.

📎

Web Clipper

Browser extension that captures web pages as clean markdown. Strips ads and boilerplate for clean source documents.

qmd

CLI tool converting PDF, DOCX, EPUB, HTML into LLM-ready markdown. Preserves headings, tables, and code blocks.

🎥

Marp

Converts wiki markdown directly into presentation slide decks. Wiki pages become polished slides in seconds.

🔍

Dataview

Obsidian plugin for querying your wiki like a database. SQL-like power for dynamic tables and cross-page summaries.

🤖

Claude + CLAUDE.md

The LLM engine with schema-driven maintenance. Claude reads CLAUDE.md for conventions, then follows them for ingest, query, and lint.


Start Building Your LLM Wiki

Fork the idea. Adapt the pattern. The tools are free, the concept is open, and the LLM does the heavy lifting. Start with one topic you care about, add a few sources, and watch the wiki grow.

Built by gentic.news — the AI intelligence platform that lives this pattern every day.