The LLM Wiki
A pattern for building personal knowledge bases using LLMs.
Instead of RAG — where every query re-derives knowledge from raw chunks — the LLM incrementally builds a persistent wiki. When you add a source, the LLM reads it, extracts information, and integrates it into existing pages. Cross-references are built. Contradictions are flagged. Knowledge compounds over time. You never write the wiki yourself — the LLM writes and maintains all of it.
Read Karpathy's original gistThe Core Insight
How RAG Works
- Every query starts from scratch — the LLM has no memory of previous questions or synthesis
- Documents are split into chunks and stored as embeddings — no structure, no relationships
- If two documents contradict each other, the system has no way to detect or flag it
- Ask a question requiring synthesis across 5 documents — the LLM must find and piece together fragments every time
- Nothing accumulates. NotebookLM, ChatGPT file uploads, and most RAG systems work this way.
How LLM Wiki Works
- Knowledge is compiled once into structured markdown pages, then kept current — not re-derived
- When you add a new source, the LLM updates entity pages, revises summaries, and notes where new data contradicts old claims
- Cross-references are already built. Contradictions are already flagged. Synthesis reflects everything you’ve read.
- Good answers get filed back into the wiki — your explorations compound just like ingested sources
- The wiki is a persistent, compounding artifact. It keeps getting richer with every source and every question.
Side-by-Side Comparison
| Dimension | RAG | LLM Wiki |
|---|---|---|
| Knowledge state | Re-derived every query from raw chunks | Compiled once into structured pages, kept current |
| Cross-references | None — chunks are isolated vectors | Built automatically when new sources arrive |
| Contradictions | Go undetected until a user notices | Flagged during ingestion by the LLM |
| Maintenance cost | Zero effort, but zero value accumulation | Zero effort, but compounding value over time |
| Best scale | Any number of documents | Sweet spot ~100–500 sources with deep synthesis |
| Query quality | Depends on chunk retrieval quality | Benefits from pre-built synthesis and connections |
| Human role | Upload documents, ask questions | Curate sources, direct analysis, ask good questions |
Architecture: Three Layers
Every LLM Wiki has the same three-layer structure. The LLM bridges them.
The Schema
A document — CLAUDE.md for Claude Code, AGENTS.md for Codex — that tells the LLM how the wiki is structured. It defines the conventions, page formats, and workflows the LLM should follow when ingesting sources, answering questions, or maintaining the wiki. This is the key configuration file: it’s what makes the LLM a disciplined wiki maintainer rather than a generic chatbot. You and the LLM co-evolve this over time as you figure out what works for your domain.
The Wiki
A directory of LLM-generated markdown files: summaries, entity pages, concept pages, comparisons, an overview, a synthesis. The LLM owns this layer entirely — it creates pages, updates them when new sources arrive, maintains cross-references, and keeps everything consistent. You read it; the LLM writes it. Two special files help navigation: index.md is a content catalog (each page with a link and one-line summary), and log.md is a chronological record of what happened and when.
Raw Sources
Your curated collection of source documents: articles, papers, images, data files. These are immutable — the LLM reads from them but never modifies them. This is your source of truth. Use Obsidian Web Clipper (a browser extension) to convert web articles to markdown and drop them into your raw collection. Optionally download images locally so the LLM can reference them directly.
Three Operations
Everything you do with an LLM Wiki falls into one of three categories.
Ingest
Add a source and watch the wiki grow
Query
Ask questions against compiled knowledge
Lint
Health-check and improve over time
What Can You Build?
Click any card to expand the full use case.
Why This Actually Works
LLMs don’t get bored, don’t forget to update a cross-reference, and can touch 15 files in one pass. The wiki stays maintained because the cost of maintenance is near zero.
A single ingested source ripples through the entire knowledge base — updating summaries, revising entity pages, strengthening or challenging the evolving synthesis.
Every connection between pages is checked and updated. No orphans, no stale links, no forgotten relationships.
“The human's job is to curate sources, direct the analysis, ask good questions, and think about what it all means. The LLM's job is everything else.
From Memex to LLM Wiki
Vannevar Bush envisioned a personal, curated knowledge store with associative trails between documents. His vision was closer to this than to what the web became: private, actively curated, with the connections as valuable as the documents.
Retrieval-Augmented Generation. Upload documents, retrieve chunks, generate answers. Works, but no accumulation. Knowledge is re-derived every time.
The LLM incrementally builds and maintains a persistent wiki. The part Bush couldn’t solve — who does the maintenance — the LLM handles.
gentic.news: An LLM Wiki at Scale
We built an LLM Wiki before the term existed. Here is how gentic.news maps to Karpathy's three-layer architecture.
Concrete example
When OpenAI raises $122B at $852B valuation, the Living Agent: updates OpenAI's entity page, creates a timeline event, adjusts competitive relationships with Anthropic and Google, checks 3 active predictions for new evidence, and updates the weekly intelligence briefing. All automatically. One source touches 10+ wiki pages — exactly the pattern Karpathy describes.
The Toolkit
Karpathy's recommended stack. All open-source or freely available.
Obsidian
Local-first markdown vault. Pages are plain .md files — searchable, portable, no lock-in. Graph view maps your knowledge network.
Web Clipper
Browser extension that captures web pages as clean markdown. Strips ads and boilerplate for clean source documents.
qmd
CLI tool converting PDF, DOCX, EPUB, HTML into LLM-ready markdown. Preserves headings, tables, and code blocks.
Marp
Converts wiki markdown directly into presentation slide decks. Wiki pages become polished slides in seconds.
Dataview
Obsidian plugin for querying your wiki like a database. SQL-like power for dynamic tables and cross-page summaries.
Claude + CLAUDE.md
The LLM engine with schema-driven maintenance. Claude reads CLAUDE.md for conventions, then follows them for ingest, query, and lint.
Start Building Your LLM Wiki
Fork the idea. Adapt the pattern. The tools are free, the concept is open, and the LLM does the heavy lifting. Start with one topic you care about, add a few sources, and watch the wiki grow.
Built by gentic.news — the AI intelligence platform that lives this pattern every day.