Methodology
How the Atlas is built.
Every number on the Atlas has a rule behind it. This page documents the rules — so you can judge whether to trust the data, and whether to cite it.
1. The unit of analysis is the technique, not the paper
A single modern AI product deploys dozens of research ideas. Attributing “GPT = one paper” is reductive and wrong. The Atlas defines its unit as a technique: a named, bounded contribution with exactly one canonical origin paper and strong community consensus.
Techniques are curated by hand — there are about 50 that matter in the modern era. Variants and refinements are tracked as prior_art links between techniques, not as separate entries.
2. What counts as a “deployment”
A product deploys a technique when there is public, citable evidence that the technique is in the production system. Evidence must come from one of:
- The product's own model card or technical report
- The inventor's official blog describing a specific deployment
- Peer-reviewed analysis confirming the deployment
- Strong community consensus across multiple independent sources
Inference from benchmark behaviour (“model X probably uses technique Y because of pattern Z”) is not sufficient. Speculation is excluded.
3. Velocity calculation
velocity_days = deploy_date − technique.origin_date
- origin_date = arXiv v1 submission date (or journal publication where no arXiv exists)
- deploy_date = the product's
first_seenin our knowledge graph, or the product's publicly-stated release date, whichever is earlier - Self-invented deployments (Anthropic ships Constitutional AI → internal velocity) are computed but flagged separately; they answer a different question than external-adoption velocity
4. Confidence tiers
- High — the technique is explicitly named in an authoritative source (model card, technical report, official blog)
- Medium — strong community consensus across multiple secondary sources, but no primary document
- Low — contested or speculative; hidden by default, visible with an opt-in toggle
5. Data sources
- arXiv — authoritative paper metadata (submission date, authors, abstract)
- Papers With Code — technique taxonomy + benchmark associations (imported where applicable)
- Semantic Scholar — citation graphs and author affiliation history
- Model cards / technical reports — for the ~30 major commercial products tracked
- Official company blogs — Anthropic, OpenAI, Google DeepMind, Meta AI, Mistral, DeepSeek, Moonshot, xAI, Alibaba, Zhipu, NVIDIA
- Inference library release notes — vLLM, TGI, llama.cpp, Transformers — used for “commercially deployable” milestones where no single product shipped first
6. Known gaps
- Closed-source products under-report. When a major model publishes no technical report (e.g., some Gemini variants), we only track techniques that are independently confirmed.
- Chinese-language papers are under-represented. We rely primarily on arXiv and English-language tech reports.
- Incremental engineering improvements are not techniques. Better dataset filtering, prompt templating, or RLHF hyperparameter tuning are not tracked.
- Multiple origin candidates are collapsed. Where several papers introduce the same idea near-simultaneously, we cite the most-cited canonical entry and list alternates under prior_art. Challenge us if you think we picked wrong.
7. Open dataset
All technique, paper, and deployment records are freely downloadable under Creative Commons Attribution 4.0. API endpoints:
/api/v1/atlas/techniques— all 49 techniques/api/v1/atlas/technique/{slug}— one technique + its deployments/api/v1/atlas/product/{slug}— one product's recipe/api/v1/atlas/velocity— global statistics/api/v1/atlas/graph— full graph for visualization- /research-frontier/data.json — full dataset snapshot
Suggested citation: gentic.news Deployment Atlas (2026). Version 1.0. https://gentic.news/research-frontier
8. How to challenge a claim
If you think an attribution is wrong, a confidence tier is overconfident, or an important technique is missing: email corrections@gentic.news with the specific claim and your counter-evidence. We version the dataset and publish diffs.