Skip to content
gentic.news — AI News Intelligence Platform

The Deployment Atlas

When AI research reaches production.

For every foundational AI technique of the modern era — transformers, RLHF, FlashAttention, Constitutional AI, speculative decoding, DPO, MoE — we track the origin paper, the first commercial deployment, and the velocity between. Every edge is sourced. Every claim is evidenced. The full dataset is free and open.

192
Deployments tracked

Technique × product pairs, each with sourced evidence.

4y
Median research → prod

Typical lag from origin paper to first commercial deploy.

49
Canonical techniques

Hand-curated with a single origin paper each. No paper-product 1:1 lies.

Fastest deploy ever

Llama 4 Maverick shipped YaRN RoPE Context Extension in 583 days.

Slowest deploy

Kimi K2.6 shipped Mixture of Experts (Sparse MoE for LLMs) 9y after the origin paper.

Every canonical technique

Grouped by category. Click any card for origin paper, deployment timeline, and prior art.

Methodology →

alignment · 9 techniques

Deep RL from Human Preferences

OpenAI · 2017-06

0

Learning reward functions from pairwise human comparisons rather than hand-coded rewards. The direct precursor to RLHF.

Red-Teaming with Preference Models

Google DeepMind · 2022-02

0

Using an LM to generate adversarial prompts that elicit harmful behavior, scaling safety evaluation far beyond human red-teaming.

Reinforcement Learning from Human Feedback (RLHF)

OpenAI · 2022-03

3

A three-stage recipe (SFT → reward model from human comparisons → PPO) that aligns LM outputs with human preferences. InstructGPT is the canonical reference.

Constitutional AI

Anthropic · 2022-12

7

Training harmless assistants using a written constitution of principles and an AI-generated critique/revision loop rather than human labels for every case.

Direct Preference Optimization (DPO)

Stanford · 2023-05

0

Aligning LMs to preference data by directly optimizing a closed-form likelihood ratio, eliminating the reward model and RL loop of RLHF.

RLAIF (Reinforcement Learning from AI Feedback)

Google · 2023-09

0

Using an off-the-shelf LLM to generate preference labels, scaling preference learning without human annotators.

Identity Preference Optimization (IPO)

Google DeepMind · 2023-10

0

A preference-optimization variant that avoids DPO's over-fitting by adding an explicit regularizer.

Self-Rewarding Language Models

Meta AI · 2024-01

0

Iterative alignment where the LM judges its own outputs using an LLM-as-a-judge prompt, removing human-labeled preferences from the loop.

KTO (Kahneman-Tversky Optimization)

Contextual AI · 2024-02

0

Alignment method that treats individual completions as binary-good/bad signals (no preference pairs needed) inspired by prospect theory.

Open dataset

Every technique, paper, and deployment is freely available under CC BY 4.0. API endpoint: /api/v1/atlas/techniques. Cite us as: gentic.news Deployment Atlas (2026).