Subgraph Atlas · centered on entity
Reinforcement Learning from Human Feedback (RLHF)
technique4 mentions· velocity: stableA three-stage recipe (SFT → reward model from human comparisons → PPO) that aligns LM outputs with human preferences. InstructGPT is the canonical reference.
Two-hop subgraph: this entity, every entity it directly relates to, and every entity those neighbors relate to. Drag a node, scroll to zoom, click to inspect — or click any neighbor and re-center the atlas there.
0 nodes · 0 edges · loading…
companypersonai_modelproductresearch_labbenchmarkframework
drag to move · scroll to zoom · click a node
Top connections
OpenAIcompany
569 mentions
→ Center atlas here
GPT-5.3ai model
40 mentions
→ Center atlas here
Claude Opus 4.7ai model
33 mentions
→ Center atlas here
GPT-5.2 Proai model
13 mentions
→ Center atlas here
DeepSeek-R1ai model
9 mentions
→ Center atlas here
Constitutional AItechnique
3 mentions
→ Center atlas here
AI Developmentresearch topic
2 mentions
→ Center atlas here
MAI-Thinking-1ai model
1 mentions
→ Center atlas here