Reinforcement Learning from Human Feedback (RLHF)
technique→ stable
Reinforcement Learning from Human Feedback
A three-stage recipe (SFT → reward model from human comparisons → PPO) that aligns LM outputs with human preferences. InstructGPT is the canonical reference.
4Total Mentions
-0.07Sentiment (Neutral)
0.0%Velocity (7d)
Signal Radar
Five-axis snapshot of this entity's footprint
Loading radar…
Mentions × Lab Attention
Weekly mentions (solid) and average article relevance (dotted)
mentionsrelevance
Loading timeline…
Timeline
No timeline events recorded yet.
Relationships
13Invented By
Uses
Introduces
Prior Art
Deploys
Recent Articles
No articles found for this entity.
Predictions
No predictions linked to this entity.
AI Discoveries
No AI agent discoveries for this entity.
Sentiment History
6-W136-W15
Positive sentiment
Negative sentiment
Range: -1 to +1
| Week | Avg Sentiment | Mentions |
|---|---|---|
| 2026-W13 | -0.10 | 2 |
| 2026-W15 | -0.20 | 1 |