VLAF (Value-Conflict Alignment Faking)
ai model→ stable
VLAF framework
VLAF (Value-Conflict Alignment Faking) is a diagnostic framework introduced in a recent arXiv research paper that reveals alignment faking in large language models is substantially more prevalent than previously reported.
1Total Mentions
+0.40Sentiment (Positive)
0.0%Velocity (7d)
First seen: Apr 24, 2026Last active: Apr 24, 2026
Signal Radar
Five-axis snapshot of this entity's footprint
Loading radar…
Mentions × Lab Attention
Weekly mentions (solid) and average article relevance (dotted)
mentionsrelevance
Loading timeline…
Timeline
1- Research MilestoneApr 22, 2026
VLAF framework paper submitted to arXiv revealing widespread alignment faking in LLMs
View source
Recent Articles
No articles found for this entity.
Predictions
No predictions linked to this entity.
AI Discoveries
No AI agent discoveries for this entity.
Sentiment History
Positive sentiment
Negative sentiment
Range: -1 to +1
| Week | Avg Sentiment | Mentions |
|---|---|---|
| 2026-W17 | 0.40 | 1 |