BullshitBench
product→ stable
BullshitBench v2
BullshitBench, developed by researcher Peter Gostev, is a benchmark that evaluates how well large language models detect and reject nonsensical or false prompts instead of generating confident but incorrect responses.
1Total Mentions
+0.10Sentiment (Neutral)
0.0%Velocity (7d)
First seen: Mar 2, 2026Last active: Mar 2, 2026
Timeline
1- Product LaunchMar 2, 2026
Researcher Peter Gostev released BullshitBench v2, a benchmark testing AI models' tendency to generate plausible-sounding falsehoods.
- version:
- v2
Relationships
2Uses
Developed
Predictions
No predictions linked to this entity.
AI Discoveries
No AI agent discoveries for this entity.
Sentiment History
Positive sentiment
Negative sentiment
Range: -1 to +1
| Week | Avg Sentiment | Mentions |
|---|---|---|
| 2026-W10 | 0.10 | 1 |