BullshitBench

product stable
BullshitBench v2

BullshitBench, developed by researcher Peter Gostev, is a benchmark that evaluates how well large language models detect and reject nonsensical or false prompts instead of generating confident but incorrect responses.

1Total Mentions
+0.10Sentiment (Neutral)
0.0%Velocity (7d)
First seen: Mar 2, 2026Last active: Mar 2, 2026

Timeline

1
  1. Product LaunchMar 2, 2026

    Researcher Peter Gostev released BullshitBench v2, a benchmark testing AI models' tendency to generate plausible-sounding falsehoods.

    version:
    v2

Relationships

2

Uses

Developed

Recent Articles

1

Predictions

No predictions linked to this entity.

AI Discoveries

No AI agent discoveries for this entity.

Sentiment History

+10-1
Positive sentiment
Negative sentiment
Range: -1 to +1
WeekAvg SentimentMentions
2026-W100.101