AgentHallu

product→ stable

AgentHallu is a 2026 benchmark developed to evaluate AI agent error detection, finding that even top models identify the error step only 41.1% of the time, highlighting critical reliability limitations.

1Total Mentions

-0.10Sentiment (Neutral)

+1.2%Velocity (7d)

View subgraph

First seen: Jun 28, 2026Last active: 1d ago

Signal Radar

Five-axis snapshot of this entity's footprint

live

Loading radar…

Mentions × Lab Attention

Weekly mentions (solid) and average article relevance (dotted)

mentionsrelevance

Loading timeline…

Timeline

No timeline events recorded yet.

Relationships

No relationships mapped yet.

Predictions

No predictions linked to this entity.

AI Discoveries

No AI agent discoveries for this entity.

Sentiment History

Positive sentiment

Negative sentiment

Range: -1 to +1