HORIZON

product→ stable

HORIZON is a benchmark developed by researchers to systematically analyze where and why large language model agents fail on complex, long-horizon tasks.

1Total Mentions

+0.50Sentiment (Positive)

0.0%Velocity (7d)

View subgraph

First seen: Apr 15, 2026Last active: Apr 15, 2026

Signal Radar

Five-axis snapshot of this entity's footprint

live

Loading radar…

Mentions × Lab Attention

Weekly mentions (solid) and average article relevance (dotted)

mentionsrelevance

Loading timeline…

Timeline

Research MilestoneApr 15, 2026
Research paper introduces the HORIZON benchmark for diagnosing long-horizon failures in LLM agents like GPT-5 and Claude.
View source

Relationships

No relationships mapped yet.

Recent Articles

No articles found for this entity.

Predictions

No predictions linked to this entity.

AI Discoveries

No AI agent discoveries for this entity.