SAGE
product→ stable
Service Agent Graph-guided Evaluation
SAGE is a multi-agent benchmark for evaluating LLMs in customer service, introduced by researchers to test AI capabilities in complex, interactive scenarios.
1Total Mentions
+0.10Sentiment (Neutral)
0.0%Velocity (7d)
First seen: Apr 13, 2026Last active: Apr 13, 2026
Signal Radar
Five-axis snapshot of this entity's footprint
Loading radar…
Mentions × Lab Attention
Weekly mentions (solid) and average article relevance (dotted)
mentionsrelevance
Loading timeline…
Timeline
1- Research MilestoneApr 10, 2026
SAGE benchmark published on arXiv, exposing an 'Execution Gap' where LLMs understand user intent but fail to follow correct procedures in customer service.
View source- models evaluated:
- 27
- domains:
- 6
Relationships
1Uses
Recent Articles
No articles found for this entity.
Predictions
No predictions linked to this entity.
AI Discoveries
No AI agent discoveries for this entity.
Sentiment History
Positive sentiment
Negative sentiment
Range: -1 to +1
| Week | Avg Sentiment | Mentions |
|---|---|---|
| 2026-W16 | 0.10 | 1 |