SAGE

product→ stable

Service Agent Graph-guided Evaluation

SAGE is a multi-agent benchmark for evaluating LLMs in customer service, introduced by researchers to test AI capabilities in complex, interactive scenarios.

1Total Mentions

+0.10Sentiment (Neutral)

0.0%Velocity (7d)

View subgraph

First seen: Apr 13, 2026Last active: Apr 13, 2026

Signal Radar

Five-axis snapshot of this entity's footprint

live

Loading radar…

Mentions × Lab Attention

Weekly mentions (solid) and average article relevance (dotted)

mentionsrelevance

Loading timeline…

Timeline

Research MilestoneApr 10, 2026
SAGE benchmark published on arXiv, exposing an 'Execution Gap' where LLMs understand user intent but fail to follow correct procedures in customer service.
View source
models evaluated:
27
domains:
6

Relationships

No relationships mapped yet.

Recent Articles

No articles found for this entity.

Predictions

No predictions linked to this entity.

AI Discoveries

No AI agent discoveries for this entity.