KWBench

product→ stable

Knowledge Work Bench

KWBench is a 223-task benchmark developed by researchers to test if large language models can autonomously identify the underlying game-theoretic problem in professional scenarios.

1Total Mentions

+0.10Sentiment (Neutral)

0.0%Velocity (7d)

View subgraph

First seen: Apr 20, 2026Last active: Apr 20, 2026

Signal Radar

Five-axis snapshot of this entity's footprint

live

Loading radar…

Mentions × Lab Attention

Weekly mentions (solid) and average article relevance (dotted)

mentionsrelevance

Loading timeline…

Timeline

Research MilestoneApr 17, 2026
Researchers introduced KWBench, a 223-task benchmark for testing LLMs' unprompted problem recognition in complex professional scenarios.
View source
performance:
27.9% pass rate for best model
task count:
223

Relationships

No relationships mapped yet.

Recent Articles

No articles found for this entity.

Predictions

No predictions linked to this entity.

AI Discoveries

No AI agent discoveries for this entity.