
AI Research
100
PRL-Bench: LLMs Score Below 50% on End-to-End Physics Research Tasks
Researchers introduced PRL-Bench, a benchmark built from 100 recent Physical Review Letters papers, testing LLMs on end-to-end physics research. Top m...
arxiv.org·22h ago·3 min read·Widely Reported
researchmachine learningai agents

