survey paper

30 articles about survey paper in AI news

Survey Paper 'The Latent Space' Maps Evolution from Token Generation to Latent Computation in Language Models

Researchers have published a comprehensive survey charting the evolution of language model architectures from token-level autoregression to methods that perform computation in continuous latent spaces. This work provides a unified framework for understanding recent advances in reasoning, planning, and long-context modeling.

Apr 3, 202685% relevant

Omar Saadoun's PaperWiki AI Agents Now Generate Personalized Research Surveys

Omar Saadoun announced that his PaperWiki platform now uses AI agents to generate personalized survey papers from a user's LLM-generated knowledge base. These surveys are self-improving and update automatically as new papers are published.

Apr 10, 202685% relevant

AI Memory Survey: Three Systems Needed for Human-Like Recall

A new survey paper proposes that modern AI requires three distinct memory systems—parametric, retrieval, and agent memory—to achieve human-like cognition, highlighting control as the key bottleneck.

Apr 28, 202680% relevant

Beyond Sequence Generation: The Emergence of Agentic Reinforcement Learning for LLMs

A new survey paper argues that LLM reinforcement learning must evolve beyond narrow sequence generation to embrace true agentic capabilities. The research introduces a comprehensive taxonomy for agentic RL, mapping environments, benchmarks, and frameworks shaping this emerging field.

Mar 7, 202685% relevant

239-Paper Survey Maps How AI Agents Self-Improve via Scaffold Updates

A survey of 239 papers shows 68% of AI agent self-improvement methods focus on scaffold updates rather than model retraining, raising evaluation quality concerns.

Jul 19, 202685% relevant

100+ Papers Surveyed: LLMs' Metacognition Gap

A systematic survey of 100+ papers reveals gaps in LLM metacognition, including 10-30% miscalibration in top models like GPT-4 and Claude 3.

Jul 19, 202675% relevant

Robots Learn Self-Supervised Progress Tracking via Reward Modeling Survey

Survey unifies progress reward modeling for robots to self-assess advancement, stagnation, or regression during tasks, replacing binary success signals.

Jul 28, 202682% relevant

World Action Models Survey Unifies 100+ Methods Under One Taxonomy

A survey reviews 100+ world action models, unifying world models, video generation, and VLA policies under one taxonomy.

Jun 27, 202687% relevant

111-Page Survey Maps 5 AGI Levels: Responder to Ecosystem

111-page survey from US/China labs defines 5 AGI levels, argues epistemic exploration — not better answering — is key. Challenges scaling orthodoxy.

Jun 9, 202694% relevant

Meta-Stanford Survey: Code as Agent Harness Improves AI Reasoning

Meta, Stanford, Illinois survey argues AI agents work better with code as their main working layer, calling it an agent harness.

May 25, 202689% relevant

40-Author Survey Unveils 'Levels × Laws' Framework for Agent World Models

A 40-author survey introduces a 'levels × laws' framework for world models in AI agents, spanning 3 capability levels and 4 law regimes, synthesizing 400+ works. It provides a shared vocabulary for designing and evaluating world models across traditionally siloed research communities.

Apr 27, 202685% relevant

Anthropic Survey: 81,000 People Rank AI Economic Hopes & Fears

Anthropic published new research analyzing the economic hopes and worries expressed by 81,000 people in a prior survey on AI. The findings aim to guide AI development toward public priorities.

Apr 22, 202685% relevant

IBM Research Survey Proposes Framework for Optimizing LLM Agent Workflows

IBM researchers published a comprehensive survey categorizing approaches to LLM agent workflow optimization along three dimensions: when structure is determined, which components get optimized, and what signals guide optimization.

Mar 27, 202699% relevant

Pseudo Label NCF: A Novel Approach to Cold-Start Recommendation Using Survey Data and Dual Embeddings

New research introduces Pseudo Label NCF, a method that enhances Neural Collaborative Filtering for extreme data sparsity. It uses survey-derived 'pseudo labels' to create dual embedding spaces, improving ranking accuracy while revealing a trade-off between embedding separability and performance.

Mar 27, 202676% relevant

Survey Benchmarks Four Approaches to Synthetic Brain Signal Generation for BCI Data Scarcity

A comprehensive survey categorizes and benchmarks four methodological approaches to generating synthetic brain signals for BCIs, addressing data scarcity and privacy constraints. The authors provide an open-source codebase for comparing knowledge-based, feature-based, model-based, and translation-based generative algorithms.

Mar 16, 202684% relevant

Prithvi-EO Fails Cross-Country Crop Yield Generalization, Paper Shows

Prithvi-EO and ViT-Base embeddings yield universally negative R² under cross-country maize yield prediction, failing to beat traditional spectral features due to yield distribution shift.

May 12, 202672% relevant

arXiv Survey Maps KV Cache Optimization Landscape: 5 Strategies for Million-Token LLM Inference

A comprehensive arXiv review categorizes five principal KV cache optimization techniques—eviction, compression, hybrid memory, novel attention, and combinations—to address the linear memory scaling bottleneck in long-context LLM inference. The analysis finds no single dominant solution, with optimal strategy depending on context length, hardware, and workload.

Mar 24, 202695% relevant

90 Hours of Black Myth: Wukong Fuel New World Model Benchmark

A new survey and benchmark rethinks interactive world models as game engines, with a data engine collecting over 90 hours of Black Myth: Wukong gameplay.

Jul 19, 202678% relevant

New CASIA Benchmark Exposes Fragmented Face Swapping Evaluation

CASIA researchers released a face swapping survey and benchmark on April 27, 2026, aiming to standardize evaluation across fragmented GAN and diffusion model methods.

May 5, 202674% relevant

LLM-Based Customer Digital Twins Predict Preferences with 87.7% Accuracy

A new arXiv paper proposes using LLM-based 'customer digital twins' (CDTs) — agents built from individual Reddit review histories via RAG — to perform conjoint analysis. The CDTs predict actual user preferences with 87.73% accuracy in a computer monitor case study, offering a scalable alternative to traditional market research.

Apr 28, 202680% relevant

Study: People Rely on AI for Medical Advice, But Quality Evidence Lags

A new paper reveals people are frequently using AI for medical advice, but most research uses outdated models and lacks comparison to the non-AI information people would otherwise seek.

Apr 19, 202685% relevant

OpenAI Proposes 4-Day Week, Robot Tax Amid Rising Anti-AI Violence

Following violent attacks on CEO Sam Altman, OpenAI has published a policy paper proposing a new social contract, including a four-day workweek and AI dividends, to address rising public anxiety over AI's societal impact.

Apr 15, 202695% relevant

Rank, Don't Generate: A New Benchmark for Factual, Ranked Explanations in Recommendation Systems

A new research paper formalizes explainable recommendation as a statement-level ranking problem, not a generation task. It introduces the StaR benchmark, built from Amazon reviews, showing that simple popularity baselines can outperform state-of-the-art models in personalized explanation ranking.

Apr 7, 202688% relevant

How Academics Are Using CLAUDE.md to Automate Research Code

A new presentation reveals how researchers use Claude Code's CLAUDE.md to automate literature reviews, data analysis, and paper writing workflows.

Mar 22, 202695% relevant

The Next Frontier for Self-Driving Cars: Teaching AI to Think Like a Human

A new survey argues that autonomous driving's biggest hurdle is no longer perception but a lack of robust reasoning. The integration of large language models offers a path forward but creates a critical tension between slow deliberation and split-second safety.

Mar 13, 202681% relevant

METR's 'Expenditure Horizon': AI Agents Break Even at $3,300

METR's expenditure horizon metric shows AI agents break even at $0–$3,300 on NanoGPT, vs $2,500 per 1% speedup for humans. GPT-5 and Opus-4.1 pro lead, but blind spots remain.

Jul 27, 202690% relevant

Beijing Unveils 10-Measure Agent AI Policy With Token Economy

Beijing's July 2026 Agent AI policy introduces 10 measures including token economy infrastructure, signaling regulated economic framework for autonomous AI systems.

Jul 23, 2026100% relevant

Open-Source Course Shows Harness, Not Model, Lifts Coding Agent 25 Places

Open-source course shows harness engineering, not model swap, moved a coding agent from ~30th to top 5 on Terminal-Bench. Course builds Decode from scratch.

Jul 23, 202689% relevant

Lilian Weng Argues Harness Design, Not Model Rewrites, Is Path to RSI

Lilian Weng argues RSI starts with harness design, not model rewrites, citing Sakana AI's The AI Scientist in Nature 2026 and two other projects.

Jul 7, 202694% relevant

OpenAI's ChatGPT 'Dreaming' Memory Retains Preferences Across Sessions

OpenAI launched a dreaming memory system for ChatGPT that retains user preferences across conversations by compressing and replaying session data, enabling persistent personalization.

Jun 5, 2026100% relevant

Explore More

AI Agents Large Language Models Claude Code OpenAI RAG MCP Fine-tuning Benchmarks Open Source AI AI Safety