survey paper
30 articles about survey paper in AI news
Survey Paper 'The Latent Space' Maps Evolution from Token Generation to Latent Computation in Language Models
Researchers have published a comprehensive survey charting the evolution of language model architectures from token-level autoregression to methods that perform computation in continuous latent spaces. This work provides a unified framework for understanding recent advances in reasoning, planning, and long-context modeling.
Omar Saadoun's PaperWiki AI Agents Now Generate Personalized Research Surveys
Omar Saadoun announced that his PaperWiki platform now uses AI agents to generate personalized survey papers from a user's LLM-generated knowledge base. These surveys are self-improving and update automatically as new papers are published.
AI Memory Survey: Three Systems Needed for Human-Like Recall
A new survey paper proposes that modern AI requires three distinct memory systems—parametric, retrieval, and agent memory—to achieve human-like cognition, highlighting control as the key bottleneck.
Beyond Sequence Generation: The Emergence of Agentic Reinforcement Learning for LLMs
A new survey paper argues that LLM reinforcement learning must evolve beyond narrow sequence generation to embrace true agentic capabilities. The research introduces a comprehensive taxonomy for agentic RL, mapping environments, benchmarks, and frameworks shaping this emerging field.
111-Page Survey Maps 5 AGI Levels: Responder to Ecosystem
111-page survey from US/China labs defines 5 AGI levels, argues epistemic exploration — not better answering — is key. Challenges scaling orthodoxy.
Meta-Stanford Survey: Code as Agent Harness Improves AI Reasoning
Meta, Stanford, Illinois survey argues AI agents work better with code as their main working layer, calling it an agent harness.
40-Author Survey Unveils 'Levels × Laws' Framework for Agent World Models
A 40-author survey introduces a 'levels × laws' framework for world models in AI agents, spanning 3 capability levels and 4 law regimes, synthesizing 400+ works. It provides a shared vocabulary for designing and evaluating world models across traditionally siloed research communities.
Anthropic Survey: 81,000 People Rank AI Economic Hopes & Fears
Anthropic published new research analyzing the economic hopes and worries expressed by 81,000 people in a prior survey on AI. The findings aim to guide AI development toward public priorities.
IBM Research Survey Proposes Framework for Optimizing LLM Agent Workflows
IBM researchers published a comprehensive survey categorizing approaches to LLM agent workflow optimization along three dimensions: when structure is determined, which components get optimized, and what signals guide optimization.
Pseudo Label NCF: A Novel Approach to Cold-Start Recommendation Using Survey Data and Dual Embeddings
New research introduces Pseudo Label NCF, a method that enhances Neural Collaborative Filtering for extreme data sparsity. It uses survey-derived 'pseudo labels' to create dual embedding spaces, improving ranking accuracy while revealing a trade-off between embedding separability and performance.
Survey Benchmarks Four Approaches to Synthetic Brain Signal Generation for BCI Data Scarcity
A comprehensive survey categorizes and benchmarks four methodological approaches to generating synthetic brain signals for BCIs, addressing data scarcity and privacy constraints. The authors provide an open-source codebase for comparing knowledge-based, feature-based, model-based, and translation-based generative algorithms.
Prithvi-EO Fails Cross-Country Crop Yield Generalization, Paper Shows
Prithvi-EO and ViT-Base embeddings yield universally negative R² under cross-country maize yield prediction, failing to beat traditional spectral features due to yield distribution shift.
arXiv Survey Maps KV Cache Optimization Landscape: 5 Strategies for Million-Token LLM Inference
A comprehensive arXiv review categorizes five principal KV cache optimization techniques—eviction, compression, hybrid memory, novel attention, and combinations—to address the linear memory scaling bottleneck in long-context LLM inference. The analysis finds no single dominant solution, with optimal strategy depending on context length, hardware, and workload.
New CASIA Benchmark Exposes Fragmented Face Swapping Evaluation
CASIA researchers released a face swapping survey and benchmark on April 27, 2026, aiming to standardize evaluation across fragmented GAN and diffusion model methods.
LLM-Based Customer Digital Twins Predict Preferences with 87.7% Accuracy
A new arXiv paper proposes using LLM-based 'customer digital twins' (CDTs) — agents built from individual Reddit review histories via RAG — to perform conjoint analysis. The CDTs predict actual user preferences with 87.73% accuracy in a computer monitor case study, offering a scalable alternative to traditional market research.
Study: People Rely on AI for Medical Advice, But Quality Evidence Lags
A new paper reveals people are frequently using AI for medical advice, but most research uses outdated models and lacks comparison to the non-AI information people would otherwise seek.
OpenAI Proposes 4-Day Week, Robot Tax Amid Rising Anti-AI Violence
Following violent attacks on CEO Sam Altman, OpenAI has published a policy paper proposing a new social contract, including a four-day workweek and AI dividends, to address rising public anxiety over AI's societal impact.
Rank, Don't Generate: A New Benchmark for Factual, Ranked Explanations in Recommendation Systems
A new research paper formalizes explainable recommendation as a statement-level ranking problem, not a generation task. It introduces the StaR benchmark, built from Amazon reviews, showing that simple popularity baselines can outperform state-of-the-art models in personalized explanation ranking.
How Academics Are Using CLAUDE.md to Automate Research Code
A new presentation reveals how researchers use Claude Code's CLAUDE.md to automate literature reviews, data analysis, and paper writing workflows.
The Next Frontier for Self-Driving Cars: Teaching AI to Think Like a Human
A new survey argues that autonomous driving's biggest hurdle is no longer perception but a lack of robust reasoning. The integration of large language models offers a path forward but creates a critical tension between slow deliberation and split-second safety.
OpenAI's ChatGPT 'Dreaming' Memory Retains Preferences Across Sessions
OpenAI launched a dreaming memory system for ChatGPT that retains user preferences across conversations by compressing and replaying session data, enabling persistent personalization.
Simple Graph Heuristic Beats Generative Recommenders on 10 of 14 Benchmarks
A no-training graph heuristic beats generative recommenders on 10 of 14 benchmarks, exposing shortcut-solvable datasets. Relative NDCG@10 gains hit 44% on Amazon CDs.
Two-Tower vs Vector DB + LLM: Which Wins for RecSys at Scale?
Two-tower models offer sub-10ms latency for cold-start; vector DB + LLM provides richer semantics. Hybrid architectures reduce churn by 15-20%.
LLM Agents Will Reshape Personalization
Researchers propose that LLM-based assistants are reconfiguring how user representations are produced and exposed, requiring a shift toward inspectable, portable, and revisable user models across services. They identify five research fronts for the future of recommender systems.
Semantic Needles in Document Haystacks
Researchers developed a framework to test how LLMs score similarity between documents with subtle semantic changes. They found models exhibit positional bias, are sensitive to topical context, and produce unique scoring 'fingerprints'. This matters for any application relying on LLM-as-a-Judge for document comparison.
Study of 1,222 Users Claims ChatGPT Use Reduces Cognitive Effort
A viral social media post references a study of 1,222 people, claiming it proves ChatGPT use reduces cognitive effort. The claim lacks published methodology or data, highlighting the ongoing debate over AI's impact on human cognition.
AlphaEarth Embeddings Outperform Prithvi, Clay in Urban Signal Benchmark
Researchers benchmarked three geospatial foundation models—AlphaEarth, Prithvi, and Clay—on predicting 14 neighborhood-level urban indicators from satellite imagery. AlphaEarth's compact 64-dimensional embeddings proved most informative, achieving the highest predictive skill for built-environment-linked outcomes like chronic health burdens.
8 RAG Architectures Explained for AI Engineers: From Naive to Agentic Retrieval
A technical thread explains eight distinct RAG architectures with specific use cases, from basic vector similarity to complex agentic systems. This provides a practical framework for engineers choosing the right approach for different retrieval tasks.
The Agentic AI Reality Check: 88% Never Reach Production, Here's How to Spot the Fakes
A new analysis reveals widespread 'agent washing' in AI, with most systems labeled as agents being rebranded chatbots or automation scripts. The article provides a 5-point checklist to distinguish real, production-ready agents from marketing hype, crucial for retail leaders evaluating AI investments.
NVIDIA Releases NVPanoptix-3D on Hugging Face: Single-Image 3D Indoor Scene Reconstruction
NVIDIA has open-sourced NVPanoptix-3D, a model that reconstructs complete 3D indoor scenes—including panoptic segmentation, depth, and geometry—from a single RGB image in one forward pass.