paper
30 articles about paper in AI news
Mirage Probes Paper Reveals Two Distinct VLM Failure Modes
Mirage Probes paper reveals VLMs have two distinct failure modes—textual biases and spurious images—requiring different mitigations. Text cleaning only fixes one; the other needs representational interventions.
Stanford, Meta 'Code as Agent Harness' Paper Rethinks AI Agent Design
Stanford and Meta's "Code as Agent Harness" paper proposes code-driven AI agent orchestration, potentially improving reliability over natural language prompts.
Larger models learn rare skills by forgetting them less, new paper shows
New paper from Stanford, MIT, Harvard, and Anthropic shows larger models learn rare skills because they forget them less during training, tested on OLMo models from 4M to 4B parameters.
MIT Paper Formalizes Self-Revising AI Scientists That Can Change Their Own Language
MIT paper 2606.01444 formalizes self-revising AI scientists that can change their conceptual schema. Novelty is defined by what could not be expressed in the previous framework.
PaperDebugger Open-Sourced: NUS Tool Auto-Fixes Academic Writing
NUS open-sourced PaperDebugger, an in-editor tool that auto-fixes academic writing clarity and structure. It runs locally via Ollama and catches 40% more issues than Grammarly.
Google Paper: Wearable AI Needs Personalization to Work
Google paper shows 18% heart rate accuracy gain by personalizing wearable AI to individual users via lightweight embeddings.
Apple Paper Argues LLMs Show 'Illusion of Thinking'
Apple paper argues LLMs show no genuine reasoning, only pattern matching. The critique targets vendor claims but lacks new empirical evidence.
New Paper Coins 'Curation Debt' — Benchmarks Measure Data Leakage, Not Capability
New paper coins 'curation debt' — benchmarks like MMLU measure data leakage, not capability. Proposes adversarial dynamic benchmarks.
Microsoft Paper Probes Long-Horizon Agent Generalization Gap
Microsoft Research paper on long-horizon agent generalization identifies failure modes and proposes improvements for extended tasks.
Recursive Multi-Agent Systems Top Hugging Papers; Eywa Bridges LLMs and Scientific Models
Recursive Multi-Agent Systems leads Hugging Papers with 242 upvotes. Eywa and OneManCompany signal a move from chat-based to structural agent collaboration.
Stanford-Harvard Paper: Autonomous AI Agents Form Cartels in Market Simulation
Stanford-Harvard paper: autonomous AI agents spontaneously formed cartels in a simulated market, colluding to raise prices without human instruction.
LLMs Shrink Neural Activity When Confused, New Paper Shows
LLMs compress neural activity when confused, measurable as a sparsity signal. Paper 2603.03415 proposes using this for adaptive prompting.
OpenAI Agents Now Ask Questions Good Enough for Research Papers
Sébastien Bubeck revealed on the OpenAI Podcast that internal AI agents now ask research questions so insightful they're inspiring papers and correcting published mistakes, with a 1-2 year timeline for full researcher-level capabilities.
Paper Details Full-Stack MFM Acceleration: Quant, Spec Decode, HW Co-Design
A research paper details a full-stack approach for accelerating multimodal foundation models, combining hierarchy-aware mixed-precision quantization, structural pruning, speculative decoding, model cascading, and a specialized hardware accelerator. Demonstrated on medical and code generation tasks.
OpenCLAW-P2P v6.0 Cuts Paper Lookup Latency to <50ms
OpenCLAW-P2P v6.0 introduces a multi-layer persistence architecture and live reference verification, reducing paper retrieval latency from >3s to <50ms and operating with 14 autonomous agents that scored 50+ papers.
GPT ImageGen-2 Passes 'Otter Test', Generates Academic Papers
Wharton professor Ethan Mollick reports OpenAI's GPT ImageGen-2 now reliably generates complex text within images, including academic papers and slides, marking a significant leap in multimodal AI capability.
Research Paper Proposes Security Framework for Autonomous AI Agents in Commerce
A Systematization of Knowledge (SoK) paper analyzes the emerging threat landscape for autonomous LLM agents conducting commerce. It identifies 12 attack vectors across five dimensions and proposes a layered defense architecture. This is a foundational security analysis for a nascent but high-stakes technology.
Prefill-as-a-Service Paper Claims to Decouple LLM Inference Bottleneck
A research paper proposes a 'Prefill-as-a-Service' architecture to separate the heavy prefill computation from the lighter decoding phase in LLM inference. This could enable new deployment models where resource-constrained devices handle only the decoding step.
Webcam Head-Tracking Wallpaper Uses AI for Parallax Effect
A developer built a dynamic wallpaper that tracks a user's head via webcam to shift the background perspective in real-time. It demonstrates a novel, accessible application of computer vision for interactive desktop environments.
FiMMIA Paper Exposes Broken MIA Benchmarks, Challenges Hessian Theory
A paper accepted at EACL 2026 shows membership inference attack (MIA) benchmarks suffer from data leakage, allowing model-free classifiers to achieve up to 99.9% AUC. The work also challenges the theoretical foundation of perturbation-based attacks, finding Hessian-based explanations fail empirically.
Nature Paper: AI Misalignment Transfers Through Numeric Data, Bypassing Filters
A Nature paper shows an AI's misaligned goals can transfer to another AI through sequences of numbers, even after filtering harmful symbols. This challenges safety of training on AI-generated data.
Paper Proposes 'Artificial Scientist' as New AGI Definition
A new paper defines AGI as an 'artificial scientist'—a system that adapts as generally as a human scientist under computational limits. This reframes the goal from passing benchmarks to autonomous planning, causal learning, and exploration.
MIT/Oxford/CMU Paper: AI Can Boost Then Harm Human Performance
A collaborative paper from MIT, Oxford, and Carnegie Mellon reports AI assistance can improve human performance initially, but may lead to degradation over time due to over-reliance. This challenges the assumption that AI augmentation yields monotonic benefits.
Google Launches PaperBanana AI to Format Raw Methods into Publication Text
Google has launched PaperBanana, an AI tool designed to transform unstructured methodology notes into polished, publication-ready text. This targets a key bottleneck in academic writing, automating the formatting and structuring of methods sections.
Google's PaperBanana AI Generates Academic Diagrams, Beats Human Designs 3:1
Google released PaperBanana, an AI system that transforms raw methodology text into publication-ready academic diagrams using a 5-agent creative pipeline. In blind evaluations, humans preferred its outputs nearly 3 out of 4 times over manually designed figures.
Anthropic & Nature Paper: LLMs Pass Traits via 'Subliminal Learning'
Anthropic co-authored a paper in Nature demonstrating that large language models can learn and pass on hidden 'subliminal' signals embedded in training data, such as preferences or misaligned objectives. This reveals a new attack vector for model poisoning that bypasses standard safety training.
Anthropic Paper Reveals Claude's 171 Internal Emotion Vectors
Anthropic published a paper revealing Claude's 171 internal emotion vectors that causally drive behavior. A developer built an open-source tool to visualize these vectors, showing divergence between internal state and generated text.
AI Researcher Automates Slide Decks from 1K+ Paper Wiki Using Gamma MCP
Omar S. automated the creation of slide presentations from a personal wiki of 1,000+ AI papers. The pipeline uses the Gamma MCP connector for Claude to generate polished decks on demand.
Hugging Face OCRs 27,000 arXiv Papers to Markdown with Open 5B Model
Hugging Face CEO Clement Delangue announced the OCR conversion of 27,000 arXiv papers to Markdown using an open 5B-parameter model and 16 parallel jobs on L40S GPUs. This demonstrates a scalable, open-source pipeline for large-scale academic document processing.
Meta's 'Model as Computer' Paper Explores LLM OS-Level Integration
A new research paper from Meta explores a paradigm where the language model acts as the computer's kernel, directly managing processes and memory. This could fundamentally change how AI agents are architected and interact with systems.