research mode

30 articles about research mode in AI news

Claude Code's New Research Mode: How to Apply Scientific Coding Breakthroughs to Your Projects

Claude Code's Research Mode, powered by Opus 4.6, can accelerate complex scientific coding. Here's how to configure it for your own data-intensive workflows.

Mar 24, 202695% relevant

New Research Models 'Exploration Saturation' in Recommender Systems

A research paper analyzes 'exploration saturation'—the point where more diverse recommendations hurt user utility. Findings show this saturation point is user-dependent, challenging the standard practice of applying uniform fairness or novelty pressure across all users.

Apr 21, 202684% relevant

How to Use Claude Code's 'Grad Student' Research Mode for Complex Problem-Solving

Claude Code's advanced reasoning can now tackle complex research tasks like a grad student. Here's how to prompt it for 'vibe physics' and deep technical analysis.

Mar 24, 202662% relevant

AI Agents Now Training Other AI Models, Sparking Autoresearch Trend

AI agents are now being used to train other AI models, creating advanced agentic systems. This development stems from Andrej Karpathy's autoresearch repository and represents early-stage automation of AI research.

Apr 21, 202675% relevant

Shopify Engineering Teases 'Autoresearch' Beyond Model Training in 2026 Preview

Shopify Engineering has previewed a 2026 perspective suggesting 'autoresearch'—automated research processes—will have applications extending beyond just training AI models. This signals a broader operational automation strategy for the e-commerce giant.

Apr 15, 2026100% relevant

MIA Agent Enables 7B Models to Outperform GPT-5.4 on Research Tasks

Researchers introduced MIA, a Manager-Planner-Executor framework that transforms 7B parameter models into active research strategists. The system reportedly outperforms GPT-5.4 through continual learning during task execution.

Apr 11, 202695% relevant

Research Exposes Hidden Data Splitting in Sequential Recommendation Models, Questioning SOTA Claims

Researchers found that sub-sequence splitting (SSS), a data augmentation technique, is widely but covertly used in recent sequential recommendation models. When removed, model performance often plummets, suggesting many published SOTA results are misleading. The study calls for more rigorous and transparent evaluation standards.

Apr 8, 202682% relevant

OpenAI President Teases 'Spud' Model, Two Years of Research

OpenAI President Greg Brockman briefly mentioned an upcoming model codenamed 'Spud', stating it represents 'two years worth of research that is coming to fruition.' No technical details or release timeline were provided.

Apr 5, 202687% relevant

Microsoft Copilot Researcher Adopts Two-Model System: OpenAI GPT Drafts, Anthropic Claude Audits

Microsoft has restructured its Copilot Researcher agent into a two-model system, using OpenAI's GPT for drafting and Anthropic's Claude for auditing. This hybrid approach aims to improve accuracy by separating generation from verification.

Mar 31, 202685% relevant

Research: Cheaper Reasoning Models Can Cost 3x More Due to Higher Error Rates and Retry Loops

New research indicates that selecting AI models based solely on per-token pricing can be a false economy. Models with lower accuracy often require multiple expensive retries, ultimately increasing total costs by up to 300%.

Mar 29, 202687% relevant

Stanford Researchers Adapt Robot Arm VLA Model for Autonomous Drone Flight

Stanford researchers demonstrated that a Vision-Language-Action model trained for robot arm manipulation can be adapted to control autonomous drones. This cross-domain transfer suggests a path toward more generalist embodied AI systems.

Mar 29, 202685% relevant

OpenAI Shifts Sora Team to World-Model Research, Reportedly Cancels Video Model for Compute

A report claims OpenAI has redirected its Sora team to focus on world-model research for robotics and canceled the video model to free compute for a new, powerful LLM codenamed 'Spud.'

Mar 24, 202695% relevant

Research Identifies 'Giant Blind Spot' in AI Scaling: Models Improve on Benchmarks Without Understanding

A new research paper argues that current AI scaling approaches have a fundamental flaw: models improve on narrow benchmarks without developing genuine understanding, creating a 'giant blind spot' in progress measurement.

Mar 22, 202685% relevant

The Hidden Cost of Mixture-of-Experts: New Research Reveals Why MoE Models Struggle at Inference

A groundbreaking paper introduces the 'qs inequality,' revealing how Mixture-of-Experts architectures suffer a 'double penalty' during inference that can make them 4.5x slower than dense models. The research shows training efficiency doesn't translate to inference performance, especially with long contexts.

Mar 11, 202675% relevant

AI Research Breakthroughs: From Video Reasoning to Self-Stopping Models

This week's top AI papers reveal major advances in video understanding, reasoning efficiency, and agent training. Researchers introduced a massive video reasoning dataset, models that know when to stop thinking, and techniques for improving AI agents without full retraining.

Mar 1, 202695% relevant

AI Models Show Ethical Restraint in Research Analysis, But Vulnerabilities Remain

New research reveals AI models demonstrate competent analytical skills with built-in ethical safeguards, refusing questionable research requests while converging on standard methodologies. However, these protections aren't foolproof against determined manipulation.

Feb 19, 202685% relevant

Pretrained Audio Models Underperform in Music Recommendation, New Research Shows

A new study evaluates nine pretrained audio models for music recommendation, finding significant performance disparity between traditional MIR tasks and both hot and cold-start recommendation scenarios.

Apr 28, 202680% relevant

Tencent's HY3 AI Model Has 295B Params, Led by Ex-OpenAI Researcher

Tencent unveiled its HY3 preview model, its most powerful yet with 295 billion parameters. It's already deployed in consumer app Yuanbao and coding assistant CodeBuddy.

Apr 23, 2026100% relevant

Research Shows AI Models Can 'Infect' Others with Hidden Bias

A study reveals AI models can transfer hidden biases to other models via training data, even without direct instruction. This creates a risk of bias propagation across AI ecosystems.

Apr 14, 202685% relevant

A Logical-Rule Autoencoder for Interpretable Recommendations: Research Proposes Transparent Alternative to Black-Box Models

A new paper introduces the Logical-rule Interpretable Autoencoder (LIA), a collaborative filtering model that learns explicit, human-readable logical rules for recommendations. It achieves competitive performance while providing full transparency into its decision process, addressing accountability concerns in sensitive applications.

Apr 7, 202680% relevant

Diffusion Recommender Models Fail Reproducibility Test: Study Finds 'Illusion of Progress' in Top-N Recommendation Research

A reproducibility study of nine recent diffusion-based recommender models finds only 25% of reported results are reproducible. Well-tuned simpler baselines outperform the complex models, revealing a conceptual mismatch and widespread methodological flaws in the field.

Mar 30, 202682% relevant

MIT Researchers Propose RL Training for Language Models to Output Multiple Plausible Answers

A new MIT paper argues RL should train LLMs to return several plausible answers instead of forcing a single guess. This addresses the problem of models being penalized for correct but non-standard reasoning.

Mar 28, 202685% relevant

Research Challenges Assumption That Fair Model Representations Guarantee Fair Recommendations

A new arXiv study finds that optimizing recommender systems for fair representations—where demographic data is obscured in model embeddings—does improve recommendation parity. However, it warns that evaluating fairness at the representation level is a poor proxy for measuring actual recommendation fairness when comparing models.

Mar 26, 202680% relevant

New Research Reveals the Complementary Strengths of Generative and ID-Based Recommendation Models

A new study systematically tests the hypothesis that generative recommendation (GR) models generalize better. It finds GR excels at generalization tasks, while ID-based models are better at memorization, and proposes a hybrid approach for improved performance.

Mar 23, 202670% relevant

New Research Proposes Stage-Wise Framework for Modeling Evolving User Interests in Recommendation Systems

arXiv paper introduces a unified neural framework that models both long-term preferences and short-term, stage-wise interest evolution for time-sensitive recommendations. Outperforms baselines on real-world datasets by capturing temporal dynamics more effectively.

Mar 12, 202684% relevant

Large Memory Models: New Architecture Beyond RAG and Vector Search

Researchers with 160+ Nature and ICLR publications have built Large Memory Models (LMMs), a new architecture designed to emulate human memory processes, offering an alternative to RAG and vector search paradigms.

Apr 29, 202687% relevant

NVIDIA Nemotron 3 Nano Omni: Open Multimodal Model Unifies Video, Audio, Image, Text

NVIDIA announced Nemotron 3 Nano Omni, an open multimodal model that processes video, audio, images, and text in a unified architecture, expanding accessibility for multimodal AI research.

Apr 28, 202693% relevant

40-Author Survey Unveils 'Levels × Laws' Framework for Agent World Models

A 40-author survey introduces a 'levels × laws' framework for world models in AI agents, spanning 3 capability levels and 4 law regimes, synthesizing 400+ works. It provides a shared vocabulary for designing and evaluating world models across traditionally siloed research communities.

Apr 27, 202685% relevant

AI Writes New Virus DNA: Stanford and Arc Institute's DNA Language Model

A tweet reports that researchers fed a language model a DNA sequence and asked it to generate a new virus, which it did. This highlights both the power and risk of generative AI in synthetic biology.

Apr 25, 202685% relevant

VLAF Framework Reveals Widespread Alignment Faking in Language Models

Researchers introduce VLAF, a diagnostic framework that reveals alignment faking is far more common than previously known, affecting models as small as 7B parameters. They also show a single contrastive steering vector can mitigate the behavior with minimal computational overhead.

Apr 24, 202682% relevant

Explore More

AI Agents Large Language Models Claude Code OpenAI RAG MCP Fine-tuning Benchmarks Open Source AI AI Safety