research

30 articles about research in AI news

OpenAI Agents Now Ask Questions Good Enough for Research Papers

Sébastien Bubeck revealed on the OpenAI Podcast that internal AI agents now ask research questions so insightful they're inspiring papers and correcting published mistakes, with a 1-2 year timeline for full researcher-level capabilities.

Apr 28, 202685% relevant

Google Launches Deep Research Max Agent on Gemini 3.1 Pro

Google DeepMind rolled out Deep Research Max and standard Deep Research agents on Gemini 3.1 Pro, enabling autonomous web and proprietary data research via the Gemini API. The Max variant uses extended test-time compute for thorough asynchronous reports.

Apr 21, 202675% relevant

AI Agents Now Training Other AI Models, Sparking Autoresearch Trend

AI agents are now being used to train other AI models, creating advanced agentic systems. This development stems from Andrej Karpathy's autoresearch repository and represents early-stage automation of AI research.

Apr 21, 202675% relevant

New Research Models 'Exploration Saturation' in Recommender Systems

A research paper analyzes 'exploration saturation'—the point where more diverse recommendations hurt user utility. Findings show this saturation point is user-dependent, challenging the standard practice of applying uniform fairness or novelty pressure across all users.

Apr 21, 202684% relevant

NVIDIA Research Shows AI Can Optimize Decades-Old EDA Tools Like ABC

New NVIDIA research indicates AI can be used to optimize Electronic Design Automation (EDA) tools, such as the classic ABC system, which have been manually tuned by engineers for decades. This could automate a core, labor-intensive bottleneck in semiconductor design.

Apr 21, 202685% relevant

Anthropic Launches STEM Fellows Program to Pair Experts with AI Research

Anthropic announced the Anthropic STEM Fellows Program, a new initiative to bring science and engineering experts into its research teams for collaborative, months-long projects aimed at accelerating progress with AI.

Apr 20, 202689% relevant

Codex 'Chronicle' Research Preview Adds Memory for Daily Developer Context

A research preview of 'Chronicle' for Codex has been released. It enables the AI coding assistant to accumulate memories from a developer's daily workflow to improve context.

Apr 20, 202693% relevant

PRL-Bench: LLMs Score Below 50% on End-to-End Physics Research Tasks

Researchers introduced PRL-Bench, a benchmark built from 100 recent Physical Review Letters papers, testing LLMs on end-to-end physics research. Top models scored below 50%, exposing a significant capability gap for autonomous scientific discovery.

Apr 20, 2026100% relevant

Researchers Achieve Ultra-Long-Horizon Agentic Science with Cohesive AI Agents

A research team has developed AI agents capable of executing and maintaining coherent, long-horizon scientific research workflows. This addresses a core challenge in creating autonomous systems for complex discovery.

Apr 20, 202685% relevant

Prince Canuma's M3 Ultra 512GB & RTX Pro 6000 Setup for MLX Research

Independent developer Prince Canuma has assembled a powerful, community-sponsored home compute cluster for MLX research and model porting, featuring an M3 Ultra with 512GB RAM and an RTX Pro 6000.

Apr 19, 202679% relevant

Google DeepMind Researcher: LLMs Can Never Achieve Consciousness

A Google DeepMind researcher has publicly argued that large language models, by their algorithmic nature, can never become conscious, regardless of scale or time. This stance challenges a core speculative narrative in AI discourse.

Apr 18, 202685% relevant

HUOZIIME: A Research Framework for On-Device LLM-Powered Input Methods

A new research paper introduces HUOZIIME, a personalized on-device input method powered by a lightweight LLM. It uses a hierarchical memory mechanism to capture user-specific input history, enabling privacy-preserving, real-time text generation tailored to individual writing styles.

Apr 17, 202676% relevant

MiniMax Launches MaxHermes, Cloud-Hosted Agent with NousResearch

MiniMax has launched MaxHermes, a cloud-hosted version of the Hermes agent framework, in partnership with NousResearch. This provides a managed service for users of MiniMax's M2.7 model, aiming to simplify agent deployment.

Apr 16, 202685% relevant

New Research Proposes Collaborative Contrastive Network for Generalizable

Researchers propose the Collaborative Contrastive Network (CCN) to solve Trigger-Induced Recommendation challenges in ephemeral e-commerce scenarios like Black Friday. Instead of modeling ambiguous intent, CCN learns context-specific preferences from user-trigger pairs via novel contrastive signals. In online A/B tests on Taobao, CCN increased CTR by 12.3% and order volume by 12.7% in unseen scenarios.

Apr 16, 202680% relevant

New Research Proposes Lightweight Method to Fix Stale Semantic IDs in

Researchers propose a method to update 'stale' Semantic IDs in generative retrieval systems without full retraining. Their alignment technique improves key metrics and reduces compute costs by ~8-9x, addressing a core challenge in dynamic recommendation environments.

Apr 16, 202674% relevant

Shopify Engineering Teases 'Autoresearch' Beyond Model Training in 2026 Preview

Shopify Engineering has previewed a 2026 perspective suggesting 'autoresearch'—automated research processes—will have applications extending beyond just training AI models. This signals a broader operational automation strategy for the e-commerce giant.

Apr 15, 2026100% relevant

AI Research Suggests Whale 'Vowels' in Sperm Whale Communication

AI researchers analyzing sperm whale vocalizations have identified combinatorial structures that function like vowels, marking a step toward decoding cetacean communication.

Apr 15, 202685% relevant

Tsinghua Researchers Diagnose On-Policy Distillation Failures, Propose Fixes

Researchers from Tsinghua University have pinpointed two necessary conditions for successful on-policy distillation: compatible thinking patterns and novel teacher capabilities. They propose two recovery methods to salvage failing distillation runs.

Apr 15, 202685% relevant

New Research Proposes Profiler and DAVINCI for Scalable

Researchers propose Profiler, a non-learnable module to efficiently capture human citation patterns, and DAVINCI, a reranking model that integrates these patterns with semantic data. They also introduce a strict inductive evaluation setting to better simulate real-world recommendation scenarios, achieving state-of-the-art results.

Apr 15, 202684% relevant

Anthropic's AI Researchers Outperform Humans, Discover Novel Science

Anthropic reports its AI systems for alignment research are surpassing human scientists in performance and generating novel scientific concepts, broadening the exploration space for AI safety.

Apr 14, 202695% relevant

AI Agent Research Faces Human Evaluation Bottleneck

A prominent AI researcher argues that human-based evaluation is fundamentally flawed for testing autonomous AI agents, as humans cannot perceive or replicate agent logic, creating a major research bottleneck.

Apr 14, 202675% relevant

Researchers Study AI Mental Health Risks Using Simulated Teen 'Bridget'

A research team created a ChatGPT account for a simulated 13-year-old girl named 'Bridget' to study AI interaction risks with depressed, lonely teens. The experiment underscores urgent safety and ethical questions for generative AI developers.

Apr 14, 202685% relevant

Google DeepMind Hires Philosopher Henry Shevlin for AI Consciousness Research

Google DeepMind has hired philosopher Henry Shevlin to treat machine consciousness as a live research problem, focusing on AI inner states, human-AI relations, and governance. This marks a strategic pivot toward understanding what advanced AI systems might become, not just what they can do.

Apr 14, 202687% relevant

New Research Proposes DITaR Method to Defend Sequential Recommenders

Researchers propose DITaR, a dual-view method to detect and rectify harmful fake orders embedded in user sequences. It aims to protect recommendation integrity while preserving useful data, showing superior performance in experiments. This addresses a critical vulnerability in e-commerce and retail AI systems.

Apr 13, 202686% relevant

MIA Agent Enables 7B Models to Outperform GPT-5.4 on Research Tasks

Researchers introduced MIA, a Manager-Planner-Executor framework that transforms 7B parameter models into active research strategists. The system reportedly outperforms GPT-5.4 through continual learning during task execution.

Apr 11, 202695% relevant

PetClaw AI Agent Automates Research Stack, Replaces $200/Month Tools

A developer claims PetClaw's desktop AI agent automated their entire research workflow—browsing, sourcing, dashboard building—and saved it as a reusable skill, replacing multiple paid tools. No code was written.

Apr 11, 202687% relevant

New Research: How Online Marketplaces Can Use Demand Allocation to Control Seller Inventory

Researchers propose a model where a marketplace platform, by controlling the timing and predictability of order allocation to sellers, can influence their safety-stock inventory and their choice to use platform fulfillment services. This identifies demand allocation as a key operational lever for digital marketplaces.

Apr 9, 202678% relevant

Grainulator: The MCP-Powered Research Plugin That Forces Claude Code to Prove Its Claims

Grainulator transforms Claude Code into a research engine with typed claims, conflict detection, and confidence scoring—forcing AI to prove its work.

Apr 9, 2026100% relevant

Google's AutoWrite AI Generates Research Papers from Scratch

Google published a paper detailing AutoWrite, an AI system that can generate complete research papers from scratch. This represents a significant step toward automating the scientific writing process.

Apr 8, 202675% relevant

Coresight Research Report: Technology and Resilience as Path to Stronger Retail Margins

Coresight Research has published a report titled 'Supply Chain Insights for Food, Drug and Mass Retail: Technology, Resilience and the Path to Stronger Margins.' The research focuses on how strategic tech adoption can fortify operations and profitability in key retail segments.

Apr 8, 202681% relevant

Explore More

AI Agents Large Language Models Claude Code OpenAI RAG MCP Fine-tuning Benchmarks Open Source AI AI Safety