Skip to content
gentic.news — AI News Intelligence Platform

research paper

30 articles about research paper in AI news

Google's AutoWrite AI Generates Research Papers from Scratch

Google published a paper detailing AutoWrite, an AI system that can generate complete research papers from scratch. This represents a significant step toward automating the scientific writing process.

75% relevant

New Research Paper Identifies Multi-Tool Coordination as Critical Failure Point for AI Agents

A new research paper posits that the primary failure mode for AI agents is not in calling individual tools, but in reliably coordinating sequences of many tools over extended tasks. This reframes the core challenge from single-step execution to multi-step orchestration and state management.

85% relevant

Research Paper Proposes Security Framework for Autonomous AI Agents in Commerce

A Systematization of Knowledge (SoK) paper analyzes the emerging threat landscape for autonomous LLM agents conducting commerce. It identifies 12 attack vectors across five dimensions and proposes a layered defense architecture. This is a foundational security analysis for a nascent but high-stakes technology.

100% relevant

HuggingFace Launches Daily Papers SKILL.md for AI Agents to Read, Search, and Fetch Research Papers

HuggingFace released Daily Papers SKILL.md, a tool enabling AI agents to read paper content as markdown, search papers, find linked models/datasets, and fetch papers via API.

85% relevant

Research Paper 'Can AI Agents Agree?' Finds LLM-Based Groups Fail at Simple Coordination

A new study demonstrates that groups of LLM-based AI agents cannot reliably reach consensus on simple decisions, with failure rates increasing with group size. This challenges the common developer assumption that multi-agent systems will naturally converge through discussion.

87% relevant

Prefill-as-a-Service Paper Claims to Decouple LLM Inference Bottleneck

A research paper proposes a 'Prefill-as-a-Service' architecture to separate the heavy prefill computation from the lighter decoding phase in LLM inference. This could enable new deployment models where resource-constrained devices handle only the decoding step.

85% relevant

Meta's 'Model as Computer' Paper Explores LLM OS-Level Integration

A new research paper from Meta explores a paradigm where the language model acts as the computer's kernel, directly managing processes and memory. This could fundamentally change how AI agents are architected and interact with systems.

89% relevant

Alibaba Paper Shows AI Moving Beyond Text, Echoing Pichai's Warnings

Alibaba has published a research paper illustrating AI's progression beyond pure text generation. The work serves as a concrete example of the accelerating, multi-modal capabilities that industry leaders like Google's Sundar Pichai have recently cautioned about.

75% relevant

Anthropic Paper: 'Emotion Concepts and their Function in LLMs' Published

Anthropic has released a new research paper titled 'Emotion Concepts and their Function in LLMs.' The work investigates the role and representation of emotional concepts within large language model architectures.

95% relevant

The Jagged Frontier Paper Finally Published: Documenting AI's Early Productivity Revolution

The landmark 2022 research paper that coined the term 'jagged frontier' and provided early experimental evidence of AI productivity gains has officially been published after a 2.5-year academic review process, validating foundational insights about AI's uneven capabilities.

85% relevant

OpenDev Paper Formalizes the Architecture for Next-Generation Terminal AI Coding Agents

A comprehensive 81-page research paper introduces OpenDev, a systematic framework for building terminal-based AI coding agents. The work details specialized model routing, dual-agent architectures, and safety controls that address reliability challenges in autonomous coding systems.

95% relevant

New Research Models 'Exploration Saturation' in Recommender Systems

A research paper analyzes 'exploration saturation'—the point where more diverse recommendations hurt user utility. Findings show this saturation point is user-dependent, challenging the standard practice of applying uniform fairness or novelty pressure across all users.

84% relevant

HUOZIIME: A Research Framework for On-Device LLM-Powered Input Methods

A new research paper introduces HUOZIIME, a personalized on-device input method powered by a lightweight LLM. It uses a hierarchical memory mechanism to capture user-specific input history, enabling privacy-preserving, real-time text generation tailored to individual writing styles.

76% relevant

Research Suggests Social Reasoning and Logical Thinking Improve AI Agent Team Collaboration

A research paper indicates that incorporating social reasoning and logical thinking capabilities into AI agent teams leads to more effective collaboration. The findings were highlighted in a tweet by AI researcher Rohan Paul.

87% relevant

Research Identifies 'Giant Blind Spot' in AI Scaling: Models Improve on Benchmarks Without Understanding

A new research paper argues that current AI scaling approaches have a fundamental flaw: models improve on narrow benchmarks without developing genuine understanding, creating a 'giant blind spot' in progress measurement.

85% relevant

Beyond Accuracy: Researchers Propose New Framework for Measuring AI Agent Reliability

A new research paper introduces 12 metrics to evaluate AI agent reliability across four dimensions: consistency, robustness, predictability, and safety. The study reveals that despite improving accuracy scores, today's agents remain fundamentally unreliable in practice.

72% relevant

LoopCTR: A New 'Loop Scaling' Paradigm for Efficient

A new research paper introduces LoopCTR, a method for scaling Transformer-based CTR models by recursively reusing shared layers during training. This 'train-multi-loop, infer-zero-loop' approach achieves state-of-the-art performance with lower deployment costs, directly addressing a core industrial constraint in recommendation systems.

92% relevant

A Reference Architecture for Agentic Hybrid Retrieval in Dataset Search

A new research paper presents a reference architecture for 'agentic hybrid retrieval' that orchestrates BM25, dense embeddings, and LLM agents to handle underspecified queries against sparse metadata. It introduces offline metadata augmentation and analyzes two architectural styles for quality attributes like governance and performance.

84% relevant

IPCCF: A New Graph-Based Approach to Disentangle User Intent for Better

A new research paper introduces Intent Propagation Contrastive Collaborative Filtering (IPCCF), a method designed to improve recommendation systems by more accurately disentangling the underlying intents behind user-item interactions. It addresses limitations in existing methods by incorporating broader graph structure and using contrastive learning for direct supervision, showing superior performance in experiments.

84% relevant

TRACE: A Multi-Agent LLM Framework for Sustainable Tourism Recommendations

A new research paper introduces TRACE, a modular LLM-based framework for conversational travel recommendations. It uses specialized agents to elicit sustainability preferences and generate 'greener' alternatives through interactive explanations, aiming to reduce overtourism and carbon-intensive travel.

92% relevant

RoTE: A New Plug-and-Play Module to Sharpen Time-Aware Sequential

A new research paper introduces RoTE, a multi-level temporal embedding module for sequential recommenders. It explicitly models the time spans between user interactions, a factor often overlooked, leading to significant performance gains on standard benchmarks.

82% relevant

DUET: A New LLM-Based Recommender That Generates Paired User-Item Profiles

A new research paper introduces DUET, an interaction-aware profile generator for recommendation systems. Instead of using dense vectors or independent text descriptions, it jointly creates semantically consistent user and item profiles conditioned on their interaction history, optimizing them with reinforcement learning for better performance.

82% relevant

HARPO: A New Agentic Framework for Conversational Recommendation Aims to

A new research paper introduces HARPO, a hierarchical agentic reasoning framework for conversational recommender systems. It reframes recommendation as a structured decision-making process, directly optimizing for interpretable quality dimensions like relevance, diversity, and predicted satisfaction. The approach shows consistent improvements on recommendation-centric metrics across three datasets.

87% relevant

PRAGMA: Revolut's Foundation Model for Banking Event Sequences

A new research paper introduces PRAGMA, a family of foundation models designed specifically for multi-source banking event sequences. The model uses masked modeling on a large corpus of financial records to create general-purpose embeddings that achieve strong performance on downstream tasks like fraud detection with minimal fine-tuning.

74% relevant

Rank, Don't Generate: A New Benchmark for Factual, Ranked Explanations in Recommendation Systems

A new research paper formalizes explainable recommendation as a statement-level ranking problem, not a generation task. It introduces the StaR benchmark, built from Amazon reviews, showing that simple popularity baselines can outperform state-of-the-art models in personalized explanation ranking.

88% relevant

Agent Psychometrics: New Framework Predicts Task-Level Success in Agentic Coding Benchmarks with 0.81 AUC

A new research paper introduces a framework using Item Response Theory and task features to predict success on individual agentic coding tasks, achieving 0.81 AUC. This enables benchmark designers to calibrate difficulty without expensive evaluations.

75% relevant

The Socratic Model: A Hierarchical AI Architecture That Delegates to Specialists

A new research paper proposes a 3B-parameter hierarchical AI system called the Socratic Model. Instead of one monolithic LLM, it uses a lightweight router to classify queries and delegate to specialized expert models, outperforming a generalist baseline on mixed math/logic tasks.

82% relevant

LLM Multi-Agent Framework 'Shared Workspace' Proposed to Improve Complex Reasoning via Task Decomposition

A new research paper proposes a multi-agent framework where LLMs split complex reasoning tasks across specialized agents that collaborate via a shared workspace. This approach aims to overcome single-model limitations in planning and tool use.

85% relevant

flexvec: A New SQL Kernel for Programmable Vector Retrieval

A new research paper introduces flexvec, a retrieval kernel that exposes the embedding matrix and score array as a programmable surface via SQL, enabling complex, real-time query-time operations called Programmatic Embedding Modulation (PEM). This approach allows AI agents to dynamically manipulate retrieval logic and achieves sub-100ms performance on million-scale corpora on a CPU.

76% relevant

GenRecEdit: A Model Editing Framework to Fix Cold-Start Collapse in Generative Recommenders

A new research paper proposes GenRecEdit, a training-free model editing framework for generative recommendation systems. It directly injects knowledge of cold-start items, improving their recommendation accuracy to near-original levels while using only ~9.5% of the compute time of a full retrain.

95% relevant