Skip to content
gentic.news — AI News Intelligence Platform
Connecting to the Living Graph…

multi agent

30 articles about multi agent in AI news

SDAR: Self-Distilled RL Stabilizes Multi-Turn LLM Agents, +9.4% on ALFWorld

SDAR gates self-distillation within GRPO to stabilize multi-turn LLM agent training, yielding +9.4% on ALFWorld and gains on WebShop and Search-QA across Qwen2.5 and Qwen3 models.

85% relevant

Multi-Agent LLM Systems Fail to Outperform Single Models, Study Finds

New paper finds multi-agent LLM systems underperform single models by 2.3% on reasoning benchmarks, challenging a core assumption in AI engineering.

89% relevant

Recursive Multi-Agent Systems Top Hugging Papers; Eywa Bridges LLMs and Scientific Models

Recursive Multi-Agent Systems leads Hugging Papers with 242 upvotes. Eywa and OneManCompany signal a move from chat-based to structural agent collaboration.

89% relevant

Multi-User LLM Agents Struggle: Gemini 3 Pro Scores 85.6% on Muses-Bench

A new benchmark reveals LLMs struggle with multi-user scenarios where agents face conflicting instructions. Gemini 3 Pro leads but only achieves 85.6% average, with privacy-utility tradeoffs proving particularly difficult.

92% relevant

Alibaba's VulnSage Generates 146 Zero-Days via Multi-Agent Exploit Workflow

Alibaba researchers published VulnSage, a multi-agent LLM framework that generates functional software exploits. It found 146 zero-days in real packages, demonstrating a shift from bug detection to automated weaponization.

99% relevant

Cobl AI Launches Multi-Agent Platform for Business Document Generation

Cobl, a new startup, has launched a multi-agent AI platform designed to generate business documents like proposals and reports. It enters a competitive space dominated by established players like Notion AI and Microsoft Copilot.

97% relevant

New Research Paper Identifies Multi-Tool Coordination as Critical Failure Point for AI Agents

A new research paper posits that the primary failure mode for AI agents is not in calling individual tools, but in reliably coordinating sequences of many tools over extended tasks. This reframes the core challenge from single-step execution to multi-step orchestration and state management.

85% relevant

arXiv Paper Proposes Federated Multi-Agent System with AI Critics for Network Fault Analysis

A new arXiv paper introduces a collaborative control algorithm for AI agents and critics in a federated multi-agent system, providing convergence guarantees and applying it to network telemetry fault detection. The system maintains agent privacy and scales with O(m) communication overhead for m modalities.

74% relevant

OpenAgents Workspace Enables Real-Time, Multi-Agent AI Collaboration

OpenAgents Workspace allows multiple AI agents to communicate and collaborate in real time. This moves beyond single-agent tools toward a coordinated, multi-agent workflow system.

95% relevant

Open-Source Multi-Agent LLM System for Complex Software Engineering Tasks Released by Academic Consortium

A consortium of researchers from Stony Brook, CMU, Yale, UBC, and Fudan University has open-sourced a multi-agent LLM system specifically architected for complex software engineering. The release aims to provide a collaborative, modular framework for tackling tasks beyond single-agent capabilities.

93% relevant

How RepoWire Turns Your Claude Code Sessions into a Multi-Agent Network

RepoWire orchestrates multiple Claude Code instances to work in parallel, letting you run specialized agents simultaneously for faster, more comprehensive development tasks.

95% relevant

Cline Launches Kanban Platform for Visualizing and Managing Multi-Agent AI Workflows

Cline has launched Cline Kanban, a visual platform for developers to manage and orchestrate multi-agent AI workflows. It aims to address the complexity of coordinating multiple specialized AI agents on a single task.

85% relevant

How to Build a Multi-Agent Dev System: One Developer's 40-Commit Field Report

A developer's two-week field report reveals how CLAUDE.md, knowledge graph corrections, and multi-agent workflows create compounding productivity gains.

92% relevant

LLM Multi-Agent Framework 'Shared Workspace' Proposed to Improve Complex Reasoning via Task Decomposition

A new research paper proposes a multi-agent framework where LLMs split complex reasoning tasks across specialized agents that collaborate via a shared workspace. This approach aims to overcome single-model limitations in planning and tool use.

85% relevant

The Agent Coordination Trap: Why Multi-Agent AI Systems Fail in Production

A technical analysis reveals why multi-agent AI pipelines fail unpredictably in production, with failure probability scaling exponentially with agent count. This exposes critical reliability gaps as luxury brands deploy complex AI workflows.

86% relevant

Anthropic Deploys Multi-Agent Harness to Scale Claude's Frontend Design & Autonomous Software Engineering

Anthropic engineers detail a multi-agent system that orchestrates multiple Claude instances to tackle complex, long-running software tasks like frontend design. The approach aims to overcome single-model context and reasoning limits.

85% relevant

Solving LLM Debate Problems with a Multi-Agent Architecture

A developer details moving from generic prompts to a multi-agent system where two LLMs are forced to refute each other, improving reasoning and output quality. This is a technical exploration of a novel prompting architecture.

78% relevant

PodcastBrain: A Technical Breakdown of a Multi-Agent AI System That Learns User Preferences

A developer built PodcastBrain, an open-source, local AI podcast generator where two distinct agents debate any topic. The system learns user preferences via ratings and adjusts future content, demonstrating a working feedback loop with multi-agent orchestration.

70% relevant

AI Agent Types and Communication Architectures: From Simple Systems to Multi-Agent Ecosystems

A guide to designing scalable AI agent systems, detailing agent types, multi-agent patterns, and communication architectures for real-world enterprise production. This represents the shift from reactive chatbots to autonomous, task-executing AI.

72% relevant

TTal CLI: Orchestrate Multiple Claude Code Agents for Autonomous PR Workflows

TTal is a Go CLI that creates a multi-agent system with persistent manager agents and disposable worker agents, letting you run entire PR cycles from your phone via Telegram.

95% relevant

Multi-Agent Reinforcement Learning for Dynamic Pricing: A Comparative Study of MAPPO and MADDPG

A new arXiv paper benchmarks multi-agent RL algorithms for competitive dynamic pricing. MAPPO achieved the highest, most stable profits, while MADDPG delivered the fairest outcomes. This offers a scalable alternative to independent learning for retail price optimization.

95% relevant

LLMs Score Only 22% Win Rate in Multi-Agent Clue Game, Revealing Deductive Reasoning Gaps

Researchers created a text-based Clue game to test LLM agents' multi-step deductive reasoning. Across 18 games with GPT-4o-mini and Gemini-2.5-Flash agents, only 4 correct wins were achieved, showing fine-tuning on logic puzzles doesn't reliably improve performance.

75% relevant

Multi-Agent Coding Systems Compared: Claude Code, Codex, and Cursor

A hands-on comparison reveals three fundamentally different approaches to multi-agent coding. Claude Code distinguishes between subagents and agent teams, Codex treats it as an engineering problem, and Cursor implements parallel file-system operations.

70% relevant

Multi-Agent AI Systems: Architecture Patterns and Governance for Enterprise Deployment

A technical guide outlines four primary architecture patterns for multi-agent AI systems and proposes a three-layer governance framework. This provides a structured approach for enterprises scaling AI agents across complex operations.

70% relevant

Three Agents, One Mission: A Multi-Agent Architecture for Real-Time Fraud Detection

A technical walkthrough of a multi-agent system built with Mesa and XGBoost for real-time fraud detection. It moves beyond a simple classifier to a complete, observable, and actionable pipeline.

72% relevant

DOVA Framework Introduces Deliberation-First Orchestration for Multi-Agent Research Automation

Researchers propose DOVA, a multi-agent platform that uses explicit meta-reasoning before tool invocation, achieving 40-60% inference cost reduction on simple tasks while maintaining deep reasoning capacity for complex research automation.

100% relevant

Claude's Subagents vs. Agent Teams: A Practical Framework for Multi-Agent System Design

Anthropic's Claude offers two distinct multi-agent models: isolated subagents for parallel tasks and communicating agent teams for complex workflows. The key design principle is to split work by context, not role, and to default to a single agent until complexity is proven necessary.

87% relevant

How Claude-Code-Workflow Orchestrates Multiple CLI Agents for Complex Tasks

Install this CLI tool to coordinate multiple Claude Code agents for complex projects using semantic commands and session management.

98% relevant

Recon: The tmux Dashboard That Finally Makes Multi-Agent Claude Code Workflows Manageable

Recon is a tmux-native TUI dashboard that lets you monitor and manage multiple Claude Code agents from a single interface—perfect for side monitors.

95% relevant

AI Agents Get a Memory Upgrade: New Framework Treats Multi-Agent Memory as Computer Architecture

A new paper proposes treating multi-agent memory systems as a computer architecture problem, introducing a three-layer hierarchy and identifying critical protocol gaps. This approach could significantly improve reasoning, skills, and tool usage in collaborative AI systems.

85% relevant