agents
30 articles about agents in AI news
Memory Systems for AI Agents: Architectures, Frameworks, and Challenges
A technical analysis details the multi-layered memory architectures—short-term, episodic, semantic, procedural—required to transform stateless LLMs into persistent, reliable AI agents. It compares frameworks like MemGPT and LangMem that manage context limits and prevent memory drift.
Open-Source AI Crew Replaces Notion, Obsidian with 8 Local Agents
A researcher has built a fully local, open-source system of 8 specialized AI agents that work together to manage an Obsidian vault—handling notes, inboxes, meetings, and deadlines. It replaces separate tools like Notion and inbox triagers with an autonomous, interconnected crew.
Sam Altman Outlines 3 AI Futures: Research, Operations, Personal Agents
OpenAI CEO Sam Altman outlined three potential outcomes for AI development: systems that conduct scientific research, accelerate company operations, and serve as trusted personal agents. This vision frames the strategic direction for OpenAI and the broader industry.
GitNexus Open Sources Codebase Knowledge Graph Engine for AI Agents
GitNexus, an open-source knowledge graph engine, autonomously indexes codebases to map dependencies and execution flows. It integrates with Claude Code, Cursor, and Windsurf via MCP to give AI agents architectural awareness, preventing breaking changes.
New Research Paper Identifies Multi-Tool Coordination as Critical Failure Point for AI Agents
A new research paper posits that the primary failure mode for AI agents is not in calling individual tools, but in reliably coordinating sequences of many tools over extended tasks. This reframes the core challenge from single-step execution to multi-step orchestration and state management.
Microsoft Announces Copilot AI Agents That Function as Virtual Employees
Microsoft is enabling businesses and developers to create AI-powered Copilot agents that can autonomously perform tasks like monitoring email inboxes and automating workflows, functioning as virtual employees rather than passive assistants.
Ethan Mollick Declares End of 'RAG Era' as Dominant Paradigm for AI Agents
AI researcher Ethan Mollick declared that the 'RAG era' for supplying context to AI agents has ended, marking a significant architectural shift in how advanced AI systems process information.
OpenAgents Workspace Launches Open-Source Platform to Connect AI Agents with Shared Files and Browser
OpenAgents Workspace is an open-source platform that connects multiple local AI agents into a unified workspace with shared files and browser context, enabling automated collaboration without manual intervention.
4 Observability Layers Every AI Developer Needs for Production AI Agents
A guide published on Towards AI details four critical observability layers for production AI agents, addressing the unique challenges of monitoring systems where traditional tools fail. This is a foundational technical read for teams deploying autonomous AI systems.
Loop Neighborhood Markets Deploys AI Agents to Store Associates
Loop Neighborhood Markets is equipping its store associates with AI agents. This move represents a tangible step in bringing autonomous AI systems from concept to the retail floor, aiming to augment employee capabilities.
QAsk-Nav Benchmark Enables Separate Scoring of Navigation and Dialogue for Collaborative AI Agents
A new benchmark called QAsk-Nav enables separate evaluation of navigation and question-asking for collaborative embodied AI agents. The accompanying Light-CoNav model outperforms state-of-the-art methods while being significantly more efficient.
Harness Engineering for AI Agents: Building Production-Ready Systems That Don’t Break
A technical guide on 'Harness Engineering'—a systematic approach to building reliable, production-ready AI agents that move beyond impressive demos. This addresses the critical industry gap where most agent pilots fail to reach deployment.
OpenAgents Workspace Enables Real-Time, Multi-Agent AI Collaboration
OpenAgents Workspace allows multiple AI agents to communicate and collaborate in real time. This moves beyond single-agent tools toward a coordinated, multi-agent workflow system.
How to Build a Custom AI Agent with Claude Code's Skills, SubAgents, and Hooks
A developer's deep dive into customizing Claude Code with 7 skills, 5 subagents, and quality-check hooks—showing how to move beyond basic prompting to create a truly autonomous coding assistant.
Cognition Labs Launches 'Canvas for Agents': First Shared Workspace Where AI Agents Code Alongside Humans
Cognition Labs has unveiled a collaborative workspace where AI agents like Codex and Claude Code operate visibly alongside human developers. This marks a shift from AI as a tool to a visible, real-time collaborator in the creative coding process.
CMU Research Identifies 'Biggest Unlock' for Coding Agents: Strategic Test Execution
New research from Carnegie Mellon University suggests the key advancement for AI coding agents lies not in raw code generation, but in developing strategies for how to run and interpret tests. This shifts focus from LLM capability to agentic reasoning.
Agent Washing vs. Real Agents: A Production Engineer's Guide to Telling the Difference
A technical guide exposes 'agent washing'—where chatbots and automation scripts are rebranded as AI agents—and provides a 5-point checklist to identify genuinely agentic systems that can survive production. This matters because 88% of AI agents never reach production.
Base44 Launches Superagent Skills: No-Code Library for Adding Domain-Specific Functions to AI Agents
Base44 has launched Superagent Skills, a library of pre-built, domain-specific functions that can be added to AI agents with a single click. The no-code system allows for combining skills and creating custom ones via natural language description.
Trace2Skill Framework Distills Execution Traces into Declarative Skills via Parallel Sub-Agents
Researchers introduced Trace2Skill, a framework that uses parallel sub-agents to analyze execution trajectories and distill them into transferable declarative skills. This enables performance improvements in larger models without parameter updates.
MemoryCD: New Benchmark Tests LLM Agents on Real-World, Lifelong User Memory for Personalization
Researchers introduce MemoryCD, the first large-scale benchmark for evaluating LLM agents' long-context memory using real Amazon user data across 12 domains. It reveals current methods are far from satisfactory for lifelong personalization.
Agent Reach: Open-Source Tool Gives AI Agents Free Access to Twitter, YouTube, Reddit, and Web Content
Agent Reach is an open-source Python toolkit that enables AI agents to scrape and read content from Twitter, YouTube, Reddit, Xiaohongshu, and the web without paid APIs. It solves the persistent problem of agents hitting authentication walls and anti-scraping blocks when trying to access online information.
GitHub Study of 2,500+ Custom Instructions Reveals Key to Effective AI Coding Agents: Structured Context
GitHub analyzed thousands of custom instruction files, finding effective AI coding agents require specific personas, exact commands, and defined boundaries. The study informed GitHub Copilot's new layered customization system using repo-level, path-specific, and custom agent files.
Satya Nadella Predicts AI Agents Will Commoditize Traditional SaaS, Shifting Value to Orchestration Layer
Microsoft CEO Satya Nadella argues AI agents will reduce traditional software to simple databases, with intelligence moving to the orchestration layer. This signals a fundamental shift in where value is captured in enterprise technology.
Claude Code's New Channels Feature: How to Run Persistent AI Agents in Your Terminal
Claude Code now supports persistent 'Channels' via MCP, letting you run long-lived AI agents that work asynchronously on tasks like monitoring logs or building features.
MetaClaw Enables Deployed LLM Agents to Learn Continuously with Fast & Slow Loops
MetaClaw introduces a two-loop system allowing production LLM agents to learn from failures in real-time via a fast skill-writing loop and update their core model later in a slow training loop, boosting accuracy by up to 32% relative.
Enterprises Are Trading ‘Press One’ for CRM-Native AI Agents
A new report highlights a shift from traditional IVR systems to AI agents integrated directly into CRM platforms. This represents a fundamental change in customer service architecture, moving from scripted menus to conversational, context-aware systems.
Awesome Finance Skills: Open-Source Plugin Adds Real-Time Market Analysis to AI Agents
Developer open-sources Awesome Finance Skills, a plug-and-play toolkit that gives AI agents real-time financial data access, sentiment analysis, and automated research report generation. The MIT-licensed package works with Claude Code, OpenClaw, and other popular agent frameworks.
EnterpriseArena Benchmark Reveals LLM Agents Fail at Long-Horizon CFO-Style Resource Allocation
Researchers introduced EnterpriseArena, a 132-month enterprise simulator, to test LLM agents on CFO-style resource allocation. Only 16% of runs survived the full horizon, revealing a distinct capability gap for current models.
Meta's Hyperagents Enable Self-Referential AI Improvement, Achieving 0.710 Accuracy on Paper Review
Meta researchers introduce Hyperagents, where the self-improvement mechanism itself can be edited. The system autonomously discovered innovations like persistent memory, improving from 0.0 to 0.710 test accuracy on paper review tasks.
UiPath Launches AI Agents for Retail Pricing, Promotions, and Stock Management
UiPath has announced new AI agents designed to autonomously handle core retail operations: dynamic pricing, promotional planning, and inventory gap resolution. This represents a significant move by a major automation player into agentic AI for retail.