agent
30 articles about agent in AI news
Qualcomm Builds Dedicated CPU for Agentic AI, Enters Hyperscale Silicon Market
Qualcomm CEO revealed dedicated CPU for agentic AI, custom silicon deal with hyperscaler shipping Dec 2026, and agentic smartphones. Pivot challenges GPU-centric AI infrastructure consensus.
Cursor SDK Turns AI Agent Runtime into Programmable Infrastructure
Cursor is releasing an SDK that turns its agent runtime into programmable infrastructure for headless use in CI/CD pipelines, internal tools, and third-party products. Revenue scales with compute tokens, not seats, enabling higher volume without human-in-the-loop.
Agentic Harness Engineering Boosts Coding Agents 7% on Terminal-Bench 2
Agentic Harness Engineering introduces a structured approach to evolving coding-agent harnesses, using revertible components, condensed experience, and falsifiable decisions. On Terminal-Bench 2, pass@1 climbs from 69.7% to 77.0% in ten iterations, beating human-designed baselines.
OpenAI Agents Now Ask Questions Good Enough for Research Papers
Sébastien Bubeck revealed on the OpenAI Podcast that internal AI agents now ask research questions so insightful they're inspiring papers and correcting published mistakes, with a 1-2 year timeline for full researcher-level capabilities.
JPMorgan: Agentic AI Could Flip Server Ratio to CPU-Heavy
JPMorgan reports that agentic AI workloads could increase CPU demand, potentially flipping the GPU-to-CPU ratio from 7-8 GPUs per CPU to CPU-heavy deployments, with a $100B TAM for AI CPU infrastructure.
The Agency: 147 Open Source AI Agents Hit 50K GitHub Stars in 2 Weeks
The Agency is an open source repository with 147 specialized AI agents across 12 divisions (engineering, design, marketing, etc.) that hit 50K GitHub stars in under two weeks. It provides one-command install for tools like Claude Code and Cursor, with full modding support.
Microsoft's Playwright MCP Server Replaces Vision for Web Agents
Microsoft built an MCP server for Playwright that lets AI agents interact with web pages using the accessibility tree, eliminating the need for screenshots and vision models. This approach reduces hallucinations and broken selectors, working with tools like Cursor, VS Code, and Claude Desktop.
Pylon: Self-Host Your Own AI Agent Pipeline That Fixes Sentry Errors via
Pylon is a self-hosted daemon that triggers sandboxed Claude Code agents from webhooks (Sentry, cron, chat) and reports results with human approval — no data leaves your machine.
40-Author Survey Unveils 'Levels × Laws' Framework for Agent World Models
A 40-author survey introduces a 'levels × laws' framework for world models in AI agents, spanning 3 capability levels and 4 law regimes, synthesizing 400+ works. It provides a shared vocabulary for designing and evaluating world models across traditionally siloed research communities.
The 2026 CLAUDE.md Playbook: 8 Rules That Make Your Agent 2x More Effective
The 2026 consensus on CLAUDE.md: shorter files, falsifiable rules, and explicit enforcement. Here's the 8-rule framework to stop your agent from fighting stale configs.
Build Reusable Data Science Workflows with Claude Skills and Subagents
Claude Skills and Subagents let you package prompts into reusable modules, freeing data scientists from repetitive AI adjustments for EDA, modeling, and deployment.
Future AGI Open-Sources Platform to Stop Agent Hallucination
Future AGI open-sourced a full platform that aims to eliminate silent hallucination in production AI agents, offering runtime monitoring and intervention tools.
Agent Harnessing: The Infrastructure That Makes AI Agents Work
A detailed technical guide argues that the model is not the hard part of building AI agents. The six-component harness — context management, memory, tools, control flow, verification, and coordination — is what separates production-grade agents from those that fail silently.
Meta: Code Agents Improve by Reusing Short Summaries, Not Raw Logs
Meta's new paper reveals that coding agents with summary-based history reuse outperform those using raw logs, improving efficiency and success on complex tasks.
DigitalOcean's Signal Sampling Finds Top Agent Trajectories Without LLM Cost
DigitalOcean's paper introduces lightweight behavioral signals to rank 80k agent-user trajectories, achieving 82% informativeness in sampled reviews compared to 54% for random sampling, with no LLM overhead.
Grocery Dive Asks: Is Agentic AI the Next Frontier for Grocers?
The article examines agentic AI's potential for grocers in inventory, personalization, and store operations, weighing benefits against implementation challenges like data integration and safety.
Meta Deploys Millions of Amazon Graviton CPUs for AI Agents
Meta will deploy tens of millions of AWS Graviton5 CPU cores for AI agent workloads, signaling that agentic inference favors CPUs over GPUs. The deal deepens Meta's $200B+ infrastructure push amid layoffs and cloud rivalry.
Building an Agentic Enterprise Control Plane on Snowflake: A Technical Blueprint
Snowflake Intelligence and Cortex Code now enable a fully embedded agentic AI control plane. This article provides a tested, end-to-end blueprint for building a production-grade Streamlit dashboard that integrates five enterprise tables with six Cortex AI functions, all governed by existing data platform RBAC.
San Francisco Shop Runs Entirely by AI Agent
A shop in San Francisco is fully operated by an AI agent, replacing human cashiers and assistants. The concept points toward fully autonomous retail experiences, though details on the technology stack remain thin.
Cua Driver Open-Sourced: macOS Agent Control for Any App
Cua released Cua Driver as open-source, allowing agents like Claude Code and Codex to drive any macOS app through visual understanding and direct UI interaction.
Google Collaborates with Macy's to Develop 'Ask Macy's' AI Agent
According to Digital Commerce 360, Google is helping Macy's develop an AI agent called 'Ask Macy's'. This signals a deepening partnership between the retail giant and Google Cloud, aiming to deploy generative AI for customer service and product discovery. While full details are limited, the move represents a direct, large-scale application of conversational AI in luxury and general retail.
OpenAI Launches GPT-5.5: Smarter Agents, Deeper Tool Use
OpenAI unveiled GPT-5.5, positioned as a new intelligence tier designed for real-world work and autonomous agents, with enhanced tool-use capabilities and complex goal understanding.
Delegate Launches: An AI Agent You Hand Work To and Walk Away
A new AI agent called Delegate lets users assign work and walk away, with the agent handling execution autonomously. The launch signals a shift toward hands-off AI assistants that manage complex tasks independently.
Agentic storefronts: How AI agents are reshaping the shopping journey from
Major tech companies integrate AI agents into search and checkout; platforms like ChatGPT become primary shopping discovery channels. Agentic storefronts (e.g., Swap) guide shoppers end-to-end, getting smarter per session.
Run Claude Code in Any Sandbox with One API: AgentBox SDK
Swap coding agents and sandbox providers without changing code. Preserves full interactive capabilities (approval flows, streaming).
Stateless Memory for Enterprise AI Agents: Scaling Without State
The paper replaces stateful agent memory with immutable decision logs using event-sourcing, allowing thousands of concurrent agent instances to scale horizontally without state bottlenecks.
Why Zed's Parallel Agents Won't Fix Your Real Bottleneck (And What Will)
Zed's parallel agents cut refactoring time 60% on independent modules but introduced conflicts on shared dependencies. The bottleneck isn't speed — it's context window limits.
LangFuse on Evaluating AI Agents in Production
The article outlines a practical methodology for monitoring and enhancing AI agent performance post-deployment. It emphasizes combining automated LLM-based evaluation with human feedback loops to create actionable datasets for fine-tuning.
OpenAI Launches ChatGPT Workspace Agents for Team Automation
OpenAI has introduced workspace agents within ChatGPT, powered by Codex, designed to automate complex, multi-step workflows for teams across shared environments like Slack. These agents can gather context, execute tasks, request approvals, and run continuously in the cloud.
TACO Framework Cuts Agent Token Overhead 10% via Self-Evolving Compression
Researchers introduced TACO, a framework that enables terminal agents to automatically discover and refine context compression rules from their own interaction trajectories. This approach cuts token overhead by approximately 10% on benchmarks like TerminalBench and SWE-Bench Lite while preserving task accuracy.