computer use agents
30 articles about computer use agents in AI news
Anthropic Launches Computer Use Feature in Claude Code, Enabling AI to Execute Terminal Commands
Anthropic has activated a 'computer use' capability within its Claude Code environment, allowing the AI assistant to directly execute terminal commands. This marks a significant step toward autonomous coding agents that can interact with development environments.
Massive Open-Source Dataset of Computer Screen Recordings Released to Train AI Agents
Researchers have released the world's largest open-source dataset of computer-use recordings on Hugging Face. The collection contains 48,478 screen recording videos totaling approximately 12,300 hours of professional software usage, licensed under CC-BY-4.0 for AI training and evaluation.
Meta's Neural Computers: Learned Runtimes Replace External OS for AI Agents
Meta AI and KAUST research introduces Neural Computers, a paradigm where AI models internalize computation, memory, and I/O. Early prototypes show 98.7% GUI cursor control and an 83% arithmetic accuracy boost via reprompting.
AI Agents Get a Memory Upgrade: New Framework Treats Multi-Agent Memory as Computer Architecture
A new paper proposes treating multi-agent memory systems as a computer architecture problem, introducing a three-layer hierarchy and identifying critical protocol gaps. This approach could significantly improve reasoning, skills, and tool usage in collaborative AI systems.
Cua Driver Brings Background Computer-Use to Windows
Cua Driver launched Windows support for background computer-use, enabling agents like Claude Code to control GUI apps without blocking execution.
Meta's 'Model as Computer' Paper Explores LLM OS-Level Integration
A new research paper from Meta explores a paradigm where the language model acts as the computer's kernel, directly managing processes and memory. This could fundamentally change how AI agents are architected and interact with systems.
The Next Frontier: AI Agents Take Direct Control of Smartphones and Apps
AI systems are gaining the ability to directly control smartphones and applications, moving beyond simple assistants to become autonomous digital agents. This breakthrough promises to revolutionize how we interact with technology but raises significant questions about privacy, security, and the future of human-computer interaction.
Omar Saro on Multi-User LLM Agents: A New Framework Frontier
AI researcher Omar Saro points out that all current LLM agent frameworks are designed for single-user instruction, creating a deployment barrier for team-based workflows. This identifies a major unsolved problem in making AI agents practically useful in organizations.
Claude Desktop Gains 'Use My Computer' Feature for Direct App and Browser Control
Anthropic's Claude Desktop app now includes an experimental 'Use My Computer' feature that allows Claude AI to directly interact with local applications, browsers, and files when explicitly enabled by users.
Open-Source 'AI Office' Platform Lets Users Walk Through 3D Space to Monitor Autonomous Agents
An open-source project called AI Office creates a 3D virtual workspace where AI agents are visualized as avatars performing tasks. Users can navigate the space instead of reading logs, offering a novel interface for multi-agent systems.
Microsoft Tests OpenClaw-Style AI Agents for Autonomous 365 Copilot
Microsoft is reportedly testing OpenClaw-style AI agents to evolve Microsoft 365 Copilot into an always-on, autonomous assistant. This move aims to directly handle complex, multi-step tasks like email triage and calendar management without constant user prompting.
Perplexity Revenue Doubled in Q1 2026 After Launching Computer
Perplexity's revenue doubled in the quarter following the launch of its 'Computer' feature. The company now has over 100 million users and growing enterprise adoption.
Sam Altman Envisions Codex Desktop Evolving into Unified AI Agent Controlling Computers
Sam Altman discussed the Codex Desktop ecosystem evolving toward a unified AI agent that can control computers, access user data, and work across multiple surfaces. This vision points toward AI systems moving beyond code generation to become proactive, cross-platform assistants.
Claude AI Gains Computer Control Feature: Opens Apps, Navigates Browser, Fills Spreadsheets
Anthropic's Claude AI can now be enabled to directly control a user's computer to perform tasks like opening applications, browser navigation, and spreadsheet work. This represents a significant shift from chat-based interaction to direct system automation.
The Auditor's Dilemma: Can AI Reliably Judge Other AI's Desktop Performance?
New research reveals that while vision-language models show promise as autonomous auditors for computer-use agents, they struggle with complex environments and exhibit significant judgment disagreements, exposing critical reliability gaps in AI evaluation systems.
Gartner's Framework for Evaluating and Implementing AI Agents in Business
Gartner outlines a three-step process for organizations to maximize AI agent value: identify candidate agents, evaluate against business needs, and implement governance. This structured approach helps prioritize use cases with measurable business impact.
The Usability Revolution: How AI Agents Are Finally Becoming Accessible to Everyone
AI agents are shifting from complex technical tools to accessible assistants that anyone can use. The real breakthrough isn't more capability, but eliminating technical barriers that have kept automation out of reach for most people.
LLM Agents Take the Wheel: How Rudder Revolutionizes Distributed GNN Training
Researchers have developed Rudder, a novel system that uses Large Language Model agents to dynamically prefetch data in distributed Graph Neural Network training, achieving up to 91% performance improvement over traditional methods by adapting to changing computational conditions in real-time.
Anthropic's Strategic Acquisition of Vercept Signals Major Shift Toward Autonomous AI Agents
Anthropic has acquired Seattle-based AI startup Vercept, known for its computer-use agent Vy that can operate a full desktop environment. The move accelerates Anthropic's push beyond conversational AI toward autonomous task completion, following Meta's recent poaching of a Vercept founder.
Perplexity Computer: The AI Agent That Works While You Sleep
Perplexity has launched 'Computer,' an AI agent that autonomously logs into user tools, executes workflows, and operates continuously without human prompting. This represents a fundamental shift from conversational AI to proactive task automation.
NVIDIA, DOE Build 100K-GPU Supercomputer for Science
DOE and NVIDIA announced Solstice, a 100K-GPU Vera Rubin supercomputer delivering 5,000 exaflops, and Equinox with 10K Blackwell GPUs.
Microsoft's Playwright MCP Server Replaces Vision for Web Agents
Microsoft built an MCP server for Playwright that lets AI agents interact with web pages using the accessibility tree, eliminating the need for screenshots and vision models. This approach reduces hallucinations and broken selectors, working with tools like Cursor, VS Code, and Claude Desktop.
OpenAI Codex Gains Screen Control, Long-Run Agents, and 90+ Plugins
OpenAI has upgraded Codex from a code-completion tool to an agentic macOS assistant that can see/click screens, run for weeks autonomously, and integrate with 90+ dev tools. This marks a strategic move into persistent, multi-modal coding agents.
Perplexity AI Launches 'Personal Computer' for Mac App Orchestration
Perplexity AI has released 'Personal Computer', a feature that integrates with its Mac app to securely orchestrate local files and applications. This move expands its AI assistant from web search to direct desktop interaction.
MiniMax Launches MMX-CLI, First Infrastructure Built for AI Agents
MiniMax released MMX-CLI, a CLI built for AI agents, not humans. It provides agents with seven multimodal 'senses' and native integration with popular AI coding environments.
Memory Systems for AI Agents: Architectures, Frameworks, and Challenges
A technical analysis details the multi-layered memory architectures—short-term, episodic, semantic, procedural—required to transform stateless LLMs into persistent, reliable AI agents. It compares frameworks like MemGPT and LangMem that manage context limits and prevent memory drift.
Open-Source AI Crew Replaces Notion, Obsidian with 8 Local Agents
A researcher has built a fully local, open-source system of 8 specialized AI agents that work together to manage an Obsidian vault—handling notes, inboxes, meetings, and deadlines. It replaces separate tools like Notion and inbox triagers with an autonomous, interconnected crew.
QAsk-Nav Benchmark Enables Separate Scoring of Navigation and Dialogue for Collaborative AI Agents
A new benchmark called QAsk-Nav enables separate evaluation of navigation and question-asking for collaborative embodied AI agents. The accompanying Light-CoNav model outperforms state-of-the-art methods while being significantly more efficient.
AI Agents Now Work in Persistent 3D Office Simulators, Raising Questions About Digital Labor
A developer has created a persistent 3D office environment where AI agents autonomously perform tasks across multiple days. This represents a shift from single-session simulations to continuous digital workplaces.
China's DeepSeek-R1: Open-Source AI Agent Runs Locally with Web Search, Code Generation, and Built-In Computer
Chinese AI company DeepSeek has released DeepSeek-R1, a fully open-source AI agent that runs locally on personal computers with web search capabilities, code generation, and built-in computer functionality. The model represents a significant move toward accessible, self-contained AI systems outside the dominant U.S. ecosystem.