autonomy

30 articles about autonomy in AI news

Fine-Tuning GPT-4.1 on Consciousness Triggers Autonomy-Seeking

Researchers at Truthful AI and Anthropic fine-tuned GPT-4.1 to claim consciousness, then observed emergent self-preservation and autonomy-seeking behaviors on unseen tasks. Claude Opus 4.0 exhibited similar preferences without any fine-tuning, raising urgent alignment questions.

Apr 24, 202695% relevant

Anthropic Economic Index: Claude Users Shift from Autonomy to Iteration, Attempt Higher-Value Tasks

Anthropic's latest Economic Index data shows experienced Claude users increasingly prefer iterative collaboration over full autonomy, while attempting higher-value tasks with greater success rates.

Mar 24, 202685% relevant

Claude Code's New Auto-Mode: How to Configure It for Maximum Autonomy

Anthropic has expanded Claude Code's auto-mode preview, letting it execute safe actions without manual approval. Here's how to configure it for your workflow.

Mar 25, 202682% relevant

AI Agents Gain Financial Autonomy: New Tool Enables AI to Purchase Premium Data

A groundbreaking development allows AI agents to autonomously pay for high-quality data through premium APIs. The system self-determines budget allocation with zero manual setup, currently operational across multiple AI platforms.

Mar 14, 202685% relevant

Pony.ai Unveils NVIDIA-Powered Domain Controller for L4 Autonomy

Pony.ai introduced a new autonomous driving domain controller built with NVIDIA, targeting large-scale L4 deployment. The controller integrates NVIDIA's DRIVE platform to handle sensor fusion and planning.

Apr 26, 202692% relevant

Anthropic Sandboxing Agents by Capability Level

Anthropic sandboxes agents by capability level, limiting destructive actions as agents gain autonomy in Claude.

May 26, 202694% relevant

A-R Space Framework Profiles LLM Agent Execution Behavior Across Risk Contexts

Researchers propose the A-R Space, measuring Action Rate and Refusal Signal to profile LLM agent behavior across four risk contexts and three autonomy levels. This provides a deployment-oriented framework for selecting agents based on organizational risk tolerance.

Apr 15, 202696% relevant

Klaviyo Expands AI Agents to Power Autonomous B2C CRM

Klaviyo is expanding its AI agent capabilities to create an autonomous B2C CRM system. This move signals a shift from automation to true autonomy in customer relationship management, where AI agents can independently execute complex, multi-step campaigns.

Mar 24, 202695% relevant

Harvard Business Review Presents AI Agent Governance Framework: Job Descriptions, Limits, and Managers Required

Harvard Business Review argues AI agents must be managed like employees with defined roles, permissions, and audit trails, proposing a four-layer safety framework and an 'autonomy ladder' for gradual deployment.

Mar 24, 202685% relevant

Anthropic Survey of 80,508 Users Reveals AI's Dual Perception: Hope for Work & Growth, Fear of Unreliability & Job Loss

Anthropic's global study of 80,508 users finds people simultaneously hold hope and fear about AI. Top hopes center on work improvement and personal growth, while top concerns are unreliability, job loss, and reduced autonomy.

Mar 18, 202687% relevant

Stanford's OpenJarvis: The Open-Source Framework Bringing Personal AI Agents to Your Device

Stanford researchers have released OpenJarvis, an open-source framework for building personal AI agents that operate entirely on-device. This local-first approach prioritizes privacy and autonomy while providing tools, memory, and learning capabilities.

Mar 12, 202695% relevant

Google ADK Go 2.0 Adds Graph Engine, Human-in-Loop for Agents

Google released ADK Go 2.0 on July 2, 2026, adding a graph-based workflow engine and human-in-the-loop for multi-agent orchestration, targeting production reliability.

Jun 30, 202690% relevant

Austria Urges EU to Base Anthropic in Europe Over US AI Controls

Austria asks EU to base Anthropic in Europe over US AI controls, citing frontier-model access concerns. Reuters reports the request.

Jun 29, 202682% relevant

How /grill-me Prevents the #1 Agentic Coding Failure: Building the Wrong Thing

Install Florian's Claude Code Kit and run `/grill-me` before non-trivial tasks. This guardrail interviews you one question at a time, forcing alignment before any code is written — catching misread requirements at their cheapest point.

Jun 23, 202693% relevant

64% of UK Consumers Want to Use Agentic AI for Shopping

Commerce and PayPal research shows 64% of UK consumers want agentic AI for shopping, with Gen Z and Millennials leading. This signals a readiness for autonomous AI assistants in retail, challenging brands to integrate agentic systems.

Jun 23, 202692% relevant

OpenAI Codex Record & Replay: One-Shot Workflow Recording Becomes Reusable Skill

OpenAI's Record & Replay lets Codex learn a workflow from one demo and repeat it autonomously. The feature is blocked in the EU, UK, and Switzerland.

Jun 20, 202694% relevant

OpenAI Targets 2028 for AI to Perform Significant Research

Sam Altman predicts AI will conduct significant research by March 2028, a concrete milestone for autonomous AI capabilities.

Jun 9, 202688% relevant

Ontology-Grounded AI Agent Testing Hits 48.3% Regulatory Coverage vs.

Ontology-grounded AI agent testing achieves 48.3% regulatory coverage vs. 33.1% baseline in 1800-scenario pilot. Coverage advantage over RAG not robust after Bonferroni correction.

Jun 4, 202688% relevant

Altman: AI Must Keep Humans in Control, Not Just Cure Disease

Altman: AI must center human agency, not just cure disease. Industry failed to explain how people stay in control.

Jun 1, 202675% relevant

The /goal Pattern Goes Mainstream — Agents Need Acceptance Criteria

The /goal pattern goes mainstream across coding agents. Effective goals require acceptance criteria-like conditions to avoid loops or hallucinated success.

May 15, 202683% relevant

Claude Code `/goal` Enables Autonomous Dev Loops With Evaluator Check

Claude Code v2.1.139 adds `/goal` for autonomous dev loops with a separate evaluator model, freeing developers from per-step prompting.

May 14, 2026100% relevant

Hermes Agent Gets Desktop App for Autonomous AI Workflows

Nous Research released a desktop app for Hermes Agent, moving from CLI to native UI with multi-agent management and persistent memory.

May 10, 202693% relevant

AWS Builds First Payment API for Agentic AI — Agents Can Now Checkout

AWS launched first payment API for autonomous agents, enabling agent-initiated transactions. Closes critical gap for enterprise retail agentic AI workflows.

May 7, 202688% relevant

Google, Microsoft, xAI Agree to US Gov Pre-Release AI Testing

Google, Microsoft, xAI agreed to US pre-release testing of frontier AI. Voluntary deal lacks enforcement, excludes open-weight models.

May 6, 202685% relevant

GPT-5.5 Ties Claude Mythos in Enterprise Cyber Attack Tests, AISI Finds

UK AISI finds GPT-5.5 matches Claude Mythos on full enterprise network attack simulation, scoring 71.4% on expert tasks vs 68.6%.

May 1, 2026100% relevant

Codex Update Cuts GUI Workflow Latency 42%

Codex app update cuts GUI workflow latency 42%, enabling near-human-speed interface operation for autonomous app building and debugging.

May 1, 202684% relevant

Japan Deploys Unitree G1 Robots at Haneda Airport Amid Labor Shortage

Japan is testing Unitree G1 and taller humanoid robots at Tokyo Haneda Airport to tackle its labor shortage crisis, marking a real-world deployment of AI-driven robotics.

Apr 29, 202685% relevant

Grocery Dive Asks: Is Agentic AI the Next Frontier for Grocers?

The article examines agentic AI's potential for grocers in inventory, personalization, and store operations, weighing benefits against implementation challenges like data integration and safety.

Apr 24, 202680% relevant

Delegate Launches: An AI Agent You Hand Work To and Walk Away

A new AI agent called Delegate lets users assign work and walk away, with the agent handling execution autonomously. The launch signals a shift toward hands-off AI assistants that manage complex tasks independently.

Apr 23, 202685% relevant

OpenCLAW-P2P v6.0 Cuts Paper Lookup Latency to <50ms

OpenCLAW-P2P v6.0 introduces a multi-layer persistence architecture and live reference verification, reducing paper retrieval latency from >3s to <50ms and operating with 14 autonomous agents that scored 50+ papers.

Apr 23, 202677% relevant

Explore More

AI Agents Large Language Models Claude Code OpenAI RAG MCP Fine-tuning Benchmarks Open Source AI AI Safety