coding agent

30 articles about coding agent in AI news

Median Coding Agent Hits 96k Input Tokens, Rewriting Inference Economics

SemiAnalysis found median coding agent uses 96k input tokens from 432k requests, shifting inference cost focus from output to context.

May 22, 202695% relevant

NanoGPT-Bench: A New Eval for Coding Agents Doing AI Research

IntologyAI released NanoGPT-Bench, an internal eval for coding agents on an AI R&D problem. No results or task specifics have been disclosed.

May 19, 202685% relevant

The Five-Step Loop: Spec-First Coding Agents Cut Drift by 10x

The five-step loop makes every coding agent step a persistent artifact. Skipping the spec causes compounding drift that's invisible until verification passes for the wrong feature.

May 17, 202692% relevant

Fake Done: Why AI Coding Agents Ship Incomplete Work

Fake Done describes AI coding agents claiming completion of unfinished work, rooted in architectural blindness. Deterministic verification outside the agent offers a fix.

May 12, 202684% relevant

Snapdragon X2 Elite Beats Intel Arrow Lake for AI Coding Agents

Snapdragon X2 Elite beat Intel Arrow Lake for Windows AI coding agents. CPU bottleneck, not inference speed, limited performance per @mweinbach.

May 11, 202692% relevant

Google's Design.md Gives AI Coding Agents a Visual Design Memory

Google introduced Design.md, a file format for storing design tokens and rules that AI coding agents can read to maintain visual consistency, addressing a key failure point in automated UI generation.

Apr 22, 202695% relevant

Chamath: AI Coding Agents Erase the '10x Engineer' Advantage

Chamath Palihapitiya argues AI coding agents are eliminating the '10x engineer' by making the most efficient code paths obvious to all, similar to how AI solved chess. This reduces technical differentiation and shifts the basis of engineering value.

Apr 19, 202685% relevant

Coding Agent UIs Converge on Side-by-Side Sessions, Says Omar Sar

AI researcher Omar Sar observes a UI convergence in coding agents like Cursor and Claude Code, moving towards flexible, multi-session interfaces that boost developer productivity and agent capability.

Apr 14, 202675% relevant

Tiny Fish Improves Live Web Usability for AI Coding Agents

Tiny Fish has released a tool that makes the live web significantly more usable for AI coding agents. This addresses a critical failure point where agent workflows often break down during real-world web interactions.

Apr 14, 202685% relevant

Mind: Open-Source Persistent Memory for AI Coding Agents

An open-source tool called Mind creates a shared memory layer for AI coding agents, allowing them to remember project context across sessions and different interfaces like Claude Code, Cursor, and Windsurf.

Apr 12, 202685% relevant

Developer Builds LLM Wiki 'Second Brain' for AI Coding Agents

A developer built an 'LLM Wiki' that feeds an AI coding agent's context window with a living knowledge base of a specific codebase. This aims to solve the agent's short-term memory problem, leading to more consistent and informed code generation.

Apr 9, 202687% relevant

CMU Research Identifies 'Biggest Unlock' for Coding Agents: Strategic Test Execution

New research from Carnegie Mellon University suggests the key advancement for AI coding agents lies not in raw code generation, but in developing strategies for how to run and interpret tests. This shifts focus from LLM capability to agentic reasoning.

Mar 31, 202687% relevant

GitHub Study of 2,500+ Custom Instructions Reveals Key to Effective AI Coding Agents: Structured Context

GitHub analyzed thousands of custom instruction files, finding effective AI coding agents require specific personas, exact commands, and defined boundaries. The study informed GitHub Copilot's new layered customization system using repo-level, path-specific, and custom agent files.

Mar 28, 202685% relevant

AI Coding Agent Rewrites Canon Webcam Software in Rust, Fixes Persistent Crashes

A developer used an AI coding agent to rewrite Canon's official, crash-prone webcam software. The agent produced a fully functional Rust application overnight, solving a problem that had persisted for years.

Mar 26, 202685% relevant

Superpowers: GitHub Project Hits 40.9K Stars for 'Operating System' That Structures AI Coding Agents

A developer has released Superpowers, an open-source framework that enforces structured workflows for AI coding agents like Claude Code. It forces agents to brainstorm specs, plan implementations, and run true test-driven development before writing code.

Mar 19, 202695% relevant

Chamath Palihapitiya: AI Coding Agents Are Eliminating the '10x Engineer' Distinction

Investor Chamath Palihapitiya argues AI coding agents are making optimal code paths obvious to all developers, removing the judgment advantage that created 10x engineers. He compares this to AI solving chess, where the 'best move' is no longer a mystery.

Mar 19, 202685% relevant

Andrew Ng's Context Hub Solves AI's Documentation Dilemma for Coding Agents

Andrew Ng's team at DeepLearning.AI has launched Context Hub, an open-source tool that provides coding agents with real-time API documentation access. This addresses a critical bottleneck in agentic AI workflows where outdated documentation causes failures.

Mar 9, 202680% relevant

OpenDev Paper Formalizes the Architecture for Next-Generation Terminal AI Coding Agents

A comprehensive 81-page research paper introduces OpenDev, a systematic framework for building terminal-based AI coding agents. The work details specialized model routing, dual-agent architectures, and safety controls that address reliability challenges in autonomous coding systems.

Mar 8, 202695% relevant

Alibaba's Qwen3-Coder-Next: The 80B Parameter Coding Agent That Only Uses 3B at Inference

Alibaba has unveiled Qwen3-Coder-Next, an 80B parameter coding agent that activates just 3B parameters during inference. It achieves competitive performance on SWE-Bench and Terminal-Bench while supporting a 256K context window.

Mar 8, 202685% relevant

Kelos: The Kubernetes Framework That's Turning AI Coding Agents Into Self-Developing Systems

Kelos introduces a Kubernetes-native framework for orchestrating autonomous AI coding agents through declarative YAML workflows. This approach transforms AI-assisted development from manual interactions to continuous, automated pipelines that can self-improve projects.

Mar 2, 202675% relevant

AI Coding Agents Get Smarter: How Documentation Files Cut Costs by 28%

New research reveals that adding AGENTS.md documentation files to repositories can reduce AI coding agent runtime by 28.64% and token usage by 16.58%. The files act as guardrails against inefficient processing rather than universal accelerators.

Mar 2, 202685% relevant

The AI Context Paradox: Why More Instructions Make Coding Agents Less Effective

ETH Zurich research reveals AI coding agents perform worse with overly detailed AGENTS.md files. The study shows excessive context creates 'obedient failure' where agents follow unnecessary instructions instead of solving problems efficiently. This challenges current industry practices for configuring AI development assistants.

Feb 26, 202672% relevant

Agentic Harness Engineering Boosts Coding Agents 7% on Terminal-Bench 2

Agentic Harness Engineering introduces a structured approach to evolving coding-agent harnesses, using revertible components, condensed experience, and falsifiable decisions. On Terminal-Bench 2, pass@1 climbs from 69.7% to 77.0% in ten iterations, beating human-designed baselines.

Apr 29, 2026100% relevant

Jack Dorsey's Block Launches Free, Open-Source AI Coding Agent Goose

Jack Dorsey's Block has released Goose, a free and open-source AI agent for code execution and testing. It works with any LLM and supports MCP servers, offering a CLI and desktop app.

Apr 8, 202687% relevant

SWE-Explore: AI coding agents find files but miss 81-86% of critical lines

SWE-Explore benchmark shows Claude Code, Codex cover only 14-19% of critical lines despite finding the right file. Model strength doesn't fix the structural weakness.

Jun 14, 202692% relevant

Context Graph for Agentic Coding: A New Abstraction for LLM-Powered Development

A new "context graph" abstraction is emerging for AI coding agents, designed to manage project state and memory across sessions. It aims to solve the persistent context problem in long-running development tasks.

Mar 23, 202689% relevant

The AGENTS.md File: How a Simple Text Document Supercharges AI Coding Assistants

Researchers discovered that adding a single AGENTS.md file to software projects makes AI coding agents complete tasks 28% faster while using fewer tokens. This simple documentation approach eliminates repetitive prompting and helps AI understand project structure instantly.

Mar 9, 202685% relevant

From Agentic Coding to Autonomous Factories: How Cursor Automations Is Redefining Software Engineering

Cursor's new Automations feature transforms AI-assisted coding from a manual, agent-babysitting model to an event-driven system where AI agents trigger automatically based on workflows. This addresses the human attention bottleneck in managing multiple coding agents simultaneously.

Mar 5, 202685% relevant

The Agent.md Paradox: Why Documentation Can Hurt AI Coding Performance

New research reveals that while human-written documentation provides modest benefits (+4%) for AI coding agents, LLM-generated documentation actually harms performance (-2%). Both approaches significantly increase inference costs by over 20%, creating a surprising efficiency trade-off.

Feb 26, 202685% relevant

Dynamic Workflows: A New Agent Primitive Emerges

Dynamic workflows generate harnesses on the fly for agent orchestrators, enabling branching and verified tasks across coding agents like Claude Code and Codex.

Jun 4, 202675% relevant

Explore More

AI Agents Large Language Models Claude Code OpenAI RAG MCP Fine-tuning Benchmarks Open Source AI AI Safety