coding agents

30 articles about coding agents in AI news

social.plus Vise: Workflow Governance for AI Coding Agents Building SDK

social.plus launched Vise, a workflow governance platform for AI coding agents building SDK integrations, enforcing policy controls and audit trails.

Jul 14, 202685% relevant

Databricks Tests Coding Agents on Its Own Codebase

Databricks benchmarked coding agents on its own polyglot codebase. GLM-5.2 matched top closed models, a minimal harness halved costs, and cheaper-per-token models cost more per task.

Jul 11, 202675% relevant

NanoGPT-Bench: A New Eval for Coding Agents Doing AI Research

IntologyAI released NanoGPT-Bench, an internal eval for coding agents on an AI R&D problem. No results or task specifics have been disclosed.

May 19, 202685% relevant

Fake Done: Why AI Coding Agents Ship Incomplete Work

Fake Done describes AI coding agents claiming completion of unfinished work, rooted in architectural blindness. Deterministic verification outside the agent offers a fix.

May 12, 202684% relevant

Snapdragon X2 Elite Beats Intel Arrow Lake for AI Coding Agents

Snapdragon X2 Elite beat Intel Arrow Lake for Windows AI coding agents. CPU bottleneck, not inference speed, limited performance per @mweinbach.

May 11, 202692% relevant

Google's Design.md Gives AI Coding Agents a Visual Design Memory

Google introduced Design.md, a file format for storing design tokens and rules that AI coding agents can read to maintain visual consistency, addressing a key failure point in automated UI generation.

Apr 22, 202695% relevant

Chamath: AI Coding Agents Erase the '10x Engineer' Advantage

Chamath Palihapitiya argues AI coding agents are eliminating the '10x engineer' by making the most efficient code paths obvious to all, similar to how AI solved chess. This reduces technical differentiation and shifts the basis of engineering value.

Apr 19, 202685% relevant

Tiny Fish Improves Live Web Usability for AI Coding Agents

Tiny Fish has released a tool that makes the live web significantly more usable for AI coding agents. This addresses a critical failure point where agent workflows often break down during real-world web interactions.

Apr 14, 202685% relevant

Mind: Open-Source Persistent Memory for AI Coding Agents

An open-source tool called Mind creates a shared memory layer for AI coding agents, allowing them to remember project context across sessions and different interfaces like Claude Code, Cursor, and Windsurf.

Apr 12, 202685% relevant

CMU Research Identifies 'Biggest Unlock' for Coding Agents: Strategic Test Execution

New research from Carnegie Mellon University suggests the key advancement for AI coding agents lies not in raw code generation, but in developing strategies for how to run and interpret tests. This shifts focus from LLM capability to agentic reasoning.

Mar 31, 202687% relevant

GitHub Study of 2,500+ Custom Instructions Reveals Key to Effective AI Coding Agents: Structured Context

GitHub analyzed thousands of custom instruction files, finding effective AI coding agents require specific personas, exact commands, and defined boundaries. The study informed GitHub Copilot's new layered customization system using repo-level, path-specific, and custom agent files.

Mar 28, 202685% relevant

Superpowers: GitHub Project Hits 40.9K Stars for 'Operating System' That Structures AI Coding Agents

A developer has released Superpowers, an open-source framework that enforces structured workflows for AI coding agents like Claude Code. It forces agents to brainstorm specs, plan implementations, and run true test-driven development before writing code.

Mar 19, 202695% relevant

Chamath Palihapitiya: AI Coding Agents Are Eliminating the '10x Engineer' Distinction

Investor Chamath Palihapitiya argues AI coding agents are making optimal code paths obvious to all developers, removing the judgment advantage that created 10x engineers. He compares this to AI solving chess, where the 'best move' is no longer a mystery.

Mar 19, 202685% relevant

Andrew Ng's Context Hub Solves AI's Documentation Dilemma for Coding Agents

Andrew Ng's team at DeepLearning.AI has launched Context Hub, an open-source tool that provides coding agents with real-time API documentation access. This addresses a critical bottleneck in agentic AI workflows where outdated documentation causes failures.

Mar 9, 202680% relevant

OpenDev Paper Formalizes the Architecture for Next-Generation Terminal AI Coding Agents

A comprehensive 81-page research paper introduces OpenDev, a systematic framework for building terminal-based AI coding agents. The work details specialized model routing, dual-agent architectures, and safety controls that address reliability challenges in autonomous coding systems.

Mar 8, 202695% relevant

Kelos: The Kubernetes Framework That's Turning AI Coding Agents Into Self-Developing Systems

Kelos introduces a Kubernetes-native framework for orchestrating autonomous AI coding agents through declarative YAML workflows. This approach transforms AI-assisted development from manual interactions to continuous, automated pipelines that can self-improve projects.

Mar 2, 202675% relevant

The AI Context Paradox: Why More Instructions Make Coding Agents Less Effective

ETH Zurich research reveals AI coding agents perform worse with overly detailed AGENTS.md files. The study shows excessive context creates 'obedient failure' where agents follow unnecessary instructions instead of solving problems efficiently. This challenges current industry practices for configuring AI development assistants.

Feb 26, 202672% relevant

GitHub's Former CEO Launches Distributed Git Network for AI Coding Agents

Claude Code users should monitor Nat Friedman's distributed Git network for faster agentic coding workflows. The new network optimizes Git for AI agents, potentially reducing clone/push latency.

Jul 8, 202680% relevant

AI Coding Agents Get Smarter: How Documentation Files Cut Costs by 28%

New research reveals that adding AGENTS.md documentation files to repositories can reduce AI coding agent runtime by 28.64% and token usage by 16.58%. The files act as guardrails against inefficient processing rather than universal accelerators.

Mar 2, 202685% relevant

The Five-Step Loop: Spec-First Coding Agents Cut Drift by 10x

The five-step loop makes every coding agent step a persistent artifact. Skipping the spec causes compounding drift that's invisible until verification passes for the wrong feature.

May 17, 202692% relevant

Developer Builds LLM Wiki 'Second Brain' for AI Coding Agents

A developer built an 'LLM Wiki' that feeds an AI coding agent's context window with a living knowledge base of a specific codebase. This aims to solve the agent's short-term memory problem, leading to more consistent and informed code generation.

Apr 9, 202687% relevant

Agentic Harness Engineering Boosts Coding Agents 7% on Terminal-Bench 2

Agentic Harness Engineering introduces a structured approach to evolving coding-agent harnesses, using revertible components, condensed experience, and falsifiable decisions. On Terminal-Bench 2, pass@1 climbs from 69.7% to 77.0% in ten iterations, beating human-designed baselines.

Apr 29, 2026100% relevant

SWE-Explore: AI coding agents find files but miss 81-86% of critical lines

SWE-Explore benchmark shows Claude Code, Codex cover only 14-19% of critical lines despite finding the right file. Model strength doesn't fix the structural weakness.

Jun 14, 202692% relevant

The AGENTS.md File: How a Simple Text Document Supercharges AI Coding Assistants

Researchers discovered that adding a single AGENTS.md file to software projects makes AI coding agents complete tasks 28% faster while using fewer tokens. This simple documentation approach eliminates repetitive prompting and helps AI understand project structure instantly.

Mar 9, 202685% relevant

The /goal Pattern Goes Mainstream — Agents Need Acceptance Criteria

The /goal pattern goes mainstream across coding agents. Effective goals require acceptance criteria-like conditions to avoid loops or hallucinated success.

May 15, 202683% relevant

Meta: Code Agents Improve by Reusing Short Summaries, Not Raw Logs

Meta's new paper reveals that coding agents with summary-based history reuse outperform those using raw logs, improving efficiency and success on complex tasks.

Apr 25, 202685% relevant

OpenAI Codex Gains Screen Control, Long-Run Agents, and 90+ Plugins

OpenAI has upgraded Codex from a code-completion tool to an agentic macOS assistant that can see/click screens, run for weeks autonomously, and integrate with 90+ dev tools. This marks a strategic move into persistent, multi-modal coding agents.

Apr 16, 202686% relevant

Coding Agent UIs Converge on Side-by-Side Sessions, Says Omar Sar

AI researcher Omar Sar observes a UI convergence in coding agents like Cursor and Claude Code, moving towards flexible, multi-session interfaces that boost developer productivity and agent capability.

Apr 14, 202675% relevant

Context Graph for Agentic Coding: A New Abstraction for LLM-Powered Development

A new "context graph" abstraction is emerging for AI coding agents, designed to manage project state and memory across sessions. It aims to solve the persistent context problem in long-running development tasks.

Mar 23, 202689% relevant

LangChain Open-Sources Deep Agents: MIT-Licensed Framework Replicating Claude Code's Core Workflow

LangChain released Deep Agents, an open-source framework that recreates the core architecture of coding agents like Claude Code. The MIT-licensed system is model-agnostic and provides modular components for building inspectable coding assistants.

Mar 15, 202687% relevant

Explore More

AI Agents Large Language Models Claude Code OpenAI RAG MCP Fine-tuning Benchmarks Open Source AI AI Safety