Skip to content
gentic.news — AI News Intelligence Platform
Connecting to the Living Graph…

agentic coding

30 articles about agentic coding in AI news

Qwen 3.7-Max Agentic Coding Demo Shows Frontier-Level UI Replication

Qwen 3.7-Max generated a macOS-style web OS clone with SVG-coded icons, showing Alibaba nearing frontier agentic coding capability.

100% relevant

AI-2027 Authors Accelerate AGI Timelines, Citing Rapid Progress in Agentic Coding

The AI-2027 forecasting group has accelerated its timeline for when AI could replace human software engineers by 1.5 years, from late 2029 to mid-2028. This revision is based on observed rapid progress in agentic coding systems over the last 3-5 months.

85% relevant

Agent Psychometrics: New Framework Predicts Task-Level Success in Agentic Coding Benchmarks with 0.81 AUC

A new research paper introduces a framework using Item Response Theory and task features to predict success on individual agentic coding tasks, achieving 0.81 AUC. This enables benchmark designers to calibrate difficulty without expensive evaluations.

75% relevant

Claude Opus 4.7 Launches with 3.75MP Vision, Agentic Coding, and New Tokenizer

Anthropic launched Claude Opus 4.7 today with 3x higher vision resolution (3.75MP), self-verifying coding outputs, and stricter instruction following. The update targets enterprise agentic workflows and knowledge work benchmarks.

100% relevant

Context Graph for Agentic Coding: A New Abstraction for LLM-Powered Development

A new "context graph" abstraction is emerging for AI coding agents, designed to manage project state and memory across sessions. It aims to solve the persistent context problem in long-running development tasks.

89% relevant

From Agentic Coding to Autonomous Factories: How Cursor Automations Is Redefining Software Engineering

Cursor's new Automations feature transforms AI-assisted coding from a manual, agent-babysitting model to an event-driven system where AI agents trigger automatically based on workflows. This addresses the human attention bottleneck in managing multiple coding agents simultaneously.

85% relevant

Claude Opus 4.8 Launches Dynamic Workflows for Agentic Code

Claude Opus 4.8 launched with dynamic workflows for Claude Code, enabling multi-step agentic coding. The release addresses quality issues after a ~25% instruction miss rate post-4.6.

100% relevant

Claude Code's /btw Command Enables Side Conversations During Agentic Workflows

Claude Code's new /btw command allows developers to ask follow-up questions while Claude is actively working, maintaining context without interrupting the primary task. This addresses a key workflow friction point in agentic coding.

95% relevant

Cursor AI Unveils New Benchmark for Evaluating AI Coding Assistants

Cursor AI has introduced a novel method for scoring AI models on agentic coding tasks, measuring both intelligence and efficiency. The benchmark reveals how different models perform in real-world development scenarios.

87% relevant

DeepSeek-R1 Reportedly Hits 78.9% on OS-World, Outperforming GPT-5.4 at 1/10th Cost

A new benchmark claim suggests DeepSeek-R1 has achieved 78.9% on the OS-World agentic coding benchmark, reportedly outperforming GPT-5.4 while operating at one-tenth the cost. If verified, this would represent a significant leap in cost-performance for AI coding agents.

95% relevant

Boris Cherny's Claude Code Tips Are Now a Skill. Here Is What the Complete Collection Reveals.

A curated collection of expert Claude Code tips is now available as a shareable 'Skill,' revealing proven workflows for faster, more reliable agentic coding.

95% relevant

Industry Leaders Predict 2026 as Breakthrough Year for AI Agents Across Domains

AI industry leaders predict 2026 as the breakthrough year for AI agents across all domains, following initial successes in agentic coding. NVIDIA's Jensen Huang positions current AI development in the 'era of Agents'.

87% relevant

From Code to Discovery: The Next Frontier of AI Agents in Research

AI researcher Omar Saray predicts a shift from 'agentic coding' to 'agentic research'—where AI systems will autonomously conduct scientific discovery. This evolution promises to accelerate innovation across disciplines.

85% relevant

Agentic Harness Engineering Boosts Coding Agents 7% on Terminal-Bench 2

Agentic Harness Engineering introduces a structured approach to evolving coding-agent harnesses, using revertible components, condensed experience, and falsifiable decisions. On Terminal-Bench 2, pass@1 climbs from 69.7% to 77.0% in ten iterations, beating human-designed baselines.

100% relevant

Claude Opus 4.6 Is Live: How to Use Its Improved Coding & Agentic Features in Claude Code

Claude Opus 4.6 is now available with better coding accuracy and agentic task handling. Here's how to configure Claude Code to use it and what to expect.

95% relevant

Google DeepMind Forms 'Strike Team' to Boost AI Coding, Citing Anthropic Pressure

Google has formed a specialized team within DeepMind to rapidly improve its AI coding capabilities. The move is a direct response to internal assessments that Anthropic's tools are more advanced, with leadership pushing for agentic systems.

100% relevant

CMU Research Identifies 'Biggest Unlock' for Coding Agents: Strategic Test Execution

New research from Carnegie Mellon University suggests the key advancement for AI coding agents lies not in raw code generation, but in developing strategies for how to run and interpret tests. This shifts focus from LLM capability to agentic reasoning.

87% relevant

Andrew Ng's Context Hub Solves AI's Documentation Dilemma for Coding Agents

Andrew Ng's team at DeepLearning.AI has launched Context Hub, an open-source tool that provides coding agents with real-time API documentation access. This addresses a critical bottleneck in agentic AI workflows where outdated documentation causes failures.

80% relevant

MiniMax M3 Sparse Attention: 15.6x Decoding Speedup at 1M Tokens

MiniMax M3 sparse attention achieves 9.7x prefilling and 15.6x decoding speedup at 1M tokens, reversing M2's full-attention stance.

100% relevant

Median Coding Agent Hits 96k Input Tokens, Rewriting Inference Economics

SemiAnalysis found median coding agent uses 96k input tokens from 432k requests, shifting inference cost focus from output to context.

95% relevant

Composer 2.5 Scores 62 on Coding Index at $0.07 vs. $4-5 for Rivals

Composer 2.5 scores 62 on coding index at $0.07/task vs $4-5 for rivals scoring 65-66. 60x cost savings with near-parity performance.

83% relevant

Snapdragon X2 Elite Beats Intel Arrow Lake for AI Coding Agents

Snapdragon X2 Elite beat Intel Arrow Lake for Windows AI coding agents. CPU bottleneck, not inference speed, limited performance per @mweinbach.

92% relevant

PayPal Cuts LLM Inference Cost 50% with EAGLE3 Speculative Decoding on H100

PayPal engineers applied EAGLE3 speculative decoding to their fine-tuned 8B-parameter commerce agent, achieving up to 49% higher throughput and 33% lower latency. This allowed a single H100 GPU to match the performance of two H100s running NVIDIA NIM, cutting inference hardware cost by 50%.

90% relevant

Google's Design.md Gives AI Coding Agents a Visual Design Memory

Google introduced Design.md, a file format for storing design tokens and rules that AI coding agents can read to maintain visual consistency, addressing a key failure point in automated UI generation.

95% relevant

Qwen3.6-27B: How to Run a 17GB Local Model That Beats 397B MoE on Coding Tasks

Qwen3.6-27B delivers flagship-level coding performance in a 55.6GB model that can be quantized to 16.8GB, making high-quality local coding assistance accessible.

100% relevant

Chamath: AI Coding Agents Erase the '10x Engineer' Advantage

Chamath Palihapitiya argues AI coding agents are eliminating the '10x engineer' by making the most efficient code paths obvious to all, similar to how AI solved chess. This reduces technical differentiation and shifts the basis of engineering value.

85% relevant

Tiny Fish Improves Live Web Usability for AI Coding Agents

Tiny Fish has released a tool that makes the live web significantly more usable for AI coding agents. This addresses a critical failure point where agent workflows often break down during real-world web interactions.

85% relevant

OpenClaw Creator: Agentic Workflows Fail Without Human Taste in Loop

Peter Steinberger, creator of the OpenClaw AI agent framework, argues that the core failure in agentic workflows is removing human judgment too soon. He asserts that strong output requires continuous human vision, steering, and questioning.

75% relevant

OpenMontage: Open-Source Agentic Video Production System Costs $0.69 Per Ad

OpenMontage, an open-source agentic video production system, has been released. It orchestrates 11 pipelines and 49 tools across multiple AI providers to autonomously script, generate assets, edit, and render videos from a plain language prompt.

99% relevant

Anthropic's Agentic Workflows Launch: A Deep Dive on Cost & Capabilities

Anthropic launched Agentic Workflows, a managed service for running persistent AI agents. While marketed from $0.08/hr, real-world costs are higher due to compute, memory, and network fees.

82% relevant