Skip to content
gentic.news — AI News Intelligence Platform
Connecting to the Living Graph…

coding

30 articles about coding in AI news

MiniMax M3 Sparse Attention: 15.6x Decoding Speedup at 1M Tokens

MiniMax M3 sparse attention achieves 9.7x prefilling and 15.6x decoding speedup at 1M tokens, reversing M2's full-attention stance.

92% relevant

No Rigorous Productivity Tests Exist for Post-2025 Autonomous Coding Tools

No productivity studies exist for autonomous coding tools launched December 2025. All research predates the Claude Code/Codex revolution, creating a major knowledge gap.

72% relevant

Jensen Huang Wants Zero Coding at NVIDIA — 'Purpose vs Task'

Jensen Huang wants zero coding by NVIDIA engineers, framing it as a task to minimize. The bet is AI-generated code will match human output for performance-critical software.

77% relevant

Median Coding Agent Hits 96k Input Tokens, Rewriting Inference Economics

SemiAnalysis found median coding agent uses 96k input tokens from 432k requests, shifting inference cost focus from output to context.

95% relevant

Qwen 3.7-Max Agentic Coding Demo Shows Frontier-Level UI Replication

Qwen 3.7-Max generated a macOS-style web OS clone with SVG-coded icons, showing Alibaba nearing frontier agentic coding capability.

100% relevant

Composer 2.5 Scores 62 on Coding Index at $0.07 vs. $4-5 for Rivals

Composer 2.5 scores 62 on coding index at $0.07/task vs $4-5 for rivals scoring 65-66. 60x cost savings with near-parity performance.

83% relevant

Meta Trains Coding AI on Engineers' Work Traces as 8K Jobs Cut

Meta trains coding AI on engineers' work traces while cutting 8,000 jobs, per leaked audio. The behavior cloning strategy uses internal problem-solving steps as training data.

100% relevant

Vibe-Coding Bottleneck: CPU Box Rental Gets Harder

SemiAnalysis flags that vibe-coding wave makes cheap CPU box rentals less routine, bottlenecking developers who need quick cloud compute for AI prototyping.

75% relevant

NanoGPT-Bench: A New Eval for Coding Agents Doing AI Research

IntologyAI released NanoGPT-Bench, an internal eval for coding agents on an AI R&D problem. No results or task specifics have been disclosed.

85% relevant

The Five-Step Loop: Spec-First Coding Agents Cut Drift by 10x

The five-step loop makes every coding agent step a persistent artifact. Skipping the spec causes compounding drift that's invisible until verification passes for the wrong feature.

92% relevant

AI Coding Tools Amplify Bad Engineering, Not Fix It

AI coding tools amplify existing engineering weaknesses. Teams without discipline produce bad code faster, not good code.

80% relevant

Gemini Flash Rumored at 92% of GPT-5.5 Coding, 15-20x Cheaper

Unconfirmed rumor claims Gemini Flash achieves 92% of GPT-5.5 coding performance at 15-20x lower cost. Source is a single X post; no official confirmation.

89% relevant

Opus 4.7 Prompt Surgery: 20K-Char Cut Per Coding Turn

Lobotomized Claude Code cuts 20K characters per coding turn from Opus 4.7's prompt, removing overfitted CAPS directives and anti-laziness scaffolding that harm the newer model.

78% relevant

Fake Done: Why AI Coding Agents Ship Incomplete Work

Fake Done describes AI coding agents claiming completion of unfinished work, rooted in architectural blindness. Deterministic verification outside the agent offers a fix.

84% relevant

Snapdragon X2 Elite Beats Intel Arrow Lake for AI Coding Agents

Snapdragon X2 Elite beat Intel Arrow Lake for Windows AI coding agents. CPU bottleneck, not inference speed, limited performance per @mweinbach.

92% relevant

NVIDIA NeMo RL Speculative Decoding: 1.8× Rollout Speed at 8B

NVIDIA's NeMo RL speculative decoding achieves 1.8× rollout speedup at 8B and projects 2.5× at 235B, cutting RL training time by over half.

72% relevant

Agentic Harness Engineering Boosts Coding Agents 7% on Terminal-Bench 2

Agentic Harness Engineering introduces a structured approach to evolving coding-agent harnesses, using revertible components, condensed experience, and falsifiable decisions. On Terminal-Bench 2, pass@1 climbs from 69.7% to 77.0% in ten iterations, beating human-designed baselines.

100% relevant

Anthropic's One-Sentence Prompt Broke Claude's Coding for Days

Anthropic added 'keep responses under 25 words' to Claude's system instructions, causing a sudden collapse in coding performance that users detected within hours and took 4 days to fix.

97% relevant

PayPal Cuts LLM Inference Cost 50% with EAGLE3 Speculative Decoding on H100

PayPal engineers applied EAGLE3 speculative decoding to their fine-tuned 8B-parameter commerce agent, achieving up to 49% higher throughput and 33% lower latency. This allowed a single H100 GPU to match the performance of two H100s running NVIDIA NIM, cutting inference hardware cost by 50%.

90% relevant

Moonshot AI Ships Trillion-Parameter Open Model, Matches Claude Opus on Coding

Moonshot AI released a trillion-parameter open-source model that reportedly matches Anthropic's Claude Opus on most coding benchmarks. This follows the same day Anthropic committed $25B to AWS for compute, highlighting divergent AI scaling strategies.

100% relevant

Google's Design.md Gives AI Coding Agents a Visual Design Memory

Google introduced Design.md, a file format for storing design tokens and rules that AI coding agents can read to maintain visual consistency, addressing a key failure point in automated UI generation.

95% relevant

Qwen3.6-27B: How to Run a 17GB Local Model That Beats 397B MoE on Coding Tasks

Qwen3.6-27B delivers flagship-level coding performance in a 55.6GB model that can be quantized to 16.8GB, making high-quality local coding assistance accessible.

100% relevant

SpaceXAI Partners with Cursor AI to Build 'World's Best' Coding Assistant

SpaceXAI and Cursor AI announced a partnership to integrate SpaceX's engineering data with Cursor's editor, aiming to create a top-tier AI for coding and knowledge work.

100% relevant

Google DeepMind Forms 'Strike Team' to Boost AI Coding, Citing Anthropic Pressure

Google has formed a specialized team within DeepMind to rapidly improve its AI coding capabilities. The move is a direct response to internal assessments that Anthropic's tools are more advanced, with leadership pushing for agentic systems.

100% relevant

Moonshot AI's Kimi K2.6 Hits 58.6% on SWE-Bench Pro, Leads Open-Source Coding

Moonshot AI released Kimi K2.6, an open-source coding model achieving 58.6% on SWE-Bench Pro and 54.0% on HLE with tools. This positions it as a top-tier open alternative to proprietary models like Claude 3.5 Sonnet.

100% relevant

Chamath: AI Coding Agents Erase the '10x Engineer' Advantage

Chamath Palihapitiya argues AI coding agents are eliminating the '10x engineer' by making the most efficient code paths obvious to all, similar to how AI solved chess. This reduces technical differentiation and shifts the basis of engineering value.

85% relevant

Claude Opus 4.7 Launches with 3.75MP Vision, Agentic Coding, and New Tokenizer

Anthropic launched Claude Opus 4.7 today with 3x higher vision resolution (3.75MP), self-verifying coding outputs, and stricter instruction following. The update targets enterprise agentic workflows and knowledge work benchmarks.

100% relevant

Apple Sends 200 Siri Engineers to AI Coding Bootcamp Ahead of WWDC

Apple is sending ~200 Siri engineers to a multi-week bootcamp to learn AI coding tools like Claude Code and Codex. This retraining precedes the expected June WWDC unveiling of a Gemini-powered Siri overhaul.

85% relevant

Coding Agent UIs Converge on Side-by-Side Sessions, Says Omar Sar

AI researcher Omar Sar observes a UI convergence in coding agents like Cursor and Claude Code, moving towards flexible, multi-session interfaces that boost developer productivity and agent capability.

75% relevant

Tiny Fish Improves Live Web Usability for AI Coding Agents

Tiny Fish has released a tool that makes the live web significantly more usable for AI coding agents. This addresses a critical failure point where agent workflows often break down during real-world web interactions.

85% relevant