code review
30 articles about code review in AI news
Claude Code Routines: Automate Code Reviews
Automate Claude Code tasks like scheduled code reviews or deployment hooks using the new Routines feature, which runs on Anthropic's infrastructure.
Qodo AI Code Review Tool Claims Major Edge Over Anthropic's Claude in Performance and Cost
A new AI-powered code review tool called Qodo reportedly outperforms Anthropic's Claude Code Review by 19% in recall accuracy while costing ten times less per review, potentially reshaping the landscape of automated development assistance.
Anthropic's Claude Code Launches Autonomous Code Review, Pushing AI Beyond Simple Generation
Anthropic has launched Code Review in Claude Code, a multi-agent system that automatically analyzes AI-generated code for logic errors and security vulnerabilities. This represents a shift from AI as a coding assistant to an autonomous reviewer capable of complex, multi-step reasoning.
The Silent Revolution: How AI Code Reviewers Are Earning Trust Through Real-World Validation
AI-powered code review systems are undergoing continuous validation through thousands of daily developer actions in open-source repositories. Each time a developer fixes a bug flagged by AI, it serves as an independent vote of confidence in the system's accuracy.
Martian Researchers Unveil Code Review Bench: A Neutral Benchmark for AI Coding Assistants
Researchers from DeepMind, Anthropic, and Meta have launched Code Review Bench, a new benchmark designed to objectively evaluate AI code review capabilities without commercial bias. This collaborative effort aims to establish standardized measurement for how well AI models can analyze, critique, and improve code.
AI Code Review Showdown: New Data Reveals Surprising Performance Gaps
New research provides the first comprehensive data-driven comparison of AI code review tools, revealing significant performance differences between GitHub Copilot and Graphite. The findings challenge assumptions about AI's role in software development workflows.
Beyond the Hype: The New Open Benchmark Putting Every AI Code Review Tool to the Test
A new open benchmarking platform allows developers to test their custom AI code review bots against eight leading commercial tools using real-world data. This transparent approach moves beyond marketing claims to provide objective performance comparisons.
AI Code Review Tools Finally Get Real-World Benchmarks: The End of Vibe-Based Decisions
New benchmarking of 8 AI code review tools using real pull requests provides concrete data to replace subjective comparisons. This marks a shift from brand-driven decisions to evidence-based tool selection in software development.
Cloudflare's New MCP Server Cuts AI Code Review Costs by 70%
A new MCP server from Cloudflare that pre-processes code to remove non-essential elements, slashing token consumption for AI-powered development workflows.
Side-by-Side Code Reviews: How to Compare Claude Code vs. Codex Outputs for Better Results
Learn how to compare Claude Code and Codex outputs side-by-side to identify each model's strengths and choose the right tool for specific coding tasks.
How Spec-Driven Development Cuts Claude Code Review Time by 80%
A developer's experiment shows that writing formal, testable specifications in plain English before coding reduces Claude Code hallucinations and eliminates manual verification of every generated line.
Inline Code Review UI for Claude Code Cuts Feedback Loop from Minutes to Seconds
A new VS Code extension lets you annotate Claude Code's changes directly in your editor and send structured feedback back to Claude via the Channels API.
Claude Code's New /review Command: How to Use It Without Breaking Your Budget or Team
Claude Code now has built-in code review. Learn the exact prompts and CLI flags to make it cost-effective and complementary to senior engineers.
Claude Agent Teams UI: The Visual Dashboard That Makes Agent Teams Actually Usable
A free, open-source desktop app that adds a real-time kanban board, cross-team messaging, and hunk-level code review to Claude Code's Agent Teams feature.
Amazon's AI Coding Crisis: How Generative Tools Triggered Major Outages and Forced Emergency Response
Amazon is convening an emergency meeting after AI-assisted coding tools caused four major website outages in one week. The company is implementing manual code reviews and developing AI safeguards to prevent future crashes affecting critical features like checkout.
Use Claude Code to Automate Systematic Literature Reviews
Claude Code can automate systematic literature reviews: scrape papers, extract key themes, and generate structured summaries — all from the terminal.
Codex 'Chronicle' Research Preview Adds Memory for Daily Developer Context
A research preview of 'Chronicle' for Codex has been released. It enables the AI coding assistant to accumulate memories from a developer's daily workflow to improve context.
Claude Code's /ultrareview Command
Claude Code's new /ultrareview command runs multiple AI reviewers in parallel to find and independently verify real bugs, costing $5-20 per run after three free tries.
How to Manage Multiple Claude Code Sessions with Harness and Preview
Two actionable tools to solve the core productivity bottlenecks when running multiple Claude Code agents: session management and review speed.
Code-Review-Graph Cuts Claude Token Usage 8.2x with Local Knowledge Graph
A developer released 'code-review-graph,' an open-source tool that uses Tree-sitter to build a persistent structural map of a codebase. This allows Claude to read only relevant files, cutting average token usage by 8.2x across six real repositories.
Tandem: Add Real-Time Document Review to Claude Code in 3 Commands
Tandem is an MCP server that connects Claude Code to a browser-based editor for real-time, annotated document review, eliminating the back-and-forth of traditional prompting.
Codex-Review Plugin: The Structured Workflow for Claude-Codex Collaboration
A new Claude Code plugin adds five simple commands to create a repeatable, artifact-driven review loop between Claude and OpenAI's Codex, preventing plan drift and lost context.
Stop Reviewing Every Line: 3 Claude Code Workflows That Verify Code For You
How to use CLAUDE.md rules, MCP servers, and targeted prompting to automatically validate Claude Code's output before you review it.
Anthropic's Claude Code Now Acts as Autonomous PR Agent, Fixing CI Failures & Review Comments in Background
Anthropic has transformed Claude Code into a persistent pull request agent that monitors GitHub PRs, reacts to CI failures and reviewer comments, and pushes fixes autonomously while developers are offline. The system runs on Anthropic-managed cloud infrastructure, enabling full repo operations without local compute.
Anthropic Launches Claude Code Auto Mode Preview, a Safety Classifier to Prevent Mass File Deletions
Anthropic is previewing 'auto mode' for Claude Code, a classifier that autonomously executes safe actions while blocking risky ones like mass deletions. The feature, rolling out to Team, Enterprise, and API users, follows high-profile incidents like a recent AWS outage linked to an AI tool.
Claude Code's Built-In Preview MCP: Instant Frontend Previews Without Configuration
Claude Code Desktop now includes a built-in MCP server for instant HTML/CSS/JS previews—no installation or configuration needed.
Claude Mythos Goes GA in Google Cloud Console, Drops Preview Label
Claude Mythos silently went GA in Google Cloud console, preview label removed. Signals deeper Anthropic-GCP integration.
Claude Mythos Preview Doubles METR Time Horizon at 80% Success
Claude Mythos Preview snapshot achieves 2x METR time horizon over next best model at 80% success rate, per Anthropic. Absolute numbers undisclosed.
GPT-Image-2 Adds Self-Review Loop for Iterative Image Correction
A new capability in GPT-Image-2 allows the model to review and iteratively correct its own image generations, aiming for higher accuracy before final output.
Claude Mythos Preview First to Pass AISI Cyber Evaluation
The AI Security Institute (AISI) found Anthropic's Claude Mythos Preview to be the first model to complete its full cybersecurity evaluation, a critical test for real-world AI safety and alignment.