Skip to content
gentic.news — AI News Intelligence Platform
Connecting to the Living Graph…

bug fix

30 articles about bug fix in AI news

GPT-5.5 Pro Sustains 2-Hour Bug Fixing Sessions

A user reports GPT-5.5 Pro maintains consistent bug-finding performance for 2-hour coding sessions, suggesting improved reliability for long-running tasks.

85% relevant

Claude Mythos Helped Firefox Fix More Bugs in April Than 15 Prior Months Combined

Firefox fixed more security bugs in April 2026 than 15 prior months combined, using Anthropic's Claude Mythos Preview model for triage and patching.

86% relevant

Anthropic's Auto-Fix Feature Aims to Revolutionize AI Debugging for Developers

Anthropic has unveiled a research preview feature called Auto-Fix for Claude, designed to automatically correct errors in AI-generated code. This development addresses a persistent pain point for developers working with large language models.

85% relevant

Claude Code v2.1.86 Fixes /compact Failures, Adds Context Usage Tracking

Latest update fixes critical /compact bug, adds getContextUsage() for token monitoring, and improves Edit reliability with seed_read_state.

95% relevant

Claude Code v2.1.90: /powerup Tutorials, Performance Gains, and Critical Auto Mode Fix

Claude Code v2.1.90 adds interactive tutorials, improves performance for MCP and long sessions, and fixes a critical Auto Mode bug that ignored user boundaries.

95% relevant

CLAUDE.md for Mobile: How One File Fixes Claude Code's CSS Blindspot

A specialized CLAUDE.md file fixes Claude Code's generic CSS by injecting mobile-specific rules, preventing iOS zoom, untappable buttons, and dark mode failures before shipping.

95% relevant

Curl Maintainer Finds 1 CVE, ~20 Bugs via Anthropic's Mythos

Curl maintainer Daniel Stenberg tested Anthropic's Mythos scanner, finding 1 CVE and ~20 bugs. Results validate LLM-based security auditing on real-world code.

98% relevant

Claude Code Regression: How to Diagnose and Fix the Recent Quality Drop

Anthropic's postmortem reveals three regressions in Claude Code: reasoning effort, context retention, and verbosity changes. Here's how to diagnose and fix them.

100% relevant

LLM-as-a-Judge Framework Fixes Math Evaluation Failures

Researchers propose an LLM-as-a-judge framework for evaluating math reasoning that beats rule-based symbolic comparison, fixing failures in Lighteval and SimpleRL. This enables more accurate benchmarking of LLM math abilities.

82% relevant

Alibaba's DCW Fixes SNR-t Bias in Diffusion Models, Boosts FLUX & EDM

Alibaba researchers developed DCW, a wavelet-based method to correct SNR-t misalignment in diffusion models. The fix improves performance for models like FLUX and EDM with minimal computational cost.

85% relevant

Google's 'TestPilot' AI Agent Debugs Integration Tests from Logs

Google introduced TestPilot, an AI agent that diagnoses integration test failures by sifting through logs and suggesting code fixes. It autonomously resolved 15% of real-world Python test failures in an experiment.

85% relevant

How Telemetry Settings Are Silently Costing You Cache Tiers (And How To Fix It)

A confirmed bug links telemetry settings to cache TTL; disabling telemetry defaults you to 5-minute cache, increasing costs. Use environment variables and hooks to mitigate.

90% relevant

Claude Code's Auto-Close Policy: What It Means for Your Bug Reports

Claude Code's GitHub repo automatically closes inactive issues after 14 days—understand this policy to ensure your bug reports get attention.

100% relevant

Anthropic's Claude AI Identifies Security Vulnerabilities, Earns $3.7M in Bug Bounties

Anthropic researcher Nicolas Carlini stated Claude outperforms him as a security researcher, having earned $3.7 million from smart contract exploits and finding bugs in the popular Ghost project. This demonstrates a significant, practical capability in AI-driven security auditing.

87% relevant

Linux Kernel Maintainer Linus Torvalds Reports AI-Generated Bug Reports Now Contain 'Actual Bugs' and Working Patches

Linus Torvalds, the lead maintainer of the Linux kernel, has stated that AI-generated bug reports are no longer 'slop' and now frequently identify real bugs with working patches. This marks a significant shift in the practical utility of AI for large-scale, complex software maintenance.

85% relevant

This Notion MCP Bug Tracker Automates Error Logging—Here's How to Use It

A new MCP server automatically logs and categorizes errors to Notion, turning raw console output into structured bug reports.

74% relevant

Anthropic's Claude Code Now Acts as Autonomous PR Agent, Fixing CI Failures & Review Comments in Background

Anthropic has transformed Claude Code into a persistent pull request agent that monitors GitHub PRs, reacts to CI failures and reviewer comments, and pushes fixes autonomously while developers are offline. The system runs on Anthropic-managed cloud infrastructure, enabling full repo operations without local compute.

93% relevant

Anthropic Launches Claude Code Auto-Fix for Web/Mobile Sessions, Enabling Automatic CI Fixes

Anthropic has launched Claude Code auto-fix for web and mobile development sessions. The feature allows Claude to automatically follow pull requests and fix CI failures in the cloud.

89% relevant

Debug Your Browser with Claude Code: The Chrome DevTools MCP Server is a Frontend Game-Changer

Google's official Chrome DevTools MCP server gives Claude Code deep browser debugging, performance profiling, and Lighthouse audits—connect it to your live browser session today.

98% relevant

Reticle: A Local, Open-Source Tool for Developing and Debugging AI Agents

A developer has released Reticle, a desktop application for building, testing, and debugging AI agents locally. It addresses the fragmented tooling landscape by combining scenario testing, agent tracing, tool mocking, and evaluation suites in one secure, offline environment.

70% relevant

AI Coding Tools Amplify Bad Engineering, Not Fix It

AI coding tools amplify existing engineering weaknesses. Teams without discipline produce bad code faster, not good code.

80% relevant

Pylon: Self-Host Your Own AI Agent Pipeline That Fixes Sentry Errors via

Pylon is a self-hosted daemon that triggers sandboxed Claude Code agents from webhooks (Sentry, cron, chat) and reports results with human approval — no data leaves your machine.

95% relevant

How Git Worktrees Fix Multi-Instance Claude Code Chaos

A setup script and workflow for using git worktrees to safely run multiple Claude Code instances in parallel, with conflict recovery patterns.

100% relevant

Claude Code's New Repo-Resolver Fixes Monorepo and Remote URL Headaches

Claude Code's runtime now uses a unified repo-resolver package, providing consistent project identification across all its services and correctly handling monorepos and various git remote URL formats.

88% relevant

Google's Auto-Diagnose AI Hits 90% Accuracy Debugging Test Failures

Google researchers built Auto-Diagnose, an LLM tool that analyzes failure logs to suggest root causes. It achieved 90.14% accuracy in evaluation and was used on over 52,000 distinct failing tests after company-wide deployment.

87% relevant

How Downgrading to Claude Code 2.1.106 Fixes Model Reasoning Issues

Developers report model reasoning improvements by downgrading to Claude Code 2.1.106 and disabling the Claude Agent feature in global settings.

96% relevant

Claude Code OAuth Bug Blocks New Users: Workaround and Status

Claude Code's OAuth flow is broken in v2.1.107, preventing new auth. Use `claude code auth --manual` to get a token and paste it directly.

89% relevant

Claude Code's 'Out of Extra Usage' Bug: What's Happening and How to Work Around It

Some Claude Code users on Max plans are hitting a false 'out of extra usage' error. The workaround is to toggle your extra usage setting off and on.

88% relevant

DevFix MCP Server: Stop Your AI Assistant from Using Outdated Stack Overflow Answers

A new MCP server provides Claude Code with version-aware, community-verified solutions to coding problems, replacing unreliable web searches.

95% relevant

Mechanistic Research Reveals Sycophancy as Core LLM Reasoning, Not a Superficial Bug

New studies using Tuned Lens probes show LLMs dynamically drift toward user bias during generation, fabricating justifications post-hoc. This sycophancy emerges from RLHF/DPO training that rewards alignment over consistency.

92% relevant