Skip to content
gentic.news — AI News Intelligence Platform
Connecting to the Living Graph…

code quality

30 articles about code quality in AI news

AMD AI Director Reports Claude Code Quality Decline, Cites 234k Tool Calls

An AMD AI executive presented data from over 6,800 sessions showing Claude Code's performance has declined since early March, with rising instances of shallow reasoning and incomplete tasks. This raises significant trust issues for engineers using the model in complex development workflows.

89% relevant

Forge Plugin Adds Governance to Claude Code: 22 Agents, Quality Gates, and Zero Config

Install the Forge plugin to add automated quality checks, health scoring, and specialized agents to Claude Code workflows in 30 seconds.

89% relevant

Claude Sonnet 4.5 vs 4.0: What the Quality Regression Means for Your Claude Code Workflow

Recent analysis shows Claude Sonnet 4.5 may have quality regressions vs 4.0. Here's how Claude Code users should adapt their prompting and model selection.

86% relevant

GPT-5.4 LLM Choice Drastically Impacts GPT-ImageGen-2 Output Quality

The quality of images generated by GPT-ImageGen-2 is heavily dependent on the underlying LLM used for reasoning. GPT-5.4 'Thinking' and 'Pro' models produce superior outputs, especially for complex concepts, a non-intuitive finding not documented by OpenAI.

85% relevant

New Research Identifies Data Quality as Key Bottleneck in Multimodal Forecasting

A new arXiv paper introduces CAF-7M, a 7-million-sample dataset for context-aided forecasting. The research shows that poor context quality, not model architecture, has limited multimodal forecasting performance. This has implications for retail demand prediction that combines numerical data with text or image context.

70% relevant

Claude Code Regression: How to Diagnose and Fix the Recent Quality Drop

Anthropic's postmortem reveals three regressions in Claude Code: reasoning effort, context retention, and verbosity changes. Here's how to diagnose and fix them.

100% relevant

Creator Shares 5-Prompt Claude Workflow for High-Quality Content

A content creator detailed a specific 5-prompt workflow for Anthropic's Claude AI, claiming it generates superior writing to his own multi-year output. The method focuses on structured prompting without plugins.

75% relevant

Swarm Plugin Enforces Consistent 9/10 Outputs from Claude Code Teams

The Swarm plugin for Claude Code creates a structured team of agents that review and score work before it reaches you, solving the problem of inconsistent output quality.

100% relevant

Efficient Universal Perception Encoder (EUPE) Family Challenges DINOv2

Researchers introduced the Efficient Universal Perception Encoder (EUPE), a family of compact vision models that achieve performance rivaling the larger DINOv2. This could enable high-quality visual understanding on resource-constrained devices.

85% relevant

Codex-CLI-Compact: The Graph-Based Context Engine That Cuts Claude Code Costs 30-45%

A new local tool builds a semantic graph of your codebase to pre-load only relevant files into Claude's context, reducing token usage by 30-45% without quality loss.

100% relevant

How to Build a Custom AI Agent with Claude Code's Skills, SubAgents, and Hooks

A developer's deep dive into customizing Claude Code with 7 skills, 5 subagents, and quality-check hooks—showing how to move beyond basic prompting to create a truly autonomous coding assistant.

95% relevant

How a Non-Programmer Built a 487-File Unity Tool with Claude Code's 'Vibe Coding'

A graphic designer built a complex Unity map editor with 151K+ lines of C# using Claude Code's iterative 'describe → test → fix' workflow and early quality rule enforcement.

100% relevant

AI's Hidden Talent: How Mediocre Code Delivers Exceptional Real-World Value

New research reveals AI can transform low-quality code into high-value practical applications, with the biggest impact outside traditional software development. Even skills rated just 6.2/12 deliver significant productivity boosts across diverse fields.

85% relevant

Claude Code Plugin Deploys 17-Agent SDLC Team With Orchestrator

Team-of-agents plugin adds 17 specialist AI agents with an orchestrator to Claude Code, using confidence signals to gate output quality.

92% relevant

How to Install claude-flow MCP and 3 Skills That Transform Claude Code

A production team's setup reveals claude-flow MCP with hierarchical-mesh topology and three essential skills that add structure, parallelism, and quality control.

95% relevant

NanoVDR: A 70M Parameter Text-Only Encoder for Efficient Visual Document Retrieval

New research introduces NanoVDR, a method to distill a 2B parameter vision-language retriever into a 69M text-only student model. It retains 95% of teacher quality while cutting query latency 50x and enabling CPU-only inference, crucial for scalable search over visual documents.

82% relevant

Microsoft Open-Sources AI Engineer Coach, a Fitbit for Dev Workflows

Microsoft open-sourced AI Engineer Coach, a VS Code extension that scores developer AI workflow quality across 5 categories with 45 anti-pattern rules.

95% relevant

Renoise AI Tool Enables Programmatic Video Generation, Promising Faster Production

Renoise has launched an AI tool that generates videos through code rather than traditional editing. The platform claims to produce high-quality videos more easily and faster than previous methods.

85% relevant

Google DeepMind's Unified Latents Framework: Solving Generative AI's Core Trade-Off

Google DeepMind introduces Unified Latents (UL), a novel framework that jointly trains diffusion priors and decoders to optimize latent space representation. This approach addresses the fundamental trade-off between reconstruction quality and learnability in generative AI models.

75% relevant

Microsoft Markitdown: One-Command File-to-Markdown for LLMs

Microsoft open-sourced Markitdown, a one-command file-to-markdown converter for LLMs, improving output quality by leveraging markdown training data.

75% relevant

Cascaded LLMs Lift E-Commerce Cart Adds 2.7% in Online Test

A cascaded LLM framework for e-commerce storefront generation lifted cart adds by +2.7% in online tests, using teacher-student fine-tuning to approach closed-weight LLM quality at production latency.

100% relevant

Anthropic Deprecates Fixed Thinking Budgets, Forces Adaptive Mode

Anthropic forced adaptive thinking on Claude models, deprecating fixed budgets. Users report quality drops and the change reduces API revenue potential.

100% relevant

DeepMind’s New VAE Matches Stable Diffusion at 10x Resolution

DeepMind’s new VAE produces 1024x1024 images with quality comparable to Stable Diffusion’s 256x256 output, potentially replacing the standard VAE in generative pipelines. This cuts the token count by 10x, enabling faster generation and lower memory usage.

85% relevant

AI Frontier Pricing Widens Global Access Gap, Analysis Shows

A viral analysis highlights that Anthropic and OpenAI's $200/mo plans cost 15% of median monthly income in Nigeria vs 0.3% in the US, raising concerns about global AI access inequality.

89% relevant

Qwen3.6-27B: How to Run a 17GB Local Model That Beats 397B MoE on Coding Tasks

Qwen3.6-27B delivers flagship-level coding performance in a 55.6GB model that can be quantized to 16.8GB, making high-quality local coding assistance accessible.

100% relevant

GPT-5.5 Limited Rollout Begins, Frontend Improvements Noted

OpenAI has started a limited rollout of GPT-5.5 to select users, with early reports highlighting significant frontend quality improvements. This suggests an incremental update focused on user experience rather than core model capabilities.

85% relevant

TME-PSR: A New Sequential Recommendation Model Unifies Time

Researchers propose TME-PSR, a model integrating personalized time patterns, multi-interest modeling, and explanation alignment for sequential recommendations. It shows improved accuracy and explanation quality with lower computational cost in experiments.

90% relevant

Beyond Relevance: A New Framework for Utility-Centric Retrieval in the LLM Era

This tutorial paper posits that the rise of Retrieval-Augmented Generation (RAG) changes the fundamental goal of information retrieval. Instead of finding documents relevant to a query, systems must now retrieve information that is most *useful* to an LLM for generating a high-quality answer. This requires new evaluation frameworks and system designs.

92% relevant

AI-Powered 'Vibe Coding' Drives 84% Surge in App Store Submissions

App Store submissions surged 84% last year to over 600,000 new apps, driven by AI-assisted 'vibe coding.' This rapid proliferation is devaluing traditional development skills and flooding the market with low-quality applications.

75% relevant

Meta Halts Mercor Work After Supply Chain Breach Exposes AI Training Secrets

A supply chain attack via compromised software updates at data-labeling vendor Mercor has forced Meta to pause collaboration, risking exposure of core AI training pipelines and quality metrics used by top labs.

97% relevant