best practices
30 articles about best practices in AI news
Claude Code 2.5: New CLI, Dashboard, and Best Practices for Web Devs
Anthropic's latest Claude Code update adds a CLI, usage dashboard, and web-focused best practices. Here's how to use them.
How the New Claude Certified Architect Exam Reveals Best Practices for Claude Code
Anthropic's new certification exam outlines the core principles for effectively using Claude in development, which you can apply directly to your Claude Code workflow.
CLAUDE.md Explained: How Anthropic's Agent Memory Works
CLAUDE.md is Anthropic's project config file for Claude Code, now two years old with settled best practices for agent memory and context.
NVIDIA and Unsloth Release Comprehensive Guide to Building RL Environments from Scratch
NVIDIA and Unsloth have published a detailed practical guide on constructing reinforcement learning environments from the ground up. The guide addresses critical gaps often overlooked in tutorials, covering environment design, when RL outperforms supervised fine-tuning, and best practices for verifiable rewards.
Evaluating AI Agents in Practice: Benchmarks, Frameworks, and Lessons Learned
A new report details the practical challenges and emerging best practices for evaluating AI agents in real-world applications, moving beyond simple benchmarks to assess reliability, safety, and business value.
Google Hits 75% AI-Generated Code, Up From 50% in Fall 2025
Google reports 75% of all new code is now AI-generated and engineer-approved, a sharp increase from 50% last fall. This indicates a massive, accelerating shift in software development practices at the tech giant.
Open-Source 'Claude Code' Dev Setup Replicates Anthropic Engineer's Workflow
A developer has reverse-engineered and published the complete Claude Code development setup used by Anthropic engineer Boris Cherny. The project is available for free on GitHub, offering a window into high-level AI-assisted programming practices.
Stop Using Elaborate Personas: Research Shows They Degrade Claude Code Output
Scientific research reveals common Claude Code prompting practices—like elaborate personas and multi-agent teams—are measurably wrong and hurt performance.
Regulators in Italy Probe Sephora, LVMH for Youth Marketing
Italian authorities are investigating LVMH and its beauty retailer Sephora for marketing practices targeting minors. This marks the first such European probe into the luxury conglomerate's youth outreach, signaling heightened regulatory scrutiny.
The Silent Data Harvest: Stanford Exposes How AI Giants Use Your Private Conversations
Stanford researchers reveal that all major AI companies—OpenAI, Google, Meta, Anthropic, Microsoft, and Amazon—train their models on user chat data by default, with minimal transparency, unclear opt-out mechanisms, and concerning practices around data retention and child privacy.
Beyond Architecture: How Training Tricks Make or Break AI Fraud Detection Systems
New research reveals that weight initialization and normalization techniques—often overlooked in AI development—are critical for graph neural networks detecting financial fraud on blockchain networks. The study shows these training practices affect different GNN architectures in dramatically different ways.
The AI Context Paradox: Why More Instructions Make Coding Agents Less Effective
ETH Zurich research reveals AI coding agents perform worse with overly detailed AGENTS.md files. The study shows excessive context creates 'obedient failure' where agents follow unnecessary instructions instead of solving problems efficiently. This challenges current industry practices for configuring AI development assistants.
Pruning LLMs for Edge Triples Bias, Perplexity Hides Damage
Pruning LLMs for edge deployment amplifies bias up to 83.7% while perplexity barely changes, revealing a paradox that undermines standard evaluation practices.
Why Production AI Needs More Than Benchmark Scores
The article argues that high benchmark scores are insufficient for production AI success, highlighting the need for robust MLOps practices, monitoring, and real-world testing—critical for retail applications.
Claude Desktop's Undisclosed Native Messaging Bridge
Claude Desktop installs a preauthorized native messaging bridge for browser extensions without explicit disclosure, impacting developer workflows and security practices.
Claude Code Best Practice Repo Hits 19.7K Stars with 84 Anthropic Tips
A GitHub repository called 'claude-code-best-practice' has amassed 19.7K stars by compiling 84 production tips from Anthropic's Claude Code creators. It provides a full open-source framework for moving from basic usage to advanced agentic workflows.
SpaceXAI Partners with Cursor AI to Build 'World's Best' Coding Assistant
SpaceXAI and Cursor AI announced a partnership to integrate SpaceX's engineering data with Cursor's editor, aiming to create a top-tier AI for coding and knowledge work.
Why the Best Generative AI Projects Start With the Most Powerful Model —
The article suggests that while initial AI projects leverage the broad capabilities of large foundation models, the most successful implementations eventually transition to smaller, more targeted systems. This reflects a maturation from experimentation to production optimization.
Google Cloud's Vertex AI Experiments Solves the 'Lost Model' Problem in ML Development
A Google Cloud team recounts losing their best-performing model after training 47 versions, highlighting a common MLops failure. They detail how Vertex AI Experiments provides systematic tracking to prevent this.
The Persistence Paradox: Why Safety Training Sticks in AI Agents Even When You Try to Make Them More Helpful
New research reveals that safety training in AI agents persists through subsequent helpfulness optimization, creating a linear trade-off frontier rather than achieving 'best of both worlds' outcomes. This challenges assumptions about how to balance safety and capability in multi-step AI systems.
11-Agent Company Earned $0: CLAUDE.md Mistakes Cost Revenue
11-agent company experiment earned $0 after 896 tasks. Operator open-sourced CLAUDE.md template with 72 lessons on coordination failures and legal constraints.
Claude Code Digest — May 14–May 17
Cut CLAUDE.md token waste by 99.3% with progressive disclosure skills.
Pichai: Frontier Models Can Break 'Pretty Much All Software'
Pichai says frontier models can break all software, possibly already. Systemic risk to enterprise stacks.
AI Coding Tools Amplify Bad Engineering, Not Fix It
AI coding tools amplify existing engineering weaknesses. Teams without discipline produce bad code faster, not good code.
CLAUDE.md for Mobile: How One File Fixes Claude Code's CSS Blindspot
A specialized CLAUDE.md file fixes Claude Code's generic CSS by injecting mobile-specific rules, preventing iOS zoom, untappable buttons, and dark mode failures before shipping.
Claude Code Digest — May 11–May 14
Anthropic's agent misalignment fixes cut incidents by 40-60%, redefining AI reliability.
Anthropic Research Cuts Agent Misalignment With 7 System Prompt Lessons
Anthropic published 7 lessons to fix misaligned AI agents by restructuring system prompts, targeting Claude Code developers. Cuts misalignment incidents by 40-60%.
Claude Code Digest — May 08–May 11
90% first-pass acceptance with Spec Kit and Claude Code transforms dev workflows.
Claude Code's Six-Layer Architecture: Harness, Not Magic
Claude Code's six-layer architecture uses a 3-layer context compressor at 92% threshold and Redis-based multi-agent FSM protocol. The model is just one node in a harness.
Skills as Untrusted Code: A Security Precedent for Agent Runtimes
Paper argues agent skills are untrusted code until verified; runtimes must enforce verification gates to prevent supply-chain attacks, echoing decades of software security lessons.