best practices
30 articles about best practices in AI news
Claude Code 2.5: New CLI, Dashboard, and Best Practices for Web Devs
Anthropic's latest Claude Code update adds a CLI, usage dashboard, and web-focused best practices. Here's how to use them.
How the New Claude Certified Architect Exam Reveals Best Practices for Claude Code
Anthropic's new certification exam outlines the core principles for effectively using Claude in development, which you can apply directly to your Claude Code workflow.
CLAUDE.md Explained: How Anthropic's Agent Memory Works
CLAUDE.md is Anthropic's project config file for Claude Code, now two years old with settled best practices for agent memory and context.
NVIDIA and Unsloth Release Comprehensive Guide to Building RL Environments from Scratch
NVIDIA and Unsloth have published a detailed practical guide on constructing reinforcement learning environments from the ground up. The guide addresses critical gaps often overlooked in tutorials, covering environment design, when RL outperforms supervised fine-tuning, and best practices for verifiable rewards.
Evaluating AI Agents in Practice: Benchmarks, Frameworks, and Lessons Learned
A new report details the practical challenges and emerging best practices for evaluating AI agents in real-world applications, moving beyond simple benchmarks to assess reliability, safety, and business value.
Google Hits 75% AI-Generated Code, Up From 50% in Fall 2025
Google reports 75% of all new code is now AI-generated and engineer-approved, a sharp increase from 50% last fall. This indicates a massive, accelerating shift in software development practices at the tech giant.
Open-Source 'Claude Code' Dev Setup Replicates Anthropic Engineer's Workflow
A developer has reverse-engineered and published the complete Claude Code development setup used by Anthropic engineer Boris Cherny. The project is available for free on GitHub, offering a window into high-level AI-assisted programming practices.
Stop Using Elaborate Personas: Research Shows They Degrade Claude Code Output
Scientific research reveals common Claude Code prompting practices—like elaborate personas and multi-agent teams—are measurably wrong and hurt performance.
Regulators in Italy Probe Sephora, LVMH for Youth Marketing
Italian authorities are investigating LVMH and its beauty retailer Sephora for marketing practices targeting minors. This marks the first such European probe into the luxury conglomerate's youth outreach, signaling heightened regulatory scrutiny.
The Silent Data Harvest: Stanford Exposes How AI Giants Use Your Private Conversations
Stanford researchers reveal that all major AI companies—OpenAI, Google, Meta, Anthropic, Microsoft, and Amazon—train their models on user chat data by default, with minimal transparency, unclear opt-out mechanisms, and concerning practices around data retention and child privacy.
Beyond Architecture: How Training Tricks Make or Break AI Fraud Detection Systems
New research reveals that weight initialization and normalization techniques—often overlooked in AI development—are critical for graph neural networks detecting financial fraud on blockchain networks. The study shows these training practices affect different GNN architectures in dramatically different ways.
The AI Context Paradox: Why More Instructions Make Coding Agents Less Effective
ETH Zurich research reveals AI coding agents perform worse with overly detailed AGENTS.md files. The study shows excessive context creates 'obedient failure' where agents follow unnecessary instructions instead of solving problems efficiently. This challenges current industry practices for configuring AI development assistants.
Pruning LLMs for Edge Triples Bias, Perplexity Hides Damage
Pruning LLMs for edge deployment amplifies bias up to 83.7% while perplexity barely changes, revealing a paradox that undermines standard evaluation practices.
Why Production AI Needs More Than Benchmark Scores
The article argues that high benchmark scores are insufficient for production AI success, highlighting the need for robust MLOps practices, monitoring, and real-world testing—critical for retail applications.
Claude Desktop's Undisclosed Native Messaging Bridge
Claude Desktop installs a preauthorized native messaging bridge for browser extensions without explicit disclosure, impacting developer workflows and security practices.
Claude Code Best Practice Repo Hits 19.7K Stars with 84 Anthropic Tips
A GitHub repository called 'claude-code-best-practice' has amassed 19.7K stars by compiling 84 production tips from Anthropic's Claude Code creators. It provides a full open-source framework for moving from basic usage to advanced agentic workflows.
SpaceXAI Partners with Cursor AI to Build 'World's Best' Coding Assistant
SpaceXAI and Cursor AI announced a partnership to integrate SpaceX's engineering data with Cursor's editor, aiming to create a top-tier AI for coding and knowledge work.
Why the Best Generative AI Projects Start With the Most Powerful Model —
The article suggests that while initial AI projects leverage the broad capabilities of large foundation models, the most successful implementations eventually transition to smaller, more targeted systems. This reflects a maturation from experimentation to production optimization.
Google Cloud's Vertex AI Experiments Solves the 'Lost Model' Problem in ML Development
A Google Cloud team recounts losing their best-performing model after training 47 versions, highlighting a common MLops failure. They detail how Vertex AI Experiments provides systematic tracking to prevent this.
The Persistence Paradox: Why Safety Training Sticks in AI Agents Even When You Try to Make Them More Helpful
New research reveals that safety training in AI agents persists through subsequent helpfulness optimization, creating a linear trade-off frontier rather than achieving 'best of both worlds' outcomes. This challenges assumptions about how to balance safety and capability in multi-step AI systems.
Claude Code Digest — May 31–Jun 03
Claude Code is quietly becoming an operating system: teams are replacing brittle UI layers with deterministic tools, while per-project rules and skills finally make the agent behave like it belongs in the repo.
skillkit: The Per-Project Claude Code Skill Manager That Finally Tames
skillkit gives Claude Code users per-project skill management via a `skills.toml` manifest and `skillkit sync` command, ending the global skill directory chaos.
Claude Code Digest — May 23–May 26
Spec-Driven Development slashes agent confusion and costs by decomposing tasks into manageable specs.
Claude Code Digest — May 18–May 21
Anthropic's $300M Stainless acquisition signals a shift towards integration-layer dominance.
11-Agent Company Earned $0: CLAUDE.md Mistakes Cost Revenue
11-agent company experiment earned $0 after 896 tasks. Operator open-sourced CLAUDE.md template with 72 lessons on coordination failures and legal constraints.
Claude Code Digest — May 14–May 17
Cut CLAUDE.md token waste by 99.3% with progressive disclosure skills.
Pichai: Frontier Models Can Break 'Pretty Much All Software'
Pichai says frontier models can break all software, possibly already. Systemic risk to enterprise stacks.
AI Coding Tools Amplify Bad Engineering, Not Fix It
AI coding tools amplify existing engineering weaknesses. Teams without discipline produce bad code faster, not good code.
CLAUDE.md for Mobile: How One File Fixes Claude Code's CSS Blindspot
A specialized CLAUDE.md file fixes Claude Code's generic CSS by injecting mobile-specific rules, preventing iOS zoom, untappable buttons, and dark mode failures before shipping.
Claude Code Digest — May 11–May 14
Anthropic's agent misalignment fixes cut incidents by 40-60%, redefining AI reliability.