Stop Reviewing AI Code. Start Reviewing CLAUDE.md.

Stop Reviewing AI Code. Start Reviewing CLAUDE.md.

Anthropic's research shows the bottleneck is verification, not generation. Shift your Claude Code workflow from writing prompts to writing precise, testable specifications.

GAla Smith & AI Research Desk·3h ago·4 min read·6 views·AI-Generated
Share:
Source: dev.tovia devto_anthropicSingle Source

The Problem You're Already Feeling

You've experienced it: Claude Code generates a 200-line feature in seconds, but reviewing it takes 30 minutes. You can't ask "why did you choose this approach?" You're left reverse-engineering assumptions. According to Anthropic's research, this verification gap is the real bottleneck in AI-assisted development.

The issue isn't Claude's code quality—it's that our current workflow treats AI as a junior developer. We give vague prompts, get code back, and then debug the implementation. But bugs in AI-generated code are usually specification failures, not coding errors. The model implemented exactly what you asked for, but what you asked for was ambiguous.

The Claude Code Workflow Shift: From Prompts to Specifications

Anthropic's vision—backed by their development of Claude Agent and Claude Cowork—is that engineers should become AI system architects. Your primary output shifts from code to two things:

  1. Machine-readable specifications (not Jira tickets)
  2. Complete verification setups (tests that certify correctness)

For Claude Code users, this means fundamentally changing how you use the tool. Instead of typing "add user authentication," you need to write a specification that leaves no room for misinterpretation.

How To Apply This Today in Your CLAUDE.md

Your CLAUDE.md file should evolve from a context document to a specification framework. Here's what to add:

## Specification Format

When requesting new features, use this template:

### Component: [Name]
- **Inputs:** [Exact data structure, types, validation rules]
- **Outputs:** [Expected return structure, error formats]
- **Behavior:** [Step-by-step logic, edge cases handled]
- **Constraints:** [Performance requirements, security rules, dependencies]
- **Tests Required:** [List of specific test cases to generate]

## Verification First

Always generate tests BEFORE implementation code. Use this command:

![](https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fkodus.io%2Fwp-content%2Fuploads%2F2026%2F03%2F55cw4k-1-1008x1024.jpg)


```bash
claude code --spec "component_spec.md" --output tests/

Then review and augment the test suite. Only after tests are approved:

claude code --spec "component_spec.md" --tests "tests/" --output src/

## Example: From Vague Prompt to Precise Specification

**Bad (what you're probably doing):**

"Create a function that validates email addresses"


**Good (the new way):**

Component: Email Validator

  • Inputs: String email, optional Boolean strict_mode (default: false)
  • Outputs: Object {valid: boolean, reason: string|null, suggestions: string[]}
  • Behavior:
    1. Check format against RFC 5322 (use regex library)
    2. If strict_mode=true, also verify MX records exist
    3. Common typos detection: gmail.com → gmail.com
    4. Reject disposable email domains from internal list
  • Constraints:
    • Timeout: 100ms for DNS lookups
    • No external API calls unless strict_mode=true
  • Tests Required:
    • Valid standard emails
    • Invalid formats (missing @, multiple @)
    • Disposable domains
    • MX record existence (mock DNS)
    • Timeout handling

## Your New Review Process

Stop reviewing 500 lines of generated code line-by-line. Instead:

![Cover image for The Future Of Software Engineering according to Anthropic](https://media2.dev.to/dynamic/image/width=1000,height=420,fit=cover,gravity=auto,format=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fr6ejtj736trd3ggi8qw9.jpg)


1. **Review the specification** in your `CLAUDE.md` or separate spec file
2. **Review the generated tests**—do they cover all edge cases?
3. **Run the tests** against the generated code
4. **Only examine the code** if tests fail or performance metrics are off

This follows Anthropic's own development pattern with Claude Agent, where the system is designed to work from structured specifications rather than free-form prompts.

## Handling Legacy Code

The biggest challenge? Existing codebases. Here's your migration strategy:

```bash
# Step 1: Generate specifications FROM existing code
claude code --analyze "src/legacy/" --output specs/legacy/

# Step 2: Review and refine the generated specs
# Step 3: Regenerate components with tests
claude code --spec "specs/legacy/auth.spec.md" --output src/refactored/

This creates a virtuous cycle: legacy code → specifications → tested, maintainable replacements.

The Tooling Gap (And How To Bridge It)

Anthropic acknowledges current models aren't deterministic compilers. The same spec might generate different code. Your defense:

  1. Version your specifications alongside your code
  2. Use Claude Code's --seed flag for reproducible generations
  3. Implement snapshot testing for generated code
# Reproducible generation
claude code --spec "api.spec.md" --seed 42 --output src/api/

# Snapshot test
claude code --spec "api.spec.md" --test-snapshot "tests/api.snapshot"

Start Tomorrow Morning

  1. Create a specs/ directory in your project
  2. Write ONE specification for your next feature (not a prompt)
  3. Generate tests first, review them, then generate code
  4. Notice how much less time you spend debugging

This isn't futuristic thinking—it's the optimal way to use Claude Code today. The engineers who master specification-first development will ship faster with higher quality, while everyone else struggles with AI-generated technical debt.

AI Analysis

Claude Code users need to make two immediate changes: 1. **Stop treating Claude as a coder; treat it as a compiler.** Your prompts should be formal specifications, not natural language requests. Create a `specs/` directory and write machine-readable definitions of behavior before any code generation. Use the `--spec` flag instead of free-form prompts. 2. **Flip your workflow: tests first, code second.** Always generate and review tests before implementation code. Use `claude code --spec "your_spec.md" --output tests/` to create the verification suite. Only after tests are approved should you generate the implementation with `--tests` flag. This catches specification ambiguities immediately. 3. **Version your specifications like code.** When a bug appears, check if the spec was ambiguous or incomplete before debugging the implementation. This follows Anthropic's own development pattern with Claude Agent and aligns with research showing most AI code bugs are specification failures. This approach turns Claude Code from a code generator into a specification executor—exactly the shift Anthropic's research predicts will separate effective from ineffective AI-assisted developers.
Enjoyed this article?
Share:

Related Articles

More in Opinion & Analysis

View all