How does the Claude Code agentic loop work step by step?

The Claude Code agentic loop follows 5 steps: (1) Accept user input and add to conversation context. (2) Send the full context to the Claude API with streaming enabled. (3) Parse the streaming response for text and tool call blocks. (4) Execute all requested tools concurrently using Promise.all. (5) Add tool results back to context and repeat from step 2 until no more tool calls are needed.

Does Claude Code execute tools in parallel or sequentially?

Claude Code executes multiple tools in parallel using Promise.all when a single API response requests multiple tool calls. Independent tools like file reads and bash commands run simultaneously, reducing total execution time significantly.

What files implement the Claude Code agent loop?

The main agentic loop is implemented across query.ts (message processing and loop control), claude.ts (API communication and streaming), and toolOrchestration.ts (concurrent tool execution) in Claude Code v2.1.88.

Chapter 01

The Agentic Loop

Everything Claude Code does flows through a single while(true) loop. Understanding this loop is understanding the entire system.

Source: query.ts, claude.ts, toolOrchestration.ts

When you send a message to Claude Code, it enters a loop. The AI thinks, decides if it needs to use any tools (read files, run commands, edit code), executes them, and then thinks again with the results. This repeats until the AI has everything it needs to give you a final answer.

You send a message

AI thinks and responds

Does the AI need to use tools?

YES

Execute tools
(read, edit, bash...)

Feed results back
and loop again ↑

Show final answer
done!

Key insight: This loop is the ENTIRE architecture of Claude Code. Every feature — file editing, code search, git operations, multi-agent work — is just a tool that this loop can call. The AI decides which tools to use, and the loop handles the rest.

View source code (query.ts) ▶

// query.ts — the core loop
while (true) {
  response = await callModel(messages, systemPrompt, tools)
  if (response has tool_use) {
    results = await runTools(toolUseBlocks)
    messages.push(...response, ...results)
    continue  // loop back with tool results
  }
  return response  // no tools needed — done
}

claude code — ~/project

$ claude

>Type your message...|

1Input Processing

2System Prompt Assembly

3API Request

4Streaming Response

5Tool Execution

6Loop Back

7Final Response

Deep Dive: Inside the Pipeline Steps

The pipeline above showed you what happens at each step. These cards go deeper into the technical details — how the API call is constructed, how streaming works, how tools are executed, and how errors are handled. Click any card to expand.

The API Call

claude.ts▶

Constructs a streaming request: system prompt (15KB) + message history + 40 tool schemas

Thinking mode: { type: 'adaptive' } for Opus/Sonnet 4.6 — model decides when to think deeply

Temperature is always 1 when thinking is enabled (required by the API)

Cache breakpoints placed on second-to-last user message for prompt caching

SSE Streaming

claude.ts▶

message_start → init message, record TTFT (time to first token)

content_block_delta → accumulate text, thinking, or tool input incrementally

content_block_stop → yield completed block to the UI as an AssistantMessage

message_delta → mutates usage IN-PLACE (not replacement) so transcript refs stay valid

Tool Lifecycle (7 steps)

toolOrchestration.ts▶

1. validateInput() — pure checks: reject old_string === new_string, block /dev/zero reads

2. backfillObservableInput() — expand ~ to absolute paths BEFORE hooks see it (security)

3. canUseTool() — permission pipeline: deny rules → tool permissions → safety checks → allow rules

4. tool.call() — actual execution, returns ToolResult { data, newMessages? }

5. Serialize — line numbers for text, base64 for images, <persisted-output> for large results

6. Large output check — if > 30K chars: write to ~/.claude/tool-results/, send preview only

7. Yield as UserMessage(tool_result) — goes back into the loop

Recovery & Error Handling

withRetry.ts▶

max_output_tokens hit → inject 'Resume directly...' message, retry up to 3x

prompt_too_long (413) → trigger reactive compaction, retry with compressed context

429 rate limit → exponential backoff (500ms × 2^attempt, max 32s) + 25% jitter

529 overloaded → 3 consecutive failures triggers fallback to different model

ECONNRESET → disable HTTP keep-alive, retry transparently

Tool Concurrency

When the AI asks to use 5 tools at once, Claude Code doesn't run them one by one. It splits them into two groups:

Read-only tools

Read, Glob, Grep, WebFetch — these can't break anything, so they run all at the same time. Up to 10 in parallel.

Write tools

Bash, Edit, Write — these modify files, so they run one at a time to avoid conflicts.

The chart below shows 5 tool calls. Notice how the 3 reads all start at the same time (0ms), while Edit waits for reads to finish, and Bash waits for Edit.

Tool Concurrency

Read-only tools run in parallel. Writes wait and execute one-by-one.

Concurrent Sequential

Read(file1.ts)

Read(file2.ts)

Grep("TODO")

Edit(file1.ts)

Bash(npm test)

Source Files

query.tsThe while(true) loop, auto-compaction

claude.tsAPI request, SSE streaming, cache breakpoints

toolOrchestration.tsConcurrency partitioning, runTools()

Tool.tsBase tool interface, 40+ tool types (30KB)

withRetry.tsRetry matrix, backoff, model fallback

autoCompact.tsContext compression, 9-section summary