A developer known as Inertia-UK built a local HTTP proxy that makes Claude Code aware of its own usage limits. The proxy intercepts Anthropic's rate limit headers, revealing that Sonnet and Opus share a single quota pool despite separate UI bars.
Key facts
- Proxy intercepts
anthropic-ratelimit-unified-5h-utilizationand7dheaders - No per-model headers; Sonnet and Opus share one pool
- GitHub issue #57050 confirms Sonnet bucket never shipped
- Proxy writes status to
~/.claude/usage-status.md - Zero npm dependencies, plain Node.js stdlib
Claude Code has no idea how much quota it's burned. You can see usage bars in the UI, but the model itself is completely blind to them. There's no API, no tool, no hook that exposes the current rate limit state during a conversation [According to the Reddit post].
Anthropic returns rate limit headers on every inference response (anthropic-ratelimit-unified-5h-utilization, anthropic-ratelimit-unified-7d-utilization, etc.) — Claude Code receives them internally to render the UI bars, but never passes them anywhere the model can see.
The proxy sits between Claude Code and api.anthropic.com, routing traffic by setting ANTHROPIC_BASE_URL to http://127.0.0.1:4080. It intercepts response headers and writes a one-line status file to ~/.claude/usage-status.md:
5h=9% 7d=99%! overage=0% bottleneck=seven_day (10/05/2026, 16:19:04)
Claude can read that file on demand or via a UserPromptSubmit hook. With a rule in CLAUDE.md, Claude can warn before large tasks near the limit, switch to lightweight mode above 90%, or refuse new work at 98%.
The interesting discovery: while testing, the developer dumped every anthropic-ratelimit-* header from both Opus and Sonnet requests. There are no per-model headers — one unified pool covers everything. The separate Sonnet usage bar in the Claude Code UI doesn't reflect a real separate limit. According to GitHub issue #57050, Anthropic intended to give Sonnet its own bucket (announced Nov 2025) but the backend never shipped it. Using Sonnet drains the same unified pool as Opus.
This only works with Claude Code (the CLI). The web chat and browser extension make requests through Anthropic's own infrastructure, so there's no local proxy to intercept.
Key Takeaways
- A developer's proxy makes Claude Code usage-aware by intercepting hidden rate limit headers.
- Sonnet and Opus share one quota pool despite separate UI bars.
What to watch

Watch for Anthropic's response to GitHub issue #57050 — whether the promised separate Sonnet quota bucket ever ships, or if the unified pool becomes an official feature. Also watch for Anthropic adding a native usage-status tool or API endpoint to Claude Code, which would render this proxy obsolete.









