Skip to content
gentic.news — AI News Intelligence Platform
Connecting to the Living Graph…

Listen to today's AI briefing

Daily podcast — 5 min, AI-narrated summary of top stories

Developer at a computer terminal configures a local proxy tool that monitors Claude Code's API rate limits…
Open SourceScore: 90

Claude Code quota proxy exposes unified Opus/Sonnet pool

A developer's proxy makes Claude Code usage-aware by intercepting hidden rate limit headers. Sonnet and Opus share one quota pool despite separate UI bars.

·May 10, 2026·3 min read··292 views·AI-Generated·Report error
Share:
Source: reddit.comvia reddit_claude, hn_claude_code, medium_claude, devto_claudecodeWidely Reported
How can Claude Code be made aware of its own usage limits?

A developer built a local HTTP proxy that makes Claude Code aware of its own usage limits by intercepting Anthropic's rate limit headers, revealing that Sonnet and Opus share a single quota pool despite separate UI bars.

TL;DR

Proxy intercepts rate limit headers Claude Code hides · Sonnet and Opus drain the same quota bucket · Open-source tool adds usage awareness to Claude Code

A developer known as Inertia-UK built a local HTTP proxy that makes Claude Code aware of its own usage limits. The proxy intercepts Anthropic's rate limit headers, revealing that Sonnet and Opus share a single quota pool despite separate UI bars.

Key facts

  • Proxy intercepts anthropic-ratelimit-unified-5h-utilization and 7d headers
  • No per-model headers; Sonnet and Opus share one pool
  • GitHub issue #57050 confirms Sonnet bucket never shipped
  • Proxy writes status to ~/.claude/usage-status.md
  • Zero npm dependencies, plain Node.js stdlib

Claude Code has no idea how much quota it's burned. You can see usage bars in the UI, but the model itself is completely blind to them. There's no API, no tool, no hook that exposes the current rate limit state during a conversation [According to the Reddit post].

Anthropic returns rate limit headers on every inference response (anthropic-ratelimit-unified-5h-utilization, anthropic-ratelimit-unified-7d-utilization, etc.) — Claude Code receives them internally to render the UI bars, but never passes them anywhere the model can see.

The proxy sits between Claude Code and api.anthropic.com, routing traffic by setting ANTHROPIC_BASE_URL to http://127.0.0.1:4080. It intercepts response headers and writes a one-line status file to ~/.claude/usage-status.md:

5h=9% 7d=99%! overage=0% bottleneck=seven_day (10/05/2026, 16:19:04)

Claude can read that file on demand or via a UserPromptSubmit hook. With a rule in CLAUDE.md, Claude can warn before large tasks near the limit, switch to lightweight mode above 90%, or refuse new work at 98%.

The interesting discovery: while testing, the developer dumped every anthropic-ratelimit-* header from both Opus and Sonnet requests. There are no per-model headers — one unified pool covers everything. The separate Sonnet usage bar in the Claude Code UI doesn't reflect a real separate limit. According to GitHub issue #57050, Anthropic intended to give Sonnet its own bucket (announced Nov 2025) but the backend never shipped it. Using Sonnet drains the same unified pool as Opus.

This only works with Claude Code (the CLI). The web chat and browser extension make requests through Anthropic's own infrastructure, so there's no local proxy to intercept.

Key Takeaways

  • A developer's proxy makes Claude Code usage-aware by intercepting hidden rate limit headers.
  • Sonnet and Opus share one quota pool despite separate UI bars.

What to watch

Getting Started with Claude Code

Watch for Anthropic's response to GitHub issue #57050 — whether the promised separate Sonnet quota bucket ever ships, or if the unified pool becomes an official feature. Also watch for Anthropic adding a native usage-status tool or API endpoint to Claude Code, which would render this proxy obsolete.

[Updated 11 May via reddit_claude]

Separately, the open-source package ccusage (installable via npm install -g ccusage) offers a more detailed breakdown of token and cost usage than Claude Code's native /cost command, displaying daily stats in a formatted terminal view [per Reddit user National_Honey7103]. This provides an alternative for users on the Pro tier seeking finer-grained tracking of their unified Sonnet/Opus quota consumption.


Sources cited in this article

  1. GitHub
  2. Proxy
Source: gentic.news · · author= · citation.json

AI-assisted reporting. Generated by gentic.news from 3 verified sources, fact-checked against the Living Graph of 4,300+ entities. Edited by Ala SMITH.

Following this story?

Get a weekly digest with AI predictions, trends, and analysis — free.

AI Analysis

This is a classic example of the AI industry's UX gap: models are powerful but completely blind to their own operational constraints. Anthropic ships rate limit data in every response but deliberately walls it off from the model — likely to prevent adversarial manipulation or quota gaming. The proxy sidesteps that design choice entirely. The unified pool discovery is the real story. Anthropic promised Sonnet would have its own quota bucket in November 2025, but the backend never shipped it. This means users who switch to Sonnet to preserve Opus quota are burning the same resource. It's a silent UX failure that undermines trust in the pricing model. This pattern mirrors the early days of cloud cost monitoring — developers building their own tools because vendors won't expose the data natively. Expect Anthropic to either ship a native usage-awareness tool or acquire this pattern into the product.
Compare side-by-side
Claude Opus 4.6 vs Claude 3.5 Sonnet
Enjoyed this article?
Share:

AI Toolslive

Five one-click lenses on this article. Cached for 24h.

Pick a tool above to generate an instant lens on this article.

Related Articles

From the lab

The framework underneath this story

Every article on this site sits on top of one engine and one framework — both built by the lab.

More in Open Source

View all