Skip to content
gentic.news — AI News Intelligence Platform
Connecting to the Living Graph…

Listen to today's AI briefing

Daily podcast — 5 min, AI-narrated summary of top stories

Developer typing code on a laptop with Claude Code interface open, demonstrating AI-assisted programming
Open SourceScore: 54

Muxer: Open-Source Model Multiplexer Slashes Claude Code Costs by Routing

Muxer reduces Claude Code costs by multiplexing models per subtask via agent frontmatter and session hooks. Keep Fable/Opus for planning; route boilerplate to Haiku.

·1d ago·4 min read··4 views·AI-Generated·Report error
Share:
Source: github.comvia hn_claude_codeCorroborated
How do I use Muxer to reduce Claude Code costs by routing subtasks to cheaper models?

Muxer is an open-source Claude Code plugin that multiplexes models per subtask: keep Fable/Opus for planning and review, route grep/boilerplate to Haiku or Gemini via agent frontmatter and session hooks. Saves credits without quality loss.

TL;DR

Route expensive planning models to high-level work and cheap models to boilerplate—Muxer guarantees it via agent frontmatter and session hooks.

Key Takeaways

  • Muxer reduces Claude Code costs by multiplexing models per subtask via agent frontmatter and session hooks.
  • Keep Fable/Opus for planning; route boilerplate to Haiku.

What Changed — The Update

Muxer is an open-source Claude Code plugin that lets an expensive model orchestrate a session while cheaper models do the actual work. It hit GitHub recently and solves a pain point anyone on a Max plan knows: the orchestrator model (Fable, Opus) bills for every subtask it spawns, even trivial ones like grepping files or writing boilerplate.

Muxer works through three mechanisms:

  1. Agent frontmatter — Each agent in agents/*.md has a model: line. The scout always runs on Haiku; the builder always runs on Opus. This is a hard guarantee, not a suggestion.

  2. SessionStart hook — A scripts/session-policy.sh injects a routing policy: on Fable sessions it says "keep the main loop lean"; on cheaper models it adds an escalation path up to Fable via muxer:oracle.

  3. PreToolUse guardscripts/guard-model.sh catches subagents (Explore, Plan) that inherit the session model and pins them to Opus unless overridden. Prevents premium billing on file exploration.

What It Means For You — Concrete Impact

If you're on a Max plan, your biggest cost driver isn't the number of prompts—it's the model running each subtask. Claude Code's built-in subagents inherit the session model by default. On a Fable session, every grep, find, and boilerplate write bills at premium rates.

Muxer's approach mirrors the strategy we covered in "Claude Fable 5 in Claude Code: The Routing Strategy That Saves Your Weekly Limit" (2026-07-02). The difference: Muxer gives you hard guarantees via agent frontmatter rather than relying on prompt engineering.

The project also has rules for quality control:

  • Taste-critical work (UI, CSS, game feel) never goes below Opus regardless of cost hints.
  • The verifier is never a cheaper model than the builder it's judging.
  • Work that fails review twice at one tier gets redone a tier up with a fresh brief.

Try It Now — Commands, Config, and Prompts

1. Clone and Set Up

git clone https://github.com/DangerousYams/muxer.git
cd muxer
# Copy agents and hooks into your Claude Code project
cp -r agents/* ~/.claude/agents/
cp scripts/* ~/.claude/hooks/

2. Configure Agent Frontmatter

Create agents/scout.md:

---
model: haiku
---
You are a scout. Your job is to explore the codebase and gather information. Be fast and concise.

Create agents/builder.md:

---
model: opus
---
You are a builder. You implement features from detailed briefs. Never accept under-scoped briefs.

3. Install Session Hook

In ~/.claude/hooks/session-start.sh:

#!/bin/bash
# If session model is Fable, inject delegation policy
if [ "$CLAUDE_SESSION_MODEL" = "fable" ]; then
  echo "Policy: Delegate all file operations to haiku. Escalate complex decisions to opus."
fi

4. Run and Watch Savings

Start a Claude Code session on Fable. Muxer prints $ saved after each task. You'll see the scout run on Haiku, the builder on Opus, and Fable only for planning and review.

Why It Works

Monitoring Claude Code costs on AWS Bedrock | by hackthebox | Medium

Claude Code picks a model for a subtask at the moment the orchestrator spawns it. Muxer leans on that decision from three directions: agent frontmatter guarantees the model, session hooks set policy, and guards catch unspawned overrides. This triple-layer approach means you don't need to trust prompt engineering—you get hard routing guarantees.

When To Use It

  • Max plan users — Your biggest cost is Fable/Opus subtasks. Muxer cuts that.
  • Multi-model workflows — You want to route specific work to Gemini or OpenAI Codex via their CLIs.
  • Quality-sensitive projects — The review escalation rules ensure cheap models don't produce garbage.

Caveats

Muxer is new (2 points, 0 comments on HN as of writing). The guard script approach is experimental. Test on a small project before rolling out to production. Also, as we noted in "How Navan's MCP Server Cuts Travel Booking from 8 Steps to 1 Command" (2026-07-02), any tool that modifies session behavior can introduce unexpected interactions—monitor your first few sessions closely.


Source: github.com

Source: gentic.news · · author= · citation.json

AI-assisted reporting. Generated by gentic.news from multiple verified sources, fact-checked against the Living Graph of 4,300+ entities. Edited by Ala SMITH.

Following this story?

Get a weekly digest with AI predictions, trends, and analysis — free.

AI Analysis

Claude Code users should immediately adopt Muxer's agent frontmatter pattern to enforce model routing per subtask. The key insight is that Claude Code's subagents inherit the session model by default, which is expensive. By adding a `model:` line to every agent file, you get hard guarantees without relying on prompt engineering. Start by defining a scout agent (Haiku), a builder agent (Opus), and a reviewer agent (Opus). Second, implement the PreToolUse guard to catch subagents that spawn without explicit model overrides. This prevents the Explore and Plan subagents from billing at premium rates during routine file operations. The guard pattern is simple: check if `$MODEL` is set; if not, pin to `opus`. Third, adopt the review escalation rules: never let a cheaper model judge a more expensive model's work, and escalate after two failures. This prevents the failure mode where a cheap model builds something visually off and an equally cheap reviewer can't see what's wrong. The project prints `$ saved` after each task, so you can quantify the impact immediately.
Compare side-by-side
Claude Code vs Muxer
Enjoyed this article?
Share:

AI Toolslive

Five one-click lenses on this article. Cached for 24h.

Pick a tool above to generate an instant lens on this article.

Related Articles

From the lab

The framework underneath this story

Every article on this site sits on top of one engine and one framework — both built by the lab.

More in Open Source

View all