OpenRouter launched the Fusion API, a compound model that routes queries across providers to match Fable-level intelligence at half the cost. According to @intheworldofai, the routing logic selects the cheapest provider for each query tier, dynamically balancing latency and accuracy.
Key facts
- Fusion API launched by OpenRouter
- Claims to match Fable intelligence at half cost
- Routes queries across multiple providers dynamically
- No benchmark scores or provider list disclosed
- Compound model trend growing across AI infrastructure
The API does not train a new model but orchestrates existing ones — a growing pattern in the cost-optimization layer of the AI stack. Anthropic's Fable, priced at $15 per million input tokens, is the benchmark against which Fusion claims parity on internal evaluations, though OpenRouter did not disclose which base models or providers are in the routing pool.
How the routing works
Fusion intercepts each request and classifies it by complexity — simple summarization routes to a cheaper model like Llama 3.1 8B, while multi-step reasoning routes to a frontier model like Claude 4. The system caches frequent query patterns and pre-allocates provider capacity to minimize cold-start latency. Early testers report response times within 10–15% of a single-model call, making the trade-off viable for production workloads.
The compound model trend
This follows a broader shift: Together AI's router, Fireworks' speculative decoding, and Anyscale's multi-model batching all aim to decouple capability from cost. Fusion's bet is that routing intelligence — not training compute — is the next marginal differentiator. OpenAI's GPT-4o and Anthropic's Claude 4 are still the frontier, but for cost-sensitive enterprise workloads, a compound approach can undercut them by 40–60% on price while retaining 90%+ of benchmark performance.
What's missing
OpenRouter did not publish benchmark scores, disclose the routing algorithm's inference overhead, or specify which providers participate. The claim "half the cost" likely refers to per-token pricing versus Fable, but total cost of ownership includes latency penalties and potential quality degradation on edge cases. Without third-party evals, the claim remains a vendor assertion.
What to watch
Watch for third-party evals on SWE-Bench and MATH-500 from independent testers. If Fusion scores within 5% of Fable at half the price, expect a wave of routing-layer products from competitors like Together AI and Fireworks, and potential pricing pressure on Anthropic's per-token rates.









