Router
The routing layer. Selects the right model for each call based on rules, context budget, and cost constraints. Runs entirely in-process.Why routing matters
Naive routing — “short prompt = cheap model” — silently breaks things:- A 150k token conversation sent to a model with an 8k context window gets truncated without warning
- Switching from Sonnet to Haiku for a complex multi-constraint prompt causes capability regression
- There is no way to know routing failed until a user reports a wrong answer
