Question 1

How much will I actually save?

Accepted Answer

Published customer baselines show 40-70% savings on production LLM spend after enabling conditional routing + semantic caching. Specific savings depend on workload (cache-friendly traffic saves more, heavy-reasoning traffic saves less). Router is free during Research Preview, so you only pay underlying model costs.

Question 2

Does quality drop?

Accepted Answer

No. Conditional routing sends easy prompts to cheap models and escalations to the best ones, so the easy requests keep quality because they never needed the premium model, and the hard ones keep quality because they get the model that handles them. The failure mode is downgrading everything; we explicitly don't do that.

Question 3

What's CEL?

Accepted Answer

CEL (Common Expression Language) is the rule language Router uses for conditional routing. You write expressions like `complexity == 'low'` or `user.tier == 'premium'` and Router evaluates them per request to pick the model. Human-readable, no custom DSL to learn.

Question 4

How does semantic caching work?

Accepted Answer

Exact-match caching is free and always on. Semantic caching (opt-in per request) matches by meaning, so two differently-worded requests for the same answer hit the same cached response. Production traffic typically caches ~73% of repeated intents.

Question 5

How do I attribute spend per team?

Accepted Answer

Every request logs model, provider, tokens, cost, cache status, and user attribution (via the `user` field on the request). Query the logs API or view live tail in the portal. Export to your FinOps or billing system.

Question 6

What does Router charge on top?

Accepted Answer

Zero during Research Preview. Pass-through pricing means you pay the underlying provider rate through Inworld. Most LLM gateways add 5-15%, Router adds 0%.

Question 7

Can I use my existing OpenAI client?

Accepted Answer

Yes. Router is OpenAI SDK compatible: change the base URL, keep your code. See the Router docs for the drop-in pattern.

Question 8

Is there a cost calculator?

Accepted Answer

Router pricing and model rates live on the pricing page. For workload-specific estimates, contact the team with your current monthly spend and we'll model the expected reduction.

Cut your LLM bill up to 95% without losing quality

Cut spend without cutting quality.

Cut the bill without cutting the quality.

Cut the bill without cutting the quality.

Easy prompts hit cheap models. Escalations hit the best ones.

Easy prompts hit cheap models. Escalations hit the best ones.

Paraphrased prompts hit the same cached answer.

Paraphrased prompts hit the same cached answer.

Say what you care about and the model shows up.

Say what you care about and the model shows up.

Same rate as going direct, minus the direct integration.

Same rate as going direct, minus the direct integration.

Know which feature, which tier, which user is driving spend.

Know which feature, which tier, which user is driving spend.

FAQ

40-70% less LLM spend. Same answers.