































































Route to different models based on user attributes like language, location, or subscription tier. Each user gets the model that fits them best.
curl 'https://api.inworld.ai/router/v1/routers' \
-H "Authorization: Basic $INWORLD_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"displayName": "User-Aware",
"routes": [
{
"condition": {
"cel_expression": "language == \"es\""
},
"route": {
"variants": [
{ "variant": { "modelId": "openai/gpt-5.2" }, "weight": 100 }
]
}
},
{
"condition": {
"cel_expression": "plan == \"free\""
},
"route": {
"variants": [
{ "variant": { "modelId": "anthropic/claude-haiku-4-5" }, "weight": 100 }
]
}
}
],
"defaultRoute": {
"variants": [
{ "variant": { "modelId": "anthropic/claude-sonnet-4-6" }, "weight": 100 }
]
}
}'curl 'https://api.inworld.ai/router/v1/routers' \
-H "Authorization: Basic $INWORLD_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"displayName": "User-Aware",
"routes": [
{
"condition": {
"cel_expression": "language == \"es\""
},
"route": {
"variants": [
{ "variant": { "modelId": "openai/gpt-5.2" }, "weight": 100 }
]
}
},
{
"condition": {
"cel_expression": "plan == \"free\""
},
"route": {
"variants": [
{ "variant": { "modelId": "anthropic/claude-haiku-4-5" }, "weight": 100 }
]
}
}
],
"defaultRoute": {
"variants": [
{ "variant": { "modelId": "anthropic/claude-sonnet-4-6" }, "weight": 100 }
]
}
}'Route to different models based on user attributes like language, location, or subscription tier. Each user gets the model that fits them best.
Most routers take 5% on every call. Realtime Router charges zero markup during Research Preview. At $150K/month in LLM spend, that is $90K/year back in your pocket. After preview, bring your own keys for continued zero-markup access.
View pricing
Most routers take 5% on every call. Realtime Router charges zero markup during Research Preview. At $150K/month in LLM spend, that is $90K/year back in your pocket. After preview, bring your own keys for continued zero-markup access.
View pricing
Pass language, country, tier, or emotion as metadata. CEL expressions evaluate conditions in real time and pick the right model per user. Update rules without redeploying.
Pass language, country, tier, or emotion as metadata. CEL expressions evaluate conditions in real time and pick the right model per user. Update rules without redeploying.
Swap your base URL to api.inworld.ai, update your API key, done. Full OpenAI and Anthropic SDK compatibility. No request or response changes.

from openai import OpenAIclient = OpenAI(base_url="https://api.openai.com/v1"base_url="https://api.inworld.ai/v1")
Swap your base URL to api.inworld.ai, update your API key, done. Full OpenAI and Anthropic SDK compatibility. No request or response changes.

from openai import OpenAIclient = OpenAI(base_url="https://api.openai.com/v1"base_url="https://api.inworld.ai/v1")
Split traffic between Claude and GPT with sticky user assignment. Measure retention, satisfaction, or conversion per model. Ramp the winner to 100% via API or dashboard.
A/B testing docsSplit traffic between Claude and GPT with sticky user assignment. Measure retention, satisfaction, or conversion per model. Ramp the winner to 100% via API or dashboard.
A/B testing docsAdd an audio parameter to any chat completions request. Get streamed text and speech in a single response. Route to the best LLM, then pipe straight into Realtime TTS. No second integration. No added latency.

{"model": "inworld/my-router","messages": [...],"audio": {"voice": "Sarah","model": "inworld-tts-1.5-max"}}
Add an audio parameter to any chat completions request. Get streamed text and speech in a single response. Route to the best LLM, then pipe straight into Realtime TTS. No second integration. No added latency.

{"model": "inworld/my-router","messages": [...],"audio": {"voice": "Sarah","model": "inworld-tts-1.5-max"}}
If a model goes down or rate-limits you, traffic reroutes instantly to the next best option. Your workflow never stops. Works with Cursor, Claude Code, Codex CLI, Aider, Continue, and Windsurf.
If a model goes down or rate-limits you, traffic reroutes instantly to the next best option. Your workflow never stops. Works with Cursor, Claude Code, Codex CLI, Aider, Continue, and Windsurf.

Capability | Direct APIs | Inworld Router | OpenRouter |
|---|---|---|---|
Models available | 1 per provider | Hundreds, across all providers | 200+ across providers |
Pricing markup | None | None (Research Preview) | ~5% commission |
Context-aware routing (CEL) | Build it yourself | Built in | Not available |
A/B testing with sticky users | Build it yourself | Built in | Not available |
Automatic failover | Build it yourself | TTFT-based | Retry on error |
Integrated TTS | Separate integration | Single API call | Not available |
Per-request observability | Build it yourself | Built in + export | Basic logging |
SDK compatibility | N/A (native) | OpenAI + Anthropic | OpenAI only |
Caching | Provider-dependent | Implicit + explicit | Provider-dependent |
Web search / grounding | Provider-dependent | Tool-based + native | Not available |
