Ultra-low latency
Optimized pipeline delivers Full-duplex audio streaming over a single WebSocket or WebRTC connection.
Provider agnostic
Route to the model that fits your latency, cost, or quality requirements.
Semantic VAD
Context-aware turn detection with adjustable eagerness.
Function calling
Models can invoke tools and APIs mid-conversation for dynamic responses.
Dynamic context management
Create, retrieve, delete, or truncate conversation items mid-session to control context length and token cost.
Multimodal
Send and receive text, audio, or both simultaneously. Switch modalities per response.