How VoiceHub Works
Core Models
VoiceHub operates as an orchestration layer over three core components in the voice AI pipeline:
Transcriber (STT) — Converts audio input into text
Language Model (LLM) — Interprets the text and generates intelligent responses
Voice (TTS) — Converts the LLM’s response back into speech
These modules can be flexibly configured using top-tier providers:
STT: Deepgram, Gladia, Azure, etc.
LLM: OpenAI, Groq, Claude, Cohere, etc.
TTS: ElevenLabs, PlayHT, LMNT, etc.
VoiceHub handles orchestration, streaming, and optimization across the three components — ensuring real-time interaction, smooth latency, and seamless switching between providers.
The Voice-to-Voice Pipeline (Real-Time Streaming)
Step 1: Listen (Intake Raw Audio)
User speaks into their device (laptop, phone, etc.). Audio is streamed and recorded in real time. That audio is then transcribed by the selected STT engine into text.
Step 2: Understand (Run an LLM)
The transcribed text is sent to the selected LLM model. That model uses the agent’s prompt and context to generate a response.
Step 3: Speak (Text → Raw Audio)
The response text is passed to a TTS engine, which synthesizes speech audio and streams it back to the user.
🎯 All three steps are optimized for real-time execution, targeting end-to-end latency of <200–500ms, depending on configuration.
What Makes VoiceHub Unique
You can switch providers at each stage without writing custom glue code
We stream audio and text between stages for responsiveness down to ~200ms
Our routing system handles scaling, retrying, failover, and QoS behind the scenes
Built-in DQ Models
For teams focused on Arabic, English or Dutch support, VoiceHub offers in-house DataQueue models:
Optimized for MENA voice patterns
Fine-tuned for dialect-specific recognition and tone
Lower latency and higher reliability than many generic models
Use DQ Mode for:
Arabic-first customer service
Multilingual deployments with low setup overhead
Government, telco, or regulated deployments with regional preferences
You can switch to DQ Mode at any time from the Configuration panel.
Last updated