How VoiceHub Works

Core Models

VoiceHub operates as an orchestration layer over three core components in the voice AI pipeline:

Transcriber (STT) — Converts audio input into text
Language Model (LLM) — Interprets the text and generates intelligent responses
Voice (TTS) — Converts the LLM’s response back into speech

These modules can be flexibly configured using top-tier providers:

STT: Deepgram, Gladia, Azure, etc.
LLM: OpenAI, Groq, Claude, Cohere, etc.
TTS: ElevenLabs, PlayHT, LMNT, etc.

VoiceHub handles orchestration, streaming, and optimization across the three components — ensuring real-time interaction, smooth latency, and seamless switching between providers.

The Voice-to-Voice Pipeline (Real-Time Streaming)

Step 1: Listen (Intake Raw Audio)

User speaks into their device (laptop, phone, etc.). Audio is streamed and recorded in real time. That audio is then transcribed by the selected STT engine into text.

Step 2: Understand (Run an LLM)

The transcribed text is sent to the selected LLM model. That model uses the agent’s prompt and context to generate a response.

Step 3: Speak (Text → Raw Audio)

The response text is passed to a TTS engine, which synthesizes speech audio and streams it back to the user.

🎯 All three steps are optimized for real-time execution, targeting end-to-end latency of <200–500ms, depending on configuration.

What Makes VoiceHub Unique

You can switch providers at each stage without writing custom glue code
We stream audio and text between stages for responsiveness down to ~200ms
Our routing system handles scaling, retrying, failover, and QoS behind the scenes

Built-in DQ Models

For teams focused on Arabic, English or Dutch support, VoiceHub offers in-house DataQueue models:

Optimized for MENA voice patterns
Fine-tuned for dialect-specific recognition and tone
Lower latency and higher reliability than many generic models

Use DQ Mode for:

Arabic-first customer service
Multilingual deployments with low setup overhead
Government, telco, or regulated deployments with regional preferences

You can switch to DQ Mode at any time from the Configuration panel.

Previousn8n Integration NextOrchestration Models

Last updated 1 month ago