WebCall Socket V1

The WebCall Websocket provides real-time interaction with voice and text agents via WebSockets. Use the webCallSocketPath returned from POST /calls/public/create-web-call to establish a connection.

Call initiation events

channel.join Event (Client -> Server)

Purpose: Subscribes the client to one or more specified channels within a call session, enabling real-time communication via Text and/or Audio.

Join Request Format

To join specific channels, send the following JSON payload:

{
  "callId": "your-call-id",
  "channels": ["audio", "text"]
}
  • callId — The unique identifier of the call session.

  • channels — An array of channels to join. Valid values include "audio" and/or "text".

Response

Upon successful subscription, the server will emit a channel.joined (Server -> Client) event, confirming your access to the requested channels.

channel.leave Event (Client -> Server)

The channel.leave event allows clients to unsubscribe from one or more specific channels within an active call session. This works similarly to the channel.join event but is used to remove the client’s connection from the selected Text and/or Audio channels.

Leave Request Format

To leave one or more channels, send the following JSON payload:

{
  "callId": "your-call-id",
  "channels": ["audio", "text"]
}
  • callId — The unique identifier of the call session.

  • channels — An array specifying which channels to leave. Valid values are "audio" and/or "text".

Important

  • Once you leave a channel, you will no longer receive messages or audio streams from that channel until you rejoin it.

  • For Text calls, you can't join Audio channels.

  • WebAudio and TextAudio need Audio channel to work properly.

  • For better experience while connected to Audio channel please join Text channel too since it connect allows receiving ''created" event

  • To minimize the credits that each call uses, please only join the needed channels.


Audio Processing Events

Note: for better experience while connected to Audio channel please join Text channel too.

audio Event (Server -> Client)

Purpose: Sends user audio chunks to the server for processing and analysis.

Request Payload

{
  "callId": "string",
  "audio": "base64-encoded audio chunk",
  "timestamp": number
}
  • callId — Unique identifier of the call session.

  • audio — Base64-encoded audio data chunk.

  • timestamp — Unix timestamp (milliseconds) representing when the audio chunk was captured.

Server Actions

  • Processes the incoming audio data.

  • Triggers Voice Activity Detection (VAD) to detect speech segments.

audio Event (Client -> Server)

Purpose: Sends audio chunks from the client to the server for real-time processing.

Payload

{
  "callId": "string",
  "audio": "base64-encoded audio chunk",
  "timestamp": 1733013245000
}
  • callId — Unique identifier of the call session.

  • audio — Base64-encoded audio data.

  • timestamp — Unix timestamp (ms) of when the chunk was captured.

Example (Client-Side)

typescriptCopyEditconst audioData = new Uint8Array(event.data[0]);
const base64Audio = window.btoa(
  String.fromCharCode(...Array.from(audioData))
);

socketRef.current?.emit('audio', {
  callId,
  audio: base64Audio,
  timestamp: Date.now()
});

agent.timestamp Event (Client -> Server)

Purpose: Synchronizes the timing of agent audio chunks during the call.

Request Payload

{
  "callId": "string",
  "chunkId": number,
  "timestamp": number
}
  • callId — Unique identifier of the call session.

  • chunkId — Identifier for the specific audio chunk.

  • timestamp — Unix timestamp (milliseconds) indicating the playback or recording time of the chunk.

Server Actions

  • Updates timing metadata related to the call recording for accurate synchronization of audio playback and storage.

agent.interruption Event (Client -> Server)

Purpose: Signals that the user has interrupted the agent during audio playback, allowing the system to handle the interruption appropriately.

Request Payload

{
  "callId": "string",
  "cursorIndex": number,
  "chunkId": number
}
  • callId — Unique identifier of the call session.

  • cursorIndex — Position within the audio buffer where the interruption occurred.

  • chunkId — Identifier of the audio chunk being interrupted.

Server Actions

  • Trims the interrupted audio buffer to remove overlapping audio.

  • Triggers analytics to log and analyze interruption events.

agent.stop Event (Server -> Client)

Purpose: Immediately terminates all agent audio playback for the specified call.

Request Payload

{
  "callId": "string"
}
  • callId — Unique identifier of the call session.

Server Actions

  • Stops all outgoing audio streams to the client associated with the call.

agent.end Event (Server -> Client)

Purpose: Initiates a graceful termination of the call session.

Request Payload

{
  "callId": "string"
}
  • callId — Unique identifier of the call session.

Server Actions

  • Finalizes and saves the call recording.

  • Triggers events to end the call.


Text Processing Events

text Event (Client -> Server)

Purpose: Allows the client to send a text message to the agent during a call session.

Request Payload

{
  "callId": "string",
  "textMessage": "Hello, I need help with my order."
}
  • callId — Unique identifier of the active call session.

  • textMessage — The message content the user wants to send to the agent.

Server Actions

  • Delivers the text message to the agent in real-time.

  • May trigger transcript events to update conversation logs


Transcripts events

created Event (Server -> Client)

The created event delivers real-time system updates about the call’s progress, transcripts, and internal state changes. Clients should listen for this event and handle its payload based on the type.

Event Payload Structure

{
  type: 'transcript' | 'pathwayInfo' | 'trace';
  createdAt: string; // ISO-8601 timestamp
  payload: TranscriptData | PathwayData | TraceData;
}

Handling by Event Type

1. Transcript Events (type: 'transcript')

  • Purpose: Receive conversation transcript segments with speaker attribution.

  • Payload example:

    jsonCopyEdit{
      "text": "Hello, how can I help you?",
      "speaker": "agent",
      "confidence": 0.95,
      "isFinal": true
    }
  • Typical client actions:

    • Render incoming messages in chat UI.

    • Detect user interruptions by matching transcript text.

    • Update conversation history and analytics.

2. Pathway Info Events (type: 'pathwayInfo')

  • Purpose: Notify client of pathway node transitions in the agent’s conversation flow.

  • Payload example:

    jsonCopyEdit{
      "node": {
        "id": "node_123",
        "type": "MessageNode",
        "data": {
          "text": "Welcome to our service!"
        }
      }
    }
  • Typical client actions:

    • Highlight current pathway node in UI.

    • Dynamically update UI components based on node data.

    • Debug or monitor conversation flow progression.

3. Trace Events (type: 'trace')

Purpose: Trace events signal key milestones and state changes in the call lifecycle, such as call initialization, node executions, and call termination.

Payload Example

jsonCopyEdit{
  "event": "call.started",
  "payload": {}
}

Common Sub-Event Types

  • call.init — Call is initializing.

  • call.started — Call has started.

  • call.node.exec — Execution of a specific conversation node.

  • call.end — Call has ended, possibly including a reason for termination.

Typical Client Actions

  • Update UI to reflect current call status (e.g., start timers on call.started).

  • Track conversation progress by monitoring node executions (call.node.exec).

  • Gracefully handle call termination (call.end) and display error alerts if termination was due to an error.

Example Client Code (React/JS)

webCallSocket.on('created', ({ callLog }) => {
  switch (callLog.type) {
    case 'transcript':
      // Add message to chat, detect interruptions, etc.
      break;

    case 'pathwayInfo':
      // Update active node in UI
      break;

    case 'trace':
      // Handle call lifecycle events and errors
      switch (callLog.payload.event) {
        case 'call.init':
          console.log('Call initializing');
          break;

        case 'call.started':
          showCallTimer();
          break;

        case 'call.node.exec':
          updateActiveNode(callLog.payload.payload.nodeId);
          break;

        case 'call.end':
          if (callLog.payload.reason && callLog.payload.reason.toLowerCase().includes('error')) {
            showErrorAlert(callLog.payload.reason);
          }
          endCallCleanup();
          break;
      }
      break;
  }
});

Call and Socket status Events

connect Event (Server -> Client)

Purpose: Triggered when the WebCall socket successfully connects.

Typical Client Actions:

  • Update UI to show connection status.

  • Optionally start initializing the call or mark it as active.

channel.joined Event (Server -> Client)

Purpose: Confirms that the client has successfully joined a specific channel.

Payload Example:

{
  "channel": "audio"
}
  • channel — The channel that was joined ("audio" or "text").

Typical Client Actions:

  • Mark audio or text channels as active.

  • Update voice state to "listening" for audio calls.

call.error Event (Server -> Client)

Purpose: Notifies the client of a call-specific error.

Payload Example:

{
  "message": "Failed to start call"
}

Typical Client Actions:

  • Show error message to the user.

  • Optionally terminate or retry the call.

disconnect Event (Server -> Client)

Purpose: Triggered when the WebCall socket disconnects.

Typical Client Actions:

  • Update UI to reflect disconnection.

  • Stop active streams and reset voice/text channel state.

connect_error Event (Server -> Client)

Purpose: Emitted if the WebCall socket fails to connect.

Payload Example:

{
  "message": "Connection timeout"
}

Typical Client Actions:

  • Log or display connection errors.

  • Stop initialization and notify the user.

error Event (Server -> Client)

Purpose: General socket error notification for unexpected issues.

Payload Example:

{
  "message": "WebCall stream error occurred"
}

Typical Client Actions:

  • Log the error for debugging.

  • Update UI and stop any audio processing if necessary.

Last updated