🎧 Real-Time Speech Analytics Guide

This guide explains how to integrate your live audio stream (e.g., from Genesys AudioHook) with our real-time speech analytics WebSocket endpoint. You'll receive live transcripts and optional AI-powered analysis results during the call.

📡 WebSocket Endpoint

Connect to the following WebSocket URL:

?x-api-key=YOUR_API_KEY

Pass the API key as a query parameter, not a header.

🔐 Authentication

Your API key must be included as a query parameter:

?x-api-key=YOUR_API_KEY

🔑 API Key: You can view your organization’s live stream API key on your Profile page.

🎙️ Audio Input Format

Audio codec: PCM16 (raw linear)
Sample rate: 16 kHz
Channels: Mono (1)
Frame size: 500ms chunks
Language: Auto-detected (For faster and more accurate transcription, specify the spoken audio language)

📤 Initial Message

After connecting, send a JSON message to open the session:

{
  "type": "open",
  "id": "your-session-id",
  "seq": 1,
  "parameters": {
    "media": [
      { "type": "audio", "codec": "audio/pcm" }
    ]
  }
}

📥 Server Response

The server replies with:

{
  "type": "opened",
  "id": "your-session-id",
  "parameters": {
    "startPaused": false,
    "media": [...],
    "supportedLanguages": ["en", "tr"]
  }
}

🔄 Message Flow

Send open → Initiate session
Send binary audio/pcm data chunks continuously
Receive event messages with real-time transcript (and analysis)
Send close message → Cleanly terminate session

✅ Transcript Output

Transcripts are returned in this format:

{
  "version": "2",
  "type": "event",
  "id": "your-session-id",
  "parameters": {
    "entity": {
      "type": "transcript",
      "value": "Live transcribed text here..."
    },
    "analysis": {
      "summary": "...",
      "sentiment": "neutral",
      "emotion": "calm",
      "agent_score": 4,
      "process_issues": "...",
      "repeated_problems": "...",
      "marketing_mentions": "..."
    }
  }
}

The analysis field is included when enough words are spoken (≥ 5 words).

💾 Final Results

When the session ends (via close or disconnection), a full transcript and final analysis are saved to CSV on the server for offline review.