π§ Real-Time Speech Analytics Guide
This guide explains how to integrate your live audio stream (e.g., from Genesys AudioHook) with our real-time speech analytics WebSocket endpoint. You'll receive live transcripts and optional AI-powered analysis results during the call.
π‘ WebSocket Endpoint
Connect to the following WebSocket URL:
?x-api-key=YOUR_API_KEY
Pass the API key as a query parameter, not a header.
π Authentication
Your API key must be included as a query parameter:
?x-api-key=YOUR_API_KEY
π API Key: You can view your organizationβs live stream API key on your Profile page.
ποΈ Audio Input Format
- Audio codec: PCM16 (raw linear)
- Sample rate: 16 kHz
- Channels: Mono (1)
- Frame size: 500ms chunks
- Language: Auto-detected (For faster and more accurate transcription, specify the spoken audio language)
π€ Initial Message
After connecting, send a JSON message to open the session:
{
"type": "open",
"id": "your-session-id",
"seq": 1,
"parameters": {
"media": [
{ "type": "audio", "codec": "audio/pcm" }
]
}
}
π₯ Server Response
The server replies with:
{
"type": "opened",
"id": "your-session-id",
"parameters": {
"startPaused": false,
"media": [...],
"supportedLanguages": ["en", "tr"]
}
}
π Message Flow
- Send
openβ Initiate session - Send binary
audio/pcmdata chunks continuously - Receive
eventmessages with real-time transcript (and analysis) - Send
closemessage β Cleanly terminate session
β Transcript Output
Transcripts are returned in this format:
{
"version": "2",
"type": "event",
"id": "your-session-id",
"parameters": {
"entity": {
"type": "transcript",
"value": "Live transcribed text here..."
},
"analysis": {
"summary": "...",
"sentiment": "neutral",
"emotion": "calm",
"agent_score": 4,
"process_issues": "...",
"repeated_problems": "...",
"marketing_mentions": "..."
}
}
}
The analysis field is included when enough words are spoken (β₯ 5 words).
πΎ Final Results
When the session ends (via close or disconnection), a full transcript and final analysis are saved to CSV on the server for offline review.