Deepgram
Deepgram is an enterprise voice AI platform that provides speech-to-text, text-to-speech, and voice agent APIs powered by advanced AI models. The platform offers real-time and batch transcription through its Nova model family, natural-sounding speech synthesis through its Aura model family, and an e
3 channels
across 3 AsyncAPI specs
· Provider profile
Channels
-
WebSocket channel for the Voice Agent API. After connecting, the client sends a Settings message to configure the agent's listen (STT), think (LLM), and speak (TTS) providers, followed by binary audioDeepgram Voice Agent Events
-
WebSocket channel for real-time speech-to-text streaming. The client sends binary audio frames and receives JSON transcription events. Connection parameters include model, language, punctuate, diarizeDeepgram Speech-to-Text Streaming Events
-
WebSocket channel for real-time text-to-speech streaming. The client sends text as JSON messages and receives synthesized audio as binary frames. Connection parameters include model, encoding, sample_Deepgram Text-to-Speech Streaming Events