AsyncAPI channel · cartesia · Cartesia Streaming WebSocket APIs

/tts/websocket

Sonic TTS bidirectional WebSocket. Clients send JSON `generationRequest` or `cancelRequest` frames; the server returns JSON `chunk`, `flush_done`, `done`, `timestamps`, `phoneme_timestamps`, and `error` frames. Multiple concurrent contexts are multiplexed by `context_id`.

Provider: cartesia AsyncAPI: v2.6.0 Spec: Cartesia Streaming WebSocket APIs Operations: 2 Messages: 8

Channel address

/tts/websocket

Operations

publish
ttsClientFrame
Client → Server frames for Sonic TTS.
ttsServerFrame
Server → Client frames for Sonic TTS.

Messages

ttsGenerationRequest
Submit (or continue) a transcript for streaming synthesis on a context.
Content-Type: application/json
ttsCancelRequest
Terminate an in-flight generation for a context.
Content-Type: application/json
ttsChunk
A base64-encoded audio chunk for an active context.
Content-Type: application/json
ttsTimestamps
Word-level start/end timings for synthesized audio.
Content-Type: application/json
ttsPhonemeTimestamps
Phoneme-level start/end timings for synthesized audio.
Content-Type: application/json
ttsFlushDone
Acknowledgement that a flush boundary has been emitted on the context.
Content-Type: application/json
ttsDone
Final generation-complete signal for a context.
Content-Type: application/json
ttsError
Error condition on the TTS WebSocket or a specific context.
Content-Type: application/json

About AsyncAPI

The AsyncAPI specification describes event-driven APIs the way OpenAPI describes request/response APIs. A channel is the named pipe — a webhook URL, a Kafka topic, a WebSocket route, an MQTT subject — that producers and consumers publish or subscribe to. Each channel carries one or more messages with structured payloads, and an operation declares whether a given party sends or receives on that channel.

Browse every event-driven channel on the APIs.io network or compare with the broader Naftiko capability, Agent Skill, and MCP server surfaces of the same providers.