AsyncAPI channel · cartesia · Cartesia Streaming WebSocket APIs

/stt/websocket

Ink STT streaming WebSocket. Clients send binary audio frames matching the encoding/sample_rate query parameters, plus optional textual `finalize` and `close` control frames. The server emits JSON `transcript`, `flush_done`, `done`, and `error` frames.

Provider: cartesia AsyncAPI: v2.6.0 Spec: Cartesia Streaming WebSocket APIs Operations: 2 Messages: 7

Channel address

/stt/websocket

Operations

publish
sttClientFrame
Client → Server frames for Ink STT.
sttServerFrame
Server → Client frames for Ink STT.

Messages

sttAudioBinary
Raw audio bytes matching the negotiated encoding and sample rate.
Content-Type: application/octet-stream
sttFinalize
Plain-text `finalize` control frame; triggers transcription of buffered audio.
Content-Type: text/plain
sttClose
Plain-text `close` control frame; flushes remaining audio and ends the session.
Content-Type: text/plain
sttTranscript
Delta transcript with word-level timing.
Content-Type: application/json
sttFlushDone
Acknowledgement of a `finalize` control frame.
Content-Type: application/json
sttDone
Acknowledgement of a `close` control frame.
Content-Type: application/json
sttError
Error condition on the STT WebSocket.
Content-Type: application/json

About AsyncAPI

The AsyncAPI specification describes event-driven APIs the way OpenAPI describes request/response APIs. A channel is the named pipe — a webhook URL, a Kafka topic, a WebSocket route, an MQTT subject — that producers and consumers publish or subscribe to. Each channel carries one or more messages with structured payloads, and an operation declares whether a given party sends or receives on that channel.

Browse every event-driven channel on the APIs.io network or compare with the broader Naftiko capability, Agent Skill, and MCP server surfaces of the same providers.