Fireworks AI
Fireworks AI is a production-grade inference platform for open-source and proprietary generative models. The Fireworks API hosts Llama, DeepSeek, Qwen, Mixtral, Stable Diffusion, and other models with serverless pay-per-token, on-demand dedicated GPU, and batch deployment options, plus managed fine-
4 channels
across 1 AsyncAPI spec
· Provider profile
Channels
-
OpenAI-compatible chat completions. When the request body sets `stream: true`, the response is `text/event-stream`. Each event is emitted as a `data:` line whose payload is a JSON `ChatCompletionStreaFireworks AI Streaming Inference API
-
OpenAI-compatible legacy text completions. When the request body sets `stream: true`, the response is `text/event-stream`. Each event is a `data:` line whose payload is a JSON `CompletionStreamResponsFireworks AI Streaming Inference API
-
Anthropic-compatible Messages endpoint. When `stream: true`, the response is `text/event-stream`. Unlike the OpenAI-compatible endpoints, each SSE event includes both an `event:` line naming the eventFireworks AI Streaming Inference API
-
OpenAI-compatible Responses API. When the request body sets `stream: true`, the response is `text/event-stream`. Per Fireworks docs each chunk is an SSE event delivering the incremental Response stateFireworks AI Streaming Inference API