Endpoints
Audio to Lipsync Data
Takes base64 encoded audio and streams viseme (mouth shapes) predictions using Server-Sent Events (SSE). Requires valid API key.
POST
Authorizations
Bearer authentication header of the form Bearer <token>
, where <token>
is your auth token.
Body
application/json
Base64 encoded audio data
Example:
"UklGRiQAAABXQVZFZm10IBAAAAABAAEARKwAAIhYAQACABAAZGF0YQAAAAA="
Audio sample rate in Hz
Example:
16000
Response
200
text/event-stream
Successful response streams viseme predictions using Server-Sent Events.
Each event contains viseme data for a processed audio chunk.
Server-Sent Events stream with the following format:
- Each event starts with "data: " prefix followed by JSON
- Example normal event: data: {"visemes":[...],"chunk_progress":"1/10","chunk_duration_ms":100}
- Example error event: error: {"message":"Error message","chunk_id":5}