Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.mascot.bot/llms.txt

Use this file to discover all available pages before exploring further.

The Mascotbot avatar SDK turns speech into a real-time talking avatar. It is a small, composable, low-level surface — audio in → a serializable viseme timeline → a thin Rive playback layer — backed by the licensed model and asset delivery. It does not ship a call UI, a TTS engine, or provider glue; those are recipes you compose, not framework you adopt.

How it works

1

Authorize

MascotProvider (or LipsyncClient.init) exchanges your API key with the edge worker, which returns a short-lived license and the WASM runtime. Sessions auto-refresh in the background.
2

Process speech

You hand the SDK 16 kHz mono audio — a recorded buffer, microphone windows, or a tapped MediaStream. The SDK produces a viseme id per 10 ms frame.
3

Animate

Visemes drive the mouth inputs of a Rive avatar through the playback engine. The SDK writes only the mouth, is_speaking, and stress — everything else on the Rive file stays yours.

Packages

The SDK is two packages, each with a root and a /rive subpath. Import the narrowest one for your use case.
ImportWhat it is
@mascotbot/coreEngine + offline VisemeTimeline + createPCMStreamPlayer. Framework-agnostic, no Rive, no React.
@mascotbot/core/riveFramework-agnostic Rive playback (MascotPlayback, getRiveInputs, hasRiveInput).
@mascotbot/reactReact provider + useMascot / useProcessAudio.
@mascotbot/react/riveReact Rive layer: <Mascot>, useMascotRive, useMascotInputs, useMascotPlayback, useLipsyncStream.
Most React apps use @mascotbot/react + @mascotbot/react/rive. The /rive subpaths take @rive-app/webgl2 (and @rive-app/react-webgl2 for React) as an optional peer dependency — install it only if you render an avatar.

Three integration paths

Offline

Run inference once, persist the timeline as JSON, replay forever with zero reprocessing.

Microphone & streaming

Drive the avatar live from the user’s mic, a tapped MediaStream, or manually pushed audio.

Realtime AI

Connect OpenAI Realtime, Gemini Live, or ElevenLabs by tapping the assistant’s voice in real time.
All three end at the same place: a MascotPlayback instance driven by either a VisemeTimeline (offline) or a live audio source. There is no separate API to learn per path.

What the SDK does and does not do

The SDK writes exactly three Rive input families: mouth visemes (100..118), is_speaking, and stress. Every other state-machine input, data-binding ViewModel, event, and listener on the Rive instance is yours, accessed directly on the raw rive object. The SDK never wraps, gates, or proxies it. This contract is detailed in Rive co-existence. The SDK is intentionally minimal — audio in, animation out. Upgrading an existing integration? The migration guide maps every change.

Browser support

  • Chrome / Edge — full.
  • Safari (desktop 17+, iOS 17+) — full; WebAssembly + WebGL2 required for the Rive avatar.
  • Firefox — audio pipeline supported; the Rive renderer requires WebGL2.
The SDK refuses to run if WebAssembly or crypto.subtle are unavailable.

Next

Installation

Private registry, keys, peer deps.

Quickstart

A working avatar in a few lines.

Visemes & the timeline

The core data model.