Mascotbot Avatar SDK Overview - Build Interactive AI Avatars

The Mascotbot avatar SDK turns speech into a real-time talking avatar. It is a small, composable, low-level surface — audio in → a serializable viseme timeline → a thin Rive playback layer — backed by the licensed model and asset delivery. It does not ship a call UI, a TTS engine, or provider glue; those are recipes you compose, not framework you adopt.

How it works

Authorize

MascotProvider (or LipsyncClient.init) exchanges your API key with the edge worker, which returns a short-lived license and the WASM runtime. Sessions auto-refresh in the background.

Process speech

You hand the SDK 16 kHz mono audio — a recorded buffer, microphone windows, or a tapped MediaStream. The SDK produces a viseme id per 10 ms frame.

Animate

Visemes drive the mouth inputs of a Rive avatar through the playback engine. The SDK writes only the mouth, is_speaking, and stress — everything else on the Rive file stays yours.

Packages

The SDK is two packages, each with a root and a /rive subpath. Import the narrowest one for your use case.

Import	What it is
`@mascotbot/core`	Engine + offline `VisemeTimeline` + `createPCMStreamPlayer`. Framework-agnostic, no Rive, no React.
`@mascotbot/core/rive`	Framework-agnostic Rive playback (`MascotPlayback`, `getRiveInputs`, `hasRiveInput`).
`@mascotbot/react`	React provider + `useMascot` / `useProcessAudio`.
`@mascotbot/react/rive`	React Rive layer: `<Mascot>`, `useMascotRive`, `useMascotInputs`, `useMascotPlayback`, `useLipsyncStream`.

Most React apps use @mascotbot/react + @mascotbot/react/rive. The /rive subpaths take @rive-app/webgl2 (and @rive-app/react-webgl2 for React) as an optional peer dependency — install it only if you render an avatar.

Three integration paths

Offline

Run inference once, persist the timeline as JSON, replay forever with zero reprocessing.

Microphone & streaming

Drive the avatar live from the user’s mic, a tapped MediaStream, or manually pushed audio.

Realtime AI

Connect OpenAI Realtime, Gemini Live, or ElevenLabs by tapping the assistant’s voice in real time.

All three end at the same place: a MascotPlayback instance driven by either a VisemeTimeline (offline) or a live audio source. There is no separate API to learn per path.

What the SDK does and does not do

The SDK writes exactly three Rive input families: mouth visemes (100..118), is_speaking, and stress. Every other state-machine input, data-binding ViewModel, event, and listener on the Rive instance is yours, accessed directly on the raw rive object. The SDK never wraps, gates, or proxies it. This contract is detailed in Rive co-existence. The SDK is intentionally minimal — audio in, animation out. Upgrading an existing integration? The migration guide maps every change.

Browser support

Chrome / Edge — full.
Safari (desktop 17+, iOS 17+) — full; WebAssembly + WebGL2 required for the Rive avatar.
Firefox — audio pipeline supported; the Rive renderer requires WebGL2.

The SDK refuses to run if WebAssembly or crypto.subtle are unavailable.

Installation

Private registry, keys, peer deps.

Quickstart

A working avatar in a few lines.

Visemes & the timeline

The core data model.

Getting Started

Core concepts

React SDK

Core SDK (vanilla)

Realtime providers

Reference

Ready-made Mascots

Mascotbot Avatar SDK Overview - Build Interactive AI Avatars

How it works

Packages

Three integration paths

Offline

Microphone & streaming

Realtime AI

What the SDK does and does not do

Browser support

Next

Installation

Quickstart

Visemes & the timeline

Getting Started

Core concepts

React SDK

Core SDK (vanilla)

Realtime providers

Reference

Ready-made Mascots

Documentation Index

​How it works

​Packages

​Three integration paths

Offline

Microphone & streaming

Realtime AI

​What the SDK does and does not do

​Browser support

​Next

Installation

Quickstart

Visemes & the timeline

How it works

Packages

Three integration paths

What the SDK does and does not do

Browser support

Next