Troubleshooting
Common integration issues with@mascotbot/react 0.2.x and their
fixes. If your symptom is a license refusal, match the error.code against
the error-code reference first.
Install fails with 401 / 403 from npm.mascot.bot
The private registry needs a valid token in.npmrc:
403 wrong_key_scope means the token is the wrong kind for the registry —
mint one at app.mascot.bot/api-keys. Make
sure the .npmrc is at the project root and the token has no trailing
newline.
Status never reaches ready
Read status and error from useMascot():
status === "refused"→ an authorization problem. Branch onerror.code(dev_key_on_public_domain,prod_key_on_localhost,origin_not_allowed,key_disabled, …) and show the matching fix. See Licensing & keys.status === "error"with aNetworkError→ the device cannot reachlicense.mascot.bot. Check connectivity, ad blockers, and corporate proxies.- Stuck on
initializingwith no error → WebAssembly orcrypto.subtleis unavailable (old browser, insecure context). The SDK requires a secure context (HTTPS orlocalhost).
Blank canvas — the avatar never renders
Almost always the Rive state machine name. Pass onlymascotStateMachine
(or STATE_MACHINE_NAMES[0]) to Rive. Rive 2.37+ throws on any unknown
state-machine name in the array; the throw fires LoadError, suppresses
Load, and leaves the canvas blank. Also verify the artboard is named
Character and the file exposes mouth inputs 100–118.
Mouth frozen during an active call
The SDK does not freeze on parent re-renders — Rive input handles are referentially stable and playback is carried across any internal recreate. A frozen mouth during a call is almost always one of:- Unstable
naturalLipSyncConfig— a new object literal every render reinitializes the natural-lipsync processor. Pass a module constant or auseState/useMemoreference. - Audio is not reaching the tap — pass
onFrametouseLipsyncStreamand logsilenceDetected/emittedVisemeId. Rising emitted IDs with a dead mouth means audio is reaching the engine but the Rive handle isn’t being written; a flat zero /silenceDetected: truemeans the tap is on a silent corpse (typical of a self-playing realtime provider torn down and not re-attached — see ElevenLabs 2nd-call diagnostic). - Wrong
sourceshape —{ kind: "mediaStream", stream }wherestreamisnullor has only ended tracks.
Call disconnects immediately after onConnect (reason: 'user')
Symptom: Conversation.startSession({ ... }) resolves, your onConnect
fires, the agent’s first_message may even reach onMessage, then
onStatusChange flips to disconnecting → disconnected and
onDisconnect runs with details === { reason: 'user' }. There’s no
network error, no onError, no server-side disconnect — your own code
called endSession().
The usual cause is an unmount-cleanup effect whose dep array contains a
teardown callback whose identity flips on every render:
setSomeMascotInput typically traces back to handles returned by
useMascotInputs() (which intentionally return a fresh { custom, has }
object per render — see Rive co-existence).
Each fresh handle → new useCallback chain → new teardown identity →
the cleanup runs, which calls endSession(), which surfaces as a
'user' disconnect right after onConnect.
Fix — stabilise the unmount cleanup with a ref so it runs once on
unmount only, while always invoking the latest teardown closure:
teardown’s identity
changes; the ref always points at the latest closure when the component
finally unmounts. Apply the same shape to any other long-lived cleanup
that depends on hook handles which are fresh-per-render
(useMascotInputs, useMascotRive — see also the
ElevenLabs onModeChange recipe
which captures custom in a ref for the same reason).
How to diagnose: temporarily log the disconnect detail and any frames so
you can distinguish a self-end from an agent/server end:
reason: 'user' is your code; reason: 'agent' is the agent stopping
the call; any onError first means a server-side problem.
Mouth flickers when speech stops
This is handled by the SDK’s internal −50 dBFS silence gate — do not add your own gate. If you still see phantom shapes at end of utterance, you are likely feeding a self-playing realtime provider throughcreatePCMStreamPlayer (double audio / doubled inference). Tap the provider’s
own output instead — see Realtime providers.
Both the SDK and the provider play audio (double voice)
createPCMStreamPlayer is only for providers that hand you raw PCM and do
not play it (Gemini Live, OpenAI Realtime over WebSocket). For self-playing
providers (ElevenLabs, OpenAI Realtime over WebRTC), do not use the player —
tap their existing playback with the SDK’s cross-browser
createElementTap() and feed that
to useLipsyncStream({ source: { kind: "mediaStream", stream } }).
No audio in a realtime/TTS demo
TheAudioContext (and createPCMStreamPlayer) must be created inside the
user-gesture handler, before any await. A context created in a post-fetch
microtask starts suspended and cannot resume without another gesture. Create
or resume() the player synchronously at the top of the click handler.
ElevenLabs 2nd call has no lip sync (1st call worked)
Symptom: an ElevenLabs widget animates the mouth on the very first call, you end it cleanly, then start a new call — voice plays, console is clean, but the mouth is frozen for the entire second call. You’re using thewindow.Audio patch + <audio> poll pattern from
the ElevenLabs avatar guide (the
cross-browser tap approach). The class:
- The patch stashes a reference to the
<audio>element ElevenLabs constructs (e.g.w.__el = el) so a 100 ms poll cantap.attach()it once it’s wired up. - On call-end,
endSession()stops the conversation but the stashed reference and thesrcObjectMediaStream both remain onwindow. The MediaStream’s audio tracks transition toreadyState: 'ended', butel.srcObject instanceof MediaStreamis stilltrue. - On call #2, the poll runs almost immediately — typically before
ElevenLabs has called
new Audio()again. The naive checkel && el.srcObject instanceof MediaStreamaccepts the stale reference,tap.attach()lands on a silent corpse, and zero audio reaches the new tap.
isLive check is the real defense — even if you forget the null
in teardown, no element with readyState !== 'live' will ever be
attached. The null-out is belt-and-suspenders that also avoids one
wasteful poll iteration.
One widget’s lip sync is fast/garbled after another widget ran
Symptom: widget A (e.g. a Gemini call) works; you end it and start widget B (e.g. an ElevenLabs widget) on the same page, and B’s mouth runs at ~2× / flickers. B alone, or B-then-A, is fine. The whole page shares one<MascotProvider> → one
LipsyncClient. useLipsyncStream’s mediaStream pipeline is
keyed on the stream’s identity and tears down (closes its
AudioContext + worklet + streaming session) only when that stream
reference changes. If, on call-end, you only player.stop() but keep
the same player.outputStream in state, the pipeline never tears down —
it lingers on the shared client. Widget B then opens a second
inference pipeline on the same client and the two corrupt each other’s
pacing.
Fully release the pipeline on every call-end path (stop,
onclose, error), symmetric with however you created it:
createPCMStreamPlayer().stop() only drops queued audio (barge-in);
.close() releases the context. Self-playing taps must likewise stop
polling and setStream(null). (Switching the avatar by unmounting the
<Mascot> subtree tears down implicitly — this bug only surfaces
when a call ends without unmounting.)
Avatar customizations (gender, colors, outline) don’t apply
Symptom: you setuseMascotInputs().custom.gender.value = … once (e.g.
in a mount effect) and the avatar still shows defaults.
Custom inputs are no-op shims until Rive has bound the real
state-machine handles, which happens asynchronously after load. A
single early write lands on a shim and is lost; the state machine then
settles into its default pose and never re-evaluates.
Consume raw useMascotInputs() (its custom/has are a fresh
object every render — do not freeze them in a memo for this),
gate the write on has(...), and re-assert every render. The
re-application is idempotent and load-bearing — a one-shot write is the
bug:
Avatar is hidden behind a section background
<MascotRive> renders a position: relative element with no
z-index. A positioned background sibling (z-5, an absolute image,
etc.) will paint over it. Wrap only <MascotRive> in a low positive
z — never a wrapper that also contains your call controls, or that
wrapper becomes a stacking context and traps the controls under a
sibling gradient:
Next.js Pages Router: “Named export not found”
Pages Router has stricter module resolution. Transpile the package:rm -rf .next.
CSP blocks the audio worklet
The worklet is served from a Blob URL by default. If your Content Security Policy forbidsworker-src blob:, either allow it or host the worklet
yourself and pass its URL via workletUrl on useLipsyncStream.
Still stuck?
Compare against the reference integration inapps/lipsync-demo (single file, no design-system deps), or
email support@mascot.bot. See also the
migration guide.