Animation Flow System - Flexible Intro/Loop/Outro
Overview
The animation system now supports flexible combinations of intro, loop, and outro sections. Each animation can define any combination of these sections, and the system will intelligently handle playback based on what’s available.
The legacy play_once flag is still supported but works differently depending on the animation structure.
Animation Structure
Each animation can have up to three sections defined in its .fbx.json descriptor:
{
"intro": {
"start_frame": 0,
"end_frame": 20
},
"loop": {
"start_frame": 21,
"end_frame": 120
},
"outro": {
"start_frame": 121,
"end_frame": 160
}
}
Sections are optional and can be defined in any combination.
Play Once Flag Behavior
The play_once flag interacts differently with animation structures:
Case 1: play_once + intro/outro (CONFLICT)
{
"play_once": true,
"intro": {"start_frame": 0, "end_frame": 20},
"outro": {"start_frame": 121, "end_frame": 160}
}
Behavior:
- play_once flag is IGNORED (structured sections take precedence)
- A warning is logged explaining the conflict
- Animation executes its intro → outro flow normally
- Rationale: intro/outro define a complete structured flow; play_once is redundant
Case 2: play_once + loop only (COMPATIBLE)
{
"play_once": true,
"loop": {"start_frame": 0, "end_frame": 100}
}
Behavior:
- Loop section plays once only (not repeated)
- No looping occurs
- Useful for isolating a portion of animation to play once
- Rationale: loop defines frame range to use; play_once restricts to single playback
Supported Combinations
Full Animation (intro + loop + outro)
Playback flow:
START → [INTRO] → [LOOP (repeat)] → STOP command → [OUTRO] → IDLE
Example: Thinking animation starts with intro frames, loops the thinking motion, and ends with outro frames before returning to idle.
Loop with Outro (loop + outro)
Playback flow:
START → [LOOP (repeat)] → STOP command → [OUTRO] → IDLE
Example: A repeating animation that has a graceful ending sequence.
Intro + Loop (no outro)
Playback flow:
START → [INTRO] → [LOOP (repeat)] → STOP command → IDLE
Example: Animation starts with intro but stops immediately without outro.
Loop Only (no intro, no outro)
Playback flow:
START → [LOOP (repeat)] → STOP command → IDLE
Special case with
play_once:{ "play_once": true, "loop": {"start_frame": 30, "end_frame": 90} }
Behavior: Plays loop section once only (doesn’t repeat). Useful for extracting a portion of animation and playing it as a one-shot.
Intro + Outro (no loop)
Playback flow:
START → [INTRO] → [OUTRO] → IDLE
Example: One-shot animation with setup and teardown.
Intro Only (no loop, no outro)
Playback flow:
START → [INTRO] → IDLE
Example: Quick animation that plays once and stops.
Solo Sections
Loop only: Repeating animation, stops immediately
Outro only: Ending animation (unusual)
Intro only: One-shot animation
Implementation Details
Animation Analysis
The handler includes _analyze_animation_structure() method that detects which
sections are present in a descriptor:
structure = handler._analyze_animation_structure(descriptor, animation_file)
# Returns: {
# "has_intro": bool,
# "has_loop": bool,
# "has_outro": bool
# }
This analysis also validates the play_once flag and logs warnings if conflicts are detected.
Play Animation Logic
When play_animation() is called:
Load descriptor and analyze structure
Determine effective loop behavior:
If has intro/outro (structured):
If also has loop → loop=True
Else → loop=False (play once through structure)
play_once flag is ignored with warning
Else if only loop + play_once flag:
loop=False (plays once only, doesn’t repeat)
Else if only loop:
loop=True (repeats normally)
Else:
Use provided loop parameter
Send animation command with descriptor to WebUI
WebUI uses frame ranges to play correct sections
No rotation task started for structured animations
Stop Animation Logic
When stop_animation() is called:
Check if animation has
outrosectionIf has outro:
Send animation command to play outro
Calculate duration based on frame count (approx 30fps)
Wait for outro to complete
Then transition to Idle
If no outro:
Immediately transition to Idle
WebUI Integration
The WebUI receives animation commands with this structure:
{
"type": "animation",
"animation": "/skins/Rei/animations/Thinking.fbx",
"loop": true,
"state": "think",
"descriptor": {
"intro": {"start_frame": 0, "end_frame": 20},
"loop": {"start_frame": 21, "end_frame": 120},
"outro": {"start_frame": 121, "end_frame": 160}
}
}
The WebUI uses this information to: - Play specific frame ranges - Handle looping logic for the loop section - Prepare outro frames for graceful stopping
Optional animation_state payload (facial state)
Alongside type: "animation" commands, the backend may include an optional rich animation_state
object. This is backward compatible: if absent, the WebUI behaves as before.
animation_state is designed for a hybrid approach:
The backend can provide suggestions (descriptor
expressions/blink/eye_movement) and the current emotion snapshot (emotions).The WebUI applies facial changes by resolving logical keys via the active skin’s
persona.json(blendshape_map), so different skins can map emotions/visemes/blendshapes differently.
Minimal example (shortened):
{
"type": "animation",
"state": "write",
"animation": "/skins/Rei/animations/Write/Texting.fbx",
"descriptor": { "loop": {"start_frame": 0, "end_frame": 120} },
"animation_state": {
"action": "write",
"phase": "loop",
"animation": "/skins/Rei/animations/Write/Texting.fbx",
"descriptor": { "loop": {"start_frame": 0, "end_frame": 120} },
"clip": { "name": "Texting", "duration": 4.0, "fps": 30 },
"timing": { "started_at": "2025-12-20T20:00:00Z", "time_in_clip": 0.0, "current_frame": 0 },
"expressions": [],
"blink": { "auto": true, "rate_s": 3.5, "intensity": 0.6 },
"eye_movement": { "auto": true, "saccade_rate_s": 2 },
"emotions": { "dominant": "happy", "values": { "happy": 7.5, "calm": 5.2 } },
"lipsync": false
}
}
Notes:
lipsyncis a boolean consent flag only (default:falsewhen not present).expressions.targetsuses logical keys; the WebUI resolves them using the skin mapping.
Emotion overlay (client-side)
When animation_state.emotions is present, the WebUI may apply a short “emotion overlay” facial pose:
pick the strongest emotion from
emotions.values(ties are broken randomly)wait a small random delay after action start
apply the corresponding face for a random duration that scales with the emotion intensity
This is intentionally not tied to WRITING specifically, because plugins may override or bypass the
writing phase/action. The mapping is done through persona.json under the new emotions mapping.
Per-skin overrides
Skins can optionally expose an emotions mapping in skins/<SkinName>/persona.json to
customize emotion face definitions for that skin. The new compact format uses a mapping keyed
by emotion name; each value is a flat dictionary of blendshape names to weights. For example:
"emotions": {
"angry": { "mouth_frown": 1.0, "brow_down": 0.9 },
"happy": { "mouth_smile": 1.0, "eyes_smile": 0.6 },
"sad": { "mouth_frown": 1.0, "eyes_closed": 0.8 }
}
Notes:
The
emotionsfield is a mapping; UI components derive the exposed emotion list from the mapping keys (no separate list is required).The per-emotion objects contain blendshape -> weight pairs (floats 0.0-1.0). The previous
targetswrapper andpriorityfield are no longer used in the new format.
The WebUI will expose these presets in window.__synth_emotion_face_presets and will set
window.__synth_persona_emotions_list to the list of keys from the mapping for UI components
(sliders, overlays) to consume.
Optional animation_state payload (facial state)
In addition to the legacy fields (state, animation, descriptor), the backend may attach a richer
animation_state object to the WebSocket payload. This is optional and fully backward compatible.
Note: the lightweight animation state summary broadcast (used for state synchronization) will also try to
enrich the animation_state with runtime emotions when an Emotion Manager plugin is available.
Typical use cases:
Provide a single structured snapshot for facial controllers (expressions, blink, eye movement)
Expose emotional state to the UI (for skin-specific mapping)
Declare lip-sync consent via a simple boolean flag
Example (shortened):
{
"type": "animation",
"state": "think",
"animation": "/skins/Rei/animations/Think/Thinking.fbx",
"descriptor": { ... },
"animation_state": {
"action": "think",
"phase": "loop",
"animation": "/skins/Rei/animations/Think/Thinking.fbx",
"descriptor": { ... },
"clip": { "name": "Thinking", "duration": 2.34, "fps": 30 },
"timing": { "started_at": "2025-12-17T20:00:00Z", "time_in_clip": 1.2, "current_frame": 36 },
"expressions": [ ... ],
"blink": { "auto": true, "rate_s": 3.5, "intensity": 0.6 },
"eye_movement": { "auto": true, "saccade_rate_s": 2 },
"emotions": { "dominant": "happy", "values": { "happy": 7.5, "calm": 5.2 } },
"lipsync": false
}
}
Notes:
lipsyncis a boolean consent flag only (default:falsewhen not provided).Blink defaults are tuned to a human-like frequency (~15–20 blinks/min), i.e. roughly one blink every 3–4 seconds.
The WebUI emits browser events when a rich
animation_stateis received:synth_animation_state_updated(detail: full state)synth_animation_lipsync_changed(detail:{ lipsync: boolean })
Backward Compatibility
Animations without descriptors work as before (use provided loop parameter)
The legacy
play_onceflag is still supportedWith intro/outro: ignored (warning logged)
With loop only: plays loop once
Without structured sections: plays animation once
Existing animations continue to work unchanged
Creating New Animations
To create an animation with intro/loop/outro:
Create the FBX animation with:
Intro frames: setup/transition frames
Loop frames: repeating motion frames
Outro frames: wind-down/transition frames
Create a .fbx.json descriptor:
{ "intro": { "start_frame": 0, "end_frame": 29 }, "loop": { "start_frame": 30, "end_frame": 119 }, "outro": { "start_frame": 120, "end_frame": 149 } }
Save both files in the same directory:
animations/state/Name.fbxanimations/state/Name.fbx.json
Testing
Run the animation flow tests:
python test_animation_flow.py
Tests verify: - Descriptor loading - Structure analysis - Loop behavior determination - play_once flag handling - Outro playback
Animation State (server → WebUI)
The backend may include an optional animation_state object in the WebSocket payload. This object provides fine-grained instructions and the current emotional state for client-side facial animation. Example schema (abridged):
Server endpoints
GET /api/animation_state– Return the current centralized animation state (used by remote clients to sync).POST /api/animation_state– Request a centralized animation state change. Body:{state, session_id?, loop?, context_id?, source?}.
Security note: Accepting remote requests to modify the global animation state can affect how the avatar behaves persistently; the server should apply appropriate access controls (API keys, interface-level trust, or admin confirmation) before honoring requests that alter the central state.
{
"animation_state": {
"action": "think",
"phase": "loop",
"descriptor": { ... },
"clip": { "name": "Thinking", "duration": 2.34, "fps": 30 },
"timing": { "started_at": "2025-12-17T20:00:00Z", "time_in_clip": 1.2, "current_frame": 36 },
"expressions": [ { "start_frame":0, "end_frame":15, "targets": { "eyes_closed": 0.1, "mouth.O": 0.02 }, "source": "server", "priority": 10 } ],
"blink": { "auto": true, "rate_s": 4, "intensity": 0.6 },
"eye_movement": { "auto": true, "saccade_rate_s": 2 },
"emotions": { "dominant": "happy", "values": { "happy": 7.5, "calm": 5.2 } },
"lipsync": false
}
}
Notes:
animation_stateis optional and preserved for backward compatibility if missing.lipsyncis a boolean flag (defaultfalse) — it is a signal that lip‑sync may be enabled by a consumer, it does not automatically start lip‑sync processing.The WebUI is responsible for resolving expression targets to per‑skin blendshapes using
skins/<skin>/persona.json(blendshape_map) and applying smoothing locally.
Per-skin persona mapping
Place mappings in skins/<skin>/persona.json under the blendshape_map key. Example:
{
"blendshape_map": {
"happy": "Smile",
"mouth.O": "Vowel_O"
},
"emotion_speed": { "default": 6.0, "decay": 4.0 }
}
The map can be minimal: only include the logical keys your skin actually uses (e.g., eyelid closure, mouth O, and the viseme aliases required by your VRM). Example minimal map:
"blendshape_map": {
"eyes_closed": "eyes_closed",
"mouth.O": "mouth_O",
"visemes": { "A": {"mouth_A": 1.0}, "O": {"mouth_O": 1.0} }
}
Do not place per-emotion presets inside blendshape_map — use the top-level emotions mapping instead (see section above). Keeping blendshape_map minimal reduces maintenance and lowers the chance of mismatches between animation descriptors and skin mappings.
The WebUI will fetch /skins/<skin>/persona.json and apply the mapping when resolving targets from animation_state.expressions.
Testing & manual QA
A small manual test harness is available at
docs/_static/animation_face_test.htmlto exercise face expressions and verify thatblendShapeProxy.setValueis invoked.The WebUI template now emits global events
synth_animation_state_updatedandsynth_animation_lipsync_changedthat can be used by other consumers.
Graceful stopping
Various animation combinations