ENGINE MODULE
Visual Action Orchestration Engine
If you mute the audio, do you understand the scene?
“Visual blocking is the language of cinema. VAOE writes it for every shot.”
VAOE generates the visual blocking specification for every shot in your story. Blocking is the physical arrangement of subjects, objects, and motion within a frame — the spatial language that makes a scene readable without dialogue. VAOE produces a 6-field specification for each shot: frame subject, spatial layout, primary motion, secondary motion, physics consequence, and end state.
This is the system that answers the fundamental question of visual storytelling: "If you mute the audio, does the viewer still understand what is happening?" If the answer is no, the blocking is insufficient. VAOE ensures the answer is always yes.
VAOE also coordinates with Impact Dynamics (ID) for physics consequences and Kinetic Intelligence (KI) for action intensity scoring, ensuring that blocking specifications are physically grounded and proportional to the narrative energy of each shot.
VAOE — Shot 7 Blocking Specification
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
FRAME SUBJECT: Lead vehicle in convoy
SPATIAL LAYOUT: Vehicle center-frame, checkpoint lights
at frame-right horizon, dust cloud behind
PRIMARY MOTION: Vehicle accelerating forward through frame
SECONDARY MOTION: Dust wake expanding, checkpoint barrier
approaching rapidly
PHYSICS CONSEQUENCE: Suspension compression as vehicle hits
rough terrain, dust displacement from tires,
headlight beams cutting through particulate
END STATE: Vehicle past checkpoint, barrier visible
in rearview position, dust settling
Blocking Density: High (ESCALATION_TRIGGER role)
Camera Shake Allowance: Moderate (handheld + vehicle motion)
Actual engine output from a StoryDirector story compilation.
Director's Notes
Without blocking, AI video generates "two people talking in a room" as a static two-shot with no spatial relationship, no motivated movement, and no physical consequence. VAOE tells the generation model exactly what should be happening in the frame, where subjects are relative to each other, and what changes by the end of the shot. This is what separates generated video that looks directed from generated video that looks random.
Classification
Deterministic (rules-based) + LLM-Assisted (VAOE LLM Augmentor for complex blocking)
Introduced
Engine 5.2.0
Dependencies
DirectorLogic™, Beat context, MOMA energy data
Outputs to
Prompt Builder V3, Prompt Assembly V2 (BLOCKING line)
Determinism
Rules-based core; LLM augmentor uses temperature 0.0 + DB cache for determinism
Experience VAOE™ in action.