Generate video from 1–9 reference images and a text prompt. Supports 720P or 1080P and flexible aspect ratios.