xAI's text-to-video model with configurable modes, duration up to 10 seconds, and multi-resolution output.
Grok Imagine Text-to-Video is xAI's entry into generative video, bringing the creative flexibility of the Grok platform to motion content. It generates videos from text prompts with configurable aspect ratios, generation modes, duration, and resolution settings.
The model offers three generation modes -- fun, normal, and spicy -- each producing different creative interpretations of the same prompt. This makes it easy to explore a range of visual approaches without rewriting your description. Duration options of 6 or 10 seconds cover both short social clips and longer narrative scenes.
Resolution options include 480p for rapid previews and 720p for higher fidelity output. With five aspect ratio choices spanning portrait, landscape, and square formats, Grok Imagine T2V adapts to virtually any distribution platform or content format.