Back to Models Guide
xAIvideo

Complete Guide to Using Grok Imagine

Animate still images into video with xAI Grok's image-to-video engine and flexible creative controls.

Try This ModelTutorial

Overview

Grok Imagine Image-to-Video transforms a static image into a dynamic video clip. Upload a reference image and optionally describe the desired motion to produce an animated video that preserves the visual identity of the source material.

Like its text-to-video counterpart, this model offers three generation modes for different creative interpretations. Note that the spicy mode automatically switches to normal when using external image URLs, ensuring stable results with user-uploaded content.

The model supports durations of 6 or 10 seconds at either 480p or 720p resolution. Combined with an optional text prompt for motion guidance, it gives you precise control over how the still image comes to life. This is ideal for animating product shots, portraits, landscapes, and any static visual that would benefit from subtle or dramatic motion.

Capabilities

  • Image-to-video animation with visual identity preservation
  • Optional text prompt for directed motion control
  • Three creative modes for varied animation styles
  • Configurable duration (6s or 10s) and resolution (480p or 720p)
  • Faithful to the color palette, composition, and elements of the source image

Use Cases

1

Animating product photography for e-commerce

2

Bringing portrait photos to life with subtle motion

3

Creating dynamic backgrounds from landscape photographs

4

Social media stories from static brand assets

5

Motion previews from concept art and illustrations

Input Parameters

image_urls
filerequired

Reference images (JPEG/PNG/WebP, ≤10MB each, up to 7).

Upload one reference image. The model will use this as the visual foundation for the generated video. Higher resolution inputs yield better results.

prompt
textarea

The text prompt describing the desired video motion (max 5000 chars).

Optional but recommended. Describe the motion you want: 'Camera slowly pans right, leaves rustle in the wind'. Without a prompt, the model infers natural motion from the image content.

aspect_ratio
select

Width-to-height ratio of the generated video.

Options
2:33:21:19:1616:9
Default: 16:9
mode
select

Generation mode.

'Normal' produces natural, faithful animation. 'Fun' adds creative interpretation. Note that 'spicy' is automatically switched to normal for uploaded images.

Options
funnormalspicy
Default: normal
duration
slider

Length of the generated video in seconds (6–30).

6 seconds for quick animations and social clips. 10 seconds for more developed motion sequences.

Min: 6Max: 30Default: 6
resolution
select

Resolution of the generated video.

480p for fast previews. 720p for final quality output.

Options
480p720p
Default: 480p
NSFW Filter
toggle

Enable NSFW content filtering.

Default: false

Tips & Best Practices

Guide the motion with prompts
Use high-quality source images
Pair with Grok Imagine I2I for prep

Related Models