xAI's fast text-to-image generation with a standard mode and a higher-accuracy quality mode.
Grok Imagine Text-to-Image is xAI's image generation model, built for speed and creative range. It turns text prompts into images across five aspect ratios, with two output modes: a fast standard mode for high-volume exploration and a quality mode that prioritizes accuracy and detail.
The standard mode is one of the most affordable ways to generate images on the platform, making it ideal for brainstorming, thumbnailing, and rapid iteration. Switching to the quality mode trades a little speed for noticeably better prompt adherence and finer detail — the right choice when an image graduates from draft to deliverable.
Prompts up to 5000 characters give plenty of room for layered scene descriptions, and the model pairs naturally with Grok Imagine's image-to-image and video models for full xAI-based pipelines.