• Picasso AI Logo
    Logo Picasso IA
  • Home
  • AI Image
    Nano Banana 2
  • AI Video
    Veo 3.1 Lite
  • AI Chat
    Gemini 3 Pro
  • Edit Images
  • Upscale Image
  • Remove Background
  • Text to Speech
  • Effects
  • AI Toolkit
    NEW
  • Generations
  • Billing
  • Support
  • Account
Unlimited Videos ARE HERE ยท Nano Banana 2 & GPT Image 2.0 UNLIMITED UNTIL June 25Upgrade
  1. Collection
  2. Text to Video
  3. Grok Imagine Video 1.5

Grok Imagine Video 1.5: Image to Video with Audio

Grok Imagine Video 1.5 takes a still image and a short text description of the motion you want, then produces a video clip with synchronized audio. If you work in content creation, e-commerce, or social media and need animated visuals without video production software, this fills that gap directly. The model supports resolutions up to 720p and accepts multiple aspect ratios, including 16:9, 9:16, 4:3, and 1:1, so your output fits the target platform without post-processing. Clip duration goes up to 5 seconds per run, which is enough for product previews, animated thumbnails, and short social posts. When you set aspect ratio to auto, the model reads your input image's native proportions and matches them in the output. It fits neatly into a content batch workflow: feed in a set of product photos, write motion descriptions for each, and collect a ready-to-publish set of video clips in one sitting. The generation time is short enough that you can run several variations on the same image, compare results, and pick the one that works.

Official

Xai

84.4k runs

Grok Imagine Video 1.5

2026-06-01

Commercial Use

Table of contents

  • Overview
  • How It Works
  • Frequently Asked Questions
  • Credit Cost
  • Features
  • Use Cases
Get Nano Banana Pro

Overview

Grok Imagine Video 1.5 takes a still image and a text prompt and turns them into a short animated video with synchronized audio. If you have a product photo, a portrait, or a static scene that needs motion, this is the model that handles it without any video editing background required. On Picasso IA, you upload your image, describe the movement or atmosphere you want, and get back a polished clip in up to 720p resolution. The built-in audio sync means the result sounds like a finished production piece, not a silent loop you have to score yourself. It fits naturally into social media workflows, client presentations, and content pipelines that need video but have no budget for a production crew.

How It Works

  • Upload a still image in JPG, PNG, or WebP format. The output video matches your image's native aspect ratio by default, or you can pick from options like 16:9, 9:16, 1:1, and 4:3.
  • Write a text prompt describing the motion or scene you want: a slow camera pan, a subject in movement, a sky shifting from day to dusk, or any combination of effects.
  • Set the video duration (up to 5 seconds) and choose your output resolution: 720p for sharper results or 480p for a lighter file size.
  • Submit your request and wait while the model processes your inputs into a video clip with synchronized audio already included.
  • Download the finished video file ready to publish, drop into an editing timeline, or hand off directly to a client.

Frequently Asked Questions

Do I need programming skills or technical knowledge to use this? No, just open Grok Imagine Video 1.5 on Picasso IA, adjust the settings you want, and hit generate.

Is it free to try? You can run the model without a paid subscription. Check the credit system on the platform to see how many free generations are available before any charges apply.

How long does it take to get results? Most generations finish within a minute or two, depending on current server load. Longer clips or higher resolutions may add a bit more processing time.

What output formats are supported? The model produces a standard video file with synchronized audio already baked in. You can download it and use it in any video editor, or post it directly to social platforms without extra steps.

Can I customize the aspect ratio to match my target format? Yes. You can choose from eight aspect ratio options including 16:9, 4:3, 1:1, and 9:16, or leave it on auto to inherit your source image's dimensions exactly. This makes it straightforward to produce content sized for any platform.

What happens if I am not happy with the result? Try rewriting the prompt with more specific motion details or adjust the duration. Small changes to the wording often produce noticeably different outputs, so iteration is quick.

Can I use the generated videos for commercial projects? The videos you generate are yours to use. Review the terms of service on Picasso IA to confirm the usage rights that apply to your specific project type.

Credit Cost

Each generation consumes 20 credits

20 credits

or 100 credits for 5 generations

Features

Everything this model can do for you

Synchronized audio output

Generates audio that matches the motion in the video so clips are ready to publish without separate sound work.

Up to 720p resolution

Exports at 720p for clear, sharp video suitable for web and social media use.

Flexible aspect ratios

Accepts 16:9, 9:16, 4:3, 1:1, 3:2, and more so the output fits any platform without cropping.

Auto aspect ratio

Reads the input image's native proportions and applies them to the output when no ratio is manually set.

5-second clip generation

Produces clips up to 5 seconds per run, the standard length for animated thumbnails and short-form posts.

Wide format input

Accepts JPG, JPEG, PNG, and WEBP files so you can use existing assets without converting them first.

Text-driven motion

Write a plain description of the motion you want and the model animates the scene to match it.

Use Cases

Animate a product photo by describing the motion you want, such as a slow pan or gentle rotation, and get a short video clip ready for social posting

Turn a portrait photo into a brief video with natural movement, like a subtle head turn, by typing a short motion description alongside the image

Convert a landscape photo into a 5-second scene with atmospheric motion, such as drifting clouds or moving water, using a plain text prompt

Generate a vertical 9:16 video from a square image by selecting a specific aspect ratio before running the model

Create an animated thumbnail for a video or article by feeding in a still image and describing a simple motion that draws the eye

Produce a short animated intro from a brand photograph by specifying the camera motion and mood in the text prompt

Test multiple motion styles on the same image by running the model several times with different text descriptions and comparing the results

Switch Category

Effects

Text To Image

Text To Video

Large Language Models

Text To Speech

Super Resolution

Lipsync

AI Music Generation

Video Editing

Speech To Text

AI Enhance Videos

Remove Backgrounds