Make Any Photo Talk with Fabric 1.0

Fabric 1.0 is an image-to-video model that takes a still photo and an audio clip, then generates a video where the subject appears to speak or sing in sync with the sound. It solves a practical production problem: creating a spokesperson or character video without scheduling a film crew, booking studio time, or recording anyone on camera. You supply the assets you already have, and the model handles the animation. The model supports output at up to 720p resolution, giving you clean video suitable for websites, presentations, and social media posts. It reads the audio track syllable by syllable to keep mouth movements accurate throughout the clip. You can also choose 480p if you need a smaller file size or a faster export time. This fits naturally into workflows where you already have a face photo and a recorded audio file ready. A marketer can produce a spokesperson segment from a product photo without a shoot. A course creator can animate a character with their own narration in a few minutes. Upload your image and audio on Picasso IA, pick a resolution, and your talking video is ready to download.

Official

Veed

1.8k runs

Fabric 1.0

2025-12-10

Commercial Use

Overview

Fabric 1.0 is an image-to-video model that animates a still photo to match any audio you provide, producing a video where the subject appears to speak or sing in sync with the sound. On Picasso IA, you run it directly from your browser with no software to install and no code to write. The model addresses a real production bottleneck: getting a talking video from a static image without booking a filming session or hiring on-screen talent. A product marketer, a teacher building an online course, or a social media creator can all use assets they already own, a face photo and an audio file, to produce broadcast-ready video in seconds.

How It Works

Upload a clear, forward-facing photo of the person or character you want to animate.
Upload the audio file you want the subject to lip-sync to, such as a voiceover, scripted speech, or narration clip.
Choose your output resolution: 480p for a quicker export or 720p for a sharper final file.
Submit the job and wait while the model processes the lip-sync animation frame by frame.
Download the finished video file directly from your browser, ready to share or drop into an editing timeline.

Frequently Asked Questions

Do I need programming skills or technical knowledge to use this? No, just open Fabric 1.0 on Picasso IA, adjust the settings you want, and hit generate.

Is it free to try? Yes, you can run Fabric 1.0 on Picasso IA without a paid subscription. Free-tier usage lets you test the output quality before committing to anything.

How long does it take to get results? Most generations finish within seconds to a minute. The exact time depends on the length of your audio clip and the resolution you choose.

What output formats are supported? The model returns a downloadable video file compatible with standard editing tools and social media upload requirements.

What kind of photo works best? The model works best with a clear, forward-facing photo where the subject's face is well-lit and fully visible. Heavily obscured or side-profile images may reduce lip-sync accuracy.

Can I use the output videos in commercial projects? Yes, the videos you generate belong to you and can be used in client work, marketing materials, or published content.

What if the lip-sync does not look accurate? Try using a higher-contrast photo with the face clearly centered, and make sure your audio recording is clean with minimal background noise. Small improvements to the source files usually produce noticeably better results.

Credit Cost

Each generation consumes 180 credits

180 credits

or 900 credits for 5 generations

Features

Everything this model can do for you

Lip-sync accuracy

Mouth movements stay frame-accurate to every phoneme in the audio track.

720p output

Export videos at up to 720p resolution, clean and ready for web or social media.

Dual resolution options

Choose 480p for faster exports or 720p for higher-quality final deliverables.

No filming required

Produce a talking video from a single still photo with no camera setup.

Audio flexibility

Works with any audio input, from studio voiceovers to casual voice recordings.

Fast generation

Receive a finished lip-synced video within seconds of submitting your files.

Clean video output

Download the result as a standard video file ready to use in any project.

User-friendly API integration

Use Cases

Upload a headshot and a recorded voiceover to produce a spokesperson video without filming anyone on camera

Animate a product mascot or illustrated character by pairing the image with a custom audio script

Create a talking avatar from a portrait photo for use in online courses or tutorial videos

Generate a lip-synced social media clip from a still image and a short audio snippet

Bring a historical or illustrated portrait to life by adding a narration track to the image

Produce multilingual spokesperson videos by swapping the audio file without replacing the visual

Turn a brand photo into a short animated statement by pairing it with a recorded audio clip

Examples

720p

Audio

3m 4s