• Picasso AI Logo
    Logo Picasso IA
  • Home
  • AI Image
    Nano Banana 2
  • AI Video
    Veo 3.1 Fast
  • AI Chat
    Gemini 3 Pro
  • Edit Images
  • Upscale Image
  • Remove Background
  • Text to Speech
  • Effects
    NEW
  • Generations
  • Billing
  • Support
  • Account
  1. Collection
  2. Text to Speech
  3. Qwen3 Tts

Explore voices to match your need

ASMR

ASMR

Japanese
Whisper
Whispering Woman

Whispering Woman

Whisper
Relaxation
Lucky Robot

Lucky Robot

Robotic
Creative
Angry Pirate

Angry Pirate

Character
Creative

Audio Tools

Original Audio
Cloned
Result

Clone Your Voice

Experience instant voice magic with just 10 seconds of audio input!

Pirate Captain
Pirate Captain
Greedy Goblin
Greedy Goblin
Southern Belle
Southern Belle

Voice Design

Create Any Voice You Can Imagine - From Simple Text Description

Qwen3 TTS: Clone Any Voice or Design Your Own

Qwen3 TTS turns written text into natural-sounding speech with three distinct modes, giving you full control over how your audio comes out. Whether you need a quick voiceover using a preset speaker or want to Picasso IA someone's voice from a short recording, this model handles it in a single generation step. It solves the common frustration of being stuck with a single, generic robot voice when your project demands something more specific. The custom voice mode lets you choose from nine preset speakers with distinct accents and tones, so you can match the right voice to your content immediately. Voice clone mode takes a reference audio file and reproduces its characteristics onto any new text, useful for dubbed content or consistent narration across multiple clips. Voice design mode goes further: describe the voice you want in plain language, like "a calm male narrator with a slight French accent", and the model generates it from scratch. Qwen3 TTS fits naturally into content production workflows where voiceovers need to sound human without hiring a voice actor. Paste in your script, choose your mode, and download the result in seconds. If the first take misses the mark, adjust the style instruction and re-run without any extra cost.

Official

Qwen

260.5k runs

Qwen3 Tts

2026-01-23

Commercial Use

Qwen3 TTS: Clone Any Voice or Design Your Own

Table of contents

  • Overview
  • How It Works
  • Frequently Asked Questions
  • Credit Cost
  • Features
  • Use Cases
Get Nano Banana Pro

Overview

Qwen3 TTS converts written text into natural-sounding speech, giving you three distinct modes to match whatever your project needs: selecting a preset voice, cloning an existing one, or designing a brand-new voice from a written description. Whether you need a consistent narrator for a podcast series or a custom voice for a product walkthrough, the model adapts without requiring any audio engineering background. On Picasso IA, you type your text, choose your mode, and receive a finished audio file in seconds. Multilingual support covers over ten languages, so creators working across different regions can produce localized audio without switching tools.

How It Works

  • Choose your TTS mode: Custom Voice for preset speakers, Voice Clone to Picasso IA a voice from a reference audio file, or Voice Design to describe the voice you want in plain text.
  • Type or paste the text you want synthesized, then set the language manually or leave it on auto-detect.
  • For Custom Voice mode, pick one of the available preset speakers and optionally add a style instruction like "speak slowly" or "excited tone" to shape the delivery.
  • For Voice Clone mode, upload a short reference audio clip and optionally include a transcript of it to improve the accuracy of the cloned voice.
  • For Voice Design mode, write a natural-language description of the voice you want (such as accent, tone, or warmth), generate the audio, and download the finished file.

Frequently Asked Questions

Do I need programming skills or technical knowledge to use this? No, just open Qwen3 TTS on Picasso IA, adjust the settings you want, and hit generate.

Is it free to try? Yes, you can run Qwen3 TTS on Picasso IA without any upfront payment. Check your account page for current usage details and available credits.

How long does it take to get results? Most short texts return audio within a few seconds. Longer passages or Voice Clone mode with an uploaded reference file may take a bit longer depending on file size and length.

What languages does Qwen3 TTS support? The model covers Chinese, English, Japanese, Korean, French, German, Italian, Spanish, Portuguese, and Russian. You can set the language manually or leave it on auto-detect and the model will identify it from your input.

Can I control how the voice sounds beyond choosing a preset speaker? Yes. In any mode you can add a style instruction written in plain language, such as "calm and measured" or "enthusiastic and upbeat," to influence the pace, tone, and energy of the output.

What audio format does the output come in? The model returns a standard audio file you can download and drop directly into video editors, podcast software, or any platform that accepts common audio formats.

What if the cloned voice doesn't match what I expected? Try using a cleaner reference audio clip with minimal background noise, and include an accurate transcript in the reference text field. Small adjustments to the style instruction can also help dial in the result.

Credit Cost

Each generation consumes 1 credit

1 credit

or 5 credits for 5 generations

Features

Everything this model can do for you

Three TTS modes

Switch between preset speakers, voice cloning, and voice design within a single interface.

Voice cloning

Reproduce the characteristics of any voice from a short reference audio file.

Voice design

Describe a voice in plain language and generate it from scratch without a sample.

Nine preset speakers

Choose from a diverse set of voices with distinct accents, tones, and genders.

Multilingual support

Generate speech in 10 languages including English, Spanish, Japanese, and Chinese.

Style instructions

Direct tone and delivery by adding natural language cues like 'speak slowly' or 'excited tone'.

Auto language detection

Leave the language on auto and let the model identify the input text automatically.

Use Cases

Record a product explainer narration by typing your script and selecting a warm, confident preset voice from the speaker list

Clone a podcast host's voice from a short audio sample to produce consistent narration for new episodes without re-recording

Design a custom character voice for a game or animation by describing its tone, gender, and accent in a text field

Add a multilingual voiceover to a translated script by setting the language field to match your target audience

Generate audiobook chapter narrations from a text file using a preset speaker that fits the book's tone

Create on-brand voice ads by describing the desired voice personality and reusing the same clone across multiple scripts

Test different voice styles for a virtual assistant by running the same greeting text through multiple speakers or design descriptions

Switch Category

Effects

Text To Image

Text To Image

Text To Video

Large Language Models

Large Language Models

Text To Speech

Text To Speech

Super Resolution

Super Resolution

Lipsync

AI Music Generation

AI Music Generation

Video Editing

Speech To Text

Speech To Text

AI Enhance Videos

Remove Backgrounds

Remove Backgrounds