Voice Clone Multilingual TTS

Text to Synthesize

Generated Audio

Speaker Selection

Temperature (lower = more stable tone, higher = more expressive)

0.1 1

Repetition Penalty

0.5 2

Reference Audio (for voice cloning)

Voice Cloning Guidelines:

Use around 7-10 seconds of clear, noise-free audio
For transcription interface will use Whisper turbo to transcribe the audio file
Longer audio clips will reduce maximum output length
Custom speaker overrides speaker selection