ElevenLabs reposted this
Introducing Eleven v3 (alpha) - the most expressive Text to Speech model ever. Eleven v3 (alpha) brings new levels of realism and control to speech generation: - 70+ languages - Multi-speaker dialogue - Audio tags like [excited], [whispers], [laughs], and [sighs] This is a research preview. It requires more prompt engineering than previous models - but the generations are breathtaking. We’ll continue fine-tuning for reliability and control. The new architecture of Eleven v3 understands text at a deeper level - delivering much greater expressiveness. And you can now guide generations more directly using audio tags: - Emotions [sad] [angry] [happily] - Delivery direction [whispers] [shouts] - Non-verbal reactions [laughs] [clears throat] [sighs] Generate dynamic multi-speaker dialogue that handles interruptions, shifts in tone, and emotional cues based on conversational context. If you’re working on videos, audiobooks, or media tools — v3 unlocks a new level of expressiveness. For real-time and conversational use cases, we recommend staying with v2.5 Turbo or Flash for now. A real-time version of v3 is in development. Public API access is coming soon. For early access, please contact sales. Eleven v3 is available today with 80% off in June. Try it out (link in the comments) — and share your best generations with us.