-
-
Notifications
You must be signed in to change notification settings - Fork 2.7k
Description
"We are open-sourcing Higgs Audio v2, a powerful audio foundation model pretrained on over 10 million hours of audio data and a diverse set of text data. Despite having no post-training or fine-tuning, Higgs Audio v2 excels in expressive audio generation, thanks to its deep language and acoustic understanding."
open_source_repo_demo.mp4
Why add this?
"including generating natural multi-speaker dialogues in multiple languages, automatic prosody adaptation during narration, melodic humming with the cloned voice, and simultaneous generation of speech and background music."
Can be used with vLLM
https://github.com/boson-ai/higgs-audio/tree/main/examples/vllm
They say for optimal performance, run the generation examples on a machine equipped with GPU with at least 24GB memory!