-
UFMT
- Cuiabá, Mato Grosso - Brazil
- https://www.fredso.com.br
- @fred_s0
Highlights
Lists (12)
Sort Name ascending (A-Z)
Stars
VoxCPM: Tokenizer-Free TTS for Context-Aware Speech Generation and True-to-Life Voice Cloning
AI Edge Quantizer: flexible post training quantization for LiteRT models.
Audio-to-Audio Schrodinger Bridges is a diffusion-based audio restoration model for bandwidth extension and inpainting.
Fast Streaming TTS with Orpheus + WebRTC (with FastRTC)
Open Source Text-To-Speech Portuguese Dataset
Spotify Scraper to extract all the information from spotify, download mp3 with cover of the song
Running any GGUF SLMs/LLMs locally, on-device in Android
Awesome speech/audio LLMs, representation learning, and codec models
DiFlow-TTS: Compact and Low-Latency Zero-Shot Text-to-Speech with Factorized Discrete Flow Matching
SoTA open-source TTS
finetune llm part for spark-tts model
stlohrey / dia-finetuning
Forked from nari-labs/diaA TTS model capable of generating ultra-realistic dialogue in one pass.
An AI-powered interactive avatar engine using Live2D, LLM, ASR, TTS, and RVC. Ideal for VTubing, streaming, and virtual assistant applications.
A TTS model capable of generating ultra-realistic dialogue in one pass.
Codec for paper: LLaSA: Scaling Train-time and Inference-time Compute for LLaMA-based Speech Synthesis
A family of state-of-the-art Transformer-based audio codecs for low-bitrate high-quality audio coding.
Code for vec2wav 2.0, a speech token vocoder for VC. Paper: https://arxiv.org/abs/2409.01995
SSR-Speech: Towards Stable, Safe and Robust Zero-shot Speech Editing and Synthesis
idiap / coqui-ai-TTS
Forked from coqui-ai/TTS🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
GeneFace++: Generalized and Stable Real-Time 3D Talking Face Generation; Official Code
Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
Official implementation for the SIGGRAPH Asia 2024 paper SPARK: Self-supervised Personalized Real-time Monocular Face Capture
A list of publicly available room impulse response datasets and scripts to download them.
[Official Implementation] Acoustic Autoregressive Modeling 🔥