Stars
Localized watermarking for AI-generated speech audios, with SOTA on robustness and very fast detector
🚀 The fast, Pythonic way to build MCP servers and clients
A Model Context Protocol (MCP) server that provides Xcode-related tools for integration with AI assistants and other MCP clients.
Unified automatic quality assessment for speech, music, and sound.
Low-latency timbre transfer models for instrumental interaction.
NotaGen: Advancing Musicality in Symbolic Music Generation with Large Language Model Training Paradigms
YuE: Open Full-song Music Generation Foundation Model, something similar to Suno.ai but open
[WIP] Resources for AI engineers. Also contains supporting materials for the book AI Engineering (Chip Huyen, 2025)
Neural network inference template for real-time cricital audio environments - presented at ADC23
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and support state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorR…
Self-supervised learning for fast pitch estimation
Implementation of "Bytecover: Cover song identification via multi-loss training" paper (ICASSP 2021)
Real Time Speech Enhancement in the Waveform Domain (Interspeech 2020)We provide a PyTorch implementation of the paper Real Time Speech Enhancement in the Waveform Domain. In which, we present a ca…
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable…
Large Language Model Text Generation Inference
Harmony-Rhythm Disentanglement audio remixer plugin
Official implementation of the RAVE model: a Realtime Audio Variational autoEncoder
An implementation of the diffusers api in Rust
Redpanda is a streaming data platform for developers. Kafka API compatible. 10x faster. No ZooKeeper. No JVM!
State-of-the-art deep learning based audio codec supporting both mono 24 kHz audio and stereo 48 kHz audio.
AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (NVIDIA GPU) and MatrixCore (AMD GPU) inference.
Python library for analysing faces using PyTorch
Differentiable FM Synthesis of Musical Instrument Sounds