Stars
AFTER : Audio Features Transfer and Exploration in Real-time
A toolbox that provides hackable building blocks for generic 1D/2D/3D UNets, in PyTorch.
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable…
State-of-the-art audio codec with 90x compression factor. Supports 44.1kHz, 24kHz, and 16kHz mono/stereo audio.
RWKV (pronounced RwaKuv) is an RNN with great LLM performance, which can also be directly trained like a GPT transformer (parallelizable). We are at RWKV-7 "Goose". So it's combining the best of RN…
Vector (and Scalar) Quantization, in Pytorch
Multidimensional indexing for tensors
🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch.
Audio generation using diffusion models, in PyTorch.
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
State-of-the-art deep learning based audio codec supporting both mono 24 kHz audio and stereo 48 kHz audio.
Robust Speech Recognition via Large-Scale Weak Supervision
Repository of Jupyter notebook tutorials for teaching the Deep Learning Course at the University of Amsterdam (MSc AI), Fall 2023
A tiny scalar-valued autograd engine and a neural net library on top of it with PyTorch-like API
Neural Networks: Zero to Hero
Creative Machine Learning course and notebook tutorials in JAX, PyTorch and Numpy
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
Flexible audio loudness meter in Python with implementation of ITU-R BS.1770-4 loudness algorithm
Universal audio synthesizer control learning with normalizing flows
open soundstream-ish VAE codecs for downstream neural audio synthesis
⚡️Optimizing einsum functions in NumPy, Tensorflow, Dask, and more with contraction order optimization.
Panel: The powerful data exploration & web app framework for Python
A bunch of scriptable audio transforms based on the torchaudio backend
Self-Supervised Speech Pre-training and Representation Learning Toolkit
Google Drive Public File Downloader when Curl/Wget Fails