Stars
Encode and decode audio samples to/from continuous and discrete compressed representations!
Code and Dataset for <Quantitative Analysis of Melodic Similarity in Music Copyright Infringement Cases, ISMIR 2024>
Code for <Mel2Word: A Text-based Melody Representation for Symbolic Music Analysis, Music and Science, 2024>
declarative polyamorous cross-system intermedia objects
an architecture for neural network inference in real-time audio applications
"Fx-Encoder++: Extracting Instrument-wise Audio Effect Representations from Mixtures"
🌈 React for interactive command-line apps
A Python library for extracting structured information from unstructured text using LLMs with precise source grounding and interactive visualization.
Localized watermarking for AI-generated speech audios, with SOTA on robustness and very fast detector
🚀 The fast, Pythonic way to build MCP servers and clients
A Model Context Protocol (MCP) server that provides Xcode-related tools for integration with AI assistants and other MCP clients.
Unified automatic quality assessment for speech, music, and sound.
Low-latency timbre transfer models for instrumental interaction.
NotaGen: Advancing Musicality in Symbolic Music Generation with Large Language Model Training Paradigms
YuE: Open Full-song Music Generation Foundation Model, something similar to Suno.ai but open
[WIP] Resources for AI engineers. Also contains supporting materials for the book AI Engineering (Chip Huyen, 2025)
Neural network inference template for real-time cricital audio environments - presented at ADC23
TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and support state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorR…
Self-supervised learning for real-time pitch estimation
Implementation of "Bytecover: Cover song identification via multi-loss training" paper (ICASSP 2021)
Real Time Speech Enhancement in the Waveform Domain (Interspeech 2020)We provide a PyTorch implementation of the paper Real Time Speech Enhancement in the Waveform Domain. In which, we present a ca…