Stars
Code for the paper "pix2gestalt: Amodal Segmentation by Synthesizing Wholes" (CVPR 2024)
Official code for EnvSDD (Environmental Sound Deepfake Detection)
HuMo: Human-Centric Video Generation via Collaborative Multi-Modal Conditioning
Dataset/code for AudioMarkBench: Benchmarking Robustness of Audio Watermarking
A curated list of watermarking schemes for generative AI models
Localized watermarking for AI-generated speech audios, with SOTA on robustness and very fast detector
语音算法相关资源汇总 Resource for Speech Processing || NEWS: official link of VoxCeleb fails recently and an external link is added for download
Guidelines for Scholarship Renew / Annual Progress Report, Qualifying Examination and Graduate Training Programme / Graduate Assistantship Programme
程序员延寿指南 | A programmer's guide to live longer
MAGI-1: Autoregressive Video Generation at Scale
📊⚽ A collection of football analytics projects, data, and analysis by Edd Webster (@eddwebster), including a curated list of publicly available resources published by the football analytics community.
The code in this repository is used to generate the GloHydroRes dataset and the figures featured in the associated paper.
⚡ Dynamically generated stats for your github readmes
利用数学和数据理解足球(Use Math and Data to Understand Football)
CNN-based audio segmentation toolkit. Allows to detect speech, music, noise and speaker gender. Has been designed for large scale gender equality studies based on speech time per gender.
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable…
AudioLDM: Generate speech, sound effects, music and beyond, with text.
[CVIU, DICTA Award] Glitch in the Matrix: A Large Scale Benchmark for Content Driven Audio-Visual Forgery Detection and Localization
[ACM MM Award] AV-Deepfake1M: A Large-Scale LLM-Driven Audio-Visual Deepfake Dataset
Code for Video Deepfake Detection model from "Combining EfficientNet and Vision Transformers for Video Deepfake Detection" presented at ICIAP 2021.
Implementation of the paper: Replay and Synthetic Speech Detection with Res2Net architecture (ICASSP 2021) https://arxiv.org/abs/2010.15006
Detection and identification of cough onset points based on MusicYOLO.