jpWang

Jiapeng Wang jpWang

@SCUT-DLVCLab

46 followers · 30 following

South China University of Technology
Guangzhou, China

Achievements

Organizations

Stars

xingchensong / TouchNet

A native-PyTorch library for large scale M-LLM (text/audio) training with tp/cp/dp/pp.

Python 110 8 Updated Jul 9, 2025

flashinfer-ai / flashinfer

FlashInfer: Kernel Library for LLM Serving

Cuda 3,353 375 Updated Jul 11, 2025

NVIDIA / TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and support state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorR…

C++ 10,995 1,579 Updated Jul 12, 2025

THUNLP-MT / StreamingBench

StreamingBench: Assessing the Gap for MLLMs to Achieve Streaming Video Understanding

Python 134 6 Updated May 16, 2025

friedrichor / Awesome-Multimodal-Papers

A curated list of awesome Multimodal studies.

224 19 Updated Jun 27, 2025

inclusionAI / Ming

Ming - facilitating advanced multimodal understanding and generation capabilities built upon the Ling LLM.

Jupyter Notebook 375 28 Updated Jul 12, 2025

PKU-Alignment / align-anything

Align Anything: Training All-modality Model with Feedback

Jupyter Notebook 4,230 500 Updated May 28, 2025

AudioLLMs / Awesome-Audio-LLM

Audio Large Language Models

Python 607 34 Updated Jul 5, 2025

AudioLLMs / AudioBench

AudioBench: A Universal Benchmark for Audio Large Language Models

Python 234 9 Updated Jun 17, 2025

modelscope / easydistill

a toolkit on knowledge distillation for large language models

Python 109 5 Updated Jul 9, 2025

kyutai-labs / moshi

Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audio codec.

Python 8,636 748 Updated Jul 11, 2025

hiyouga / LLaMA-Factory

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Python 54,139 6,628 Updated Jul 11, 2025

modelscope / ms-swift

Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 500+ LLMs (Qwen3, Qwen3-MoE, Llama4, InternLM3, DeepSeek-R1, ...) and 200+ MLLMs (Qwen2.5-VL, Qwen2.5-Omni, Qwen2-Audio, Ovis2, InternVL3, Llava, GLM4…

Python 8,626 743 Updated Jul 12, 2025

yangjianxin1 / Firefly

Firefly: 大模型训练工具，支持训练Qwen2.5、Qwen2、Yi1.5、Phi-3、Llama3、Gemma、MiniCPM、Yi、Deepseek、Orion、Xverse、Mixtral-8x7B、Zephyr、Mistral、Baichuan2、Llma2、Llama、Qwen、Baichuan、ChatGLM2、InternLM、Ziya2、Vicuna、Bloom等大模型

Python 6,481 584 Updated Oct 24, 2024

jzq2000 / MoonCast

Python 271 33 Updated Apr 11, 2025

mjpost / sacrebleu

Reference BLEU implementation that auto-downloads test sets and reports a version string to facilitate cross-lab comparisons

Python 1,154 166 Updated Mar 13, 2025

open-mmlab / Amphion

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audi…

Python 9,232 732 Updated May 27, 2025

JusperLee / AudioTrust

AudioTrust: Benchmarking the Multi-faceted Trustworthiness of Audio Large Language Models

Python 197 23 Updated May 29, 2025

SparkAudio / Spark-TTS

Spark-TTS Inference Code

Python 10,016 1,056 Updated Apr 9, 2025

fishaudio / fish-speech

SOTA Open Source TTS

Python 22,314 1,827 Updated Jul 2, 2025

nari-labs / dia

A TTS model capable of generating ultra-realistic dialogue in one pass.

Python 17,444 1,446 Updated Jul 6, 2025

2noise / ChatTTS

A generative speech model for daily dialogue.

Python 37,119 4,020 Updated Jul 6, 2025

myshell-ai / OpenVoice

Instant voice cloning by MIT and MyShell. Audio foundation model.

Python 32,947 3,476 Updated Apr 19, 2025

RVC-Boss / GPT-SoVITS

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Python 48,678 5,355 Updated Jul 11, 2025

travisvn / openai-edge-tts

Free, high-quality text-to-speech API endpoint to replace OpenAI, Azure, or ElevenLabs

Python 977 154 Updated Jul 1, 2025

Yuan-ManX / ai-audio-datasets

AI Audio Datasets (AI-ADS) 🎵, including Speech, Music, and Sound Effects, which can provide training data for Generative AI, AIGC, AI model training, intelligent audio tool development, and audio a…

777 74 Updated Jul 8, 2025

jim-schwoebel / voice_datasets

🔊 A comprehensive list of open-source datasets for voice and sound computing (95+ datasets).

1,966 241 Updated Jun 6, 2024

stepfun-ai / Step-Audio

Python 4,408 358 Updated Jun 12, 2025

MoonshotAI / Kimi-Audio-Evalkit

Python 126 5 Updated Apr 29, 2025

gpt-omni / mini-omni

open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.

Python 3,363 285 Updated Nov 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Jiapeng Wang jpWang

Achievements

Achievements

Organizations

Block or report jpWang

Stars

xingchensong / TouchNet

flashinfer-ai / flashinfer

NVIDIA / TensorRT-LLM

THUNLP-MT / StreamingBench

friedrichor / Awesome-Multimodal-Papers

inclusionAI / Ming

PKU-Alignment / align-anything

AudioLLMs / Awesome-Audio-LLM

AudioLLMs / AudioBench

modelscope / easydistill

kyutai-labs / moshi

hiyouga / LLaMA-Factory

modelscope / ms-swift

yangjianxin1 / Firefly

jzq2000 / MoonCast

mjpost / sacrebleu

open-mmlab / Amphion

JusperLee / AudioTrust

SparkAudio / Spark-TTS

fishaudio / fish-speech

nari-labs / dia

2noise / ChatTTS

myshell-ai / OpenVoice

RVC-Boss / GPT-SoVITS

travisvn / openai-edge-tts

Yuan-ManX / ai-audio-datasets

jim-schwoebel / voice_datasets

stepfun-ai / Step-Audio

MoonshotAI / Kimi-Audio-Evalkit

gpt-omni / mini-omni