Lists (9)
Sort Name ascending (A-Z)
Starred repositories
Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 80+ languages.
视频号、小程序、抖音、快手、小红书、直播流、m3u8、酷狗、QQ音乐等常见网络资源下载!
基于系统代理的抖音弹幕wss抓取程序,能够获取所有数据来源,包括chrome,抖音直播伴侣等,可进行进程过滤
Industry leading face manipulation platform
MuseTalk: Real-Time High Quality Lip Synchorization with Latent Space Inpainting
An AI-Powered Speech Processing Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Enhancement, Separation, and Target Speaker Extraction, etc.
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
《C++ Primer Plus 第6版(中文版)》原书代码、习题答案和个人笔记,仅供学习和交流。
so-vits-svc fork with realtime support, improved interface and more features.
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
Python library to communicate with an obs-websocket server (for OBS Studio)
Remote-control of OBS Studio through WebSocket
🎥 Python and OpenCV-based scene cut/transition detection program & library.
Video Duplicate Finder - Crossplatform
An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System
A generative speech model for daily dialogue.
Open-source IoT Gateway - integrates devices connected to legacy and third-party systems with ThingsBoard IoT Platform using Modbus, CAN bus, BACnet, BLE, OPC-UA, MQTT, ODBC and REST protocols
An extension for nvim-dap providing configurations for launching go debugger (delve) and debugging individual tests
Learn Go with test-driven development