+

Spycsh

Follow

Sihan Chen Spycsh

Follow

19 followers · 0 following

Intel
Shanghai
15:32 (UTC +08:00)

Achievements

Achievements

Lists (1)

Sort

🚀 My stack

Stars

hemingkx / SpeculativeDecodingPapers

📰 Must-read papers and blogs on Speculative Decoding ⚡️

974 52 Updated Sep 19, 2025

deepseek-ai / DeepEP

DeepEP: an efficient expert-parallel communication library

Cuda 8,611 955 Updated Oct 15, 2025

HabanaAI / gaudi-pytorch-bridge

C++ 17 5 Updated Sep 5, 2025

HabanaAI / vllm-fork

Forked from vllm-project/vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 83 132 Updated Oct 16, 2025

ai-dynamo / dynamo

A Datacenter Scale Distributed Inference Serving Framework

Rust 5,305 642 Updated Oct 16, 2025

huggingface / nanoVLM

The simplest, fastest repository for training/finetuning small-sized VLMs.

Python 4,127 397 Updated Oct 15, 2025

kvcache-ai / ktransformers

A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations

Python 15,179 1,092 Updated Oct 12, 2025

lipku / LiveTalking

Real time interactive streaming digital human

Python 6,606 1,023 Updated Oct 3, 2025

pipecat-ai / pipecat

Open Source framework for voice and multimodal conversational AI

Python 8,397 1,359 Updated Oct 16, 2025

InternLM / lmdeploy

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

Python 7,166 609 Updated Oct 15, 2025

flashinfer-ai / flashinfer

FlashInfer: Kernel Library for LLM Serving

Cuda 3,912 530 Updated Oct 16, 2025

deepseek-ai / DeepSeek-V3

Python 99,675 16,269 Updated Aug 28, 2025

feifeibear / LLMSpeculativeSampling

Fast inference from large lauguage models via speculative decoding

Python 835 86 Updated Aug 22, 2024

spring-projects / spring-ai

An Application Framework for AI Engineering

Java 6,923 1,945 Updated Oct 14, 2025

HabanaAI / hccl_demo

C++ 23 20 Updated Oct 9, 2025

antgroup / echomimic_v2

[CVPR 2025] EchoMimicV2: Towards Striking, Simplified, and Semi-Body Human Animation

Python 4,296 503 Updated Aug 11, 2025

PKU-YuanGroup / LLaVA-CoT

[ICCV 2025] LLaVA-CoT, a visual language model capable of spontaneous, systematic reasoning

Python 2,081 79 Updated Oct 16, 2025

anliyuan / Ultralight-Digital-Human

一个超轻量级、可以在移动端实时运行的数字人模型

Python 2,264 323 Updated Sep 18, 2025

kleinlee / DH_live

每个人都能用的数字人

Python 1,702 359 Updated Sep 22, 2025

triton-lang / triton

Development repository for the Triton language and compiler

MLIR 17,230 2,310 Updated Oct 16, 2025

build-with-groq / g1

g1: Using Llama-3.1 70b on Groq to create o1-like reasoning chains

Python 4,223 377 Updated Sep 11, 2025

QwenLM / Qwen2-Audio

The official repo of Qwen2-Audio chat & pretrained large audio language model proposed by Alibaba Cloud.

Python 1,901 146 Updated Apr 21, 2025

opea-project / GenAIEval

Evaluation, benchmark, and scorecard, targeting for performance on throughput and latency, accuracy on popular evaluation harness, safety, and hallucination

Jupyter Notebook 37 58 Updated Oct 6, 2025

opea-project / GenAIComps

GenAI components at micro-service level; GenAI service composer to create mega-service

Python 178 209 Updated Oct 13, 2025

meta-llama / llama3

The official Meta Llama 3 GitHub site

Python 29,032 3,470 Updated Jan 26, 2025

RVC-Boss / GPT-SoVITS

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Python 51,596 5,668 Updated Sep 10, 2025

meta-pytorch / gpt-fast

Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.

Python 6,122 566 Updated Aug 22, 2025

microsoft / LoRA

Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"

Python 12,805 846 Updated Dec 17, 2024

kserve / kserve

Standardized Distributed Generative and Predictive AI Inference Platform for Scalable, Multi-Framework Deployment on Kubernetes

Python 4,649 1,276 Updated Oct 15, 2025

NVIDIA / TensorRT-LLM

TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and support state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorR…

C++ 11,865 1,801 Updated Oct 16, 2025

点击这是indexloc提供的php浏览器服务，不要输入任何密码和下载