+
Skip to content
View Spycsh's full-sized avatar
  • Intel
  • Shanghai
  • 15:32 (UTC +08:00)

Block or report Spycsh

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

📰 Must-read papers and blogs on Speculative Decoding ⚡️

974 52 Updated Sep 19, 2025

DeepEP: an efficient expert-parallel communication library

Cuda 8,611 955 Updated Oct 15, 2025

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 83 132 Updated Oct 16, 2025

A Datacenter Scale Distributed Inference Serving Framework

Rust 5,305 642 Updated Oct 16, 2025

The simplest, fastest repository for training/finetuning small-sized VLMs.

Python 4,127 397 Updated Oct 15, 2025

A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations

Python 15,179 1,092 Updated Oct 12, 2025

Real time interactive streaming digital human

Python 6,606 1,023 Updated Oct 3, 2025

Open Source framework for voice and multimodal conversational AI

Python 8,397 1,359 Updated Oct 16, 2025

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

Python 7,166 609 Updated Oct 15, 2025

FlashInfer: Kernel Library for LLM Serving

Cuda 3,912 530 Updated Oct 16, 2025

Fast inference from large lauguage models via speculative decoding

Python 835 86 Updated Aug 22, 2024

An Application Framework for AI Engineering

Java 6,923 1,945 Updated Oct 14, 2025
C++ 23 20 Updated Oct 9, 2025

[CVPR 2025] EchoMimicV2: Towards Striking, Simplified, and Semi-Body Human Animation

Python 4,296 503 Updated Aug 11, 2025

[ICCV 2025] LLaVA-CoT, a visual language model capable of spontaneous, systematic reasoning

Python 2,081 79 Updated Oct 16, 2025

一个超轻量级、可以在移动端实时运行的数字人模型

Python 2,264 323 Updated Sep 18, 2025

每个人都能用的数字人

Python 1,702 359 Updated Sep 22, 2025

Development repository for the Triton language and compiler

MLIR 17,230 2,310 Updated Oct 16, 2025

g1: Using Llama-3.1 70b on Groq to create o1-like reasoning chains

Python 4,223 377 Updated Sep 11, 2025

The official repo of Qwen2-Audio chat & pretrained large audio language model proposed by Alibaba Cloud.

Python 1,901 146 Updated Apr 21, 2025

Evaluation, benchmark, and scorecard, targeting for performance on throughput and latency, accuracy on popular evaluation harness, safety, and hallucination

Jupyter Notebook 37 58 Updated Oct 6, 2025

GenAI components at micro-service level; GenAI service composer to create mega-service

Python 178 209 Updated Oct 13, 2025

The official Meta Llama 3 GitHub site

Python 29,032 3,470 Updated Jan 26, 2025

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Python 51,596 5,668 Updated Sep 10, 2025

Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.

Python 6,122 566 Updated Aug 22, 2025

Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"

Python 12,805 846 Updated Dec 17, 2024

Standardized Distributed Generative and Predictive AI Inference Platform for Scalable, Multi-Framework Deployment on Kubernetes

Python 4,649 1,276 Updated Oct 15, 2025

TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and support state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorR…

C++ 11,865 1,801 Updated Oct 16, 2025
Next
点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载