- SHANG HAI
Stars
FlashInfer: Kernel Library for LLM Serving
A throughput-oriented high-performance serving framework for LLMs
Dynamic Memory Management for Serving LLMs without PagedAttention
📚A curated list of Awesome LLM/VLM Inference Papers with Codes: Flash-Attention, Paged-Attention, WINT8/4, Parallelism, etc.🎉
TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. Tensor…
LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.
A high-throughput and memory-efficient inference and serving engine for LLMs
SCQL (Secure Collaborative Query Language) is a system that allows multiple distrusting parties to run joint analysis without revealing their private data.
Running large language models on a single GPU for throughput-oriented scenarios.
High-Resolution Image Synthesis with Latent Diffusion Models
Simple samples for TensorRT programming
Kernl lets you run PyTorch transformer models several times faster on GPU with a single line of code, and is designed to be easily hackable.
🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch.
Synthesizer for optimal collective communication algorithms
Transformer related optimization, including BERT, GPT
Development repository for the Triton language and compiler
OneFlow is a deep learning framework designed to be user-friendly, scalable and efficient.
A baseline repository of Auto-Parallelism in Training Neural Networks
XGo is the first AI-native programming language that integrates software engineering into a unified whole. Our vision is to enable everyone to become a builder of the world.
Kubernetes-native Deep Learning Framework
Training and serving large-scale neural networks with auto parallelization.
Flexible and powerful tensor operations for readable and reliable code (for pytorch, jax, TF and others)