Lists (1)
Sort Name ascending (A-Z)
Stars
DocuSnap frontend built in Andriod Studio
A curated reading list for machine learning reliability research and practice
Mirage: Automatically Generating Fast GPU Kernels without Programming in Triton/CUDA
Artifact Evaluation Scripts and Workloads for TrainCheck (OSDI'25)
A Framework for Automated Validation of Deep Learning Training Tasks
ByteCheckpoint: An Unified Checkpointing Library for LFMs
JaxPP is a library for JAX that enables flexible MPMD pipeline parallelism for large-scale LLM training
Super-Efficient RLHF Training of LLMs with Parameter Reallocation
Collective communications library with various primitives for multi-machine training.
Disseminated, Distributed OS for Hardware Resource Disaggregation. USENIX OSDI 2018 Best Paper.
Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.
你管这破玩意叫操作系统源码 — 像小说一样品读 Linux 0.11 核心代码
Tile primitives for speedy kernels
DeepSeek-V3/R1 inference performance simulator
VIP cheatsheet for Stanford's CME 295 Transformers and Large Language Models
Create beautiful diagrams just by typing notation in plain text.
NEO is a LLM inference engine built to save the GPU memory crisis by CPU offloading
A Datacenter Scale Distributed Inference Serving Framework
Automated Testing and Adaptive Detection of **Slow Faults** in Distributed Systems
Must-read papers on improving efficiency for LLM serving clusters
📰 Must-read papers and blogs on LLM based Long Context Modeling 🔥
📰 Must-read papers on KV Cache Compression (constantly updating 🤗).
Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels
verl: Volcano Engine Reinforcement Learning for LLMs