Highlights
Lists (11)
Sort Name ascending (A-Z)
Stars
Calculate token/s & GPU memory requirement for any LLM. Supports llama.cpp/ggml/bnb/QLoRA quantization
KernelBench: Can LLMs Write GPU Kernels? - Benchmark with Torch -> CUDA problems
Genai-bench is a powerful benchmark tool designed for comprehensive token-level performance evaluation of large language model (LLM) serving systems.
A single-file educational implementation for understanding vLLM's core concepts and running LLM inference.
🤗 LeRobot: Making AI for Robotics more accessible with end-to-end learning
Open Source DeepWiki: AI-Powered Wiki Generator for GitHub/Gitlab/Bitbucket Repositories. Join the discord: https://discord.gg/gMwThUMeme
My learning notes/codes for ML SYS.
Efficient Triton Kernels for LLM Training
Supercharge Your LLM with the Fastest KV Cache Layer
vLLM’s reference system for K8S-native cluster-wide deployment with community-driven performance optimization
Minimalistic 4D-parallelism distributed training framework for education purpose
Model Context Protocol Servers
每个人都能看懂的大模型知识分享,LLMs春/秋招大模型面试前必看,让你和面试官侃侃而谈
This is the official repository for The Hundred-Page Language Models Book by Andriy Burkov
The Python code to reproduce the illustrations from The Hundred-Page Machine Learning Book.
Fully open reproduction of DeepSeek-R1
Interactive deep learning book with multi-framework code, math, and discussions. Adopted at 500 universities from 70 countries including Stanford, MIT, Harvard, and Cambridge.
Python code for "Probabilistic Machine learning" book by Kevin Murphy
Transformer Explained Visually: Learn How LLM Transformer Models Work with Interactive Visualization
Coding a Multimodal (Vision) Language Model from scratch in PyTorch with full explanation: https://www.youtube.com/watch?v=vAmKB7iPkWw
🔍 An LLM-based Multi-agent Framework of Web Search Engine (like Perplexity.ai Pro and SearchGPT)