这是indexloc提供的服务，不要输入任何密码

junl666

Follow

junl666

Follow

ZJU EEer

1 follower · 9 following

ZJU
06:23 (UTC +08:00)

Achievements

Achievements

Highlights

Pro

Stars

521xueweihan / HelloGitHub

分享 GitHub 上有趣、入门级的开源项目。Share interesting, entry-level open source projects on GitHub.

Python 122,735 10,572 Updated Jun 27, 2025

ggml-org / ggml

Tensor library for machine learning

C++ 12,876 1,286 Updated Jul 25, 2025

MemTensor / MemOS

MemOS (Preview) | Intelligence Begins with Memory

Python 2,005 162 Updated Jul 26, 2025

HW-whistleblower / True-Story-of-Pangu

诺亚盘古大模型研发背后的真正的心酸与黑暗的故事。

11,282 1,384 Updated Jul 9, 2025

GeeeekExplorer / nano-vllm

Nano vLLM

Python 5,443 649 Updated Jun 27, 2025

vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 53,253 8,943 Updated Jul 26, 2025

ByteDance-Seed / Triton-distributed

Distributed Compiler based on Triton for Parallel Systems

Python 915 80 Updated Jul 25, 2025

linkedin / Liger-Kernel

Efficient Triton Kernels for LLM Training

Python 5,420 371 Updated Jul 24, 2025

exo-explore / exo

Run your own AI cluster at home with everyday devices 📱💻 🖥️⌚

Python 29,086 1,852 Updated Mar 21, 2025

Thesys-lab / Helix-ASPLOS25

Open-source implementation for "Helix: Serving Large Language Models over Heterogeneous GPUs and Network via Max-Flow"

Python 56 9 Updated Nov 24, 2024

bigscience-workshop / petals

🌸 Run LLMs at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading

Python 9,731 567 Updated Sep 7, 2024

triton-lang / triton

Development repository for the Triton language and compiler

MLIR 16,290 2,130 Updated Jul 26, 2025

FlagOpen / FlagAttention

A collection of memory efficient attention operators implemented in the Triton language.

Python 273 18 Updated Jun 5, 2024

66RING / tiny-flash-attention

flash attention tutorial written in python, triton, cuda, cutlass

Cuda 391 42 Updated May 14, 2025

ModelTC / LightLLM

LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.

Python 3,408 270 Updated Jul 25, 2025

Liu-xiandong / How_to_optimize_in_GPU

This is a series of GPU optimization topics. Here we will introduce how to optimize the CUDA kernel in detail. I will introduce several basic kernel optimizations, including: elementwise, reduce, s…

Cuda 1,100 162 Updated Jul 29, 2023

stas00 / ml-engineering

Machine Learning Engineering Open Book

Python 14,523 875 Updated Jul 24, 2025

xlite-dev / Awesome-LLM-Inference

📚A curated list of Awesome LLM/VLM Inference Papers with Codes: Flash-Attention, Paged-Attention, WINT8/4, Parallelism, etc.🎉

Python 4,289 296 Updated Jul 23, 2025

xlite-dev / LeetCUDA

📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA Kernels, Tensor Cores, HGEMM, FA-2 MMA.🎉

Cuda 5,710 604 Updated Jul 21, 2025

tpoisonooo / how-to-optimize-gemm

row-major matmul optimization

C++ 648 88 Updated Sep 9, 2023

deepseek-ai / 3FS

A high-performance distributed file system designed to address the challenges of AI training and inference workloads.

C++ 9,163 923 Updated Jun 17, 2025

nelvko / clash-for-linux-install

😼 优雅地使用基于 clash/mihomo 的代理环境

Shell 3,022 434 Updated Jul 23, 2025

kvcache-ai / ktransformers

A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations

Python 14,715 1,049 Updated Jul 25, 2025

OpenNMT / CTranslate2

Fast inference engine for Transformer models

C++ 3,925 370 Updated Apr 8, 2025

b4rtaz / distributed-llama

Connect home devices into a powerful cluster to accelerate LLM inference. More devices means faster inference.

C++ 2,229 155 Updated Jul 7, 2025

SJTU-IPADS / PowerInfer

High-speed Large Language Model Serving for Local Deployment

C++ 8,239 434 Updated Feb 19, 2025

luchangli03 / onnxsim_large_model

simplify >2GB large onnx model

Python 60 4 Updated Nov 30, 2024

luchangli03 / export_llama_to_onnx

export llama to onnx

Python 131 16 Updated Dec 28, 2024

NexaAI / Awesome-LLMs-on-device

Awesome LLMs on Device: A Comprehensive Survey

1,160 106 Updated Jan 12, 2025

datawhalechina / fun-rec

推荐系统入门教程，在线阅读地址：https://datawhalechina.github.io/fun-rec/

Jupyter Notebook 5,988 931 Updated Jun 24, 2025