+
Skip to content
View amulil's full-sized avatar
:octocat:
Focusing
:octocat:
Focusing

Block or report amulil

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM

Python 2,115 263 Updated Oct 20, 2025

fmchisel: Efficient Compression and Training Algorithms for Foundation Models

Python 66 7 Updated Oct 9, 2025

Introduction to Machine Learning Systems

Python 3,634 402 Updated Oct 20, 2025
Python 41 63 Updated Mar 14, 2025

Nano vLLM

Python 7,129 912 Updated Aug 31, 2025

FlexAttention based, minimal vllm-style inference engine for fast Gemma 2 inference.

Python 298 16 Updated Aug 7, 2025

Renderer for the harmony response format to be used with gpt-oss

Rust 3,911 214 Updated Aug 15, 2025

Mirage Persistent Kernel: Compiling LLMs into a MegaKernel

C++ 1,893 139 Updated Oct 20, 2025

To record is already an act of resistance.

1 Updated Aug 3, 2025

Calculate token/s & GPU memory requirement for any LLM. Supports llama.cpp/ggml/bnb/QLoRA quantization

JavaScript 1,375 84 Updated Dec 3, 2024

KernelBench: Can LLMs Write GPU Kernels? - Benchmark with Torch -> CUDA problems

Python 616 76 Updated Oct 10, 2025

Genai-bench is a powerful benchmark tool designed for comprehensive token-level performance evaluation of large language model (LLM) serving systems.

Python 220 26 Updated Oct 20, 2025

Learn every thing about AI Infra.

1 Updated Jun 21, 2025

A single-file educational implementation for understanding vLLM's core concepts and running LLM inference.

Python 23 3 Updated Jun 22, 2025
Python 97 9 Updated Sep 13, 2025

🤗 LeRobot: Making AI for Robotics more accessible with end-to-end learning

Python 18,268 2,797 Updated Oct 20, 2025

Open Source DeepWiki: AI-Powered Wiki Generator for GitHub/Gitlab/Bitbucket Repositories. Join the discord: https://discord.gg/gMwThUMeme

Python 11,304 1,200 Updated Oct 11, 2025

My learning notes/codes for ML SYS.

Python 3,910 234 Updated Oct 6, 2025

Efficient Triton Kernels for LLM Training

Python 5,756 418 Updated Oct 20, 2025

Material for gpu-mode lectures

Jupyter Notebook 5,187 518 Updated Sep 23, 2025

Supercharge Your LLM with the Fastest KV Cache Layer

Python 5,626 651 Updated Oct 20, 2025

vLLM’s reference system for K8S-native cluster-wide deployment with community-driven performance optimization

Python 1,862 309 Updated Oct 13, 2025

Minimalistic 4D-parallelism distributed training framework for education purpose

Python 1,862 138 Updated Aug 26, 2025

Model Context Protocol Servers

TypeScript 70,862 8,457 Updated Oct 20, 2025

Learn how to use GPUs to accelerate deep learning.

2 Updated Jun 18, 2025

每个人都能看懂的大模型知识分享,LLMs春/秋招大模型面试前必看,让你和面试官侃侃而谈

Jupyter Notebook 4,536 443 Updated Oct 13, 2025

This is the official repository for The Hundred-Page Language Models Book by Andriy Burkov

Jupyter Notebook 1,954 322 Updated May 21, 2025

The Python code to reproduce the illustrations from The Hundred-Page Machine Learning Book.

Python 1,988 587 Updated Jun 27, 2024

Fully open reproduction of DeepSeek-R1

Python 25,560 2,396 Updated Sep 8, 2025
Next
点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载