ziming-zh

Ziming ziming-zh

40 followers · 132 following

University of Michigan
Ann Arbor
12:35 (UTC -04:00)
ziming-zh.github.io

Achievements

Highlights

Organizations

Lists (1)

Sort

🚀 My stack

1 repository

Stars

meta-pytorch / gpt-fast

Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.

Python 6,123 566 Updated Aug 22, 2025

pytorch / helion

A Python-embedded DSL that makes it easy to write fast, scalable ML kernels with minimal boilerplate.

Python 365 39 Updated Oct 16, 2025

microsoft / TrainVerify

A verification tool for ensuring parallelization equivalence in distributed model training.

Python 10 1 Updated Sep 1, 2025

pytorch / kineto

A CPU+GPU Profiling library that provides access to timeline traces and hardware performance counters.

HTML 881 207 Updated Oct 11, 2025

onnx / onnx-mlir

Representation and Reference Lowering of ONNX Models in MLIR Compiler Infrastructure

C++ 921 372 Updated Oct 16, 2025

vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 60,270 10,599 Updated Oct 16, 2025

DynamoRIO / dynamorio

Dynamic Instrumentation Tool Platform

C 2,917 598 Updated Oct 16, 2025

openai / gpt-oss

gpt-oss-120b and gpt-oss-20b are two open-weight language models by OpenAI

Python 18,838 1,847 Updated Oct 6, 2025

OrderLab / TrainCheck-Benchmarks

Performance and Detection Benchmarks for TrainCheck (https://github.com/OrderLab/TrainCheck)

Python 3 Updated Aug 1, 2025

JI-DeepSleep / DocuSnap

DocuSnap: Your AI-powered Personal Document Assistant.

5 Updated Aug 5, 2025

JI-DeepSleep / DocuSnap-Frontend

DocuSnap frontend built in Andriod Studio

Kotlin 3 Updated Jul 29, 2025

OrderLab / T2C

A toolchain for distributed system runtime checkers

Java 8 1 Updated May 20, 2025

uccl-project / uccl

Ultra and Unified CCL

C++ 590 49 Updated Oct 16, 2025

OrderLab / awesome-machine-learning-reliability

A curated reading list for machine learning reliability research and practice

28 2 Updated Sep 18, 2025

open-neutrino / neutrino

C 194 16 Updated Aug 4, 2025

mirage-project / mirage

Mirage Persistent Kernel: Compiling LLMs into a MegaKernel

C++ 1,885 137 Updated Oct 16, 2025

plasma-umass / coverup

Automatic AI-powered test suite generator

Python 92 13 Updated Jul 31, 2025

OrderLab / TrainCheck-Evaluation-Workloads

Artifact Evaluation Scripts and Workloads for TrainCheck (OSDI'25)

Python 2 Updated May 22, 2025

OrderLab / TrainCheck

A Framework for Automated Validation of Deep Learning Training Tasks

Python 52 4 Updated Sep 30, 2025

ByteDance-Seed / ByteCheckpoint

ByteCheckpoint: An Unified Checkpointing Library for LFMs

Python 249 17 Updated Jul 10, 2025

NVIDIA / jaxpp

JaxPP is a library for JAX that enables flexible MPMD pipeline parallelism for large-scale LLM training

Python 54 1 Updated Oct 13, 2025

openpsi-project / ReaLHF

Super-Efficient RLHF Training of LLMs with Parameter Reallocation

Python 320 20 Updated Apr 24, 2025

pytorch / gloo

Collective communications library with various primitives for multi-machine training.

C++ 1,362 338 Updated Sep 12, 2025

WukLab / LegoOS

Disseminated, Distributed OS for Hardware Resource Disaggregation. USENIX OSDI 2018 Best Paper.

C 493 75 Updated May 6, 2021

QwenLM / Qwen3

Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.

Python 25,027 1,746 Updated Oct 13, 2025

dibingfa / flash-linux0.11-talk

你管这破玩意叫操作系统源码 — 像小说一样品读 Linux 0.11 核心代码

HTML 21,514 2,874 Updated Mar 22, 2025

HazyResearch / ThunderKittens

Tile primitives for speedy kernels

Cuda 2,820 190 Updated Oct 12, 2025

zartbot / shallowsim

DeepSeek-V3/R1 inference performance simulator

Jupyter Notebook 169 22 Updated Mar 27, 2025

afshinea / stanford-cme-295-transformers-large-language-models

VIP cheatsheet for Stanford's CME 295 Transformers and Large Language Models

3,185 427 Updated Jul 27, 2025

penrose / penrose

Create beautiful diagrams just by typing notation in plain text.

TypeScript 7,849 356 Updated Sep 30, 2025