+
Skip to content
View ziming-zh's full-sized avatar

Highlights

  • Pro

Organizations

@JI-DeepSleep

Block or report ziming-zh

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.

Python 6,123 566 Updated Aug 22, 2025

A Python-embedded DSL that makes it easy to write fast, scalable ML kernels with minimal boilerplate.

Python 365 39 Updated Oct 16, 2025

A verification tool for ensuring parallelization equivalence in distributed model training.

Python 10 1 Updated Sep 1, 2025

A CPU+GPU Profiling library that provides access to timeline traces and hardware performance counters.

HTML 881 207 Updated Oct 11, 2025

Representation and Reference Lowering of ONNX Models in MLIR Compiler Infrastructure

C++ 921 372 Updated Oct 16, 2025

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 60,270 10,599 Updated Oct 16, 2025

Dynamic Instrumentation Tool Platform

C 2,917 598 Updated Oct 16, 2025

gpt-oss-120b and gpt-oss-20b are two open-weight language models by OpenAI

Python 18,838 1,847 Updated Oct 6, 2025

Performance and Detection Benchmarks for TrainCheck (https://github.com/OrderLab/TrainCheck)

Python 3 Updated Aug 1, 2025

DocuSnap: Your AI-powered Personal Document Assistant.

5 Updated Aug 5, 2025

DocuSnap frontend built in Andriod Studio

Kotlin 3 Updated Jul 29, 2025

A toolchain for distributed system runtime checkers

Java 8 1 Updated May 20, 2025

Ultra and Unified CCL

C++ 590 49 Updated Oct 16, 2025

A curated reading list for machine learning reliability research and practice

28 2 Updated Sep 18, 2025

Mirage Persistent Kernel: Compiling LLMs into a MegaKernel

C++ 1,885 137 Updated Oct 16, 2025

Automatic AI-powered test suite generator

Python 92 13 Updated Jul 31, 2025

Artifact Evaluation Scripts and Workloads for TrainCheck (OSDI'25)

Python 2 Updated May 22, 2025

A Framework for Automated Validation of Deep Learning Training Tasks

Python 52 4 Updated Sep 30, 2025

ByteCheckpoint: An Unified Checkpointing Library for LFMs

Python 249 17 Updated Jul 10, 2025

JaxPP is a library for JAX that enables flexible MPMD pipeline parallelism for large-scale LLM training

Python 54 1 Updated Oct 13, 2025

Super-Efficient RLHF Training of LLMs with Parameter Reallocation

Python 320 20 Updated Apr 24, 2025

Collective communications library with various primitives for multi-machine training.

C++ 1,362 338 Updated Sep 12, 2025

Disseminated, Distributed OS for Hardware Resource Disaggregation. USENIX OSDI 2018 Best Paper.

C 493 75 Updated May 6, 2021

Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.

Python 25,027 1,746 Updated Oct 13, 2025

你管这破玩意叫操作系统源码 — 像小说一样品读 Linux 0.11 核心代码

HTML 21,514 2,874 Updated Mar 22, 2025

Tile primitives for speedy kernels

Cuda 2,820 190 Updated Oct 12, 2025

DeepSeek-V3/R1 inference performance simulator

Jupyter Notebook 169 22 Updated Mar 27, 2025

VIP cheatsheet for Stanford's CME 295 Transformers and Large Language Models

3,185 427 Updated Jul 27, 2025

Create beautiful diagrams just by typing notation in plain text.

TypeScript 7,849 356 Updated Sep 30, 2025
Next
点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载