这是indexloc提供的服务,不要输入任何密码
Skip to content
View xinhaoc's full-sized avatar
🕶️
Focusing
🕶️
Focusing

Block or report xinhaoc

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
30 results for source starred repositories
Clear filter

A lightweight design for computation-communication overlap.

Cuda 185 8 Updated Oct 10, 2025

A Datacenter Scale Distributed Inference Serving Framework

Rust 5,491 693 Updated Nov 18, 2025

A curated collection of resources, tutorials, and best practices for learning and mastering NVIDIA CUTLASS

244 12 Updated May 6, 2025

FlashInfer: Kernel Library for LLM Serving

Cuda 4,088 569 Updated Nov 18, 2025

FlashMLA: Efficient Multi-head Latent Attention Kernels

C++ 11,867 903 Updated Sep 30, 2025

Makefile 教程

HTML 296 35 Updated Mar 4, 2024

Github mirror of trition-lang/triton repo.

MLIR 98 23 Updated Nov 18, 2025

FlexFlow Serve: Low-Latency, High-Performance LLM Serving

C++ 63 6 Updated Sep 15, 2025

Multi-Faceted AI Agent and Workflow Autotuning. Automatically optimizes LangChain, LangGraph, DSPy programs for better quality, lower execution latency, and lower execution cost. Also has a simple …

Python 261 31 Updated May 16, 2025

Translation of C++ Core Guidelines [https://github.com/isocpp/CppCoreGuidelines] into Simplified Chinese.

2,427 334 Updated Sep 22, 2025

Mirage Persistent Kernel: Compiling LLMs into a MegaKernel

C++ 1,951 153 Updated Nov 18, 2025

Make a personal website using Notion and GitHub Pages

Shell 143 66 Updated Oct 27, 2023

CUDA Templates and Python DSLs for High-Performance Linear Algebra

C++ 8,806 1,530 Updated Nov 15, 2025

An Attention Superoptimizer

C++ 22 Updated Jan 20, 2025

MLX: An array framework for Apple silicon

C++ 22,839 1,392 Updated Nov 18, 2025

Paper collections of retrieval-based (augmented) language model.

232 12 Updated May 24, 2024

paper and its code for AI System

337 23 Updated Aug 15, 2025

Universal cross-platform tokenizers binding to HF and sentencepiece

C++ 418 99 Updated Aug 8, 2025

Automatically Discovering Fast Parallelization Strategies for Distributed Deep Neural Network Training

C++ 1,842 245 Updated Nov 17, 2025

The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.

Jupyter Notebook 52,523 6,143 Updated Sep 18, 2024

📝 My blog / notes

246 34 Updated Sep 22, 2022

Quick, visual, principled introduction to pytorch code through five colab notebooks.

Jupyter Notebook 448 70 Updated Jan 13, 2025

we want to create a repo to illustrate usage of transformers in chinese

Shell 3,026 494 Updated Aug 18, 2024

C++-based high-performance parallel environment execution engine (vectorized env) for general RL environments.

C++ 1,207 117 Updated Aug 12, 2024

A curated list of awesome READMEs

20,044 3,916 Updated Nov 16, 2025

A collection of full time roles in SWE, Quant, and PM for new grads.

15,652 1,227 Updated Nov 18, 2025