hxdtest

hxdtest

8 followers · 21 following

Achievements

Stars

28 results for source starred repositories

Clear filter

thu-ml / SageAttention

Quantized Attention achieves speedup of 2-5x and 3-11x compared to FlashAttention and xformers, without lossing end-to-end metrics across language, image, and video models.

Cuda 2,076 167 Updated Jul 21, 2025

Dao-AILab / quack

A Quirky Assortment of CuTe Kernels

Python 365 30 Updated Jul 25, 2025

volcengine / verl

verl: Volcano Engine Reinforcement Learning for LLMs

Python 11,481 1,912 Updated Jul 27, 2025

fzyzcjy / torch_memory_saver

Allow torch tensor memory to be released and resumed later

Python 91 14 Updated Jul 9, 2025

DD-DuDa / Cute-Learning

Examples of CUDA implementations by Cutlass CuTe

Makefile 211 29 Updated Jul 1, 2025

luliyucoordinate / cute-flash-attention

Implement Flash Attention using Cute.

Cuda 89 7 Updated Dec 17, 2024

ByteDance-Seed / Triton-distributed

Distributed Compiler based on Triton for Parallel Systems

Python 916 80 Updated Jul 25, 2025

deepseek-ai / DualPipe

A bidirectional pipeline parallelism algorithm for computation-communication overlap in V3/R1 training.

Python 2,833 300 Updated Mar 10, 2025

deepseek-ai / DeepGEMM

DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling

C++ 5,558 658 Updated Jul 25, 2025

deepseek-ai / DeepEP

DeepEP: an efficient expert-parallel communication library

Cuda 8,313 871 Updated Jul 22, 2025

deepseek-ai / open-infra-index

Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation

7,874 281 Updated May 15, 2025

MathFoundationRL / Book-Mathematical-Foundation-of-Reinforcement-Learning

This is the homepage of a new book entitled "Mathematical Foundations of Reinforcement Learning."

MATLAB 10,782 1,045 Updated Jun 24, 2025

int8 / monte-carlo-tree-search

Monte carlo tree search in python

Python 608 172 Updated Jul 2, 2022

pytorch / pytorch

Tensors and Dynamic neural networks in Python with strong GPU acceleration

Python 91,807 24,778 Updated Jul 27, 2025

adam-maj / tiny-gpu

A minimal GPU design in Verilog to learn how GPUs work from the ground up

SystemVerilog 8,605 668 Updated Aug 18, 2024

PKU-YuanGroup / Open-Sora-Plan

This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.

Python 12,003 1,061 Updated Jul 19, 2025

allenai / OLMo

Modeling, training, eval, and inference code for OLMo

Python 5,822 634 Updated Jul 24, 2025

triton-lang / triton

Development repository for the Triton language and compiler

MLIR 16,294 2,130 Updated Jul 27, 2025

CalvinXKY / BasicCUDA

A tutorial for CUDA&PyTorch

C++ 149 28 Updated Jan 21, 2025

SJTU-IPADS / PowerInfer

High-speed Large Language Model Serving for Local Deployment

C++ 8,244 435 Updated Jul 27, 2025

LLM360 / crystalcoder-data-prep

Data preparation code for CrystalCoder 7B LLM

Python 45 5 Updated May 10, 2024

LLM360 / crystalcoder-train

Pre-training code for CrystalCoder 7B LLM

Python 54 7 Updated May 10, 2024

modelscope / data-juicer

Data processing for and with foundation models! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷

Python 4,879 252 Updated Jul 26, 2025

haotian-liu / LLaVA

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Python 23,145 2,556 Updated Aug 12, 2024

huggingface / transformers

🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

Python 147,512 29,795 Updated Jul 26, 2025

alibaba / BladeDISC

BladeDISC is an end-to-end DynamIc Shape Compiler project for machine learning workloads.

C++ 883 167 Updated Dec 30, 2024

intelligent-machine-learning / dlrover

DLRover: An Automatic Distributed Deep Learning System

Python 1,510 189 Updated Jul 25, 2025

DeepRec-AI / DeepRec

DeepRec is a high-performance recommendation deep learning framework based on TensorFlow. It is hosted in incubation in LF AI & Data Foundation.

C++ 1,116 361 Updated Jan 21, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

hxdtest

Achievements

Achievements

Block or report hxdtest

Stars

thu-ml / SageAttention

Dao-AILab / quack

volcengine / verl

fzyzcjy / torch_memory_saver

DD-DuDa / Cute-Learning

luliyucoordinate / cute-flash-attention

ByteDance-Seed / Triton-distributed

deepseek-ai / DualPipe

deepseek-ai / DeepGEMM

deepseek-ai / DeepEP

deepseek-ai / open-infra-index

MathFoundationRL / Book-Mathematical-Foundation-of-Reinforcement-Learning

int8 / monte-carlo-tree-search

pytorch / pytorch

adam-maj / tiny-gpu

PKU-YuanGroup / Open-Sora-Plan

allenai / OLMo

triton-lang / triton

CalvinXKY / BasicCUDA

SJTU-IPADS / PowerInfer

LLM360 / crystalcoder-data-prep

LLM360 / crystalcoder-train

modelscope / data-juicer

haotian-liu / LLaVA

huggingface / transformers

alibaba / BladeDISC

intelligent-machine-learning / dlrover

DeepRec-AI / DeepRec