Salv1a

Sheng Wang Salv1a

0 followers · 6 following

Zhejiang University
Shanghai, China

Achievements

Highlights

Stars

bupticybee / TexasSolver

🚀 A very efficient Texas Holdem GTO solver ♠️♥️♣️♦️

C++ 2,186 386 Updated Nov 5, 2024

rasbt / LLMs-from-scratch

Implement a ChatGPT-like LLM in PyTorch from scratch, step by step

Jupyter Notebook 75,004 10,974 Updated Oct 9, 2025

xlite-dev / LeetCUDA

📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA Kernels, Tensor Cores, HGEMM, FA-2 MMA.🎉

Cuda 7,936 791 Updated Sep 19, 2025

deepseek-ai / DeepGEMM

DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling

Cuda 5,785 711 Updated Oct 10, 2025

microsoft / triton-shared

Shared Middle-Layer for Triton Compilation

MLIR 289 78 Updated Oct 8, 2025

apc-llc / nvcc-llvm-ir

Enabling on-the-fly manipulations with LLVM IR code of CUDA sources

C++ 112 26 Updated Apr 18, 2025

XPU-Forces / xpu_graph

A torch compile backend for multi-targets

Python 39 18 Updated Oct 10, 2025

Tongkaio / CUDA_Kernel_Samples

CUDA 算子手撕与面试指南

Cuda 634 70 Updated Aug 23, 2025

gpu-mode / awesomeMLSys

An ML Systems Onboarding list

910 33 Updated Jan 24, 2025

harleyszhang / llm_note

LLM notes, including model inference, transformer model structure, and llm framework code analysis notes.

Python 826 88 Updated Sep 16, 2025

hijiangtao / resume

个人中文简历 Latex 源码 https://hijiangtao.github.io/

TeX 2,580 630 Updated Sep 4, 2024

linkedin / Liger-Kernel

Efficient Triton Kernels for LLM Training

Python 5,731 413 Updated Oct 10, 2025

fkodom / fft-conv-pytorch

Implementation of 1D, 2D, and 3D FFT convolutions in PyTorch. Much faster than direct convolutions for large kernel sizes.

Python 509 61 Updated Sep 28, 2023

PaddleJitLab / CUDATutorial

A self-learning tutorail for CUDA High Performance Programing.

JavaScript 749 74 Updated Jun 30, 2025

realYurkOfGitHub / translation-Introduction-to-HPC

为 Eijhout 教授的Introduction to HPC提供中文翻译、 PPT和Lab。

C 323 44 Updated Apr 11, 2022

youngyangyang04 / PowerVim

Make your vim more power and much easer. 最实用的vim配置🔥

Vim Script 1,729 258 Updated May 8, 2024

NVIDIA / cutlass

CUDA Templates and Python DSLs for High-Performance Linear Algebra

C++ 8,561 1,474 Updated Sep 25, 2025

BBuf / how-to-optim-algorithm-in-cuda

how to optimize some algorithm in cuda.

Cuda 2,546 228 Updated Oct 9, 2025

FlagOpen / FlagGems

FlagGems is an operator library for large language models implemented in the Triton Language.

Python 691 142 Updated Oct 10, 2025

eglfiv / typora-activation

Typora最新的激活破解方案，三步即激活。 😊实时更新中/👩‍🎓学生党必备，有条件支持正版的请不要点开🔞🈲️。Activate Typora

1,956 190 Updated Jul 15, 2024

KEKE046 / mlir-tutorial

Hands-On Practical MLIR Tutorial

C++ 618 90 Updated Oct 20, 2023

triton-lang / triton

Development repository for the Triton language and compiler

MLIR 17,179 2,291 Updated Oct 10, 2025

Infrasys-AI / AISystem

AISystem 主要是指AI系统，包括AI芯片、AI编译器、AI推理和训练框架等AI全栈底层技术

Jupyter Notebook 15,271 2,195 Updated Sep 3, 2025

llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.

LLVM 34,815 15,212 Updated Oct 10, 2025

PKUFlyingPig / cs-self-learning

计算机自学指南

HTML 67,902 7,632 Updated Oct 9, 2025

datawhalechina / learn-nlp-with-transformers

we want to create a repo to illustrate usage of transformers in chinese

Shell 2,996 489 Updated Aug 18, 2024

openai / gpt-2

Code for the paper "Language Models are Unsupervised Multitask Learners"

Python 24,254 5,795 Updated Aug 14, 2024

karpathy / build-nanogpt

Video+code lecture on building nanoGPT from scratch

Python 4,420 697 Updated Aug 13, 2024

harvardnlp / annotated-transformer

An annotated implementation of the Transformer paper.

Jupyter Notebook 6,602 1,423 Updated Apr 7, 2024

hugo2046 / QuantsPlaybook

量化研究-券商金工研报复现

Jupyter Notebook 3,945 1,027 Updated Jul 23, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sheng Wang Salv1a

Achievements

Achievements

Highlights

Block or report Salv1a

Stars

bupticybee / TexasSolver

rasbt / LLMs-from-scratch

xlite-dev / LeetCUDA

deepseek-ai / DeepGEMM

microsoft / triton-shared

apc-llc / nvcc-llvm-ir

XPU-Forces / xpu_graph

Tongkaio / CUDA_Kernel_Samples

gpu-mode / awesomeMLSys

harleyszhang / llm_note

hijiangtao / resume

linkedin / Liger-Kernel

fkodom / fft-conv-pytorch

PaddleJitLab / CUDATutorial

realYurkOfGitHub / translation-Introduction-to-HPC

youngyangyang04 / PowerVim

NVIDIA / cutlass

BBuf / how-to-optim-algorithm-in-cuda

FlagOpen / FlagGems

eglfiv / typora-activation

KEKE046 / mlir-tutorial

triton-lang / triton

Infrasys-AI / AISystem

llvm / llvm-project

PKUFlyingPig / cs-self-learning

datawhalechina / learn-nlp-with-transformers

openai / gpt-2

karpathy / build-nanogpt

harvardnlp / annotated-transformer

hugo2046 / QuantsPlaybook