-
-
llm.c Public
Forked from karpathy/llm.cLLM training in simple, raw C/CUDA
-
cuda-samples Public
Forked from NVIDIA/cuda-samplesSamples for CUDA Developers which demonstrates features in CUDA Toolkit
C Other UpdatedAug 6, 2025 -
cutlass Public
Forked from NVIDIA/cutlassCUDA Templates for Linear Algebra Subroutines
C++ Other UpdatedAug 1, 2025 -
multi-gpu-programming-models Public
Forked from NVIDIA/multi-gpu-programming-modelsExamples demonstrating available options to program multiple GPUs in a single node or a cluster
Cuda BSD 3-Clause "New" or "Revised" License UpdatedJun 16, 2025 -
pplx-kernels Public
Forked from perplexityai/pplx-kernelsPerplexity GPU Kernels
-
DeeperGEMM Public
Forked from deepseek-ai/DeepGEMMDeeperGEMM: crazy optimized version
-
-
-
-
fast.cu Public
Forked from pranjalssh/fast.cuFastest kernels written from scratch
Cuda MIT License UpdatedFeb 15, 2025 -
lectures Public
Forked from gpu-mode/lecturesMaterial for gpu-mode lectures
Jupyter Notebook Apache License 2.0 UpdatedNov 23, 2024 -
-
nanoGPT Public
Forked from karpathy/nanoGPTThe simplest, fastest repository for training/finetuning medium-sized GPTs.
Python MIT License UpdatedMar 24, 2024