这是indexloc提供的服务,不要输入任何密码
Skip to content
View mosure's full-sized avatar
👾
👾

Sponsors

@cs50victor

Organizations

@sudo-x

Block or report mosure

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
41 stars written in Cuda
Clear filter

LLM training in simple, raw C/CUDA

Cuda 28,164 3,289 Updated Jun 26, 2025

Instant neural graphics primitives: lightning fast NeRF and more

Cuda 17,066 2,027 Updated Oct 8, 2025

A massively parallel, optimal functional runtime in Rust

Cuda 11,154 426 Updated Nov 21, 2024

DeepEP: an efficient expert-parallel communication library

Cuda 8,721 988 Updated Nov 6, 2025

DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling

Cuda 5,887 744 Updated Nov 14, 2025

CUDA accelerated rasterization of gaussian splatting

Cuda 3,954 603 Updated Oct 2, 2025

Tile primitives for speedy kernels

Cuda 2,909 196 Updated Nov 15, 2025

[ICLR2025, ICML2025, NeurIPS2025 Spotlight] Quantized Attention achieves speedup of 2-5x compared to FlashAttention, without losing end-to-end metrics across language, image, and video models.

Cuda 2,687 263 Updated Nov 6, 2025

[ARCHIVED] Cooperative primitives for CUDA C++. See https://github.com/NVIDIA/cccl

Cuda 1,803 463 Updated Oct 9, 2023

[MICRO'23, MLSys'22] TorchSparse: Efficient Training and Inference Framework for Sparse Convolution on GPUs.

Cuda 1,413 177 Updated Feb 24, 2025

FSA/FST algorithms, differentiable, with PyTorch compatibility.

Cuda 1,282 231 Updated Nov 4, 2025

[CVPR 2023 Highlight] Neural Kernel Surface Reconstruction

Cuda 892 61 Updated Sep 24, 2025
Cuda 803 88 Updated May 10, 2025

[ICML2025] SpargeAttention: A training-free sparse attention that accelerates any model inference.

Cuda 774 65 Updated Nov 14, 2025

[ICCV 2023] Official code for NeuS2

Cuda 710 51 Updated Mar 22, 2024

UNet diffusion model in pure CUDA

Cuda 654 31 Updated Jun 28, 2024

Causal depthwise conv1d in CUDA, with a PyTorch interface

Cuda 642 136 Updated Oct 20, 2025

Original implementation of "Radiant Foam: Real-Time Differentiable Ray Tracing"

Cuda 620 36 Updated May 14, 2025

Instant neural graphics primitives: lightning fast NeRF and more

Cuda 503 73 Updated Aug 14, 2022

State of the art sorting and segmented sorting, including OneSweep. Implemented in CUDA, D3D12, and Unity style compute shaders. Theoretically portable to all wave/warp/subgroup sizes.

Cuda 401 22 Updated Dec 14, 2024

Differentiable Iso-Surface Extraction Package (DISO)

Cuda 304 34 Updated May 1, 2025

Differentiable gaussian rasterization with depth, alpha, normal map and extra per-Gaussian attributes, also support camera pose gradient

Cuda 293 27 Updated Oct 9, 2024

3DGS-LM accelerates Gaussian-Splatting optimization by replacing the ADAM optimizer with Levenberg-Marquardt. (ICCV 2025)

Cuda 276 13 Updated Aug 22, 2025

Marching cubes implementation for PyTorch environment.

Cuda 222 53 Updated Dec 26, 2024

A modular differential gaussian rasterization library.

Cuda 206 13 Updated Jul 28, 2024

[ECCV'24] On the Error Analysis of 3D Gaussian Splatting and an Optimal Projection Strategy

Cuda 189 19 Updated Jul 10, 2025

A differentiable rasterizer used in the project "2D Gaussian Splatting"

Cuda 174 49 Updated Nov 10, 2024
Cuda 157 9 Updated Jul 26, 2025

Official code release for "Efficient Perspective-Correct 3D Gaussian Splatting Using Hybrid Transparency"

Cuda 131 10 Updated Oct 16, 2025
Next