这是indexloc提供的服务,不要输入任何密码
Skip to content
View mosure's full-sized avatar
👾
👾

Sponsors

@cs50victor

Organizations

@sudo-x

Block or report mosure

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
41 stars written in Cuda
Clear filter

LLM training in simple, raw C/CUDA

Cuda 28,171 3,290 Updated Jun 26, 2025

Instant neural graphics primitives: lightning fast NeRF and more

Cuda 17,068 2,027 Updated Oct 8, 2025

A massively parallel, optimal functional runtime in Rust

Cuda 11,153 426 Updated Nov 21, 2024

DeepEP: an efficient expert-parallel communication library

Cuda 8,722 991 Updated Nov 6, 2025

DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling

Cuda 5,888 745 Updated Nov 14, 2025

CUDA accelerated rasterization of gaussian splatting

Cuda 3,957 607 Updated Oct 2, 2025

Tile primitives for speedy kernels

Cuda 2,910 197 Updated Nov 15, 2025

[ICLR2025, ICML2025, NeurIPS2025 Spotlight] Quantized Attention achieves speedup of 2-5x compared to FlashAttention, without losing end-to-end metrics across language, image, and video models.

Cuda 2,692 265 Updated Nov 6, 2025

[ARCHIVED] Cooperative primitives for CUDA C++. See https://github.com/NVIDIA/cccl

Cuda 1,805 464 Updated Oct 9, 2023

[MICRO'23, MLSys'22] TorchSparse: Efficient Training and Inference Framework for Sparse Convolution on GPUs.

Cuda 1,414 177 Updated Feb 24, 2025

FSA/FST algorithms, differentiable, with PyTorch compatibility.

Cuda 1,283 231 Updated Nov 4, 2025

[CVPR 2023 Highlight] Neural Kernel Surface Reconstruction

Cuda 892 61 Updated Sep 24, 2025
Cuda 804 88 Updated May 10, 2025

[ICML2025] SpargeAttention: A training-free sparse attention that accelerates any model inference.

Cuda 779 65 Updated Nov 14, 2025

[ICCV 2023] Official code for NeuS2

Cuda 710 51 Updated Mar 22, 2024

UNet diffusion model in pure CUDA

Cuda 653 31 Updated Jun 28, 2024

Causal depthwise conv1d in CUDA, with a PyTorch interface

Cuda 642 137 Updated Oct 20, 2025

Original implementation of "Radiant Foam: Real-Time Differentiable Ray Tracing"

Cuda 620 36 Updated May 14, 2025

Instant neural graphics primitives: lightning fast NeRF and more

Cuda 503 73 Updated Aug 14, 2022

State of the art sorting and segmented sorting, including OneSweep. Implemented in CUDA, D3D12, and Unity style compute shaders. Theoretically portable to all wave/warp/subgroup sizes.

Cuda 401 23 Updated Dec 14, 2024

Differentiable Iso-Surface Extraction Package (DISO)

Cuda 304 34 Updated May 1, 2025

Differentiable gaussian rasterization with depth, alpha, normal map and extra per-Gaussian attributes, also support camera pose gradient

Cuda 293 27 Updated Oct 9, 2024

3DGS-LM accelerates Gaussian-Splatting optimization by replacing the ADAM optimizer with Levenberg-Marquardt. (ICCV 2025)

Cuda 276 13 Updated Aug 22, 2025

Marching cubes implementation for PyTorch environment.

Cuda 222 53 Updated Dec 26, 2024

A modular differential gaussian rasterization library.

Cuda 206 13 Updated Jul 28, 2024

[ECCV'24] On the Error Analysis of 3D Gaussian Splatting and an Optimal Projection Strategy

Cuda 189 19 Updated Jul 10, 2025

A differentiable rasterizer used in the project "2D Gaussian Splatting"

Cuda 174 49 Updated Nov 10, 2024
Cuda 157 9 Updated Jul 26, 2025

Official code release for "Efficient Perspective-Correct 3D Gaussian Splatting Using Hybrid Transparency"

Cuda 132 11 Updated Oct 16, 2025
Next