Lists (1)
Sort Name ascending (A-Z)
Stars
- All languages
- Assembly
- Batchfile
- Bikeshed
- C
- C#
- C++
- CMake
- CSS
- Clojure
- Cuda
- Dockerfile
- Elixir
- Fortran
- GLSL
- Go
- HLSL
- HTML
- Haskell
- Java
- JavaScript
- Jupyter Notebook
- Lean
- Lua
- MATLAB
- MDX
- MLIR
- Makefile
- Mathematica
- Nim
- Nix
- OCaml
- Objective-C
- Objective-C++
- Python
- Ruby
- Rust
- Scala
- ShaderLab
- Shell
- Starlark
- Svelte
- Swift
- SystemVerilog
- TeX
- TypeScript
- Verilog
- Vue
- WGSL
- YAML
Instant neural graphics primitives: lightning fast NeRF and more
A massively parallel, optimal functional runtime in Rust
DeepEP: an efficient expert-parallel communication library
DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling
CUDA accelerated rasterization of gaussian splatting
Tile primitives for speedy kernels
[ICLR2025, ICML2025, NeurIPS2025 Spotlight] Quantized Attention achieves speedup of 2-5x compared to FlashAttention, without losing end-to-end metrics across language, image, and video models.
[ARCHIVED] Cooperative primitives for CUDA C++. See https://github.com/NVIDIA/cccl
[MICRO'23, MLSys'22] TorchSparse: Efficient Training and Inference Framework for Sparse Convolution on GPUs.
FSA/FST algorithms, differentiable, with PyTorch compatibility.
[CVPR 2023 Highlight] Neural Kernel Surface Reconstruction
[ICML2025] SpargeAttention: A training-free sparse attention that accelerates any model inference.
Causal depthwise conv1d in CUDA, with a PyTorch interface
Original implementation of "Radiant Foam: Real-Time Differentiable Ray Tracing"
bycloudai / instant-ngp-Windows
Forked from NVlabs/instant-ngpInstant neural graphics primitives: lightning fast NeRF and more
State of the art sorting and segmented sorting, including OneSweep. Implemented in CUDA, D3D12, and Unity style compute shaders. Theoretically portable to all wave/warp/subgroup sizes.
Differentiable Iso-Surface Extraction Package (DISO)
Differentiable gaussian rasterization with depth, alpha, normal map and extra per-Gaussian attributes, also support camera pose gradient
3DGS-LM accelerates Gaussian-Splatting optimization by replacing the ADAM optimizer with Levenberg-Marquardt. (ICCV 2025)
Marching cubes implementation for PyTorch environment.
A modular differential gaussian rasterization library.
[ECCV'24] On the Error Analysis of 3D Gaussian Splatting and an Optimal Projection Strategy
A differentiable rasterizer used in the project "2D Gaussian Splatting"
Official code release for "Efficient Perspective-Correct 3D Gaussian Splatting Using Hybrid Transparency"