Stars
- All languages
- ANTLR
- Assembly
- BitBake
- Blade
- C
- C#
- C++
- CMake
- COBOL
- CSS
- Coq
- Cuda
- Dart
- Diff
- Dockerfile
- EJS
- Eagle
- Elixir
- Erlang
- Go
- Groovy
- HCL
- HTML
- HolyC
- Idris
- Java
- JavaScript
- Jupyter Notebook
- KiCad Layout
- Kotlin
- LLVM
- Lua
- MATLAB
- MDX
- MLIR
- Makefile
- Markdown
- OCaml
- Objective-C
- Open Policy Agent
- OpenSCAD
- PHP
- PLpgSQL
- Perl
- PowerShell
- Prolog
- Python
- QML
- R
- Raku
- Ruby
- Rust
- SCSS
- Scala
- Shell
- Smarty
- Solidity
- Svelte
- Swift
- SystemVerilog
- TeX
- TypeScript
- V
- VHDL
- Vue
- YARA
- ZAP
- Zig
4
stars
written in Cuda
Clear filter
Flash Attention in ~100 lines of CUDA (forward pass only)
Reference implementation of Megalodon 7B model
Lightweight Llama 3 8B Inference Engine in CUDA C