+
Skip to content
View enp1s0's full-sized avatar
🤯
Computing
🤯
Computing

Organizations

@FDPS @rioyokotalab @mori-lab @rapidsai @wmmae @hpc-wakate

Block or report enp1s0

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

GEMMul8 (GEMMulate): GEMM emulation using int8 matrix engines based on the Ozaki Scheme II

C++ 27 6 Updated Oct 12, 2025

collection of articles about PhD life written in 🇯🇵

307 7 Updated Aug 3, 2025

The book "Performance Analysis and Tuning on Modern CPU"

TeX 3,338 230 Updated Jun 9, 2025

LLM training in simple, raw C/CUDA

Cuda 27,866 3,232 Updated Jun 26, 2025

The official Vim repository

Vim Script 39,104 5,841 Updated Oct 17, 2025

A ksvd implementation written in python.

Python 113 23 Updated Dec 26, 2022

Itoyori: A distributed multi-threading runtime system for global-view fork-join task parallelism

C++ 22 2 Updated Feb 9, 2024

A lightweight TUI (ncurses-like) display manager for Linux and BSD (mirror of https://codeberg.org/fairyglade/ly).

Zig 6,520 336 Updated Oct 15, 2025

int8_t and int16_t matrix multiply based on https://arxiv.org/abs/1705.01991

C++ 74 24 Updated Dec 30, 2023

The fastest and most memory efficient lattice Boltzmann CFD software, running on all GPUs and CPUs via OpenCL. Free for non-commercial use.

C++ 4,708 423 Updated Oct 14, 2025

Synchronize your working directory efficiently to a remote place without committing the changes.

Go 75 11 Updated Nov 7, 2022

GPTPU for SC 2021

C++ 52 9 Updated Mar 22, 2023

Repository for nvCOMP docs and examples. nvCOMP is a library for fast lossless compression/decompression on the GPU that can be downloaded from https://developer.nvidia.com/nvcomp.

C++ 597 89 Updated Sep 11, 2024

stdgpu: Efficient STL-like Data Structures on the GPU

C++ 1,233 91 Updated Oct 14, 2025

Linux Kernel for Surface Devices

Shell 6,345 270 Updated Oct 11, 2025

A single-header C++ library for simplifying the use of CUDA Runtime Compilation (NVRTC).

C++ 557 74 Updated Sep 15, 2025

Dear ImGui: Bloat-free Graphical User interface for C++ with minimal dependencies

C++ 68,780 11,255 Updated Oct 16, 2025

Templight is a Clang-based tool to profile the time and memory consumption of template instantiations and to perform interactive debugging sessions to gain introspection into the template instantia…

C++ 783 41 Updated Dec 7, 2024

GPGPU-Sim provides a detailed simulation model of contemporary NVIDIA GPUs running CUDA and/or OpenCL workloads. It includes support for features such as TensorCores and CUDA Dynamic Parallelism as…

C++ 1,460 582 Updated Feb 15, 2025

Test suite for probing the numerical behavior of NVIDIA tensor cores

Cuda 41 13 Updated Jul 24, 2024

Important concepts in numerical linear algebra and related areas

780 67 Updated Jan 13, 2024

A massively-parallel, block-sparse tensor framework written in C++

C++ 309 58 Updated Oct 3, 2025

Parallel Library for Tensor Network Methods

C++ 32 8 Updated Aug 22, 2023

⚡ Dark powered Vim/Neovim plugin manager

Vim Script 3,445 195 Updated Sep 13, 2025

gpuprec: Extended-Precision Libraries on GPUs

Cuda 38 7 Updated Jan 9, 2016

Crow is very fast and easy to use C++ micro web framework (inspired by Python Flask)

C++ 7,611 886 Updated Jun 6, 2024

A compact split ortholinear keyboard.

Python 931 181 Updated Nov 20, 2022

Binary Neural Network Framework for FPGA(Differentiable LUT)

C++ 162 22 Updated Aug 12, 2025

rust-cuda working group

65 7 Updated Jun 12, 2019
Next
点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载