这是indexloc提供的服务,不要输入任何密码
Skip to content
View basujindal's full-sized avatar

Highlights

  • Pro

Block or report basujindal

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

The best ChatGPT that $100 can buy.

Python 37,005 4,478 Updated Nov 17, 2025

Fast and memory-efficient exact attention

Python 20,594 2,145 Updated Nov 18, 2025

WhatsApp MCP server

Go 5,074 793 Updated Jul 13, 2025

CUDA Templates and Python DSLs for High-Performance Linear Algebra

C++ 8,807 1,530 Updated Nov 15, 2025

Fastest kernels written from scratch

Cuda 392 53 Updated Sep 18, 2025

DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling

Cuda 5,895 745 Updated Nov 14, 2025

A curated collection of resources, tutorials, and best practices for learning and mastering NVIDIA CUTLASS

244 12 Updated May 6, 2025

[CVPR 2024 Highlight] DistriFusion: Distributed Parallel Inference for High-Resolution Diffusion Models

Python 713 31 Updated Dec 2, 2024

The code powering searchthearxiv.com, a simple semantic search engine for more than 300,000 ML papers on arXiv.

Python 162 16 Updated Apr 21, 2025

Intel CPU undervolting and throttling configuration tool

C 1,029 70 Updated Aug 24, 2023

Guide to linux undervolting for Haswell and never Intel CPUs

390 13 Updated Apr 4, 2018

[NeurIPS 2024] Simple and Effective Masked Diffusion Language Model

Python 561 80 Updated Sep 29, 2025

Exploring Hacker News by mapping and analyzing 40 million posts and comments for fun

TypeScript 200 8 Updated May 14, 2025

A JAX research toolkit for building, editing, and visualizing neural networks.

Python 1,824 68 Updated Jun 22, 2025

LLM training in simple, raw C/CUDA

Cuda 28,182 3,293 Updated Jun 26, 2025

This repository contains integer operators on GPUs for PyTorch.

Python 222 56 Updated Sep 29, 2023

PyTorch compiler that accelerates training and inference. Get built-in optimizations for performance, memory, parallelism, and easily write your own.

Python 1,421 108 Updated Nov 18, 2025

FP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up to medium batchsizes of 16-32 tokens.

Python 951 79 Updated Sep 4, 2024

Open-Sora: Democratizing Efficient Video Production for All

Python 27,900 2,773 Updated Apr 30, 2025

Grok open release

Python 50,566 8,374 Updated Aug 30, 2024

Flash Attention in ~100 lines of CUDA (forward pass only)

Cuda 971 99 Updated Dec 30, 2024

Optimized Stable Diffusion modified to run on lower GPU VRAM

Jupyter Notebook 3,117 457 Updated Sep 20, 2023

AI suite powered by state-of-the-art models and providing advanced AI/AGI functions. Includes AI personas, AGI functions, world-class Beam multi-model chats, text-to-image, voice, response streamin…

TypeScript 6,689 1,550 Updated Nov 18, 2025

Stop messing around with finicky sampling parameters and just use DRµGS!

HTML 358 21 Updated Jun 1, 2024

#1 Locally hosted web application that allows you to perform various operations on PDF files

Java 70,007 5,914 Updated Nov 18, 2025

Turn (almost) any Python command line program into a full GUI application with one line

Python 21,703 1,038 Updated Jul 12, 2025

Simple, free and efficient ad-blocker and privacy guard for Windows, macOS and Linux

Go 3,618 111 Updated Nov 14, 2025

Distribute and run LLMs with a single file.

C 23,394 1,239 Updated Nov 5, 2025

Fast, collaborative live terminal sharing over the web

Rust 7,130 251 Updated Jun 19, 2025

Display and control your Android device

C 131,248 12,287 Updated Nov 18, 2025
Next