+
Skip to content
View antferdom's full-sized avatar

Highlights

  • Pro

Block or report antferdom

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

MoE training system for research, speed-running and profit

Python 1 Updated Oct 10, 2025

Parrot is a C++ library for fused array operations using CUDA/Thrust. It provides efficient GPU-accelerated operations with lazy evaluation semantics, allowing for chaining of operations without un…

Cuda 113 6 Updated Oct 9, 2025

Cosmos-Predict2.5, the latest version of the Cosmos World Foundation Models (WFMs) family, specialized for simulating and predicting the future state of the world in the form of video.

Python 123 4 Updated Oct 7, 2025

a size profiler for cuda binary

Python 48 Updated Oct 7, 2025

easily compare pytorch json profiles. vibe coded

Shell 3 Updated Oct 1, 2025

Zonos-v0.1 is a leading open-weight text-to-speech model trained on more than 200k hours of varied multilingual speech, delivering expressiveness and quality on par with—or even surpassing—top TTS …

Python 7,067 811 Updated Mar 5, 2025

Technical report of Kimina-Prover Preview.

Python 335 13 Updated Jul 10, 2025

DC-Gen: Post-Training Diffusion Acceleration with Deeply Compressed Latent Space

Python 229 6 Updated Oct 5, 2025

Frontier Open-Source Text-to-Speech

9,537 1,174 Updated Sep 5, 2025
Python 1,294 115 Updated Sep 12, 2025

call cubin from python or julia.

Python 1 Updated Apr 9, 2025
C++ 31 2 Updated Jul 2, 2025

DRAMSim2: A cycle accurate DRAM simulator

C++ 283 155 Updated Nov 11, 2020

Scalable, fast, and disk-friendly vector search in Postgres, the successor of pgvecto.rs.

Rust 1,165 38 Updated Oct 10, 2025

AIPerf is a comprehensive benchmarking tool that measures the performance of generative AI models served by your preferred inference solution.

Python 18 2 Updated Oct 11, 2025

An algorithm for weight-activation quantization (W4A4, W4A8) of LLMs, supporting both static and dynamic quantization

Python 153 13 Updated May 22, 2025

Post-training with Tinker

Python 934 62 Updated Oct 10, 2025

Quantized LLM training in pure CUDA/C++.

C++ 185 7 Updated Oct 10, 2025

Github mirror of trition-lang/triton repo.

MLIR 84 20 Updated Oct 11, 2025

Kimi K2 is the large language model series developed by Moonshot AI team

8,317 549 Updated Sep 11, 2025

Home for "How To Scale Your Model", a short blog-style textbook about scaling LLMs on TPUs

HTML 655 93 Updated Oct 6, 2025
Python 685 41 Updated Oct 2, 2025

GPTQ and efficient search for GGUF

Python 51 4 Updated Sep 17, 2025

Reduce kernel based on CUTLASS CuTe and TMA.

Cuda 9 Updated Sep 25, 2025

Fast and memory-efficient exact kmeans

Python 100 6 Updated Sep 30, 2025

Verify Precision of all Kimi K2 API Vendor

Python 194 8 Updated Oct 10, 2025

A minimal implementation of DeepMind's Genie world model

Python 960 64 Updated Sep 28, 2025

Research code artifacts for Code World Model (CWM) including inference tools, reproducibility, and documentation.

Python 655 52 Updated Sep 24, 2025
Next
点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载