+

antferdom

Follow

A.J antferdom

Follow

engineer

52 followers · 474 following

@datacrunch-research
Seville, Spain
08:50 (UTC +02:00)
@antferdom

Achievements

Achievements

Highlights

Pro

Lists (32)

Sort

AI Efficiency

482 repositories

Attention

56 repositories

Checkpointing

29 repositories

Collective Communication

52 repositories

Compilers

132 repositories

CUDA

236 repositories

Datasets

55 repositories

Diffusion

241 repositories

Distributed Systems

170 repositories

Graphs

13 repositories

gRPC

20 repositories

Hardware

81 repositories

HPC

92 repositories

Infrastructure as Code

30 repositories

Jax

50 repositories

Job Orchestration

20 repositories

K8s

44 repositories

Kernel Language

145 repositories

Language Models

502 repositories

Linux

34 repositories

Mechanistic Interpretability

31 repositories

NVIDIA GDS: DMA/RDMA

36 repositories

PEFT

16 repositories

PyTorch

279 repositories

Quantization

55 repositories

Research Tools

283 repositories

Rust Machine Learning Ecosystem

40 repositories

Serving

178 repositories

Simulation

47 repositories

Storage

20 repositories

Vision

88 repositories

WASM

27 repositories

Starred repositories

Noumena-Network / nanoMoE

MoE training system for research, speed-running and profit

Python 1 Updated Oct 10, 2025

NVlabs / parrot

Parrot is a C++ library for fused array operations using CUDA/Thrust. It provides efficient GPU-accelerated operations with lazy evaluation semantics, allowing for chaining of operations without un…

Cuda 113 6 Updated Oct 9, 2025

nvidia-cosmos / cosmos-predict2.5

Cosmos-Predict2.5, the latest version of the Cosmos World Foundation Models (WFMs) family, specialized for simulating and predicting the future state of the world in the form of video.

Python 123 4 Updated Oct 7, 2025

flashinfer-ai / cubloaty

a size profiler for cuda binary

Python 48 Updated Oct 7, 2025

ezyang / torch-profile-compare

easily compare pytorch json profiles. vibe coded

Shell 3 Updated Oct 1, 2025

Zyphra / Zonos

Zonos-v0.1 is a leading open-weight text-to-speech model trained on more than 200k hours of varied multilingual speech, delivering expressiveness and quality on par with—or even surpassing—top TTS …

Python 7,067 811 Updated Mar 5, 2025

MoonshotAI / Kimina-Prover-Preview

Technical report of Kimina-Prover Preview.

Python 335 13 Updated Jul 10, 2025

dc-ai-projects / DC-Gen

DC-Gen: Post-Training Diffusion Acceleration with Deeply Compressed Latent Space

Python 229 6 Updated Oct 5, 2025

ColfaxResearch / cfx-article-src

C++ 148 29 Updated May 7, 2025

microsoft / VibeVoice

Frontier Open-Source Text-to-Speech

9,537 1,174 Updated Sep 5, 2025

microsoft / rStar

Python 1,294 115 Updated Sep 12, 2025

cherichy / cubin_caller

call cubin from python or julia.

Python 1 Updated Apr 9, 2025

cherichy / tilecute

C++ 31 2 Updated Jul 2, 2025

umd-memsys / DRAMSim2

DRAMSim2: A cycle accurate DRAM simulator

C++ 283 155 Updated Nov 11, 2020

tensorchord / VectorChord

Scalable, fast, and disk-friendly vector search in Postgres, the successor of pgvecto.rs.

Rust 1,165 38 Updated Oct 10, 2025

ai-dynamo / aiperf

AIPerf is a comprehensive benchmarking tool that measures the performance of generative AI models served by your preferred inference solution.

Python 18 2 Updated Oct 11, 2025

ChenMnZ / PrefixQuant

An algorithm for weight-activation quantization (W4A4, W4A8) of LLMs, supporting both static and dynamic quantization

Python 153 13 Updated May 22, 2025

thinking-machines-lab / tinker-cookbook

Post-training with Tinker

Python 934 62 Updated Oct 10, 2025

IST-DASLab / llmq

Quantized LLM training in pure CUDA/C++.

C++ 185 7 Updated Oct 10, 2025

facebookexperimental / triton

Github mirror of trition-lang/triton repo.

MLIR 84 20 Updated Oct 11, 2025

deepseek-ai / DeepSeek-V3.2-Exp

Python 863 49 Updated Oct 2, 2025

MoonshotAI / Kimi-K2

Kimi K2 is the large language model series developed by Moonshot AI team

8,317 549 Updated Sep 11, 2025

jax-ml / scaling-book

Home for "How To Scale Your Model", a short blog-style textbook about scaling LLMs on TPUs

HTML 655 93 Updated Oct 6, 2025

NVlabs / Jet-Nemotron

Python 685 41 Updated Oct 2, 2025

IST-DASLab / gptq-gguf-toolkit

GPTQ and efficient search for GGUF

Python 51 4 Updated Sep 17, 2025

HydraQYH / cute_reduce

Reduce kernel based on CUTLASS CuTe and TMA.

Cuda 9 Updated Sep 25, 2025

svg-project / flash-kmeans

Fast and memory-efficient exact kmeans

Python 100 6 Updated Sep 30, 2025

MoonshotAI / K2-Vendor-Verifier

Verify Precision of all Kimi K2 API Vendor

Python 194 8 Updated Oct 10, 2025

AlmondGod / tinyworlds

A minimal implementation of DeepMind's Genie world model

Python 960 64 Updated Sep 28, 2025

facebookresearch / cwm

Research code artifacts for Code World Model (CWM) including inference tools, reproducibility, and documentation.

Python 655 52 Updated Sep 24, 2025

Starred topics

Coq

Homebrew

点击这是indexloc提供的php浏览器服务，不要输入任何密码和下载