-
TII
- Abu Dhabi
-
11:50
(UTC -12:00) - @akanyaani
- in/akanyaani
Stars
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
DeerFlow is a community-driven Deep Research framework, combining language models with tools like web search, crawling, and Python execution, while contributing back to the open-source community.
Free online textbook of Jupyter notebooks for fast.ai Computational Linear Algebra course
huggingface / yourbench
Forked from sumukshashidhar/yourbench🤗 Benchmark Large Language Models Reliably On Your Data
Minimalistic 4D-parallelism distributed training framework for education purpose
Ongoing research training transformer models at scale
A TTS model capable of generating ultra-realistic dialogue in one pass.
Development repository for the Triton language and compiler
Official implementation of the paper: "ZClip: Adaptive Spike Mitigation for LLM Pre-Training".
Official implementation of the paper: "A Refined Analysis of Massive Activations in LLMs".
Official implementation of the "Variance control via weight rescaling in LLM pretraining" paper.
A high-performance distributed file system designed to address the challenges of AI training and inference workloads.
DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling
Annotations of the interesting ML papers I read
Pretrain, finetune ANY AI model of ANY size on 1 or 10,000+ GPUs with zero code changes.
Flexible and powerful tensor operations for readable and reliable code (for pytorch, jax, TF and others)
Efficient Triton Kernels for LLM Training
A tiny scalar-valued autograd engine and a neural net library on top of it with PyTorch-like API
Write scalable load tests in plain Python 🚗💨
Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.
A simplified LLAMA implementation for training and inference tasks.
Accessible large language models via k-bit quantization for PyTorch.