Starred repositories
Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads
Turn expensive prompts into cheap fine-tuned models
MTEB: Massive Text Embedding Benchmark
Convert documents to structured data effortlessly. Unstructured is open-source ETL solution for transforming complex documents into clean, structured formats for language models. Visit our website …
Code and models for ICML 2024 paper, NExT-GPT: Any-to-Any Multimodal Large Language Model
PyTorch code and models for VJEPA2 self-supervised learning from video.
Textbook on reinforcement learning from human feedback
Distributed Compiler based on Triton for Parallel Systems
A mixed-curvature approach to deal with transformer representation anisotropy
Implementing DeepSeek R1's GRPO algorithm from scratch
An example starter repo for Python projects
Multi-backend recommender systems with Keras 3
Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation
Code for studying the super weight in LLM
Notes from the Latent Space paper club. Follow along or start your own!
supporting pytorch FSDP for optimizers
The Programmable Cypher-based Neuro-Symbolic AGI that lets you program its behavior using Graph-based Prompt Programming: for people who want AI to behave as expected
Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.
Entropy Based Sampling and Parallel CoT Decoding
Official code for "RB-Modulation: Training-Free Personalization of Diffusion Models using Stochastic Optimal Control"
Codes for "Chameleon: Plug-and-Play Compositional Reasoning with Large Language Models".
An extremely fast Python package and project manager, written in Rust.
The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery 🧑🔬
Official inference repo for FLUX.1 models