Starred repositories
Scalable toolkit for efficient model reinforcement
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
Code for "Learning Diverse Attacks on Large Language Models for Robust Red-Teaming and Safety Tuning", ICLR 2025.
SkyRL: A Modular Full-stack RL Library for LLMs
Training teachers with reinforcement learning able to make LLMs learn how to reason for test time scaling.
Darwin Gödel Machine: Open-Ended Evolution of Self-Improving Agents
Model Merging in LLMs, MLLMs, and Beyond: Methods, Theories, Applications and Opportunities. arXiv:2408.07666.
Search-R1: An Efficient, Scalable RL Training Framework for Reasoning & Search Engine Calling interleaved LLM based on veRL
A reading list on LLM based Synthetic Data Generation 🔥
Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models
Democratizing Reinforcement Learning for LLMs
Official Repository of Absolute Zero Reasoner
L1: Controlling How Long A Reasoning Model Thinks With Reinforcement Learning
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
Sky-T1: Train your own O1 preview model within $450
Recent research papers about Foundation Models for Combinatorial Optimization
This is the reading list for the survey "A Survey on the Optimization of LLM-based Agents ". We will keep adding papers and improving the list. Any suggestions and PRs are welcome!
GenAI for Optimization and Decision Intelligence
GPU-optimized version of the MuJoCo physics simulator, designed for NVIDIA hardware.
A collection of agents that use Large Language Models (LLMs) to perform tasks common on our day to day jobs in cyber security.
An open-source, code-first Python toolkit for building, evaluating, and deploying sophisticated AI agents with flexibility and control.
Research and development (R&D) is crucial for the enhancement of industrial productivity, especially in the AI era, where the core aspects of R&D are mainly focused on data and models. We are commi…
Mirror of official Lustre development repository http://git.whamcloud.com/
Code for the Fractured Entangled Representation Hypothesis position paper!
This repository contains related work, benchmarks and datasets for the paper "Large Language Models in Finance (FinLLMs)", currently under review.
Repository hosting code for "Actions Speak Louder than Words: Trillion-Parameter Sequential Transducers for Generative Recommendations" (https://arxiv.org/abs/2402.17152).
An open protocol enabling communication and interoperability between opaque agentic applications.
Ceph is a distributed object, block, and file storage platform