Stars
An agentic solution for creating for any PDF file a hierarchical page index optimized for consumption by LLMs
Wan: Open and Advanced Large-Scale Video Generative Models
This is a repo with links to everything you'd ever want to learn about data engineering
A Python library for extracting structured information from unstructured text using LLMs with precise source grounding and interactive visualization.
mcp-use is the easiest way to interact with mcp servers with custom agents
R Extension for Visual Studio Code
GenAI Processors is a lightweight Python library that enables efficient, parallel content processing.
MemOS (Preview) | Intelligence Begins with Memory
Databricks framework to validate Data Quality of pySpark DataFrames
The AI coding agent built for the terminal.
What are the principles we can use to build LLM-powered software that is actually good enough to put in the hands of production customers?
Context engineering is the new vibe coding - it's the way to actually make AI coding assistants work. Claude Code is the best for this so that's what this repo is centered around, but you can apply…
AutoEvals is a tool for quickly and easily evaluating AI model outputs using best practices.
All the open source AI Agents hosted on the oTTomator Live Agent Studio platform!
Run all your local AI together in one package - Ollama, Supabase, n8n, Open WebUI, and more!
An open-source AI agent that brings the power of Gemini directly into your terminal.
A lightweight LMM-based Document Parsing Model
Open-Source Chrome extension for AI-powered web automation. Run multi-agent workflows using your own LLM API key. Alternative to OpenAI Operator.
The official repo for “Dolphin: Document Image Parsing via Heterogeneous Anchor Prompting”, ACL, 2025.
The fastest way to create an HTML app
A powerful AI coding agent. Built for the terminal.
Accelerates migrations to Databricks by automating key migration activities
WikiChat is an improved RAG. It stops the hallucination of large language models by retrieving data from a corpus.
PPTAgent: Generating and Evaluating Presentations Beyond Text-to-Slides [EMNLP 2025]
[NeurIPS 2025] Open-source Multi-agent Poster Generation from Papers
📄🧠 PageIndex: Document Index for Reasoning-based RAG
Python + Markdown framework for building internal apps.