- Bay Area, CA
-
23:23
(UTC -12:00) - https://peggywang0.github.io/
Starred repositories
The power of Claude Code + [Gemini / OpenAI / Grok / OpenRouter / Ollama / Custom Model / All Of The Above] working as one.
Qwen2.5-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
The Open All-in-One Multimodal AI Agent Stack connecting Cutting-edge AI Models and Agent Infra.
A TTS model capable of generating ultra-realistic dialogue in one pass.
Implementing DeepSeek R1's GRPO algorithm from scratch
c/ua is the Docker Container for Computer-Use AI Agents.
Spongecake is the easiest way to launch computer use agents.
🌐 Make websites accessible for AI agents. Automate tasks online with ease.
We write your reusable computer vision tools. 💜
A react-based starter app for using the Live API over websockets with Gemini
Preswald is a WASM packager for Python-based interactive data apps: bundle full complex data workflows, particularly visualizations, into single files, runnable completely in-browser, using Pyodide…
openvla / openvla
Forked from TRI-ML/prismatic-vlmsOpenVLA: An open-source vision-language-action model for robotic manipulation.
A flexible and efficient codebase for training visually-conditioned language models (VLMs)
PyTorch code for Vision Transformers training with the Self-Supervised learning method DINO
A natural language interface for computers
AI demo for playing ARPG/Soul-like game with RL frame
a simple project to beat boss in Blackmyth Wukong, using yolo8 to detect boss movement and a script to react to certain detections
Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train Qwen3, Llama 4, DeepSeek-R1, Gemma 3, TTS 2x faster with 70% less VRAM.
ASCII generator (image to text, image to image, video to video)
Out-of-the-box (OOTB) GUI Agent for Windows and macOS
Run your own AI cluster at home with everyday devices 📱💻 🖥️⌚
Autonomous coding agent right in your IDE, capable of creating/editing files, executing commands, using the browser, and more with your permission every step of the way.
Enhanced ChatGPT Clone: Features Agents, DeepSeek, Anthropic, AWS, OpenAI, Responses API, Azure, Groq, o1, GPT-4o, Mistral, OpenRouter, Vertex AI, Gemini, Artifacts, AI model switching, message sea…
Convert any PDF into a podcast episode!