Stars
A powerful framework for building realtime voice AI agents 🤖🎙️📹
What are the principles we can use to build LLM-powered software that is actually good enough to put in the hands of production customers?
A lightweight end-of-utterance detection model fine-tuned on SmolLM2-135M, optimized for Raspberry Pi and low-power devices.
This is a simple demonstration of more advanced, agentic patterns built on top of the Realtime API.
[ICML'24 Oral] "MagicLens: Self-Supervised Image Retrieval with Open-Ended Instructions"
Use Segment Anything 2, grounded with Florence-2, to auto-label data for use in training vision models.
A GPT-4 AI Tutor Prompt for customizable personalized learning experiences.