A speech-to-text application for macOS that allows you to dictate into any application using global hotkeys and AI-enhanced text processing.
Open Voice is a Mac desktop application that transforms speech into text and intelligently inserts it into any application you're using. Simply press Cmd+Option, speak naturally, and watch your words appear in your current text field with AI-powered enhancements.
It's heavily influenced by Aqua Voice - but the idea here is that you can have the same experience while enjoying BYOK (bring your own key) and pay-as-you-go pricing across multiple providers.
Open Voice supports multiple AI providers, each offering both speech-to-text and LLM processing capabilities. These are all plug-and-play, and just require you to get a key:
Providers:
There is also a Local Whisper option which runs locally, without LLM processing, requiring no setup.
- Python 3.9+
-
Clone the repository:
git clone https://github.com/0xToshii/open-voice.git cd open-voice -
Create a virtual environment (recommended):
python -m venv myenv source myenv/bin/activate # On macOS/Linux
-
Install Python dependencies:
pip install -r requirements.txt
-
Run the application:
python main.py
-
Grant permissions when prompted:
- Microphone access
- Accessibility permissions
- Start the app - The GUI will open on the settings page
- Add a key to desired provider - Optionally select your provider of choice from the dropdown and paste key
- Press and hold Cmd+Option - Recording overlay appears at bottom of screen
- Speak clearly - Audio waveform shows real-time levels
- Release Cmd+Option - Speech is transcribed and inserted into your current app
- View results - Check the transcript history in the main window
If desired, customize how your speech is processed by the LLM before being pasted:
- Go to Settings → Custom Instructions
- Add prompts like:
- "Use all lowercase words"
- "Never include punctuation at the end of sentences"