🤖 ChromeAiAgent v0.1.0 - AI-Powered Browser Automation Extension

Automate your browser with natural language commands - Open source browser AI agent solution

**⭐ Star this repo if you find it helpful! ⭐**

🤖 What is ChromeAiAgent?

ChromeAiAgent is a revolutionary Chrome extension that transforms your browser into an intelligent automation platform. Using natural language commands and AI-powered intelligence, ChromeAiAgent can automate virtually any browser task - from complex multi-step workflows to simple repetitive actions.

🆕 What's New in v0.1.0?

🔍 Local AI Server Detection & Management

Automatic Discovery: Instantly detects 6 popular local AI servers (Ollama, LM Studio, Jan.ai, LocalAI, Text-Gen-WebUI, GPT4All)
Real-Time Status: Live monitoring of server status and model availability
One-Click Switching: Switch between AI models with automatic configuration
Smart UI: Visual indicators showing server status and available model count
Enhanced UX: Modern gradients, improved error handling, and contextual help

🎯 Why Choose ChromeAiAgent for Browser Automation?

🧠 Natural Language Control: Command your browser in plain English - no coding required
🤖 AI-Powered Intelligence: 30+ MCP tools that understand context and adapt to your needs
⚡ Multi-Step Automation: Execute complex workflows with single commands
🔄 Universal Compatibility: Works with any website - no special setup needed
📊 Smart Data Extraction: Automatically collect and organize information from web pages
🎯 Precision Actions: Click, fill, scroll, and interact with elements using AI vision
📝 Form Automation: Fill out forms, submit data, and handle complex interactions
🖼️ Visual Understanding: AI can see and understand page content for intelligent automation
🔧 Developer Friendly: Open source with extensive API for custom automation
🚀 Lightning Fast: Execute automation tasks in seconds, not minutes

✨ Core Automation Features

📊 Intelligent Data Extraction

Smart Content Analysis: Extract structured data from any webpage
Price Monitoring: Track prices across multiple e-commerce sites
Research Automation: Gather information from multiple sources automatically

🎯 Precision Element Interaction

Visual Element Detection: AI can see and interact with page elements
Form Automation: Fill out complex forms with intelligent field mapping
Dynamic Content Handling: Adapt to changing page layouts and content

📝 Content Processing & Analysis

Text Highlighting & Summarization: Automatically highlight and summarize important content
Document Processing: Extract and organize information from web documents
Smart Note-Taking: Capture and organize insights from web browsing

🗂️ Advanced Tab & Window Management

AI-Powered Organization: Automatically group and organize tabs by topic
Smart Tab Switching: Find and switch between tabs using natural language
Multi-Window Coordination: Manage complex workflows across multiple browser windows

🤖 Multiple LLM Provider Support

ChromeAiAgent supports multiple AI providers, giving you flexibility in choosing the best model for your needs:

☁️ Cloud Providers

GitHub Models (Free tier available) - GPT-5, Claude, Llama models
OpenAI - GPT-4.1, GPT-4o, latest models
Anthropic Claude - Claude 3.5 Sonnet, advanced reasoning
Google Gemini - Multimodal capabilities
DeepSeek - Cost-effective, strong reasoning
Azure OpenAI - Enterprise-grade

🏠 Local Providers

Ollama - Run models locally, complete privacy
LM Studio - User-friendly local AI interface
LocalAI - Self-hosted OpenAI-compatible API
Text Generation WebUI - Community solution
Custom API - Connect any OpenAI-compatible endpoint

📚 See LLM-PROVIDERS.md for detailed setup instructions

🚀 Getting Started

Quick Start

Open ChromeAiAgent
- Press ⌘+M (Mac) or Ctrl+M (Windows/Linux)
- Or click the ChromeAiAgent icon in your toolbar
Start Automating
- Type /ai to start AI automation chat
- Use natural language: "Click the login button", "Fill out this form"
- Try complex workflows: "Research React best practices and save to notes"

🛠️ Development & Contributing

We love contributions! Here's how you can help make ChromeAiAgent even better:

📖 For detailed development setup, build instructions, and contribution guidelines, please see DEVELOPMENT.md

Quick Start for Contributors

🏗️ Local Development: See DEVELOPMENT.md#local-development-setup
🔧 Building: See DEVELOPMENT.md#building-for-production
🤝 Contributing: See DEVELOPMENT.md#how-to-contribute
📊 Project Status: See DEVELOPMENT.md#development-status

📊 Tool Categories Overview

🗂️ Tab Management - 8 tools

Complete tab control and navigation:

get_all_tabs - Get all open tabs across all windows
get_current_tab - Get information about the currently active tab
switch_to_tab - Switch to a specific tab by ID
create_new_tab - Create a new tab with the specified URL
get_tab_info - Get detailed information about a specific tab
duplicate_tab - Duplicate an existing tab
close_tab - Close a specific tab
get_current_tab_content - Get the visible text content of the current tab

📄 Page Content & Interaction - 15 tools

Content extraction, analysis, and page interaction:

get_page_metadata - Get page metadata including title, description, keywords
extract_page_text - Extract text content with word count and reading time
get_page_links - Get all links from the current page
search_page_text - Search for text on the current page
get_interactive_elements - Get all interactive elements (links, buttons, inputs)
get_interactive_elements_optimized - Optimized version for complex pages
click_element - Click an element using CSS selector
summarize_page - Summarize page content with key points
fill_input - Fill an input field with text
clear_input - Clear the content of an input field
get_input_value - Get the current value of an input field
submit_form - Submit a form using CSS selector
get_form_elements - Get all form elements and input fields
scroll_to_element - Scroll to a DOM element and center it
highlight_element - Permanently highlight DOM elements
highlight_text_inline - Highlight specific words or phrases within text

⬇️ Downloads & Files - 4 tools

Download control and file management:

download_text_as_markdown - Download text content as markdown file
download_image - Download an image from base64 data
download_chat_images - Download multiple images from chat messages
download_current_chat_images - Download all images from current AI chat

📸 Screenshots - 3 tools

Visual capture and screenshot management:

capture_screenshot - Capture screenshot of current visible tab
capture_tab_screenshot - Capture screenshot of a specific tab by ID
capture_screenshot_to_clipboard - Capture screenshot and save to clipboard

🔧 Advanced Features - 3+ tools

Advanced browser automation and utilities:

Additional specialized tools for enhanced browser control
AI-powered content analysis and processing
Custom automation workflows

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Support & Community

🐛 Found a bug? Open an issue
💡 Have a feature request? Start a discussion
🤝 Want to contribute? See our Contributing Guide
💬 Need help? Join our community discussions

🏆 Contributors

Thank you to all the amazing contributors who help make ChromeAiAgent better:

_ropzislaw
_{56 commits}

_Codexiaoyi
_{10 commits}

_guberm
_{5 commits}

Total Contributors: 3 | Total Commits: 71

Want to contribute? Check out our Contributing Guide and help make ChromeAiAgent even better!

🌟 Star History

Made with ❤️ by the ChromeAiAgent Team

Name		Name	Last commit message	Last commit date
Latest commit History 98 Commits
.cursor/rules		.cursor/rules
.github		.github
.vscode		.vscode
assets		assets
gif		gif
src		src
types		types
.gitignore		.gitignore
.prettierrc.mjs		.prettierrc.mjs
CHANGELOG.md		CHANGELOG.md
DEMO-LOCAL-AI.md		DEMO-LOCAL-AI.md
DEVELOPMENT.md		DEVELOPMENT.md
IMPLEMENTATION-SUMMARY.md		IMPLEMENTATION-SUMMARY.md
LICENSE		LICENSE
LLM-PROVIDERS.md		LLM-PROVIDERS.md
LOCAL-AI-DETECTION.md		LOCAL-AI-DETECTION.md
LOCAL-AI-ENHANCED.md		LOCAL-AI-ENHANCED.md
LOCAL-AI-GUIDE.md		LOCAL-AI-GUIDE.md
LOCAL-AI-IMPLEMENTATION-SUMMARY.md		LOCAL-AI-IMPLEMENTATION-SUMMARY.md
LOCALIZATION-UPDATE.md		LOCALIZATION-UPDATE.md
OLLAMA-INTEGRATION.md		OLLAMA-INTEGRATION.md
README.en.md		README.en.md
README.md		README.md
TYPESCRIPT_FIXES_SUMMARY.md		TYPESCRIPT_FIXES_SUMMARY.md
VERSION-UPDATE-SUMMARY.md		VERSION-UPDATE-SUMMARY.md
create_icons.py		create_icons.py
formatted_manifest.json		formatted_manifest.json
manifest.json		manifest.json
manifest.template.json		manifest.template.json
manual-sw-test.js		manual-sw-test.js
ollama-manual-test.js		ollama-manual-test.js
package.json		package.json
plasmo.config.ts		plasmo.config.ts
pnpm-lock.yaml		pnpm-lock.yaml
postcss.config.js		postcss.config.js
tailwind.config.js		tailwind.config.js
test-local-ai.js		test-local-ai.js
test-ollama-direct.js		test-ollama-direct.js
tsconfig.json		tsconfig.json
typescript-fixes.md		typescript-fixes.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🤖 ChromeAiAgent v0.1.0 - AI-Powered Browser Automation Extension

🤖 What is ChromeAiAgent?