+
Skip to content

guberm/AIPex

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

98 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🤖 ChromeAiAgent v0.1.0 - AI-Powered Browser Automation Extension

Automate your browser with natural language commands - Open source browser AI agent solution

Chrome Web Store Version GitHub stars GitHub forks GitHub issues GitHub pull requests License: MIT TypeScript React

**⭐ Star this repo if you find it helpful! ⭐**

Chrome Web Store

🤖 What is ChromeAiAgent?

ChromeAiAgent is a revolutionary Chrome extension that transforms your browser into an intelligent automation platform. Using natural language commands and AI-powered intelligence, ChromeAiAgent can automate virtually any browser task - from complex multi-step workflows to simple repetitive actions.

🆕 What's New in v0.1.0?

🔍 Local AI Server Detection & Management

  • Automatic Discovery: Instantly detects 6 popular local AI servers (Ollama, LM Studio, Jan.ai, LocalAI, Text-Gen-WebUI, GPT4All)
  • Real-Time Status: Live monitoring of server status and model availability
  • One-Click Switching: Switch between AI models with automatic configuration
  • Smart UI: Visual indicators showing server status and available model count
  • Enhanced UX: Modern gradients, improved error handling, and contextual help

Local AI Detection

🎯 Why Choose ChromeAiAgent for Browser Automation?

  • 🧠 Natural Language Control: Command your browser in plain English - no coding required
  • 🤖 AI-Powered Intelligence: 30+ MCP tools that understand context and adapt to your needs
  • Multi-Step Automation: Execute complex workflows with single commands
  • 🔄 Universal Compatibility: Works with any website - no special setup needed
  • 📊 Smart Data Extraction: Automatically collect and organize information from web pages
  • 🎯 Precision Actions: Click, fill, scroll, and interact with elements using AI vision
  • 📝 Form Automation: Fill out forms, submit data, and handle complex interactions
  • 🖼️ Visual Understanding: AI can see and understand page content for intelligent automation
  • 🔧 Developer Friendly: Open source with extensive API for custom automation
  • 🚀 Lightning Fast: Execute automation tasks in seconds, not minutes

✨ Core Automation Features

📊 Intelligent Data Extraction

  • Smart Content Analysis: Extract structured data from any webpage
  • Price Monitoring: Track prices across multiple e-commerce sites
  • Research Automation: Gather information from multiple sources automatically

Data Extraction

🎯 Precision Element Interaction

  • Visual Element Detection: AI can see and interact with page elements
  • Form Automation: Fill out complex forms with intelligent field mapping
  • Dynamic Content Handling: Adapt to changing page layouts and content

Element Interaction

📝 Content Processing & Analysis

  • Text Highlighting & Summarization: Automatically highlight and summarize important content
  • Document Processing: Extract and organize information from web documents
  • Smart Note-Taking: Capture and organize insights from web browsing

Content Processing

🗂️ Advanced Tab & Window Management

  • AI-Powered Organization: Automatically group and organize tabs by topic
  • Smart Tab Switching: Find and switch between tabs using natural language
  • Multi-Window Coordination: Manage complex workflows across multiple browser windows

Tab Management

🤖 Multiple LLM Provider Support

ChromeAiAgent supports multiple AI providers, giving you flexibility in choosing the best model for your needs:

☁️ Cloud Providers

  • GitHub Models (Free tier available) - GPT-5, Claude, Llama models
  • OpenAI - GPT-4.1, GPT-4o, latest models
  • Anthropic Claude - Claude 3.5 Sonnet, advanced reasoning
  • Google Gemini - Multimodal capabilities
  • DeepSeek - Cost-effective, strong reasoning
  • Azure OpenAI - Enterprise-grade

🏠 Local Providers

  • Ollama - Run models locally, complete privacy
  • LM Studio - User-friendly local AI interface
  • LocalAI - Self-hosted OpenAI-compatible API
  • Text Generation WebUI - Community solution
  • Custom API - Connect any OpenAI-compatible endpoint

📚 See LLM-PROVIDERS.md for detailed setup instructions

🚀 Getting Started

Quick Start

  1. Open ChromeAiAgent

    • Press ⌘+M (Mac) or Ctrl+M (Windows/Linux)
    • Or click the ChromeAiAgent icon in your toolbar
  2. Start Automating

    • Type /ai to start AI automation chat
    • Use natural language: "Click the login button", "Fill out this form"
    • Try complex workflows: "Research React best practices and save to notes"

🛠️ Development & Contributing

We love contributions! Here's how you can help make ChromeAiAgent even better:

📖 For detailed development setup, build instructions, and contribution guidelines, please see DEVELOPMENT.md

Quick Start for Contributors

📊 Tool Categories Overview

🗂️ Tab Management - 8 tools

Complete tab control and navigation:

  • get_all_tabs - Get all open tabs across all windows
  • get_current_tab - Get information about the currently active tab
  • switch_to_tab - Switch to a specific tab by ID
  • create_new_tab - Create a new tab with the specified URL
  • get_tab_info - Get detailed information about a specific tab
  • duplicate_tab - Duplicate an existing tab
  • close_tab - Close a specific tab
  • get_current_tab_content - Get the visible text content of the current tab
📄 Page Content & Interaction - 15 tools

Content extraction, analysis, and page interaction:

  • get_page_metadata - Get page metadata including title, description, keywords
  • extract_page_text - Extract text content with word count and reading time
  • get_page_links - Get all links from the current page
  • search_page_text - Search for text on the current page
  • get_interactive_elements - Get all interactive elements (links, buttons, inputs)
  • get_interactive_elements_optimized - Optimized version for complex pages
  • click_element - Click an element using CSS selector
  • summarize_page - Summarize page content with key points
  • fill_input - Fill an input field with text
  • clear_input - Clear the content of an input field
  • get_input_value - Get the current value of an input field
  • submit_form - Submit a form using CSS selector
  • get_form_elements - Get all form elements and input fields
  • scroll_to_element - Scroll to a DOM element and center it
  • highlight_element - Permanently highlight DOM elements
  • highlight_text_inline - Highlight specific words or phrases within text
⬇️ Downloads & Files - 4 tools

Download control and file management:

  • download_text_as_markdown - Download text content as markdown file
  • download_image - Download an image from base64 data
  • download_chat_images - Download multiple images from chat messages
  • download_current_chat_images - Download all images from current AI chat
📸 Screenshots - 3 tools

Visual capture and screenshot management:

  • capture_screenshot - Capture screenshot of current visible tab
  • capture_tab_screenshot - Capture screenshot of a specific tab by ID
  • capture_screenshot_to_clipboard - Capture screenshot and save to clipboard
🔧 Advanced Features - 3+ tools

Advanced browser automation and utilities:

  • Additional specialized tools for enhanced browser control
  • AI-powered content analysis and processing
  • Custom automation workflows

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Support & Community

🏆 Contributors

Thank you to all the amazing contributors who help make ChromeAiAgent better:


ropzislaw

56 commits

Codexiaoyi

10 commits

guberm

5 commits

Total Contributors: 3 | Total Commits: 71


Want to contribute? Check out our Contributing Guide and help make ChromeAiAgent even better!

🌟 Star History

Star History Chart


Made with ❤️ by the ChromeAiAgent Team

GitHub

About

AIPex: open claude for chrome, automate your browser

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • TypeScript 97.1%
  • JavaScript 1.6%
  • Other 1.3%
点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载