Probe

Probe is an AI-friendly, fully local, semantic code search tool designed to power the next generation of AI coding assistants. By combining the speed of ripgrep with the code-aware parsing of tree-sitter, Probe delivers precise results with complete code blocks—perfect for large codebases and AI-driven development workflows.

Quick Start

Basic Search Example Search for code containing the phrase "llm pricing" in the current directory:

probe search "llm pricing" ./

Advanced Search (with Token Limiting) Search for "partial prompt injection" in the current directory but limit the total tokens to 10000 (useful for AI tools with context window constraints):

probe search "prompt injection" ./ --max-tokens 10000

Interactive AI Chat Use the built-in AI assistant to ask questions about your codebase:

# Set your API key first
export ANTHROPIC_API_KEY=your_api_key
# Then start the chat interface
probe chat

MCP server

Integrate with any AI editor:

{
  "mcpServers": {
    "memory": {
      "command": "npx",
      "args": [
        "-y",
        "@buger/probe-mcp"
      ]
    }
  }
}

Example queries:

"Do the probe and search my codebase for implementations of the ranking algorithm"

"Using probe find all functions related to error handling in the src directory"

Features

AI-Friendly: Extracts entire functions, classes, or structs so AI models get full context.
Fully Local: Keeps your code on your machine—no external APIs.
Powered by ripgrep: Extremely fast scanning of large codebases.
Tree-sitter Integration: Parses and understands code structure accurately.
Re-Rankers & NLP: Uses tokenization, stemming, BM25, TF-IDF, or hybrid ranking methods for better search results.
Multi-Language: Works with popular languages like Rust, Python, JavaScript, TypeScript, Java, Go, C/C++, Swift, C#, and more.
Interactive AI Chat: Built-in AI assistant that can answer questions about your codebase using Claude or GPT models.
Flexible: Run as a CLI tool, an MCP server, or an interactive AI chat.

Installation

Quick Installation

You can install Probe with a single command:

curl -fsSL https://raw.githubusercontent.com/buger/probe/main/install.sh | bash

What this script does:

Detects your operating system and architecture
Fetches the latest release from GitHub
Downloads the appropriate binary for your system
Verifies the checksum for security
Installs the binary to /usr/local/bin

Requirements

Operating Systems: macOS, Linux, or Windows (with MSYS/Git Bash/WSL)
Architectures: x86_64 (all platforms) or ARM64 (macOS only)
Tools: curl, bash, and sudo/root privileges

Manual Installation

Download the appropriate binary for your platform from the GitHub Releases page:
- probe-x86_64-linux.tar.gz for Linux (x86_64)
- probe-x86_64-darwin.tar.gz for macOS (Intel)
- probe-aarch64-darwin.tar.gz for macOS (Apple Silicon)
- probe-x86_64-windows.zip for Windows

Extract the archive:

# For Linux/macOS
tar -xzf probe-*-*.tar.gz

# For Windows
unzip probe-x86_64-windows.zip

Move the binary to a location in your PATH:

# For Linux/macOS
sudo mv probe /usr/local/bin/

# For Windows
# Move probe.exe to a directory in your PATH

Building from Source

Install Rust and Cargo (if not already installed):

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

Clone this repository:

git clone https://github.com/buger/probe.git
cd code-search

Build the project:
```
cargo build --release
```
(Optional) Install globally:
```
cargo install --path .
```

Verifying the Installation

probe --version

Troubleshooting

Permissions: Ensure you can write to /usr/local/bin.
System Requirements: Double-check your OS/architecture.
Manual Install: If the quick install script fails, try Manual Installation.
GitHub Issues: Report issues on the GitHub repository.

Uninstalling

sudo rm /usr/local/bin/probe

Usage

Probe can be used in three main modes:

CLI Mode: Direct code search from the command line
MCP Server Mode: Run as a server exposing search functionality via MCP
AI Chat Mode: Interactive AI assistant for code exploration

CLI Mode

probe search <SEARCH_PATTERN> [OPTIONS]

Key Options

<SEARCH_PATTERN>: Pattern to search for (required)
--files-only: Skip AST parsing; only list files with matches
--ignore: Custom ignore patterns (in addition to .gitignore)
--include-filenames, -n: Include files whose names match query words
--reranker, -r: Choose a re-ranking algorithm (hybrid, hybrid2, bm25, tfidf)
--frequency, -s: Frequency-based search (tokenization, stemming, stopword removal)
--exact: Exact matching (overrides frequency search)
--max-results: Maximum number of results to return
--max-bytes: Maximum total bytes of code to return
--max-tokens: Maximum total tokens of code to return (useful for AI)
--allow-tests: Include test files and test code blocks
--any-term: Match files containing any query terms (default is all terms)
--no-merge: Disable merging of adjacent code blocks after ranking (merging enabled by default)
--merge-threshold: Max lines between code blocks to consider them adjacent for merging (default: 5)

Examples

# 1) Search for "setTools" in the current directory with frequency-based search
probe search "setTools"

# 2) Search for "impl" in ./src with exact matching
probe search "impl"  ./src --exact

# 3) Search for "keyword" returning only the top 5 results
probe search "keyword" --max-tokens 10000

# 4) Search for "function" and disable merging of adjacent code blocks
probe search "function" --no-merge


### MCP Server

Add the following to your AI editor's MCP configuration file:
  
  ~~~json
  {
    "mcpServers": {
      "memory": {
        "command": "npx",
        "args": [
          "-y",
          "@buger/probe-mcp"
        ]
      }
    }
  }

Example Usage in AI Editors:

Once configured, you can ask your AI assistant to search your codebase with natural language queries like:

"Do the probe and search my codebase for implementations of the ranking algorithm"

"Using probe find all functions related to error handling in the src directory"

AI Chat Mode

Run Probe as an interactive AI assistant:

probe chat

This starts an interactive CLI interface where you can ask questions about your codebase and get AI-powered responses.

Features

AI-Powered Search: Uses LLMs to understand your questions and search the codebase intelligently
Multi-Model Support: Works with both Anthropic's Claude and OpenAI's GPT models
Token Tracking: Monitors token usage for both requests and responses
Conversation History: Maintains context across multiple interactions
Colored Output: Provides a user-friendly terminal interface with syntax highlighting

Configuration

Configure the chat using environment variables:

# Use Claude models (recommended)
export ANTHROPIC_API_KEY=your_api_key
probe chat

# Or use OpenAI models
export OPENAI_API_KEY=your_api_key
probe chat

# Override the default model
export MODEL_NAME=claude-3-opus-20240229
probe chat

# Override API URLs (useful for proxies or enterprise deployments)
export ANTHROPIC_API_URL=https://your-anthropic-proxy.com
export OPENAI_API_URL=https://your-openai-proxy.com/v1
probe chat

# Enable debug mode for detailed logging
export DEBUG=1 probe chat

Example Usage

❯ How does the ranking algorithm work?

─ Response ─────────────────────────────────────────────────────────
I'll explain how the ranking algorithm works in the codebase.

The ranking system in Probe uses multiple algorithms to sort search results by relevance. The main ranking algorithms are:

1. TF-IDF (Term Frequency-Inverse Document Frequency)
2. BM25 (Best Matching 25)
3. Hybrid (a combination approach)

Let me break down each approach:

## TF-IDF Ranking

TF-IDF weighs terms based on how frequently they appear in a document versus how common they are across all documents.

Key implementation details:
- Term frequency (TF) measures how often a term appears in a document
- Inverse document frequency (IDF) measures how rare a term is across all documents
- Final score is calculated as TF × IDF

## BM25 Ranking

BM25 is an advanced ranking function that improves upon TF-IDF by adding document length normalization.

Key implementation details:
- Uses parameters k1 (term frequency saturation) and b (document length normalization)
- Handles edge cases like empty documents and rare terms
- Provides better results for longer documents

## Hybrid Ranking

The hybrid approach combines multiple ranking signals for better results:

1. Combines scores from both TF-IDF and BM25
2. Considers document length and term positions
3. Applies normalization to ensure fair comparison

The default reranker is "hybrid" which provides the best overall results for code search.

The ranking implementation can be found in `src/search/result_ranking.rs`.
─────────────────────────────────────────────────────────────────────
Token Usage: Request: 1245 Response: 1532 (Current message only: ~1532)
Total: 2777 tokens (Cumulative for entire session)
─────────────────────────────────────────────────────────────────────

Web Interface

Probe includes a web-based chat interface that provides a user-friendly way to interact with your codebase using AI. The web interface is located in the web/ folder and offers a modern UI for code search and AI-powered code exploration.

Features

Interactive Chat UI: Clean, modern interface with markdown and syntax highlighting
AI-Powered Code Search: Uses Claude AI to search and explain your codebase
Mermaid Diagram Support: Renders visual diagrams for code architecture and flows
Configurable Search Paths: Define which directories can be searched via environment variables

Setup and Configuration

Navigate to the web directory:
```
cd web
```
Install dependencies:
```
npm install
```

Configure environment variables: Create or edit the .env file in the web directory:

ANTHROPIC_API_KEY=your_anthropic_api_key
PORT=8080
ALLOWED_FOLDERS=/path/to/folder1,/path/to/folder2

Start the server:
```
npm start
```
Access the web interface: Open your browser and navigate to http://localhost:8080

Technical Details

Built with vanilla JavaScript and Node.js
Uses the Vercel AI SDK for Claude integration
Executes Probe commands via the probeTool.js module
Renders markdown with Marked.js and syntax highlighting with Highlight.js
Supports Mermaid.js for diagram generation and visualization

Supported Languages

Probe currently supports:

Rust (.rs)
JavaScript / JSX (.js, .jsx)
TypeScript / TSX (.ts, .tsx)
Python (.py)
Go (.go)
C / C++ (.c, .h, .cpp, .cc, .cxx, .hpp, .hxx)
Java (.java)
Ruby (.rb)
PHP (.php)
Swift (.swift)
C# (.cs)
Markdown (.md, .markdown)

How It Works

Probe combines fast file scanning with deep code parsing to provide highly relevant, context-aware results:

Ripgrep Scanning
Probe uses ripgrep to quickly search across your files, identifying lines that match your query. Ripgrep's efficiency allows it to handle massive codebases at lightning speed.
AST Parsing with Tree-sitter
For each file containing matches, Probe uses tree-sitter to parse the file into an Abstract Syntax Tree (AST). This process ensures that code blocks (functions, classes, structs) can be identified precisely.
NLP & Re-Rankers
Next, Probe applies classical NLP methods—tokenization, stemming, and stopword removal—alongside re-rankers such as BM25, TF-IDF, or the hybrid approach (combining multiple ranking signals). This step elevates the most relevant code blocks to the top, especially helpful for AI-driven searches.
Block Extraction
Probe identifies the smallest complete AST node containing each match (e.g., a full function or class). It extracts these code blocks and aggregates them into search results.
Context for AI
Finally, these structured blocks can be returned directly or fed into an AI system. By providing the full context of each code segment, Probe helps AI models navigate large codebases and produce more accurate insights.

Adding Support for New Languages

Tree-sitter Grammar: In Cargo.toml, add the tree-sitter parser for the new language.
Language Module: Create a new file in src/language/ for parsing logic.
Implement Language Trait: Adapt the parse method for the new language constructs.
Factory Update: Register your new language in Probe's detection mechanism.

Releasing New Versions

Probe uses GitHub Actions for multi-platform builds and releases.

Update Cargo.toml with the new version.

Create a new Git tag:

git tag -a vX.Y.Z -m "Release vX.Y.Z"
git push origin vX.Y.Z

GitHub Actions will build, package, and draft a new release with checksums.

Each release includes:

Linux binary (x86_64)
macOS binaries (x86_64 and aarch64)
Windows binary (x86_64)
SHA256 checksums

We believe that local, privacy-focused, semantic code search is essential for the future of AI-assisted development. Probe is built to empower developers and AI alike to navigate and comprehend large codebases more effectively.

For questions or contributions, please open an issue on GitHub. Happy coding—and searching!

Name		Name	Last commit message	Last commit date
Latest commit History 58 Commits
.githooks		.githooks
.github/workflows		.github/workflows
mcp		mcp
src		src
test_data		test_data
tests		tests
web		web
.clinerules		.clinerules
.gitignore		.gitignore
.windsurfrules		.windsurfrules
Cargo.toml		Cargo.toml
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
install.sh		install.sh
logo.png		logo.png
state.md		state.md
test_block_merging.rs		test_block_merging.rs
test_comment_extraction.go		test_comment_extraction.go

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Probe

Table of Contents

Quick Start

Features

Installation

Quick Installation

Requirements

Manual Installation

Building from Source

Verifying the Installation

Troubleshooting

Uninstalling

Usage

CLI Mode

Key Options

Examples

AI Chat Mode

Features

Configuration

Example Usage

Web Interface

Features

Setup and Configuration

Technical Details

Supported Languages

How It Works

Adding Support for New Languages

Releasing New Versions

About

Uh oh!

Releases

Packages

Languages

License

a1cnore/probe

Folders and files

Latest commit

History

Repository files navigation

Probe

Table of Contents

Quick Start

Features

Installation

Quick Installation

Requirements

Manual Installation

Building from Source

Verifying the Installation

Troubleshooting

Uninstalling

Usage

CLI Mode

Key Options

Examples

AI Chat Mode

Features

Configuration

Example Usage

Web Interface

Features

Setup and Configuration

Technical Details

Supported Languages

How It Works

Adding Support for New Languages

Releasing New Versions

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages