A fast, context-aware code documentation chat interface that provides intelligent responses based on your codebase and documentation.
Doc2Talk was born out of a real need I encountered while developing Crawl4AI (https://github.com/unclecode/crawl4ai). I wanted to provide excellent coding assistance for users with questions about the library and its usage. The tool turned out to be remarkably helpful - smarter and more accurate than many other available solutions.
The key insight was that LLMs need both code and documentation to provide truly helpful responses. By building a knowledge graph that combines the codebase and documentation using embedding and BM25 search, Doc2Talk can intelligently generate context for each user question.
As conversations continue, it efficiently maintains context, adding or removing information as needed to ensure it can answer follow-up questions. The system decides whether to replace, append, or reuse existing context based on the conversation flow.
Currently, Doc2Talk operates in the terminal, with plans for a web version coming soon. I hope this tool proves as helpful for your projects as it has been for mine.
- Knowledge Graph Indexing: Indexes both code and documentation for comprehensive understanding
- Smart Context Management: Intelligently decides when to fetch new context or use existing context
- Lazy Initialization: Fast startup with on-demand index building
- Repository Caching: Efficiently caches GitHub repositories for faster repeated access
- Custom LLM Support: Configure different models for decisions and responses
- High-Performance Storage: Uses optimized serialization for 100x faster loading times
- Terminal UI: Rich terminal interface with interactive streaming responses
- Session Management: Save and restore chat sessions for continuous workflows
# Install from PyPI
pip install doc2talk
# Install from source
git clone https://github.com/unclecode/doc2talk.git
cd doc2talk
pip install -e .
# Specify code and documentation sources (required)
doc2talk --code /path/to/code --docs /path/to/docs
# Use a GitHub repository
doc2talk --code https://github.com/unclecode/crawl4ai/tree/main/crawl4ai --docs https://github.com/unclecode/crawl4ai/tree/main/docs/md_v2
# Exclude specific patterns
doc2talk --exclude "*/tests/*" --exclude "*/node_modules/*"
# Session management
doc2talk --list # List all chat sessions
doc2talk --continue SESSION_ID # Continue existing session
doc2talk --delete SESSION_ID # Delete a session
from doc2talk import Doc2Talk
# Create instance with sources (required)
doc2talk = Doc2Talk(
code_source="https://github.com/unclecode/crawl4ai/tree/main/crawl4ai",
docs_source="https://github.com/unclecode/crawl4ai/tree/main/docs/md_v2",
exclude_patterns=["*/tests/*"]
)
# Build index only when needed (chat operations automatically build the index)
doc2talk.build_index() # Or explicitly build index before chatting
# Ask questions and get responses
response = doc2talk.chat("How does the crawler work?")
print(response)
# Streaming responses (synchronous)
for chunk in doc2talk.chat_stream("What are the main components?"):
print(chunk, end="", flush=True)
# Streaming responses (asynchronous)
async for chunk in doc2talk.chat_stream_async("How does it work?"):
print(chunk, end="", flush=True)
# Customize history and context limits
doc2talk = Doc2Talk(
code_source="/path/to/code",
max_history=100, # Keep up to 100 messages (default is 50)
max_contexts=10 # Keep up to 10 contexts (default is 5)
)
# Custom LLM configurations
from doc2talk.models import LLMConfig
# Configure custom models with specific parameters
doc2talk = Doc2Talk(
code_source="/path/to/code",
# Model for context decisions
decision_llm_config=LLMConfig(
model="gpt-4o", # Stronger model for context decisions
temprature=0.6,
max_tokens=200
),
# Model for response generation
generation_llm_config=LLMConfig(
model="gpt-4o-mini", # Cheaper model for responses
temprature=0.6, # Higher temperature for creativity
max_tokens=2 ** 12 # Allow longer responses
)
)
# Session management
print(f"Current session ID: {doc2talk.session_id}")
sessions = Doc2Talk.list_sessions()
Doc2Talk.delete_session("session_id")
# Loading from existing index
doc2talk = Doc2Talk.from_index(
"/path/to/index.c4ai",
max_history=75,
max_contexts=3
)
Doc2Talk combines several powerful components:
-
DocGraph: A knowledge graph engine that:
- Indexes Python code (classes, functions) and Markdown docs
- Uses BM25 search algorithm for accurate retrieval
- Supports both local paths and GitHub repositories
- Maintains optimized serialization for fast loading
-
Contextual Decision Making:
- Uses LLMs to determine when to fetch new context
- Decides between replacement, addition, or reuse of context
- Avoids redundant context fetching
-
Repository Management:
- Caches repositories at
~/.doctalk/repos/
- Intelligently handles multiple paths from the same repository
- Auto-cleans old repositories not accessed in 30 days
- Caches repositories at
-
Index Caching:
- Stores indexed knowledge graphs at
~/.doctalk/index/
- Uses compressed binary format for 50-70% smaller files
- Memory-mapped files for efficient partial loading
- Stores indexed knowledge graphs at
~/.doctalk/
├── index/ # Cached knowledge graphs
├── repos/ # Cached GitHub repositories
└── sessions/ # Saved chat sessions
Start a new chat session by running the CLI:
doc2talk --code /path/to/code --docs /path/to/docs
The first time you run it, it will:
- Clone the repository if it's a GitHub URL (or use cached version if available)
- Build the knowledge graph (or load from cache if available)
- Present you with an interactive chat interface
Simply type your questions in the terminal. Doc2Talk will:
- Analyze your question
- Determine if new context is needed
- Search the knowledge graph for relevant information
- Generate a response based on the retrieved context
Example questions:
- "How does the crawling functionality work?"
- "Can you explain the BM25 search implementation?"
- "What's the structure of the DocGraph class?"
Doc2Talk uses three context modes:
- New: Completely replaces existing context with fresh search results
- Additional: Adds new search results to existing context
- None: Uses existing context without searching
The system automatically decides which mode to use based on your questions.
Doc2Talk supports using different models for different tasks:
- Context Decisions: Use a cheaper model to decide when to update context
- Response Generation: Use a more powerful model to generate responses
This allows you to balance cost and quality.
Doc2Talk initializes quickly by using lazy loading:
- Creating a Doc2Talk instance is nearly instantaneous
- The knowledge graph is only built when needed
- You can control when to build the index with
build_index()
This provides faster startup time for applications.
Doc2Talk is designed for speed and efficiency:
- Cached repositories avoid repeated cloning
- Optimized serialization for 100x faster loading
- Memory-mapped reading for efficient resource usage
- Smart context decisions to minimize redundant searches
- Lazy initialization for faster application startup
Contributions are welcome! Here are some areas for improvement:
- Adding support for more repository types (GitLab, Bitbucket)
- Extending language support beyond Python and Markdown
- Improving the semantic search capabilities
- Enhancing the terminal UI