RAG Chatbot

A Retrieval-Augmented Generation (RAG) chatbot built with Next.js, Pinecone, and HuggingFace. This system provides intelligent document-based conversations with zero monthly costs using free tier services.

Features

Document Intelligence - Upload and chat with PDF, DOCX, MD, and TXT files
Multi-User Support - Secure namespace-based user isolation
Real-time Chat - Streaming responses with document context
Smart Search - Semantic document retrieval with relevance scoring
Production Ready - Authentication, error handling, and monitoring
Developer Friendly - TypeScript, comprehensive error handling

Architecture

graph TD
    A[User] --> B[Next.js Frontend]
    B --> C[API Routes]
    C --> D[Authentication]
    D --> E[Document Processing]
    E --> F[HuggingFace Embeddings]
    F --> G[Pinecone Vector Store]
    G --> H[RAG Context]
    H --> I[Gemini AI]
    I --> J[Streaming Response]

Quick Start

Prerequisites

Node.js 18+ and npm
Pinecone Account (free tier)
HuggingFace Account (optional, for higher rate limits)
Google AI API Key (for Gemini)
PostgreSQL Database (local or cloud)

Step 1: Clone and Install

# Clone the repository
git clone <your-repository-url>
cd rag

# Install dependencies
npm install

Step 2: Set Up Free Tier Accounts

2.1 Pinecone Setup (Required)

Create Account: Go to pinecone.io and sign up
Create Project: Create a new project in the dashboard
Get API Key:
- Go to "API Keys" in the left sidebar
- Copy your API key
- Note: The system will auto-create your index on first use

2.2 HuggingFace Setup (Optional but Recommended)

Create Account: Go to huggingface.co and sign up
Generate Token:
- Go to Settings → Access Tokens
- Create a new token with "Read" permissions
- Copy the token

2.3 Google AI Setup (Required)

Get API Key: Go to Google AI Studio
Create Key: Generate a new API key for Gemini

Step 3: Neon Postgres Setup via Vercel Marketplace

Set up your serverless PostgreSQL database through Vercel's integration with Neon:

3.1 Create Neon Database via Vercel

Login to Vercel: Go to vercel.com and sign in
Navigate to Storage: In your project dashboard, click the Storage tab
Create Database:
- Click Create Database or Browse Storage
- Select Neon from the "Marketplace Database Providers" section
- Choose a database name (e.g., rag-chatbot-db)
- Select your preferred region (closest to your users)
- Click Continue and authorize the integration

3.2 Connect Database to Project

Complete Setup: After authorizing Neon, complete the database creation process
Connect Project: Click Connect Project and select your RAG chatbot project
Auto-Injection: Vercel automatically injects POSTGRES_URL and other database environment variables
Verify Connection: Check your project's environment variables to confirm POSTGRES_URL is present

Note: You now have a Neon serverless PostgreSQL database connected through Vercel's marketplace. No separate Neon account setup required.

3.3 Pull Environment Variables Locally

# Install Vercel CLI (if not already installed)
npm install -g vercel

# Login to Vercel
vercel login

# Link your local project (if not already linked)
vercel link

# Pull environment variables to local development
vercel env pull .env.local

This will automatically create/update your .env.local file with the Vercel Postgres connection string.

Step 4: Environment Configuration

After pulling Vercel environment variables, your .env.local should contain the Neon database connection. Add the remaining required variables:

# Copy the example file to get started
cp .env.example .env.local

# Then edit .env.local with your actual API keys

Your .env.local should contain (see .env.example for template):

# Database (Automatically added by Neon via Vercel)
POSTGRES_URL="neon-connection-string-from-vercel-env-pull"

# Authentication (Required - Generate with: openssl rand -base64 32)
AUTH_SECRET="your-random-secret-key-min-32-chars"

# AI Models (Required)
GOOGLE_GENERATIVE_AI_API_KEY="your-google-ai-api-key"

# Pinecone (Required)
PINECONE_API_KEY="your-pinecone-api-key"
PINECONE_INDEX_NAME="rag-documents"

# HuggingFace (Optional but Recommended - for higher rate limits)
HUGGINGFACE_API_KEY="your-huggingface-token"

# RAG Configuration (Optional - uses smart defaults)
RAG_CHUNK_SIZE=1000
RAG_CHUNK_OVERLAP=200
RAG_MAX_DOCS=5
RAG_SIMILARITY_THRESHOLD=0.7
RAG_QUERY_EXPANSION=false
RAG_RERANKING=false

Step 5: Database Migration

# Run database migrations to set up tables in Neon database
npm run db:migrate

# Optional: Push schema changes directly to Neon
npm run db:push

Note: The database operations will run against your Neon PostgreSQL database using the POSTGRES_URL environment variable.

Step 6: Test the RAG System

# Test Pinecone + HuggingFace integration
npm run rag:test

Expected output:

{
  "success": true,
  "embeddings_test": { "success": true, "dimensions": 384 },
  "pinecone_test": { "pinecone_connected": true, "index_name": "rag-documents" }
}

Step 7: Start the Application

# Development mode
npm run dev

# Production build
npm run build
npm start

Open http://localhost:3000 in your browser.

How to Use

1. User Registration/Login

Navigate to the application
Click "Register" to create an account
Verify your email (if configured)
Login with your credentials

2. Upload Documents

Click the Document Manager (folder icon in the navbar)
Upload Files:
- Drag & drop files or click "Choose Files"
- Supported: PDF, DOCX, MD, TXT (max 10MB each)
- Files are automatically processed and indexed
View Progress: See upload status and chunk counts
Manage Documents: View, delete, or re-upload files

3. Chat with Your Documents

Start a Conversation: Type your question in the chat
Automatic Context: The system automatically finds relevant documents
Manual Search: Use the searchDocuments tool for specific queries
Source Attribution: See which documents informed the response

Example Conversations:

You: "What are the main findings in the research paper?"
AI: Based on your uploaded research paper "AI_Study_2024.pdf", the main findings include...
[Source: AI_Study_2024.pdf]

You: "Search for information about machine learning algorithms"
AI: *searches documents* Found 3 relevant sections about ML algorithms...

4. Advanced Features

Document Search Tool

You: "Search my documents for 'quarterly revenue'"
AI: *uses searchDocuments tool* Found 2 documents mentioning quarterly revenue:
1. Q3_Report.pdf - Revenue increased 15%...
2. Annual_Summary.docx - Quarterly breakdown shows...

Multi-Document Conversations

You: "Compare the findings between document A and document B"
AI: *analyzes both documents* Comparing the two documents:
Document A suggests... while Document B indicates...

Configuration Guide

Free Tier Limits

Service	Free Tier Limit	Your Usage
Pinecone	2GB storage (~300K vectors)	Perfect for <50 documents
HuggingFace	1000 requests/hour	Sufficient for personal use
Gemini	15 requests/minute	Great for chat interactions
Neon	512MB storage, 1 compute hour/month	Ideal for development and small projects

Performance Tuning

For Better Accuracy:

RAG_SIMILARITY_THRESHOLD=0.8  # Higher threshold
RAG_MAX_DOCS=7               # More context

For Faster Responses:

RAG_SIMILARITY_THRESHOLD=0.6  # Lower threshold
RAG_MAX_DOCS=3               # Less context

For Cost Optimization:

# Don't set HUGGINGFACE_API_KEY to use public endpoint
RAG_CHUNK_SIZE=800           # Smaller chunks

Environment Profiles

Development

NODE_ENV=development
RAG_SIMILARITY_THRESHOLD=0.6

Production

NODE_ENV=production
RAG_SIMILARITY_THRESHOLD=0.75
HUGGINGFACE_API_KEY=your-token  # For reliability

Development

Project Structure

rag/
├── app/                    # Next.js App Router
│   ├── (auth)/            # Authentication pages
│   ├── (chat)/            # Chat interface & API
│   └── api/               # API endpoints
├── components/            # React components
│   ├── custom/           # App-specific components
│   └── ui/               # Reusable UI components
├── lib/                  # Core libraries
│   ├── pinecone-rag-core.ts    # Pinecone RAG implementation
│   ├── huggingface-embeddings.ts # HF embeddings
│   ├── document-processor.ts    # File processing
│   └── utils.ts          # Utilities
├── db/                   # Database
│   ├── schema.ts         # Drizzle schema
│   ├── queries.ts        # Database queries
│   └── migrate.ts        # Migration runner
└── public/               # Static assets

Adding New Document Types

Update file validation in lib/document-processor.ts:

const ALLOWED_TYPES = [
  "text/plain",
  "text/markdown", 
  "application/pdf",
  "application/vnd.openxmlformats-officedocument.wordprocessingml.document",
  "your-new-mime-type" // Add here
];

Add processing logic for the new file type
Update frontend validation in components/custom/document-manager.tsx

Custom Embedding Models

Replace the default model in lib/huggingface-embeddings.ts:

// Current: all-MiniLM-L6-v2 (384 dimensions)
this.model = 'sentence-transformers/all-MiniLM-L6-v2';

// Alternatives:
// 'sentence-transformers/all-mpnet-base-v2' (768 dim, better quality)
// 'sentence-transformers/all-distilroberta-v1' (768 dim)

Important: If changing dimensions, update Pinecone index configuration.

Running Tests

# Type checking
npm run type-check

# Linting
npm run lint

# Database operations
npm run db:generate  # Generate migrations
npm run db:push      # Push schema changes

# RAG system testing
npm run rag:test     # Test embeddings + Pinecone

Production Deployment

Vercel Deployment (Recommended)

Connect Repository: Import your GitHub repo to Vercel
Database Integration: Neon database is already connected via marketplace and environment variables are automatically injected
Additional Environment Variables: Add the remaining variables (API keys, secrets) to Vercel project settings
Deploy: Automatic deployment on push with seamless Neon database connectivity

# Optional: Deploy via CLI
npm i -g vercel
vercel --prod

Docker Deployment

FROM node:18-alpine

WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production

COPY . .
RUN npm run build

EXPOSE 3000
CMD ["npm", "start"]

Environment Variables Checklist

Production deployment requires:

Required POSTGRES_URL - Automatically injected by Neon via Vercel marketplace
Required AUTH_SECRET - Session encryption (add manually)
Required GOOGLE_GENERATIVE_AI_API_KEY - Gemini API (add manually)
Required PINECONE_API_KEY - Vector database (add manually)
Required PINECONE_INDEX_NAME - Index name (add manually)
Optional HUGGINGFACE_API_KEY - Optional but recommended (add manually)
Optional NODE_ENV=production - Production mode (automatically set)

Troubleshooting

Common Issues

"Pinecone connection failed"

# Check API key format
echo $PINECONE_API_KEY

# Test connection
npm run rag:test

"HuggingFace rate limit exceeded"

# Add API key for higher limits
HUGGINGFACE_API_KEY=your-token

# Or reduce batch size in code

"Database connection error"

# Verify Neon database connection via Vercel
vercel env pull .env.local

# Check if POSTGRES_URL is properly set
echo $POSTGRES_URL

# Test database migration
npm run db:push

# If still failing, check Vercel dashboard Storage tab for Neon database status

"Document upload fails"

Check file size (10MB limit)
Verify file type is supported
Check server logs for processing errors

"No search results"

Verify documents are uploaded and indexed
Lower similarity threshold: RAG_SIMILARITY_THRESHOLD=0.6
Check document content quality

Performance Optimization

Slow Document Upload

Reduce chunk size: RAG_CHUNK_SIZE=800
Check HuggingFace rate limits
Use HuggingFace API key

Slow Chat Responses

Lower similarity threshold: RAG_SIMILARITY_THRESHOLD=0.6
Reduce max documents: RAG_MAX_DOCS=3
Check Pinecone region (use us-east-1)

Monitoring

System Health

# Check all services
npm run rag:test

# Verify Neon database connection
vercel env pull .env.local
npm run db:push --dry-run

# Check Vercel deployment logs
vercel logs

# Check local development logs
tail -f .next/server.log

Cost Monitoring

Pinecone: Monitor vector count in dashboard
HuggingFace: Check request usage
Gemini: Monitor API usage in Google Cloud Console
Neon: Monitor compute hours and storage in Vercel Storage dashboard

Usage Analytics

Free Tier Capacity

With default settings, you can handle:

Documents: ~50-100 typical PDFs (depends on length)
Users: Unlimited (namespace isolation)
Queries: ~1000/hour (HuggingFace limit)
Storage: 2GB vectors + unlimited database

Scaling Up

When you exceed free tiers:

Pinecone: $70/month for 20GB storage
HuggingFace: $9/month for Inference Endpoints
Neon: $19/month for Launch plan with 10GB storage and more compute hours

Contributing

Fork the repository
Create a feature branch: git checkout -b feature-name
Make changes and test: npm run rag:test
Submit a pull request

Development Setup

# Clone your fork
git clone https://github.com/yourusername/rag.git
cd rag

# Install dependencies
npm install

# Set up environment
cp .env.example .env.local
# Edit .env.local with your API keys

# Run tests
npm run type-check
npm run lint
npm run rag:test

# Start development
npm run dev

License

This project is licensed under the MIT License. See the LICENSE file for details.

Acknowledgments

Built with:

Next.js - React framework
Pinecone - Vector database
HuggingFace - Embeddings
Google Gemini - Language model
Drizzle ORM - Database toolkit
Tailwind CSS - Styling

Happy building! If you have questions or need help, please open an issue or start a discussion.

Name		Name	Last commit message	Last commit date
Latest commit History 70 Commits
.devcontainer		.devcontainer
.github/workflows		.github/workflows
ai		ai
app		app
components		components
db		db
drizzle		drizzle
lib		lib
public		public
.coderabbit.yaml		.coderabbit.yaml
.env.example		.env.example
.eslintrc.json		.eslintrc.json
.git-filter-repo-replacements.txt		.git-filter-repo-replacements.txt
.gitignore		.gitignore
.npmrc		.npmrc
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
components.json		components.json
drizzle.config.ts		drizzle.config.ts
middleware.ts		middleware.ts
next-env.d.ts		next-env.d.ts
next.config.mjs		next.config.mjs
package-lock.json		package-lock.json
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
postcss.config.mjs		postcss.config.mjs
tailwind.config.ts		tailwind.config.ts
tsconfig.json		tsconfig.json
vercel.json		vercel.json

License

yudduy/rag

Folders and files

Latest commit

History

Repository files navigation

RAG Chatbot

Features

Architecture

Quick Start

Prerequisites

Step 1: Clone and Install

Step 2: Set Up Free Tier Accounts

2.1 Pinecone Setup (Required)

2.2 HuggingFace Setup (Optional but Recommended)

2.3 Google AI Setup (Required)

Step 3: Neon Postgres Setup via Vercel Marketplace

3.1 Create Neon Database via Vercel

3.2 Connect Database to Project

3.3 Pull Environment Variables Locally

Step 4: Environment Configuration

Step 5: Database Migration

Step 6: Test the RAG System

Step 7: Start the Application

How to Use

1. User Registration/Login

2. Upload Documents

3. Chat with Your Documents

Example Conversations:

4. Advanced Features

Document Search Tool

Multi-Document Conversations

Configuration Guide

Free Tier Limits

Performance Tuning

For Better Accuracy:

For Faster Responses:

For Cost Optimization:

Environment Profiles

Development

Production

Development

Project Structure

Adding New Document Types

Custom Embedding Models

Running Tests

Production Deployment

Vercel Deployment (Recommended)

Docker Deployment

Environment Variables Checklist

Troubleshooting

Common Issues

"Pinecone connection failed"

"HuggingFace rate limit exceeded"

"Database connection error"

"Document upload fails"

"No search results"

Performance Optimization

Slow Document Upload

Slow Chat Responses

Monitoring

System Health

Cost Monitoring

Usage Analytics

Free Tier Capacity

Scaling Up

Contributing

Development Setup

License

Acknowledgments

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages