A Retrieval-Augmented Generation (RAG) chatbot built with Next.js, Pinecone, and HuggingFace. This system provides intelligent document-based conversations with zero monthly costs using free tier services.
- Document Intelligence - Upload and chat with PDF, DOCX, MD, and TXT files
- Multi-User Support - Secure namespace-based user isolation
- Real-time Chat - Streaming responses with document context
- Smart Search - Semantic document retrieval with relevance scoring
- Production Ready - Authentication, error handling, and monitoring
- Developer Friendly - TypeScript, comprehensive error handling
graph TD
A[User] --> B[Next.js Frontend]
B --> C[API Routes]
C --> D[Authentication]
D --> E[Document Processing]
E --> F[HuggingFace Embeddings]
F --> G[Pinecone Vector Store]
G --> H[RAG Context]
H --> I[Gemini AI]
I --> J[Streaming Response]
- Node.js 18+ and npm
- Pinecone Account (free tier)
- HuggingFace Account (optional, for higher rate limits)
- Google AI API Key (for Gemini)
- PostgreSQL Database (local or cloud)
# Clone the repository
git clone <your-repository-url>
cd rag
# Install dependencies
npm install
- Create Account: Go to pinecone.io and sign up
- Create Project: Create a new project in the dashboard
- Get API Key:
- Go to "API Keys" in the left sidebar
- Copy your API key
- Note: The system will auto-create your index on first use
- Create Account: Go to huggingface.co and sign up
- Generate Token:
- Go to Settings → Access Tokens
- Create a new token with "Read" permissions
- Copy the token
- Get API Key: Go to Google AI Studio
- Create Key: Generate a new API key for Gemini
Set up your serverless PostgreSQL database through Vercel's integration with Neon:
- Login to Vercel: Go to vercel.com and sign in
- Navigate to Storage: In your project dashboard, click the Storage tab
- Create Database:
- Click Create Database or Browse Storage
- Select Neon from the "Marketplace Database Providers" section
- Choose a database name (e.g.,
rag-chatbot-db
) - Select your preferred region (closest to your users)
- Click Continue and authorize the integration
- Complete Setup: After authorizing Neon, complete the database creation process
- Connect Project: Click Connect Project and select your RAG chatbot project
- Auto-Injection: Vercel automatically injects
POSTGRES_URL
and other database environment variables - Verify Connection: Check your project's environment variables to confirm
POSTGRES_URL
is present
Note: You now have a Neon serverless PostgreSQL database connected through Vercel's marketplace. No separate Neon account setup required.
# Install Vercel CLI (if not already installed)
npm install -g vercel
# Login to Vercel
vercel login
# Link your local project (if not already linked)
vercel link
# Pull environment variables to local development
vercel env pull .env.local
This will automatically create/update your .env.local
file with the Vercel Postgres connection string.
After pulling Vercel environment variables, your .env.local
should contain the Neon database connection. Add the remaining required variables:
# Copy the example file to get started
cp .env.example .env.local
# Then edit .env.local with your actual API keys
Your .env.local
should contain (see .env.example
for template):
# Database (Automatically added by Neon via Vercel)
POSTGRES_URL="neon-connection-string-from-vercel-env-pull"
# Authentication (Required - Generate with: openssl rand -base64 32)
AUTH_SECRET="your-random-secret-key-min-32-chars"
# AI Models (Required)
GOOGLE_GENERATIVE_AI_API_KEY="your-google-ai-api-key"
# Pinecone (Required)
PINECONE_API_KEY="your-pinecone-api-key"
PINECONE_INDEX_NAME="rag-documents"
# HuggingFace (Optional but Recommended - for higher rate limits)
HUGGINGFACE_API_KEY="your-huggingface-token"
# RAG Configuration (Optional - uses smart defaults)
RAG_CHUNK_SIZE=1000
RAG_CHUNK_OVERLAP=200
RAG_MAX_DOCS=5
RAG_SIMILARITY_THRESHOLD=0.7
RAG_QUERY_EXPANSION=false
RAG_RERANKING=false
# Run database migrations to set up tables in Neon database
npm run db:migrate
# Optional: Push schema changes directly to Neon
npm run db:push
Note: The database operations will run against your Neon PostgreSQL database using the POSTGRES_URL
environment variable.
# Test Pinecone + HuggingFace integration
npm run rag:test
Expected output:
{
"success": true,
"embeddings_test": { "success": true, "dimensions": 384 },
"pinecone_test": { "pinecone_connected": true, "index_name": "rag-documents" }
}
# Development mode
npm run dev
# Production build
npm run build
npm start
Open http://localhost:3000 in your browser.
- Navigate to the application
- Click "Register" to create an account
- Verify your email (if configured)
- Login with your credentials
- Click the Document Manager (folder icon in the navbar)
- Upload Files:
- Drag & drop files or click "Choose Files"
- Supported: PDF, DOCX, MD, TXT (max 10MB each)
- Files are automatically processed and indexed
- View Progress: See upload status and chunk counts
- Manage Documents: View, delete, or re-upload files
- Start a Conversation: Type your question in the chat
- Automatic Context: The system automatically finds relevant documents
- Manual Search: Use the
searchDocuments
tool for specific queries - Source Attribution: See which documents informed the response
You: "What are the main findings in the research paper?"
AI: Based on your uploaded research paper "AI_Study_2024.pdf", the main findings include...
[Source: AI_Study_2024.pdf]
You: "Search for information about machine learning algorithms"
AI: *searches documents* Found 3 relevant sections about ML algorithms...
You: "Search my documents for 'quarterly revenue'"
AI: *uses searchDocuments tool* Found 2 documents mentioning quarterly revenue:
1. Q3_Report.pdf - Revenue increased 15%...
2. Annual_Summary.docx - Quarterly breakdown shows...
You: "Compare the findings between document A and document B"
AI: *analyzes both documents* Comparing the two documents:
Document A suggests... while Document B indicates...
Service | Free Tier Limit | Your Usage |
---|---|---|
Pinecone | 2GB storage (~300K vectors) | Perfect for <50 documents |
HuggingFace | 1000 requests/hour | Sufficient for personal use |
Gemini | 15 requests/minute | Great for chat interactions |
Neon | 512MB storage, 1 compute hour/month | Ideal for development and small projects |
RAG_SIMILARITY_THRESHOLD=0.8 # Higher threshold
RAG_MAX_DOCS=7 # More context
RAG_SIMILARITY_THRESHOLD=0.6 # Lower threshold
RAG_MAX_DOCS=3 # Less context
# Don't set HUGGINGFACE_API_KEY to use public endpoint
RAG_CHUNK_SIZE=800 # Smaller chunks
NODE_ENV=development
RAG_SIMILARITY_THRESHOLD=0.6
NODE_ENV=production
RAG_SIMILARITY_THRESHOLD=0.75
HUGGINGFACE_API_KEY=your-token # For reliability
rag/
├── app/ # Next.js App Router
│ ├── (auth)/ # Authentication pages
│ ├── (chat)/ # Chat interface & API
│ └── api/ # API endpoints
├── components/ # React components
│ ├── custom/ # App-specific components
│ └── ui/ # Reusable UI components
├── lib/ # Core libraries
│ ├── pinecone-rag-core.ts # Pinecone RAG implementation
│ ├── huggingface-embeddings.ts # HF embeddings
│ ├── document-processor.ts # File processing
│ └── utils.ts # Utilities
├── db/ # Database
│ ├── schema.ts # Drizzle schema
│ ├── queries.ts # Database queries
│ └── migrate.ts # Migration runner
└── public/ # Static assets
- Update file validation in
lib/document-processor.ts
:
const ALLOWED_TYPES = [
"text/plain",
"text/markdown",
"application/pdf",
"application/vnd.openxmlformats-officedocument.wordprocessingml.document",
"your-new-mime-type" // Add here
];
- Add processing logic for the new file type
- Update frontend validation in
components/custom/document-manager.tsx
Replace the default model in lib/huggingface-embeddings.ts
:
// Current: all-MiniLM-L6-v2 (384 dimensions)
this.model = 'sentence-transformers/all-MiniLM-L6-v2';
// Alternatives:
// 'sentence-transformers/all-mpnet-base-v2' (768 dim, better quality)
// 'sentence-transformers/all-distilroberta-v1' (768 dim)
Important: If changing dimensions, update Pinecone index configuration.
# Type checking
npm run type-check
# Linting
npm run lint
# Database operations
npm run db:generate # Generate migrations
npm run db:push # Push schema changes
# RAG system testing
npm run rag:test # Test embeddings + Pinecone
- Connect Repository: Import your GitHub repo to Vercel
- Database Integration: Neon database is already connected via marketplace and environment variables are automatically injected
- Additional Environment Variables: Add the remaining variables (API keys, secrets) to Vercel project settings
- Deploy: Automatic deployment on push with seamless Neon database connectivity
# Optional: Deploy via CLI
npm i -g vercel
vercel --prod
FROM node:18-alpine
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production
COPY . .
RUN npm run build
EXPOSE 3000
CMD ["npm", "start"]
Production deployment requires:
- Required
POSTGRES_URL
- Automatically injected by Neon via Vercel marketplace - Required
AUTH_SECRET
- Session encryption (add manually) - Required
GOOGLE_GENERATIVE_AI_API_KEY
- Gemini API (add manually) - Required
PINECONE_API_KEY
- Vector database (add manually) - Required
PINECONE_INDEX_NAME
- Index name (add manually) - Optional
HUGGINGFACE_API_KEY
- Optional but recommended (add manually) - Optional
NODE_ENV=production
- Production mode (automatically set)
# Check API key format
echo $PINECONE_API_KEY
# Test connection
npm run rag:test
# Add API key for higher limits
HUGGINGFACE_API_KEY=your-token
# Or reduce batch size in code
# Verify Neon database connection via Vercel
vercel env pull .env.local
# Check if POSTGRES_URL is properly set
echo $POSTGRES_URL
# Test database migration
npm run db:push
# If still failing, check Vercel dashboard Storage tab for Neon database status
- Check file size (10MB limit)
- Verify file type is supported
- Check server logs for processing errors
- Verify documents are uploaded and indexed
- Lower similarity threshold:
RAG_SIMILARITY_THRESHOLD=0.6
- Check document content quality
- Reduce chunk size:
RAG_CHUNK_SIZE=800
- Check HuggingFace rate limits
- Use HuggingFace API key
- Lower similarity threshold:
RAG_SIMILARITY_THRESHOLD=0.6
- Reduce max documents:
RAG_MAX_DOCS=3
- Check Pinecone region (use
us-east-1
)
# Check all services
npm run rag:test
# Verify Neon database connection
vercel env pull .env.local
npm run db:push --dry-run
# Check Vercel deployment logs
vercel logs
# Check local development logs
tail -f .next/server.log
- Pinecone: Monitor vector count in dashboard
- HuggingFace: Check request usage
- Gemini: Monitor API usage in Google Cloud Console
- Neon: Monitor compute hours and storage in Vercel Storage dashboard
With default settings, you can handle:
- Documents: ~50-100 typical PDFs (depends on length)
- Users: Unlimited (namespace isolation)
- Queries: ~1000/hour (HuggingFace limit)
- Storage: 2GB vectors + unlimited database
When you exceed free tiers:
- Pinecone: $70/month for 20GB storage
- HuggingFace: $9/month for Inference Endpoints
- Neon: $19/month for Launch plan with 10GB storage and more compute hours
- Fork the repository
- Create a feature branch:
git checkout -b feature-name
- Make changes and test:
npm run rag:test
- Submit a pull request
# Clone your fork
git clone https://github.com/yourusername/rag.git
cd rag
# Install dependencies
npm install
# Set up environment
cp .env.example .env.local
# Edit .env.local with your API keys
# Run tests
npm run type-check
npm run lint
npm run rag:test
# Start development
npm run dev
This project is licensed under the MIT License. See the LICENSE file for details.
Built with:
- Next.js - React framework
- Pinecone - Vector database
- HuggingFace - Embeddings
- Google Gemini - Language model
- Drizzle ORM - Database toolkit
- Tailwind CSS - Styling
Happy building! If you have questions or need help, please open an issue or start a discussion.