+
Skip to content

Prism (Personal Retrieval & Insight System for Multimedia). This project is currently in progress.

License

Notifications You must be signed in to change notification settings

Sumangal44/Prism

 
 

Repository files navigation

🔮 Prism - Multimodal RAG System

Prism Logo

Your intelligent multimodal search companion

React FastAPI Python License


🌟 Overview

Prism is a multimodal RAG (Retrieval-Augmented Generation) system that enables intelligent search and question-answering across documents (PDF, DOCX), images, and audio files. Built with a local-first approach, all processing happens on-device with no cloud dependencies, ensuring complete privacy and security.

✨ Key Features

  • 🔍 Multimodal Search - Search across documents, images, and audio with natural language
  • 🧠 Document Q&A - Ask questions about uploaded documents using Mistral 7B LLM
  • 🏠 Local-First - All processing happens on your device, no cloud dependencies
  • 📄 Document Processing - Support for PDF and DOCX files with intelligent chunking
  • 🎨 Modern UI - Beautiful React interface with Tailwind CSS and Framer Motion
  • 🚀 Fast API - High-performance FastAPI backend with async processing
  • 🔒 Privacy-Focused - Your data never leaves your device

🎯 Current Status

Implemented Features:

  • ✅ Document upload and processing (PDF/DOCX)
  • ✅ Intelligent text chunking with token awareness
  • ✅ Document Q&A using Mistral 7B via llama.cpp
  • ✅ Modern React frontend with multiple interfaces
  • ✅ FastAPI backend with comprehensive endpoints
  • ✅ Local LLM integration for offline operation

In Development:

  • 🔨 Image OCR and processing
  • 🔨 Audio transcription
  • 🔨 Vector embeddings for semantic search
  • 🔨 Full multimodal search interface

🚀 Quick Start

Prerequisites

  • Python 3.8+ with pip
  • Node.js 16+ with npm
  • 8GB+ RAM (for running Mistral 7B)
  • 4GB+ disk space (for model storage)

1. Clone the Repository

git clone <repository-url>
cd prism

2. Set Up the Backend

cd backend

# Create and activate virtual environment
python -m venv .venv
.venv\Scripts\activate  # Windows
# source .venv/bin/activate  # macOS/Linux

# Install dependencies
pip install -r requirements.txt

3. Download the LLM Model

# Create models directory
mkdir -p models/llm

# Download Mistral 7B (4.1GB)
curl -L -o models/llm/mistral-7b-instruct-v0.2.Q4_K_M.gguf https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.2-GGUF/resolve/main/mistral-7b-instruct-v0.2.Q4_K_M.gguf

4. Set Up the Frontend

cd frontend
npm install

5. Start the Application

Terminal 1 (Backend):

cd backend
.venv\Scripts\activate
python -m uvicorn app.main:app --reload --host 0.0.0.0 --port 8000

Terminal 2 (Frontend):

cd frontend
npm run dev

Open your browser and go to http://localhost:3000 🎉

📋 For detailed setup instructions, see SETUP_QA.md


🏗️ Architecture

┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐
│   React Frontend │    │  FastAPI Backend │    │ Mistral 7B LLM  │
│   (Port 3000)   │◄──►│   (Port 8000)   │◄──►│  (llama.cpp)    │
└─────────────────┘    └─────────────────┘    └─────────────────┘
         │                       │                       │
         ▼                       ▼                       ▼
┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐
│  File Upload    │    │ Document        │    │ Question        │
│  Interface      │    │ Processing      │    │ Answering       │
└─────────────────┘    └─────────────────┘    └─────────────────┘

🔧 Tech Stack

Frontend:

  • React 18 with Vite
  • Tailwind CSS for styling
  • Framer Motion for animations
  • Lucide React for icons
  • Axios for API calls

Backend:

  • FastAPI for high-performance APIs
  • llama.cpp for LLM inference
  • PyPDF2 & python-docx for document parsing
  • tiktoken for intelligent text chunking

AI/ML:

  • Mistral 7B Instruct v0.2 (GGUF format)
  • Token-aware document chunking
  • Local inference with no cloud dependencies

📁 Project Structure

prism/
├── 🎨 frontend/                 # React application
│   ├── src/
│   │   ├── components/          # UI components
│   │   │   ├── DocumentQA.jsx   # Q&A interface
│   │   │   ├── FileUpload.jsx   # File upload UI
│   │   │   ├── SearchInterface.jsx
│   │   │   └── ...
│   │   ├── App.jsx              # Main app component
│   │   └── index.css            # Tailwind styles
│   └── package.json
│
├── 🔧 backend/                  # FastAPI application
│   ├── app/
│   │   ├── main.py              # API endpoints
│   │   └── services/
│   │       ├── llm_service.py   # Mistral LLM integration
│   │       └── qa_service.py    # Document Q&A logic
│   ├── ingestion/
│   │   ├── parse_pdf.py         # PDF/DOCX processing
│   │   └── chunker.py           # Intelligent text chunking
│   └── requirements.txt
│
├── 🧠 models/
│   └── llm/                     # LLM model storage
│       └── mistral-7b-instruct-v0.2.Q4_K_M.gguf
│
├── 📊 data/                     # Application data
│   ├── uploads/                 # Uploaded files
│   ├── processed/               # Processed documents
│   └── indices/                 # Search indices
│
└── 📚 docs/                     # Documentation
    ├── architecture.md
    └── runbook.md

🎯 Usage

Document Q&A

  1. Upload Documents - Upload PDF or DOCX files via the Q&A interface
  2. Processing - Documents are automatically chunked and indexed
  3. Ask Questions - Type natural language questions about your documents
  4. Get Answers - Receive contextual answers with source citations

Example Questions:

  • "What are the main conclusions in this report?"
  • "Summarize the budget section"
  • "What recommendations are mentioned?"
  • "Find information about project timelines"

Search Interface

  • Text Search - Search across all processed documents
  • Multimodal Support - Future support for image and audio search
  • Smart Results - Relevance scoring and source citations

🔧 Configuration

Model Settings

Edit backend/app/services/llm_service.py:

# Context window size (adjust for memory usage)
self.n_ctx = 4096

# Thread count (adjust for CPU cores)
self.n_threads = 4

# Temperature (0.0 = focused, 1.0 = creative)
temperature = 0.7

Chunking Settings

Edit backend/ingestion/chunker.py:

# Chunk size in tokens
chunk_size = 1000

# Overlap between chunks
chunk_overlap = 200

🛠️ Development

Frontend Development

cd frontend
npm run dev     # Start development server
npm run build   # Build for production
npm run lint    # Run ESLint

Backend Development

cd backend
.venv\Scripts\activate
python -m uvicorn app.main:app --reload  # Start with auto-reload

Testing

# Run backend tests
cd backend
python -m pytest tests/

# Run frontend tests
cd frontend
npm test

📊 API Documentation

The FastAPI backend provides comprehensive API documentation:

  • Interactive Docs: http://localhost:8000/docs
  • ReDoc: http://localhost:8000/redoc

Key Endpoints

Method Endpoint Description
GET / Health check and model status
POST /api/upload Upload and process documents
POST /api/question Ask questions about documents
GET /api/documents List processed documents
DELETE /api/documents/{file_id} Delete documents
GET /api/model/status Check LLM model status

🔒 Privacy & Security

  • Local Processing - All data processing happens on your device
  • No Cloud Dependencies - Documents never leave your machine
  • Offline Operation - Works completely offline after setup
  • Data Control - You own and control all your data

🤝 Contributing

We welcome contributions! Please read our contributing guidelines:

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

📝 License

This project is licensed under the MIT License - see the LICENSE file for details.


🆘 Support

  • Documentation: Check SETUP_QA.md for detailed setup
  • Issues: Report bugs via GitHub Issues
  • Discussions: Join discussions for questions and ideas

Built with ❤️ for privacy-focused AI

⭐ Star this repo🐛 Report Bug💡 Request Feature

About

Prism (Personal Retrieval & Insight System for Multimedia). This project is currently in progress.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • JavaScript 67.8%
  • Python 28.3%
  • CSS 1.6%
  • Batchfile 0.8%
  • Dockerfile 0.7%
  • Shell 0.6%
  • Other 0.2%
点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载