TejOCR v0.1.5 - LibreOffice OCR Extension

🎉 Phase 2 Complete: Professional UI/UX with Real Configurable Dialogs!

TejOCR is a powerful LibreOffice extension that adds Optical Character Recognition (OCR) capabilities to your documents. Extract text from images directly within LibreOffice Writer.

✅ What's New in v0.1.5

🎨 COMPLETE UI/UX OVERHAUL:

✅ Real Settings Dialog: Configurable XDL-based settings with dependency checking
✅ Professional OCR Options Dialog: Language selection, output modes, advanced options
✅ Smart Workflow Integration: Seamless dialog flow for both OCR methods
✅ Enhanced User Experience: Grouped controls, helpful hints, and error guidance

🔧 MAJOR IMPROVEMENTS:

Dependency Status Dashboard: Live status checking with installation guidance
Tesseract Path Configuration: Browse, test, and validate Tesseract installation
Advanced OCR Options: Page segmentation modes, engine modes, preprocessing
Multiple Output Modes: Cursor, text box, replace image, clipboard
Smart Defaults: Remembers your preferences between sessions

🎯 Current Status

Phase 1 (Core Stability): ✅ COMPLETE

Core OCR functionality fully working
Multi-strategy error handling
Robust dependency detection

Phase 2 (Professional UI/UX): ✅ COMPLETE

Real XDL-based dialogs
Configurable settings system
Professional user experience
Advanced OCR options

Phase 3 (Advanced Features): 🚧 Next Priority

Batch processing capabilities
Enhanced output formatting
Performance optimizations

🚀 Quick Start

Prerequisites

Tesseract OCR (Required):

   # macOS
brew install tesseract

   # Ubuntu/Debian
sudo apt install tesseract-ocr
   
   # Windows
   # Download from: https://github.com/UB-Mannheim/tesseract/wiki

Python Dependencies (for LibreOffice's Python):

Automated Installation (Recommended):

python3 install_dependencies.py

Manual Installation:

# Get LibreOffice's Python path first
/Applications/LibreOffice.app/Contents/Frameworks/LibreOfficePython.framework/Versions/Current/bin/python3 -m pip install numpy pytesseract pillow

Installation

Download: Get the latest TejOCR-0.1.5.oxt from releases
Install: LibreOffice → Tools → Extension Manager → Add → Select the .oxt file
Restart: Close and restart LibreOffice completely
Verify: Look for "TejOCR" in the top menu bar

Usage

Open LibreOffice Writer
Configure Settings: Tools → TejOCR → Settings (first time setup)
For File OCR: Tools → TejOCR → OCR Image from File → Select options → Start OCR
For Selected Image: Insert image → Select it → Tools → TejOCR → OCR Selected Image → Select options → Start OCR

🔧 Troubleshooting

Check Dependencies

Go to Tools → TejOCR → Settings to see real-time status:

✅ Tesseract: Shows installed version and path
✅ Python packages: Shows NumPy, Pytesseract, Pillow status
📁 Browse & Test: Built-in path finder and validator

Common Issues

"Settings dialog won't open":

Check LibreOffice version (4.0+ required)
Restart LibreOffice completely
Check extension is properly installed

"OCR options not working":

Use Settings dialog to verify all dependencies
Check Tesseract path with built-in tester
Ensure image is properly selected

Advanced Configuration

Language Selection: Choose from all installed Tesseract languages
Output Modes: Customize where text appears
Page Segmentation: Optimize for different image types
Preprocessing: Enable image enhancement for better results

🏗️ Development

Building from Source

git clone <repository>
cd TejOCR
python3 build.py

Project Structure

TejOCR/
├── python/tejocr/          # Main Python package
│   ├── constants.py        # Version and configuration constants
│   ├── tejocr_service.py   # Main UNO service with dialog integration
│   ├── tejocr_engine.py    # OCR processing engine
│   ├── tejocr_output.py    # Text insertion handling
│   ├── tejocr_dialogs.py   # Professional XDL dialog handlers
│   └── uno_utils.py        # UNO utilities and helpers
├── dialogs/                # XDL dialog definitions
│   ├── tejocr_settings_dialog.xdl     # Settings UI
│   └── tejocr_options_dialog.xdl      # OCR options UI
├── icons/                  # Extension icons
├── description.xml         # Extension metadata
├── Addons.xcu             # LibreOffice menu/toolbar integration
└── build.py               # Build script

📝 License

This project is licensed under the Mozilla Public License 2.0 - see the LICENSE file for details.

🙏 Acknowledgments

Tesseract OCR team for the excellent OCR engine
LibreOffice community for extension development resources
Python community for pytesseract and imaging libraries

Note: This is v0.1.5 with Phase 2 (Professional UI/UX) complete. Phase 3 (Advanced Features) is coming next!

For detailed changes and technical information, see CHANGELOG.md.

🧠 About the Name

Tej (तेज) in Sanskrit and other Indian languages means light, effulgence, sharpness, or brilliance. TejOCR aims to bring clarity and insight to your documents by making the text within images accessible and editable.

📧 Contact

Maintainer: Devansh Varshney
GitHub: varshneydevansh
Twitter: @varshneydevansh

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
META-INF		META-INF
__pycache__		__pycache__
dialogs		dialogs
dist		dist
icons		icons
l10n		l10n
python		python
tests		tests
.DS_Store		.DS_Store
Addons.xcu		Addons.xcu
CHANGELOG.md		CHANGELOG.md
FINAL_XDL_DEBUG_v0.1.5.md		FINAL_XDL_DEBUG_v0.1.5.md
INTERACTIVE_UI_COMPLETED.md		INTERACTIVE_UI_COMPLETED.md
INTERACTIVE_UI_IMPLEMENTATION.md		INTERACTIVE_UI_IMPLEMENTATION.md
LICENSE		LICENSE
LibreOffice Developer's Guide_ Chapter 4 - Extensions - The Document Foundation Wiki.html		LibreOffice Developer's Guide_ Chapter 4 - Extensions - The Document Foundation Wiki.html
PARALLEL_XDL_APPROACH_v0.1.5.md		PARALLEL_XDL_APPROACH_v0.1.5.md
ProtocolHandler.xcu		ProtocolHandler.xcu
QUICK_TEST_GUIDE_v0.1.5.md		QUICK_TEST_GUIDE_v0.1.5.md
README.md		README.md
REGRESSION_FIXES_v0.1.5.md		REGRESSION_FIXES_v0.1.5.md
SIMPLE_UI_IMPLEMENTATION_v0.1.5.md		SIMPLE_UI_IMPLEMENTATION_v0.1.5.md
TASKS.md		TASKS.md
TejOCR-0.1.4(final).oxt		TejOCR-0.1.4(final).oxt
TejOCR-0.1.4.oxt		TejOCR-0.1.4.oxt
TejOCR-0.1.6.oxt		TejOCR-0.1.6.oxt
TejOCR-test.oxt		TejOCR-test.oxt
XDL_DEBUG_TEST_v0.1.5.md		XDL_DEBUG_TEST_v0.1.5.md
build.py		build.py
build_tejocr.py		build_tejocr.py
description.xml		description.xml
documentation.md		documentation.md
generate_icons.py		generate_icons.py
generate_translations.py		generate_translations.py
install_dependencies.py		install_dependencies.py
run_libreoffice_with_extension.py		run_libreoffice_with_extension.py
technical.md		technical.md
test_fixes.py		test_fixes.py
test_fixes_phase1.py		test_fixes_phase1.py
test_interactive_ui.py		test_interactive_ui.py
test_ocr_engine.py		test_ocr_engine.py
test_ocr_setup.py		test_ocr_setup.py
test_phase1_fixes.py		test_phase1_fixes.py
test_simple_dependencies.py		test_simple_dependencies.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

TejOCR v0.1.5 - LibreOffice OCR Extension

✅ What's New in v0.1.5

🎯 Current Status

🚀 Quick Start

Prerequisites

Installation

Usage

🔧 Troubleshooting

Check Dependencies

Common Issues

Advanced Configuration

🏗️ Development

Building from Source

Project Structure

📝 License

🙏 Acknowledgments

🧠 About the Name

📧 Contact

About

Uh oh!

Releases

Packages

Languages

License

varshneydevansh/TejOCR

Folders and files

Latest commit

History

Repository files navigation

TejOCR v0.1.5 - LibreOffice OCR Extension

✅ What's New in v0.1.5

🎯 Current Status

🚀 Quick Start

Prerequisites

Installation

Usage

🔧 Troubleshooting

Check Dependencies

Common Issues

Advanced Configuration

🏗️ Development

Building from Source

Project Structure

📝 License

🙏 Acknowledgments

🧠 About the Name

📧 Contact

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages