+
Skip to content

kp-forks/deepdoctection

 
 

Repository files navigation

Deep Doctection Logo

GitHub Repo stars PyPI - Version PyPI - License


NEW

Version v.0.43 includes a significant redesign of the Analyzer's default configuration. Key changes include:

  • More powerful models for Document Layout Analysis and OCR.
  • Expanded functionality.
  • Less dependencies.

A Package for Document Understanding

deepdoctection is a Python library that orchestrates Scan and PDF document layout analysis and extraction for RAG. It also provides a framework for training, evaluating and inferencing Document AI models.

Overview

Have a look at the introduction notebook for an easy start.

Check the release notes for recent updates.


Hugging Face Space Demo

Check the demo of a document layout analysis pipeline with OCR on 🤗 Hugging Face spaces or use the gradio client.

pip install gradio_client   # requires Python >= 3.10 

To process a single image:

from gradio_client import Client, handle_file

if __name__ == "__main__":

    client = Client("deepdoctection/deepdoctection")
    result = client.predict(
        img=handle_file('/local_path/to/dir/file_name.jpeg'),  # accepts image files, e.g. JPEG, PNG
        pdf=None,   
        max_datapoints = 2,
        api_name = "/analyze_image"
    )
    print(result)

To process a PDF document:

from gradio_client import Client, handle_file

if __name__ == "__main__":

    client = Client("deepdoctection/deepdoctection")
    result = client.predict(
        img=None,
        pdf=handle_file("/local_path/to/dir/your_doc.pdf"),
        max_datapoints = 2, # increase to process up to 9 pages
        api_name = "/analyze_image"
    )
    print(result)

Example

import deepdoctection as dd
from IPython.core.display import HTML
from matplotlib import pyplot as plt

analyzer = dd.get_dd_analyzer()  # instantiate the built-in analyzer similar to the Hugging Face space demo

df = analyzer.analyze(path = "/path/to/your/doc.pdf")  # setting up pipeline
df.reset_state()                 # Trigger some initialization

doc = iter(df)
page = next(doc) 

image = page.viz(show_figures=True, show_residual_layouts=True)
plt.figure(figsize = (25,17))
plt.axis('off')
plt.imshow(image)

sample

HTML(page.tables[0].html)

table

print(page.text)

text


Requirements

requirements

  • Linux or macOS. Windows is not supported but there is a Dockerfile available.
  • Python >= 3.9
  • 2.2 <= PyTorch or 2.11 <= Tensorflow < 2.16. (For lower Tensorflow versions the code will only run on a GPU). Tensorflow support will be stopped from Python 3.11 onwards.
  • To fine-tune models, a GPU is recommended.
Task PyTorch Torchscript Tensorflow
Layout detection via Detectron2/Tensorpack ✅ (CPU only) ✅ (GPU only)
Table recognition via Detectron2/Tensorpack ✅ (CPU only) ✅ (GPU only)
Table transformer via Transformers
Deformable-Detr
DocTr
LayoutLM (v1, v2, v3, XLM) via Transformers

Installation

We recommend using a virtual environment.

Get started installation

For a simple setup which is enough to parse documents with the default setting, install the following:

PyTorch

pip install transformers
pip install python-doctr
pip install deepdoctection

TensorFlow

pip install tensorpack
pip install python-doctr
pip install deepdoctection

Both setups are sufficient to run the introduction notebook.

Full installation

The following installation will give you ALL models available within the Deep Learning framework as well as all models that are independent of Tensorflow/PyTorch.

PyTorch

First install Detectron2 separately as it is not distributed via PyPi. Check the instruction here or try:

pip install detectron2@git+https://github.com/deepdoctection/detectron2.git

Then install deepdoctection with all its dependencies:

pip install deepdoctection[pt]

Tensorflow

pip install deepdoctection[tf]

For further information, please consult the full installation instructions.

Installation from source

Download the repository or clone via

git clone https://github.com/deepdoctection/deepdoctection.git

PyTorch

cd deepdoctection
pip install ".[pt]" # or "pip install -e .[pt]"

Tensorflow

cd deepdoctection
pip install ".[tf]" # or "pip install -e .[tf]"

Running a Docker container from Docker hub

Pre-existing Docker images can be downloaded from the Docker hub.

docker pull deepdoctection/deepdoctection:<release_tag> 

Use the Docker compose file ./docker/pytorch-gpu/docker-compose.yaml. In the .env file provided, specify the host directory where deepdoctection's cache should be stored. Additionally, specify a working directory to mount files to be processed into the container.

docker compose up -d

will start the container. There is no endpoint exposed, though.


Credits

We thank all libraries that provide high quality code and pre-trained models. Without, it would have been impossible to develop this framework.

If you like deepdoctection ...

...you can easily support the project by making it more visible. Leaving a star or a recommendation will help.

License

Distributed under the Apache 2.0 License. Check LICENSE for additional information.

About

A Repo For Document AI

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 99.6%
  • Other 0.4%
点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载