这是indexloc提供的服务,不要输入任何密码
Skip to content

jsaribeirolopes/PdfDocumentParser

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PdfDocumentParser

PdfDocumentParser is a parsing engine intended to find and extract text/images from PDF documents that conform to predictable graphic layouts - such as reports, bills, forms, tickets and the like. Its parsing approach is based on finding certain text or image fragments in page and then extracting text/images located relatively to those fragments.

PdfDocumentParser does all the tricky job of building parsing templates, search, recognition and extraction, thus, leaving you only to code a custom logic.

PdfDocumentParser is a .NET DLL.

For a sample of using PdfDocumentParser or a framework refer to SampleParser project in the repository.

More details...

Support

Contact me if you want me to enhance PdfDocumentParser. Also, you can hire me for solving a parsing task of any complexity or for general development.

About

PdfDocumentParser is a .NET toolset for building PDF parsers.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • C# 94.3%
  • HTML 3.2%
  • JavaScript 1.7%
  • Other 0.8%