这是indexloc提供的服务,不要输入任何密码
Skip to content

Conversation

@timothycarambat
Copy link
Member

@timothycarambat timothycarambat commented Sep 18, 2023

Taking over for PR #183

  • Migrates PDF to PyMuPDF
  • Simple PDF parsing and loading
  • 200% speed increase from this simple change

c/o: franzbischoff

@timothycarambat timothycarambat mentioned this pull request Sep 18, 2023
@timothycarambat timothycarambat merged commit 3e78476 into master Sep 18, 2023
@timothycarambat timothycarambat deleted the franzbischoff-document-improvements branch September 18, 2023 23:21
cabwds pushed a commit to cabwds/anything-llm that referenced this pull request Jul 3, 2025
* cosmetic changes to be compatible to hadolint

* common configuration for most editors until better plugins comes up

* Changes on PDF metadata, using PyMuPDF (faster and more compatible)

* small changes on other file ingestions in order to try to keep the fields equal

* Lint, review, and review

* fixed unknown chars

* Use PyMuPDF for pdf loading for 200% speed increase
linting

---------

Co-authored-by: Francisco Bischoff <franzbischoff@gmail.com>
Co-authored-by: Francisco Bischoff <984592+franzbischoff@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants