PDFDocument() from pdfminer requires a parser argument

Using the fix from #24 and python 3.5.2, I call slate.PDF(file) but PDFDocument requires a parser. What should be put here? I tried self.parser but this didn't work.

Traceback (most recent call last):
  File "pdftotext.py", line 7, in <module>
    doc = slate.PDF(f)
  File "//anaconda/lib/python3.5/site-packages/slate/classes.py", line 56, in __init__
    self.doc = PDFDocument()
TypeError: __init__() missing 1 required positional argument: 'parser'

putting self.doc = PDFDocument(self.parser) leads to this error that I cannot fix either.

Traceback (most recent call last):
  File "pdftotext.py", line 7, in <module>
    doc = slate.PDF(f)
  File "//anaconda/lib/python3.5/site-packages/slate/classes.py", line 57, in __init__
    self.doc = PDFDocument(self.parser)
  File "//anaconda/lib/python3.5/site-packages/pdfminer/pdfdocument.py", line 559, in __init__
    pos = self.find_xref(parser)
  File "//anaconda/lib/python3.5/site-packages/pdfminer/pdfdocument.py", line 773, in find_xref
    for line in parser.revreadlines():
  File "//anaconda/lib/python3.5/site-packages/pdfminer/psparser.py", line 285, in revreadlines
    s = self.fp.read(prevpos-pos)
  File "//anaconda/lib/python3.5/codecs.py", line 321, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x88 in position 2: invalid start byte


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

PDFDocument() from pdfminer requires a parser argument #43

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

PDFDocument() from pdfminer requires a parser argument #43

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions