You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I ran into a Value error and then Key Error in the process of trying to fix the bug
I checked out the dev branch and reproduced the bug using: python pdf2txt.py xxx.pdf.
I couldn't share the document now because it is private and protected.
But the Stacktrace tells the story:
Traceback (most recent call last):
File "pdf2txt.py", line 204, in <module>
sys.exit(main())
File "pdf2txt.py", line 198, in main
outfp = extract_text(**vars(A))
File "pdf2txt.py", line 66, in extract_text
pdfminer.high_level.extract_text_to_fp(fp, **locals())
File "/Users/baojiatong/Work/pdfminer.six/pdfminer/high_level.py", line 83, in extract_text_to_fp
caching=not disable_caching):
File "/Users/baojiatong/Work/pdfminer.six/pdfminer/pdfpage.py", line 128, in get_pages
doc = PDFDocument(parser, password=password, caching=caching)
File "/Users/baojiatong/Work/pdfminer.six/pdfminer/pdfdocument.py", line 572, in __init__
self.read_xref_from(parser, pos, self.xrefs)
File "/Users/baojiatong/Work/pdfminer.six/pdfminer/pdfdocument.py", line 826, in read_xref_from
self.read_xref_from(parser, pos, xrefs)
File "/Users/baojiatong/Work/pdfminer.six/pdfminer/pdfdocument.py", line 815, in read_xref_from
xref.load(parser)
File "/Users/baojiatong/Work/pdfminer.six/pdfminer/pdfdocument.py", line 233, in load
or stream['Type'] is not LITERAL_XREF:
File "/Users/baojiatong/Work/pdfminer.six/pdfminer/pdftypes.py", line 219, in __getitem__
return self.attrs[name]
KeyError: 'Type'
Traceback (most recent call last):
File "pdf2txt.py", line 204, in <module>
sys.exit(main())
File "pdf2txt.py", line 198, in main
outfp = extract_text(**vars(A))
File "pdf2txt.py", line 66, in extract_text
pdfminer.high_level.extract_text_to_fp(fp, **locals())
File "/Users/baojiatong/Work/pdfminer.six/pdfminer/high_level.py", line 83, in extract_text_to_fp
caching=not disable_caching):
File "/Users/baojiatong/Work/pdfminer.six/pdfminer/pdfpage.py", line 128, in get_pages
doc = PDFDocument(parser, password=password, caching=caching)
File "/Users/baojiatong/Work/pdfminer.six/pdfminer/pdfdocument.py", line 572, in __init__
self.read_xref_from(parser, pos, self.xrefs)
File "/Users/baojiatong/Work/pdfminer.six/pdfminer/pdfdocument.py", line 826, in read_xref_from
self.read_xref_from(parser, pos, xrefs)
File "/Users/baojiatong/Work/pdfminer.six/pdfminer/pdfdocument.py", line 815, in read_xref_from
xref.load(parser)
File "/Users/baojiatong/Work/pdfminer.six/pdfminer/pdfdocument.py", line 231, in load
(_, stream) = parser.nextobject()
File "/Users/baojiatong/Work/pdfminer.six/pdfminer/psparser.py", line 610, in nextobject
self.do_keyword(pos, token)
File "/Users/baojiatong/Work/pdfminer.six/pdfminer/pdfparser.py", line 72, in do_keyword
((_, objid), (_, genno)) = self.pop(2)
ValueError: not enough values to unpack (expected 2, got 1)
I will file a PR along soon to fix this.
The text was updated successfully, but these errors were encountered:
python pdf2txt.py xxx.pdf
.I couldn't share the document now because it is private and protected.
I will file a PR along soon to fix this.
The text was updated successfully, but these errors were encountered: