Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect parsing of filenames in file specifications #152

Closed
davidtr1037 opened this issue Jun 9, 2018 · 1 comment · Fixed by #338
Closed

Incorrect parsing of filenames in file specifications #152

davidtr1037 opened this issue Jun 9, 2018 · 1 comment · Fixed by #338

Comments

@davidtr1037
Copy link

davidtr1037 commented Jun 9, 2018

There is an issue with parsing filenames of file specifications:
For example, when I run:

dumppdf.py -E /tmp/ sample.pdf

I get:

Traceback (most recent call last):
  File "./build/scripts-2.7/dumppdf.py", line 275, in <module>
    if __name__ == '__main__': sys.exit(main(sys.argv))
  File "./build/scripts-2.7/dumppdf.py", line 272, in main
    dumpall=dumpall, codec=codec, extractdir=extractdir)
  File "./build/scripts-2.7/dumppdf.py", line 200, in extractembedded
    extract1(obj)
  File "./build/scripts-2.7/dumppdf.py", line 173, in extract1
    filename = os.path.basename(obj['UF'] or obj['F'])
KeyError: 'UF'

Possible fix:

filename = obj.get('UF') or obj.get('F') or <some_default_string>

The pdf is attached:
7c127eb6889074efbfac63d35ba0b69cfae22d56bfa2755ef0e925b6c032c4b0.pdf

@pietermarsman
Copy link
Member

Thanks for your issue! I've created #338 to fix this. Do you have more pdf's with embedded images? Could you check if the PR code works for those?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants