Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

566 fix cjk characters extraction #593

Merged

Conversation

wind-chh
Copy link

@wind-chh wind-chh commented Feb 26, 2021

Pull request

This PR try to fix issue (#566 ) and the same failure(but different cause) on my own pdf files.

How Has This Been Tested?

I've added two pdf files under sample/contrib, one is from issue reporter and
the other is my own which occur the same issue. And I've also added two tests
in test_highlevel_extracttext.py testsuite. All unittests passed on my environment.

Checklist

  • I have added tests that prove my fix is effective or that my feature
    works
  • I have added docstrings to newly created methods and classes
  • I have optimized the code at least one time after creating the initial
    version
  • I have updated the README.md or I am verified that this
    is not necessary
  • I have updated the readthedocs documentation or I
    verified that this is not necessary
  • I have added a consice human-readable description of the change to
    CHANGELOG.md

Copy link
Member

@pietermarsman pietermarsman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your changes look good, and I'm happy that all the tests are working. However, I'm not very familiar with this part of the code. Can you indicate how certain you are that this code will not break other PDF's?

Also, I've left some requests for small changes and some questions. If you change/answer them this PR is good to go.

pdfminer/pdffont.py Show resolved Hide resolved
pdfminer/cmapdb.py Show resolved Hide resolved
CHANGELOG.md Outdated Show resolved Hide resolved
@pietermarsman pietermarsman merged commit 234c466 into pdfminer:develop Aug 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants