-
Notifications
You must be signed in to change notification settings - Fork 946
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Always try to get CMap, even if name is not recognized #438
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems Good,
Initially, any value outside IDENTITY_ENCODER was set "unknown" as default, now this will be set to the cmap_name itself.
@fakabbir Could you elaborate on that? I don't see the difference in the |
So, at this place https://github.com/pdfminer/pdfminer.six/pull/438/files#diff-9d138ff43c58cd4903b1e16ce49c98fcR766 |
But in the old code the Happy to improve thing here, but I'm not (yet) seeing it. |
I felt it's getting assigned as |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need Root Cause Figured Out
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removing Approved Status.
@fakabbir I'm a bit confused on what to do with this PR now. |
Hey @pietermarsman and @fakabbir. I've been sorting through the issues (as per conversation on gitter). I've marked the issue relating to this PR as "in progress". Is this moving forwards somehow? |
… default if the key is not in there
That's indeed the case. The differences:
So I think this is an improvement. But it would be great if someone else can confirm this. |
It improves the output of the issue, so thats a good start :) |
Does the current patch resolve the issue with the Chinese characters in the PDF in #391 ? |
Yes |
I need another review before merging this. Either from @fakabbir or someone else. |
I can take a look if you want, but I think fakabbir probably understood this more than I will so you should probably wait for him. Looks like CI is failing at the moment anyway? |
…nto 391-fix-cmap-from-pickle-file
I thought I fixed that earlier, but actually it never passed 🤔 Now it is! :) Actually, the test output improved because the CJK characters in simple3.pdf are now also recognized. |
Now it is... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just one suggestion to add a comment (which you're welcome to ignore if you think it's obvious)
Pull request
Fixes #391
This PR allows always tries to get a
cmap_name
. Previously this was only done for . It does not break any existing behavior.How Has This Been Tested?
Checklist
works
version
is not necessary
verified that this is not necessary
CHANGELOG.md