-
-
Notifications
You must be signed in to change notification settings - Fork 691
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Encode PDFString containing non-Latin characters #162
Conversation
b4dbbbe
to
06218d5
Compare
Hello @PlushBeaver! I just had a chance to review this. My apologies for the late response. Thank you very much for the time and effort you spent on this! I have a couple of questions/concerns:
|
Thanks for the concerns raised, @Hopding.
|
Hello @PlushBeaver! My apologies for the delayed response - I've been quite busy. But I haven't forgotten about this. I've been thinking about the best way to handle all of this. I finally made some decisions, and have been working on implementing things in #204 over the past few days. The changes I made in #204 are all based on your work here, but with a few differences. Please take a look at #204 and let me know if I missed anything that you solved here. If it looks good to you, then I'll merge it (closing both #162 and #204) and the changes will go out in the next pdf-lib release! |
Hello, @Hopding. Specialized facilities from #204 are both handy and flexible enough not only to solve metatata encoding issues, but also to implement Unicode outline using new |
ASCII representation for PDFString cannot handle code points above 127 (7-bit ASCII). Such strings have to be encoded as Unicode (UTF16+BOM). This implementation opts to represent them as hex strings for ease of encoding.