Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Emoji clues are rendered incorrectly on Windows #137

Closed
jpd236 opened this issue Jan 4, 2021 · 2 comments · Fixed by #156
Closed

Emoji clues are rendered incorrectly on Windows #137

jpd236 opened this issue Jan 4, 2021 · 2 comments · Fixed by #156

Comments

@jpd236
Copy link
Contributor

jpd236 commented Jan 4, 2021

If I set a clue to an emoji character in a format which supports UTF-8 (like JPZ), it fails to render correctly. Not sure whether it's a wxWidgets limitation or if we're mangling the text before passing it on to be rendered.

@jpd236 jpd236 changed the title Emoji clues are rendered incorrectly (at least on Windows) Emoji clues are rendered incorrectly on Windows Jul 4, 2021
@jpd236
Copy link
Contributor Author

jpd236 commented Jul 4, 2021

Verified that Emoji render correctly on Mac (using the UTF-8 .puz support from #154).

I believe the problem here is the platform definition of std::wchar. On Mac/Linux, it's 4 bytes, and you must use UTF-32; this is what XWord appears to have implemented. But on Windows, it's only 2 bytes, and you're expected to use UTF-16. As a result, UTF-8 to Unicode conversions (e.g. as done by utf8_to_unicode in puzstring.cpp) may try to store a value that's too large to fit in one character as one character, leading to corruption for any characters too large to fit into 2 bytes.

We could have Windows-specific implementations of UTF-8 encoding/decoding here, though I'm not sure if there might be wider difficulties/consequences.

@mrichards42
Copy link
Owner

Ah good call, yes that's almost certainly it. Probably worth scrapping those hand-rolled functions and using a library. Looks like pugixml handles encodings, so that might be the lightest lift. I didn't realize it also has a wchar implementation that does the encoding and decoding for you or I might have used that in the jpz parser originally.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants