Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Across Lite 2.0 format #154

Open
jpd236 opened this issue Jun 19, 2021 · 3 comments
Open

Support Across Lite 2.0 format #154

jpd236 opened this issue Jun 19, 2021 · 3 comments

Comments

@jpd236
Copy link
Contributor

jpd236 commented Jun 19, 2021

Evidently there is a minor tweak to the .puz format where the version string is 2.0 and strings (at least clues, if not others) are in UTF-8 instead of ISO-8859-1. The attached example puzzle opens fine in Across Lite and shows various unicode characters, but fails to open in XWord. Would be nice to support in XWord (though we might also need to take a closer look at #137).

example.zip

@mrichards42
Copy link
Owner

Huh, well, so it's easy enough to replace decode_puz with decode_utf8 (and the same when saving), although the way the code is structured it's kind of obnoxious to support both at the same time, so it might take a little refactoring.

The more annoying part is that it seems like the clues produce different checksums than across lite wants. XWord and AL generate identical contents including most of the checksums, but the two sets of checksums that include the clues are off. Possible also that I missed something when I hacked the saving code to change the encoding function. I also couldn't get AL to enter accents in the grid, so I didn't get any info about if it still tries to save the grid as utf-8 or cp-1252 (small thing, but iirc it's that, not iso-8859-1).

@jpd236
Copy link
Contributor Author

jpd236 commented Jun 22, 2021

Here's another reference implementation - not sure if it runs into the same issue w/ checksums: https://github.com/alexdej/puzpy/pull/26/files

AFAIK, the grid is still unchanged in terms of supported characters (https://twitter.com/evanbirnholz/status/1406431096653422593).

I used to think it was Cp1252 as well, but I think I found inconsistencies in how certain Cp1252 characters work across platforms - e.g. that Mac Across Lite couldn't handle at least some of the Cp1252-exclusive characters.

Also, we may want to retain the input version (and/or use the minimum version needed to support the characters in the puzzle) rather than always "upgrading" to v2.0. I've found at least one bug in the current version of Across Lite w/ v2.0 puzzles - rebus entry appears to be fairly broken.

@jpd236
Copy link
Contributor Author

jpd236 commented Jul 4, 2021

Kicked off #155. The good news is that the files being saved here still seem to open in Across Lite, so I think the checksums are being calculated correctly. The bad news is that while one of the puzzles I tested with a "simple" unicode character (https://herbach.dnsalias.com/WaPo/wp210620_a.puz, see 72 Down) reads and writes the correct clue text, the original example attached above gets corrupted. I'll look into it soon, but I suspect that one or both of encode_utf8 or decode_utf8 has a bug with higher codepoints - perhaps the same root cause as #137.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants