-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add CI for s390x and fix big-endian arch #98
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to fix the following files as well for the same issue:
- unicode-data-scripts/test/Unicode/Char/General/ScriptsSpec.hs
- unicode-data-scripts/lib/Unicode/Char/General/Scripts.hs
We may have problems in future if we forget to include MachDeps.h , this is a nasty thing that is not caught by any tool and can be easily missed. We can do two things to mitigate this:
- define a CPP macro for reading Word32 with endianness handling and then use that macro everywhere. That way we need to include MachDeps only in the file defining that macro. Both byteswap and narrowing can be included in that macro so we do not miss it anywhere.
- Add a CI for s390x. We can copy it from unicode-transforms repo.
@@ -51,9 +51,9 @@ name (C# c#) = getName 0# 149185# | |||
indexInt32OffAddr'# k# = | |||
#ifdef WORDS_BIGENDIAN | |||
#if MIN_VERSION_base(4,16,0) | |||
word2Int# (byteSwap32# (word32ToWord# (indexWord32OffAddr# offsets# k#))) | |||
word2Int# (narrow32Word# (byteSwap32# (word32ToWord# (indexWord32OffAddr# offsets# k#)))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume byteSwap32# changes the byte ordering. The doc (https://hackage.haskell.org/package/base-4.17.0.0/docs/GHC-Exts.html#v:byteSwap32-35-) says:
Swap bytes in the lower 32 bits of a word. The higher bytes are undefined.
I do not understand what "swap bytes" means for 4 bytes, the term "swap" is applicable only to 2 bytes.
Is there a documentation somewhere about what narrow32Word# does? Does it zero the higher 32 bits? The doc of byteSwap32#
above says the "higher bytes are undefined" so perhaps that is what the bug was?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not find much information. I found some hints in cryptonite
package.
My understanding is that byte swapping adapt the endianness and is not limited to 2 bytes. Here byteSwap32#
is for 4 bytes, and byteSwap16#
would be for 2, etc.
56e57e1
to
1a75a0b
Compare
These are already fixed.
I think the latter is better. I am giving a try. |
@harendra-kumar there is an issue with the CI for GHC 8.0.2 and 8.2.2. It seems they are not supported anymore. |
It see this in
Don't we need a call to narrowWord here? |
The installation seems to be failing, not sure what's wrong. We can investigate if someone asks for it, otherwise let's ignore it for now. |
- Add missing include "MachDeps.h" - Use narrow32Word# to ensure correct result after byteSwap32# on Big-Endian arch
Adapted from unicode-transforms
Try to spare memory for s390x CI.
@harendra-kumar After the experiment in #99, I think this is ready to merge if the relevant tests pass (let’s discard GHC ≤ 8.2). I reorganized and squashed some commits. Keep in mind that we can fully test only against GHC 9.6. And GHC support for Unicode < 12.1 is not reliable, because there are some discrepancies (I checked manually some errors). |
It seems you missed this #98 (comment) . Can you please respond to this. |
I think I did fix it. Are you using the latest version? |
@harendra-kumar kind reminder |
Looks like I missed it, maybe I did not pull it again or ignored the pull error. |
Fix for 32bits words on big-endian arch.
Fixes #97