Fix 32 bit compilation #322

amesgen · 2023-06-12T19:13:08Z

This is a rebased and fixed version of #296 (it does not compile for me on e.g. GHC 9.2 and 9.6). This should resolve #293 and resolve #309.

This PR also adds i386 CI jobs (also see #207); feel free to remove if you consider this out-of-scope.

My actual motivation for this PR is the GHC 9.6/9.8 WASM/WASI backend; I locally ensured that the tests pass, but I did not add a CI job as this still needs unmerged/unreleased PRs of transitive dependencies (concretely, haskellari/splitmix#73).

dcoutts

This would be easier to follow if we could split it into two patches: the first making it compile with older compilers (fixing the things that look like they were just broken and type incorrect), and then a patch to also make it build with newer compilers (that make the changes to the unboxed primitive numeric types).

dcoutts · 2023-09-01T10:14:53Z

cborg/src/Codec/CBOR/Decoding.hs

@@ -423,7 +424,7 @@ decodeWord64 =
 #if defined(ARCH_64bit)
  Decoder (\k -> return (ConsumeWord (\w# -> k (toWord64 w#))))
 #else
-  Decoder (\k -> return (ConsumeWord64 (\w64# -> k (toWord64 w64#))))
+  Decoder (\k -> return (ConsumeWord64 (\w64# -> k (W64# w64#))))


I don't understand why we're replacing our conversion helper function toWord64 with a concrete implementation. And indeed deleting the helper function. Why not just make the helper function do the right thing?

Will change, see https://github.com/well-typed/cborg/pull/322/files#r1320790119

dcoutts · 2023-09-01T10:17:40Z

cborg/src/Codec/CBOR/Decoding.hs

-#else
-toInt64  n = I64# (intToInt64# n)
-toWord64 n = W64# (wordToWord64# n)


Why are we deleting these rather than fixing them? It's breaking the pattern used below. And it also means we cannot support multiple compiler versions that need differences in these functions.

Surely we should just use CPP here to change the impl of toWord64 and toInt64 so that with older GHC versions they're implemented as now, and with newer versions (where the unboxed integer types changed) to use the new implementation (that's inlined at the use site below) of

toInt64 n = I64# n toWord64 n = W64# n

I think I just removed these compat shims as they were not strictly necessary, but I will add them back for consistency and future-proofing.

dcoutts · 2023-09-01T10:19:39Z

cborg/src/Codec/CBOR/Decoding.hs

 toWord8  :: Word# -> Word8
 toWord16 :: Word# -> Word16
 toWord32 :: Word# -> Word32
+#if WORD_SIZE_IN_BITS == 64


For consistency, can we use the same macro as everywhere else: either ARCH_64bit or ARCH_32bit.

Right, will also change the existing usage of WORD_SIZE_IN_BITS 👍

dcoutts · 2023-09-01T10:21:09Z

cborg/src/Codec/CBOR/Decoding.hs

@@ -986,7 +987,7 @@ type ByteOffset = Int64
 -- @since 0.2.2.0
 peekByteOffset :: Decoder s ByteOffset
 peekByteOffset = Decoder (\k -> return (PeekByteOffset (\off# -> k (I64#
-#if MIN_VERSION_base(4,17,0)
+#if MIN_VERSION_base(4,17,0) && !defined(ARCH_32bit)


Surely this can't be quite the right logic, because 32bit arches prior to some ghc version did need intToInt64# didn't they?

PeekByteOffset always provides an Int64# on 32bit:

cborg/cborg/src/Codec/CBOR/Decoding.hs

Lines 191 to 195 in c8013b3

#if defined(ARCH_32bit)

| PeekByteOffset (Int64# -> ST s (DecodeAction s a))

#else

| PeekByteOffset (Int# -> ST s (DecodeAction s a))

#endif

so this seems correct to me.

PeekByteOffset always provides an Int64# on 32bit

which would mean this logic here is wrong. The logic currently says "use intToInt64 if base >= 4.17 and only on 64bit arches".

But what you're saying above is that it should be "always use intToInt64 on 32bit arches", so the combo would be

#if MIN_VERSION_base(4,17,0) || defined(ARCH_32bit)

so that we use intToInt64 on 32bit arches (for all base versions), and we also use it on all arches from base 4.17 onwards.

Did we check this patch actually compiles on 32bit arches prior to ghc 9.2?

PeekByteOffset always provides an Int64# on 32bit

which would mean this logic here is wrong. The logic currently says "use intToInt64 if base >= 4.17 and only on 64bit arches".

But what you're saying above is that it should be "always use intToInt64 on 32bit arches", so the combo would be

#if MIN_VERSION_base(4,17,0) || defined(ARCH_32bit)

so that we use intToInt64 on 32bit arches (for all base versions), and we also use it on all arches from base 4.17 onwards.

I think the existing logic is exactly right (it also compiles on all GHC versions >=8.4), as seen eg by a case distinction.

The purpose of this code is to fill the hole _f in PeekByteOffset $ \off# -> k (I64# (_f off#)) (where k has input type ByteOffset ~ Int64).

On 32bit:
We have off# :: Int64# and I64# :: Int64# -> Int64 on all GHCs. Hence, we can fill _f = id.

On 64bit:
We have off# :: Int#.

On GHC < 9.4, we have I64# :: Int# -> Int64.
Hence, we can fill _f = id.

Since GHC >= 9.4, we have I64# :: Int64# -> Int64 (FTR: corresponding section in the migration guide for 9.4).
Hence, we can fill _f = intToInt64# :: Int# -> Int64#

Note that we can fill _f = id except when we are on GHC >=9.4 and on 64bit, which is exactly the existing condition.

Would be neat if there were some more principled/locally understandable alternative to the current CPP maze 😅

Did we check this patch actually compiles on 32bit arches prior to ghc 9.2?

Yes, to double check once again, I pushed to a branch in my fork with the second commit (Support GHC >=9.2 on 32bit) removed: https://github.com/amesgen/cborg/actions/runs/6183531233
One can see that i686 compilation with all GHCs from 8.4 to 9.0 succeeds (as expected, compilation on >=9.2 fails as the respective commit was removed).
Also, here is the CI run that everything works on all GHCs with all three commits: https://github.com/amesgen/cborg/actions/runs/6138762687

dcoutts · 2023-09-01T10:28:48Z

cborg/src/Codec/CBOR/Read.hs

-    DecodedToken sz i@(I64# i#)
-      | isInt64Canonical sz i  -> k i# >>= go_fast (BS.unsafeDrop sz bs)
+    DecodedToken sz (I64# i#)
+      | isInt64Canonical sz i# -> k i# >>= go_fast (BS.unsafeDrop sz bs)


I'm confused by this. Did this simply never work?

I would have to do some git blame archaeology to find out whether this worked in the past due to different type signatures or was just introduced without any testing; even more extremely, the code didn't even parse on 32bit (introduced in #273) previously, so this code went without any testing since at least then.

dcoutts · 2023-09-01T10:29:26Z

cborg/src/Codec/CBOR/Read.hs

@@ -1796,7 +1796,7 @@ tryConsumeInteger hdr !bs = case word8ToWord hdr of
  0x1b -> let !w = eatTailWord64 bs
              sz = 9
 #if defined(ARCH_32bit)
-          in DecodedToken sz (BigIntToken (isWord64Canonical sz (word64ToWord w)) $! toInteger w)
+          in DecodedToken sz (BigIntToken (isWord64Canonical sz w) $! toInteger w)


Also looks like this couldn't ever have been type correct.

see #322 (comment)

amesgen · 2023-09-10T18:34:13Z

This would be easier to follow if we could split it into two patches: the first making it compile with older compilers (fixing the things that look like they were just broken and type incorrect), and then a patch to also make it build with newer compilers (that make the changes to the unboxed primitive numeric types).

Good idea, did that in the latest push (also addressed most comments) 👍

dcoutts

I'm still struggling to follow this, and I'm not totally convinced the first patch does compile on 32bit arches. I'd like to try that.

dcoutts · 2023-09-14T08:33:29Z

cborg/src/Codec/CBOR/Decoding.hs

+#if defined(ARCH_64bit)
+toInt64  n = I64# n
+toWord64 n = W64# n
+#endif
+#endif
+#if defined(ARCH_32bit)
+toInt64  n = I64# n


It's unclear why this identity change is needed. The definitions in the 64 and 32bit cases here are the same as each other, and they are the same definitions as what where here before this code change.

I was half expecting it to be an identity refactoring in preparation for the next patch, but it's not. Was this intended to be something else? Otherwise it has no effect but makes the logic more complicated.

they are the same definitions as what where here before this code change.

No, they are different for 32bit and GHC >=9.4: Before this change, this meant that we compiled to toInt64 n = I64# (intToInt64# n) (incorrect, see #322 (comment)), and with this change, to toInt64 n = I64# n (correct).

But I just realized that this diff belongs into the second, not the first commit; will change that and also try to simplify the logic a bit.

dcoutts · 2023-09-14T08:40:59Z

cborg/src/Codec/CBOR/Decoding.hs

@@ -986,7 +987,7 @@ type ByteOffset = Int64
 -- @since 0.2.2.0
 peekByteOffset :: Decoder s ByteOffset
 peekByteOffset = Decoder (\k -> return (PeekByteOffset (\off# -> k (I64#
-#if MIN_VERSION_base(4,17,0)
+#if MIN_VERSION_base(4,17,0) && !defined(ARCH_32bit)


PeekByteOffset always provides an Int64# on 32bit

which would mean this logic here is wrong. The logic currently says "use intToInt64 if base >= 4.17 and only on 64bit arches".

But what you're saying above is that it should be "always use intToInt64 on 32bit arches", so the combo would be

#if MIN_VERSION_base(4,17,0) || defined(ARCH_32bit)

so that we use intToInt64 on 32bit arches (for all base versions), and we also use it on all arches from base 4.17 onwards.

Did we check this patch actually compiles on 32bit arches prior to ghc 9.2?

amesgen · 2023-09-14T10:24:01Z

I'm still struggling to follow this, and I'm not totally convinced the first patch does compile on 32bit arches. I'd like to try that.

Yes, I found it very hard to modify these CPP-overflowing modules without constantly double-checking everything by actually compiling it. See #322 (comment) at the bottom for CI runs confirming that everything compiles as epxected on various GHCs on 32bit.

dcoutts reviewed Sep 1, 2023

View reviewed changes

amesgen force-pushed the restore-32bit branch from db7d049 to 8011748 Compare September 10, 2023 18:34

dcoutts reviewed Sep 14, 2023

View reviewed changes

amesgen force-pushed the restore-32bit branch from 8011748 to f490e01 Compare September 14, 2023 10:46

amesgen added 3 commits November 14, 2023 19:48

Fix compilation on 32bit

a4757c4

Support GHC >=9.2 on 32bit

94a856e

i386 CI

c3b5c69

amesgen force-pushed the restore-32bit branch from f490e01 to c3b5c69 Compare November 18, 2023 15:49

amesgen mentioned this pull request Jul 23, 2024

How to build other packages for wasm32? tweag/ghc-wasm-miso-examples#17

Open

amesgen mentioned this pull request Aug 1, 2024

Allow building on 32bit platforms #337

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix 32 bit compilation #322

Fix 32 bit compilation #322

amesgen commented Jun 12, 2023 •

edited

Loading

dcoutts left a comment

dcoutts Sep 1, 2023

amesgen Sep 10, 2023

dcoutts Sep 1, 2023

amesgen Sep 10, 2023

dcoutts Sep 1, 2023

amesgen Sep 10, 2023

dcoutts Sep 1, 2023

amesgen Sep 10, 2023

dcoutts Sep 14, 2023

amesgen Sep 14, 2023

dcoutts Sep 1, 2023

amesgen Sep 10, 2023

dcoutts Sep 1, 2023

amesgen Sep 10, 2023 •

edited

Loading

amesgen commented Sep 10, 2023

dcoutts left a comment

dcoutts Sep 14, 2023

amesgen Sep 14, 2023

dcoutts Sep 14, 2023

amesgen commented Sep 14, 2023

	#if defined(ARCH_32bit)
	\| PeekByteOffset (Int64# -> ST s (DecodeAction s a))
	#else
	\| PeekByteOffset (Int# -> ST s (DecodeAction s a))
	#endif

Fix 32 bit compilation #322

Are you sure you want to change the base?

Fix 32 bit compilation #322

Conversation

amesgen commented Jun 12, 2023 • edited Loading

dcoutts left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

amesgen Sep 10, 2023 • edited Loading

Choose a reason for hiding this comment

amesgen commented Sep 10, 2023

dcoutts left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

amesgen commented Sep 14, 2023

amesgen commented Jun 12, 2023 •

edited

Loading

amesgen Sep 10, 2023 •

edited

Loading