Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

valid doesn't seem to work as expected with -O0 #459

Closed
sjakobi opened this issue May 9, 2022 · 5 comments
Closed

valid doesn't seem to work as expected with -O0 #459

sjakobi opened this issue May 9, 2022 · 5 comments

Comments

@sjakobi
Copy link
Member

sjakobi commented May 9, 2022

I ran into this issue while working on #458. I'm using the sjakobi/valid-O0 branch to investigate.

To reproduce:

$ cabal test --test-show-details=direct --test-options ' --quickcheck-tests 4 -p Lazy.alterF.valid --quickcheck-max-size 100 --quickcheck-verbose --quickcheck-replay=836177' -w ghc-9.2
All
  Properties
    Data.HashMap.Lazy
      alterF
        valid: FAIL
          Passed:
          <fun>
          K {hash = 1, _x = C}
          fromList []
          [] == []
          
          Failed:
          <fun>
          K {hash = 9, _x = A}
          fromList [(K {hash = -724960, _x = D},2),(K {hash = 1, _x = D},12),(K {hash = 8, _x = A},7),(K {hash = 2305843009213693960, _x = C},9),(K {hash = 264, _x = C},10),(K {hash = 9, _x = A},1),(K {hash = 1961, _x = C},10),(K {hash = 1482, _x = B},6),(K {hash = -22, _x = B},25),(K {hash = 1111, _x = A},26)]
          [Invalid (INV6_misplaced_hash 0) (SubHashPath {partialHash = 9, lengthInBits = 10}),Invalid (INV6_misplaced_hash 0) (SubHashPath {partialHash = 9, lengthInBits = 10}),Invalid (INV6_misplaced_hash 0) (SubHashPath {partialHash = 9, lengthInBits = 10}),Valid,Invalid (INV6_misplaced_hash 0) (SubHashPath {partialHash = 9, lengthInBits = 10}),Invalid (INV6_misplaced_hash 0) (SubHashPath {partialHash = 9, lengthInBits = 10})] /= [Valid,Valid,Valid,Valid,Valid,Valid]

<lots of shrinking>

          *** Failed! Falsified (after 2 tests and 23 shrinks):
          {_->[Just 1]}
          K {hash = 8, _x = A}
          fromList [(K {hash = 0, _x = A},1),(K {hash = 8, _x = A},1)]
          [Invalid (INV6_misplaced_hash 0) (SubHashPath {partialHash = 8, lengthInBits = 5})] /= [Valid]
          Use --quickcheck-replay=836177 to reproduce.

The code used by alterF here is actually buggy (fixed in 1e8ca8c and 418aa71) so the test failure is expected.

However if I run the same command and add -O0, we apparently get the same test case, but valid reports the resulting HashMaps to be Valid:

$ cabal test --test-show-details=direct --test-options ' --quickcheck-tests 4 -p Lazy.alterF.valid --quickcheck-max-size 100 --quickcheck-verbose --quickcheck-replay=836177' -w ghc-9.2 -O0
All
  Properties
    Data.HashMap.Lazy
      alterF
        valid: OK
          Passed:
          <fun>
          K {hash = 1, _x = C}
          fromList []
          [] == []
          
          Passed:
          <fun>
          K {hash = 9, _x = A}
          fromList [(K {hash = -724960, _x = D},2),(K {hash = 1, _x = D},12),(K {hash = 8, _x = A},7),(K {hash = 2305843009213693960, _x = C},9),(K {hash = 264, _x = C},10),(K {hash = 9, _x = A},1),(K {hash = 1961, _x = C},10),(K {hash = 1482, _x = B},6),(K {hash = -22, _x = B},25),(K {hash = 1111, _x = A},26)]
          [Valid,Valid,Valid,Valid,Valid,Valid] == [Valid,Valid,Valid,Valid,Valid,Valid]

<two more tests>

I've tried to increase the number of tests and the size of the test cases but I still can't get the testsuite to fail with -O0.

I have tried to reproduce the shrunk failure case in GHCi but that's reported as Valid too.

EDIT: I'm getting the same behaviour with GHC 8.2.2, so it doesn't seem to be a recent regression in GHC.

@sjakobi
Copy link
Member Author

sjakobi commented May 9, 2022

My initial suspicion is that GHC somehow produces incorrect code for valid. There's some bit-twiddling going on that may rely on some optimizations to actually be correct?!

-- | A part of a 'Hash' with 'bitsPerSubkey' bits.
type SubHash = Word
data SubHashPath = SubHashPath
{ partialHash :: !Word
-- ^ The bits we already know, starting from the lower bits.
-- The unknown upper bits are @0@.
, lengthInBits :: !Int
-- ^ The number of bits known.
} deriving (Eq, Show)
initialSubHashPath :: SubHashPath
initialSubHashPath = SubHashPath 0 0
addSubHash :: SubHashPath -> SubHash -> SubHashPath
addSubHash (SubHashPath ph l) sh =
SubHashPath (ph .|. (sh `unsafeShiftL` l)) (l + bitsPerSubkey)
hashMatchesSubHashPath :: SubHashPath -> Hash -> Bool
hashMatchesSubHashPath (SubHashPath ph l) h = maskToLength h l == ph
where
-- Note: This needs to use `shiftL` instead of `unsafeShiftL` because
-- @l'@ may be greater than 32/64 at the deepest level.
maskToLength h' l' = h' .&. complement (complement 0 `shiftL` l')
valid :: Hashable k => HashMap k v -> Validity k
valid Empty = Valid
valid t = validInternal initialSubHashPath t
where
validInternal p Empty = Invalid INV1_internal_Empty p
validInternal p (Leaf h l) = validHash p h <> validLeaf p h l
validInternal p (Collision h ary) = validHash p h <> validCollision p h ary
validInternal p (BitmapIndexed b ary) = validBitmapIndexed p b ary
validInternal p (Full ary) = validFull p ary
validHash p h | hashMatchesSubHashPath p h = Valid
| otherwise = Invalid (INV6_misplaced_hash h) p
validLeaf p h (L k _) | hash k == h = Valid
| otherwise = Invalid (INV7_key_hash_mismatch k h) p
validCollision p h ary = validCollisionSize <> A.foldMap (validLeaf p h) ary <> distinctKeys
where
n = A.length ary
validCollisionSize | n < 2 = Invalid (INV9_Collision_size n) p
| otherwise = Valid
distinctKeys = A.foldMap (\(L k _) -> appearsOnce k) ary
appearsOnce k | A.foldMap (\(L k' _) -> if k' == k then Sum @Int 1 else Sum 0) ary == 1 = Valid
| otherwise = Invalid (INV10_Collision_duplicate_key k h) p
validBitmapIndexed p b ary = validBitmap <> validArraySize <> validSubTrees p b ary
where
validBitmap | b .&. complement fullBitmap == 0 = Valid
| otherwise = Invalid (INV2_Bitmap_unexpected_1_bits b) p
n = A.length ary
validArraySize | n < 1 || n >= maxChildren = Invalid (INV3_bad_BitmapIndexed_size n) p
| popCount b == n = Valid
| otherwise = Invalid (INV4_bitmap_array_size_mismatch b n) p
validSubTrees p b ary
| A.length ary == 1
, isLeafOrCollision (A.index ary 0)
= Invalid INV5_BitmapIndexed_invalid_single_subtree p
| otherwise = go b
where
go 0 = Valid
go b' = validInternal (addSubHash p (fromIntegral c)) (A.index ary i) <> go b''
where
c = countTrailingZeros b'
m = 1 `unsafeShiftL` c
i = sparseIndex b m
b'' = b' .&. complement m
validFull p ary = validArraySize <> validSubTrees p fullBitmap ary
where
n = A.length ary
validArraySize | n == maxChildren = Valid
| otherwise = Invalid (INV8_bad_Full_size n) p

@treeowl
Copy link
Collaborator

treeowl commented May 9, 2022

I don't understand why your tweaks were necessary. Shouldn't those hashes always be the same??? As for optimizations, I think there are some RULES for alterF; those will only fire with optimizations enabled. Maybe you're running into a problem with the code they rewrite to?

@sjakobi
Copy link
Member Author

sjakobi commented May 9, 2022

I don't understand why your tweaks were necessary. Shouldn't those hashes always be the same???

The optimizations I'm trying in #458 actually manipulate the key-hashes themselves, so they end up different from the stores hashes in the tree. This will need further cleanup to reduce confusion. See #458.

As for optimizations, I think there are some RULES for alterF; those will only fire with optimizations enabled. Maybe you're running into a problem with the code they rewrite to?

Ah, that's it! :) The default alterF implementation doesn't use the buggy code, so to reproduce the issue this rule must fire:

"alterFWeird" forall f. alterF f =
   alterFWeird (f Nothing) (f (Just test_bottom)) f

@sjakobi sjakobi closed this as completed May 9, 2022
@treeowl
Copy link
Collaborator

treeowl commented May 9, 2022

I don't understand the optimization you're trying to do. The hashes change somehow??

@sjakobi
Copy link
Member Author

sjakobi commented May 9, 2022

I don't understand the optimization you're trying to do. The hashes change somehow??

Let me finish #458. That should clear things up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants