New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Create a more compact TxOut using unpacked Word64s #2534

Merged

DavidEichmann merged 4 commits into master from davide/simple_smaller_TxOut

Nov 8, 2021

Contributor

DavidEichmann commented Oct 28, 2021

I've special cased when the address hash size is 28 bytes and the data
hash size is 32 bytes and the value is Ada only.

DavidEichmann changed the title ~~Create a more compact TxOut using unpacked Word64s~~ WIP Create a more compact TxOut using unpacked Word64s

DavidEichmann force-pushed the davide/simple_smaller_TxOut branch from d427b05 to 5729ea5 Compare

October 28, 2021 22:43

Collaborator

lehins commented Oct 28, 2021

Whats all the bit twiddling is about? Is there any reason you can't use PackedBytes exposed in IntersectMBO/cardano-base#243

dcoutts reviewed

View reviewed changes

Contributor

dcoutts left a comment

Nice. I realise we cannot use PackedBytes (because that itself cannot be unpacked due to it being a multi-constructor type), but I think the code would be a little cleaner if we factored out a 4-word 28byte hash type (which would be single-constructor and thus unpackable). It'd give us the same representation, but would factor the code more concisely.

Contributor Author

DavidEichmann commented Oct 29, 2021

Humm I have another branch that tries to extract things into something like PackedBytes but with a single constructor so it's unpackable. I'll update this PR to use that. I'm busy at the moment, so expect an update on Tuesday evening.

lehins requested changes

View reviewed changes

Collaborator

lehins left a comment

Here are more detailed suggestions that I meant by my #2534 (comment)

eras/alonzo/impl/src/Cardano/Ledger/Alonzo/TxBody.hs Outdated

Comment on lines 229 to 265

+                  go :: Hash (CC.ADDRHASH crypto) a -> (Word64, Word64, Word64, Word64)
+                  go h = case fmap fromIntegral (BS.unpack (hashToBytes h)) of
+                    [ b1,
+                      b2,
+                      b3,
+                      b4,
+                      b5,
+                      b6,
+                      b7,
+                      b8,
+                      b9,
+                      b10,
+                      b11,
+                      b12,
+                      b13,
+                      b14,
+                      b15,
+                      b16,
+                      b17,
+                      b18,
+                      b19,
+                      b20,
+                      b21,
+                      b22,
+                      b23,
+                      b24,
+                      b25,
+                      b26,
+                      b27,
+                      b28
+                      ] ->
+                        ( toWord64 b1 b2 b3 b4 b5 b6 b7 b8,
+                          toWord64 b9 b10 b11 b12 b13 b14 b15 b16,
+                          toWord64 b17 b18 b19 b20 b21 b22 b23 b24,
+                          toWord64 b25 b26 b27 b28 0 0 0 (networkBit .&. payCredTypeBit)
+                        )
+                    _ -> error "Impossible! Wrong number of bytes"

Collaborator

lehins Nov 1, 2021

By using PackedBytes we can avoid creating an intermediate ByteString and remove the need to convert hashes through lists and doing it byte-by-byte, all of which is much slower than it has to be.

Suggested change

      
                go :: Hash (CC.ADDRHASH crypto) a -> (Word64, Word64, Word64, Word64)
          
                go h = case fmap fromIntegral (BS.unpack (hashToBytes h)) of
          
                  [ b1,
          
                    b2,
          
                    b3,
          
                    b4,
          
                    b5,
          
                    b6,
          
                    b7,
          
                    b8,
          
                    b9,
          
                    b10,
          
                    b11,
          
                    b12,
          
                    b13,
          
                    b14,
          
                    b15,
          
                    b16,
          
                    b17,
          
                    b18,
          
                    b19,
          
                    b20,
          
                    b21,
          
                    b22,
          
                    b23,
          
                    b24,
          
                    b25,
          
                    b26,
          
                    b27,
          
                    b28
          
                    ] ->
          
                      ( toWord64 b1 b2 b3 b4 b5 b6 b7 b8,
          
                        toWord64 b9 b10 b11 b12 b13 b14 b15 b16,
          
                        toWord64 b17 b18 b19 b20 b21 b22 b23 b24,
          
                        toWord64 b25 b26 b27 b28 0 0 0 (networkBit .&. payCredTypeBit)
          
                      )
          
                  _ -> error "Impossible! Wrong number of bytes"
          
                go :: SizeHash (CC.ADDRHASH crypto) ~ 28 => Hash (CC.ADDRHASH crypto) a -> (Word64, Word64, Word64, Word64)
          
                go h =
          
                  case hashToPackedBytes h of
          
                    PackedBytes28 a64 b64 c64 d32 -> (a64, b64, c64, (fromIntegral d32 `shiftL` 32) .|. fromIntegral (networkBit .&. payCredTypeBit))
          
                    _ -> error "Impossible! Wrong number of bytes"

Contributor Author

DavidEichmann Nov 2, 2021

Yes! great suggestion. I'll do that. I think that .&. is meant to be a .|. I'll also elaborate on the error too.

Collaborator

lehins Nov 2, 2021

Oh yeah, you are right .&. is meant to be .|.. I was modifying your code when making the suggestion and didn't give that part much thought.

eras/alonzo/impl/src/Cardano/Ledger/Alonzo/TxBody.hs Outdated

Comment on lines 278 to 285

+                unsafeMakeSafeHash $
+                  fromJust $
+                    hashFromBytes $
+                      BS.pack $
+                        [ fromIntegral (w64 `shiftR` offset)
+                          | w64 <- [a, b, c, d],
+                            offset <- [56, 48 .. 0]
+                        ]

Collaborator

lehins Nov 1, 2021

Same here. Using lists in such a critical part of code for conversions of hash representation could have serious impact on performance. Also I recommend everyone to ban fromJust from their vocabulary. Not only it is a partial function, but it also does not provide any information on the location of the failure, when it does happen (no matter how impossible it is)

Suggested change

      
              unsafeMakeSafeHash $
          
                fromJust $
          
                  hashFromBytes $
          
                    BS.pack $
          
                      [ fromIntegral (w64 `shiftR` offset)
          
                        | w64 <- [a, b, c, d],
          
                          offset <- [56, 48 .. 0]
          
                      ]
          
              hashFromPackedBytes (PackedBytes32 a b c d)

eras/alonzo/impl/src/Cardano/Ledger/Alonzo/TxBody.hs Outdated

Comment on lines 298 to 337

+              encodeDataHash32 dataHash = case fmap fromIntegral (BS.unpack (hashToBytes (extractHash dataHash))) of
+                [ b1,
+                  b2,
+                  b3,
+                  b4,
+                  b5,
+                  b6,
+                  b7,
+                  b8,
+                  b9,
+                  b10,
+                  b11,
+                  b12,
+                  b13,
+                  b14,
+                  b15,
+                  b16,
+                  b17,
+                  b18,
+                  b19,
+                  b20,
+                  b21,
+                  b22,
+                  b23,
+                  b24,
+                  b25,
+                  b26,
+                  b27,
+                  b28,
+                  b29,
+                  b30,
+                  b31,
+                  b32
+                  ] ->
+                    ( toWord64 b1 b2 b3 b4 b5 b6 b7 b8,
+                      toWord64 b9 b10 b11 b12 b13 b14 b15 b16,
+                      toWord64 b17 b18 b19 b20 b21 b22 b23 b24,
+                      toWord64 b25 b26 b27 b28 b29 b30 b31 b32
+                    )
+                _ -> error "Impossible! Wrong number of bytes"

Collaborator

lehins Nov 1, 2021

Suggested change

      
            encodeDataHash32 dataHash = case fmap fromIntegral (BS.unpack (hashToBytes (extractHash dataHash))) of
          
              [ b1,
          
                b2,
          
                b3,
          
                b4,
          
                b5,
          
                b6,
          
                b7,
          
                b8,
          
                b9,
          
                b10,
          
                b11,
          
                b12,
          
                b13,
          
                b14,
          
                b15,
          
                b16,
          
                b17,
          
                b18,
          
                b19,
          
                b20,
          
                b21,
          
                b22,
          
                b23,
          
                b24,
          
                b25,
          
                b26,
          
                b27,
          
                b28,
          
                b29,
          
                b30,
          
                b31,
          
                b32
          
                ] ->
          
                  ( toWord64 b1 b2 b3 b4 b5 b6 b7 b8,
          
                    toWord64 b9 b10 b11 b12 b13 b14 b15 b16,
          
                    toWord64 b17 b18 b19 b20 b21 b22 b23 b24,
          
                    toWord64 b25 b26 b27 b28 b29 b30 b31 b32
          
                  )
          
              _ -> error "Impossible! Wrong number of bytes"
          
            encodeDataHash32 dataHash = 
          
              case hashToPackedBytes (extractHash dataHash) of
          
                PackedBytes32 a b c d -> (a, b, c, d)

eras/alonzo/impl/src/Cardano/Ledger/Alonzo/TxBody.hs Outdated Show resolved Hide resolved

eras/alonzo/impl/src/Cardano/Ledger/Alonzo/TxBody.hs Outdated

               import Data.Coders
-              import Data.Maybe (fromMaybe)
+              import Data.Maybe (fromJust, fromMaybe)

Collaborator

lehins Nov 1, 2021

This function along with other like head, tail, etc. should be banned in production code base. It is better to use an explicit error, because not only it is more descriptive, but it also provides a stack trace.

Suggested change

      
            import Data.Maybe (fromJust, fromMaybe)
          
            import Data.Maybe (fromMaybe)

Here is a good example that I saw recently that depicts the problem very well: haskell/haskell-language-server#1618

eras/alonzo/impl/src/Cardano/Ledger/Alonzo/TxBody.hs Outdated

+                _ -> error "Impossible! Wrong number of bytes"
+              toWord64 :: Word8 -> Word8 -> Word8 -> Word8 -> Word8 -> Word8 -> Word8 -> Word8 -> Word64
+              toWord64 b1 b2 b3 b4 b5 b6 b7 b8 =

Collaborator

lehins Nov 1, 2021

This function can be removed with suggested changes in this review

DavidEichmann force-pushed the davide/simple_smaller_TxOut branch 2 times, most recently from 43b5e30 to 0ec274f Compare

November 2, 2021 20:06

DavidEichmann closed this

DavidEichmann force-pushed the davide/simple_smaller_TxOut branch from 0ec274f to 1419b2a Compare

November 2, 2021 20:22

DavidEichmann reopened this

DavidEichmann force-pushed the davide/simple_smaller_TxOut branch from 7a7a3ef to 00da965 Compare

November 2, 2021 20:36

Contributor Author

DavidEichmann commented Nov 2, 2021 •

edited

Loading

@lehins after much fiddling with getting my gpg keys working, I've updated this PR with your suggestions.

DavidEichmann requested a review from lehins

November 2, 2021 20:38

lehins reviewed

View reviewed changes

eras/alonzo/impl/src/Cardano/Ledger/Alonzo/TxBody.hs Outdated

+                  addrHash :: Hash (CC.ADDRHASH crypto) a
+                  addrHash =
+                    hashFromPackedBytes $
+                      PackedBytes32 a b c (fromIntegral (d `shiftR` offset))

Collaborator

lehins Nov 2, 2021

I'll be really surprised if this compiles. PackedBytes32 is a 32 byte hash constructor, but we are decoding an address with 28 bytes. Also offset doesn't seem to be defined

lehins reviewed

View reviewed changes

eras/alonzo/impl/src/Cardano/Ledger/Alonzo/TxBody.hs Show resolved Hide resolved

Collaborator

lehins commented Nov 2, 2021

Looks much better! 👍 However, there are still too many partial functions and some compilation errors that should be easy to fix.

So, here is a challenge, it is possible to get rid of all partial functions from this PR, if you won't be able to get to it tomorrow I'll make a commit with the fix, I just don't have any more time to get to it today.

DavidEichmann force-pushed the davide/simple_smaller_TxOut branch from 00da965 to 813d175 Compare

November 2, 2021 23:04

Contributor Author

DavidEichmann commented Nov 2, 2021 •

edited

Loading

In terms of partial functions there is valueToCompactErr and in one other place I assume that compacting a Value will always succeed. I think this is an assumption we already make in master: https://github.com/input-output-hk/cardano-ledger-specs/blob/master/eras/alonzo/impl/src/Cardano/Ledger/Alonzo/TxBody.hs#L174. Do you have some other solution in mind? I guess we could have a second non-partial version of the Compactible class with toCompact :: a -> CompactForm a, but that seems out of scope of this PR.

I also have error cases when matching on PackedBytes even though the number of bytes is known statically in the type. This is because the PackedBytes# constructor is defined for all sizes so we must have a catch all case and rely on the invariant that we don't use PackedBytes# when we can used the unpacked constructors. One solution would be to reimplement the constructors of PackedBytes into pattern synonyms that discharge the invariant as follows. Thoughts? Or is there somet else I'm missing?

{-# COMPLETE  PackedBytes32 #-}
pattern PackedBytes32 :: Word64 -> Word64 -> Word64 -> Word64 -> PackedBytes 32
pattern PackedBytes32 a b c d = view32 -> (a, b, c, d)

view32 :: PackedBytes 32 -> (Word64, Word64, Word64, Word64)
view32 (PackedBytes32 a b c d) = (a,b,c,d)
view32 (PackedBytes# _) = error "Impossible!"

DavidEichmann changed the title ~~WIP Create a more compact TxOut using unpacked Word64s~~ Create a more compact TxOut using unpacked Word64s

DavidEichmann requested a review from lehins

November 3, 2021 09:28

lehins approved these changes

View reviewed changes

Collaborator

lehins left a comment

👍

DavidEichmann and others added 4 commits

November 5, 2021 15:48


          Create a more compact TxOut using unpacked Word64s

9725aba

I've special cased when the address hash size is 28 bytes and the data
hash size is 32 bytes and the value is Ada only.


          Remove majority of partial functions

ac80ced


          Fix old pattern synonyms for TxOut

34fc893


          Do not uncompact Null and Ptr staking credentials

47f2a81

DavidEichmann force-pushed the davide/simple_smaller_TxOut branch from 60fd847 to 47f2a81 Compare

November 5, 2021 15:48

DavidEichmann merged commit 0411c3e into master

iohk-bors bot deleted the davide/simple_smaller_TxOut branch

November 8, 2021 12:14

lehins mentioned this pull request

Improve tx out compacting #2553

Merged

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet