Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create a more compact TxOut using unpacked Word64s #2534

Merged
merged 4 commits into from
Nov 8, 2021

Conversation

DavidEichmann
Copy link
Contributor

I've special cased when the address hash size is 28 bytes and the data
hash size is 32 bytes and the value is Ada only.

@DavidEichmann DavidEichmann changed the title Create a more compact TxOut using unpacked Word64s WIP Create a more compact TxOut using unpacked Word64s Oct 28, 2021
@lehins
Copy link
Collaborator

lehins commented Oct 28, 2021

Whats all the bit twiddling is about? Is there any reason you can't use PackedBytes exposed in IntersectMBO/cardano-base#243

Copy link
Contributor

@dcoutts dcoutts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice. I realise we cannot use PackedBytes (because that itself cannot be unpacked due to it being a multi-constructor type), but I think the code would be a little cleaner if we factored out a 4-word 28byte hash type (which would be single-constructor and thus unpackable). It'd give us the same representation, but would factor the code more concisely.

@DavidEichmann
Copy link
Contributor Author

Humm I have another branch that tries to extract things into something like PackedBytes but with a single constructor so it's unpackable. I'll update this PR to use that. I'm busy at the moment, so expect an update on Tuesday evening.

Copy link
Collaborator

@lehins lehins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here are more detailed suggestions that I meant by my #2534 (comment)

Comment on lines 229 to 265
go :: Hash (CC.ADDRHASH crypto) a -> (Word64, Word64, Word64, Word64)
go h = case fmap fromIntegral (BS.unpack (hashToBytes h)) of
[ b1,
b2,
b3,
b4,
b5,
b6,
b7,
b8,
b9,
b10,
b11,
b12,
b13,
b14,
b15,
b16,
b17,
b18,
b19,
b20,
b21,
b22,
b23,
b24,
b25,
b26,
b27,
b28
] ->
( toWord64 b1 b2 b3 b4 b5 b6 b7 b8,
toWord64 b9 b10 b11 b12 b13 b14 b15 b16,
toWord64 b17 b18 b19 b20 b21 b22 b23 b24,
toWord64 b25 b26 b27 b28 0 0 0 (networkBit .&. payCredTypeBit)
)
_ -> error "Impossible! Wrong number of bytes"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By using PackedBytes we can avoid creating an intermediate ByteString and remove the need to convert hashes through lists and doing it byte-by-byte, all of which is much slower than it has to be.

Suggested change
go :: Hash (CC.ADDRHASH crypto) a -> (Word64, Word64, Word64, Word64)
go h = case fmap fromIntegral (BS.unpack (hashToBytes h)) of
[ b1,
b2,
b3,
b4,
b5,
b6,
b7,
b8,
b9,
b10,
b11,
b12,
b13,
b14,
b15,
b16,
b17,
b18,
b19,
b20,
b21,
b22,
b23,
b24,
b25,
b26,
b27,
b28
] ->
( toWord64 b1 b2 b3 b4 b5 b6 b7 b8,
toWord64 b9 b10 b11 b12 b13 b14 b15 b16,
toWord64 b17 b18 b19 b20 b21 b22 b23 b24,
toWord64 b25 b26 b27 b28 0 0 0 (networkBit .&. payCredTypeBit)
)
_ -> error "Impossible! Wrong number of bytes"
go :: SizeHash (CC.ADDRHASH crypto) ~ 28 => Hash (CC.ADDRHASH crypto) a -> (Word64, Word64, Word64, Word64)
go h =
case hashToPackedBytes h of
PackedBytes28 a64 b64 c64 d32 -> (a64, b64, c64, (fromIntegral d32 `shiftL` 32) .|. fromIntegral (networkBit .&. payCredTypeBit))
_ -> error "Impossible! Wrong number of bytes"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! great suggestion. I'll do that. I think that .&. is meant to be a .|. I'll also elaborate on the error too.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah, you are right .&. is meant to be .|.. I was modifying your code when making the suggestion and didn't give that part much thought.

Comment on lines 278 to 285
unsafeMakeSafeHash $
fromJust $
hashFromBytes $
BS.pack $
[ fromIntegral (w64 `shiftR` offset)
| w64 <- [a, b, c, d],
offset <- [56, 48 .. 0]
]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here. Using lists in such a critical part of code for conversions of hash representation could have serious impact on performance. Also I recommend everyone to ban fromJust from their vocabulary. Not only it is a partial function, but it also does not provide any information on the location of the failure, when it does happen (no matter how impossible it is)

Suggested change
unsafeMakeSafeHash $
fromJust $
hashFromBytes $
BS.pack $
[ fromIntegral (w64 `shiftR` offset)
| w64 <- [a, b, c, d],
offset <- [56, 48 .. 0]
]
hashFromPackedBytes (PackedBytes32 a b c d)

Comment on lines 298 to 337
encodeDataHash32 dataHash = case fmap fromIntegral (BS.unpack (hashToBytes (extractHash dataHash))) of
[ b1,
b2,
b3,
b4,
b5,
b6,
b7,
b8,
b9,
b10,
b11,
b12,
b13,
b14,
b15,
b16,
b17,
b18,
b19,
b20,
b21,
b22,
b23,
b24,
b25,
b26,
b27,
b28,
b29,
b30,
b31,
b32
] ->
( toWord64 b1 b2 b3 b4 b5 b6 b7 b8,
toWord64 b9 b10 b11 b12 b13 b14 b15 b16,
toWord64 b17 b18 b19 b20 b21 b22 b23 b24,
toWord64 b25 b26 b27 b28 b29 b30 b31 b32
)
_ -> error "Impossible! Wrong number of bytes"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
encodeDataHash32 dataHash = case fmap fromIntegral (BS.unpack (hashToBytes (extractHash dataHash))) of
[ b1,
b2,
b3,
b4,
b5,
b6,
b7,
b8,
b9,
b10,
b11,
b12,
b13,
b14,
b15,
b16,
b17,
b18,
b19,
b20,
b21,
b22,
b23,
b24,
b25,
b26,
b27,
b28,
b29,
b30,
b31,
b32
] ->
( toWord64 b1 b2 b3 b4 b5 b6 b7 b8,
toWord64 b9 b10 b11 b12 b13 b14 b15 b16,
toWord64 b17 b18 b19 b20 b21 b22 b23 b24,
toWord64 b25 b26 b27 b28 b29 b30 b31 b32
)
_ -> error "Impossible! Wrong number of bytes"
encodeDataHash32 dataHash =
case hashToPackedBytes (extractHash dataHash) of
PackedBytes32 a b c d -> (a, b, c, d)

eras/alonzo/impl/src/Cardano/Ledger/Alonzo/TxBody.hs Outdated Show resolved Hide resolved
import Data.Coders
import Data.Maybe (fromMaybe)
import Data.Maybe (fromJust, fromMaybe)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function along with other like head, tail, etc. should be banned in production code base. It is better to use an explicit error, because not only it is more descriptive, but it also provides a stack trace.

Suggested change
import Data.Maybe (fromJust, fromMaybe)
import Data.Maybe (fromMaybe)

Here is a good example that I saw recently that depicts the problem very well: haskell/haskell-language-server#1618

_ -> error "Impossible! Wrong number of bytes"

toWord64 :: Word8 -> Word8 -> Word8 -> Word8 -> Word8 -> Word8 -> Word8 -> Word8 -> Word64
toWord64 b1 b2 b3 b4 b5 b6 b7 b8 =
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function can be removed with suggested changes in this review

@DavidEichmann DavidEichmann force-pushed the davide/simple_smaller_TxOut branch 2 times, most recently from 43b5e30 to 0ec274f Compare November 2, 2021 20:06
@DavidEichmann DavidEichmann reopened this Nov 2, 2021
@DavidEichmann
Copy link
Contributor Author

DavidEichmann commented Nov 2, 2021

@lehins after much fiddling with getting my gpg keys working, I've updated this PR with your suggestions.

addrHash :: Hash (CC.ADDRHASH crypto) a
addrHash =
hashFromPackedBytes $
PackedBytes32 a b c (fromIntegral (d `shiftR` offset))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll be really surprised if this compiles. PackedBytes32 is a 32 byte hash constructor, but we are decoding an address with 28 bytes. Also offset doesn't seem to be defined

@lehins
Copy link
Collaborator

lehins commented Nov 2, 2021

Looks much better! 👍 However, there are still too many partial functions and some compilation errors that should be easy to fix.

So, here is a challenge, it is possible to get rid of all partial functions from this PR, if you won't be able to get to it tomorrow I'll make a commit with the fix, I just don't have any more time to get to it today.

@DavidEichmann
Copy link
Contributor Author

DavidEichmann commented Nov 2, 2021

In terms of partial functions there is valueToCompactErr and in one other place I assume that compacting a Value will always succeed. I think this is an assumption we already make in master: https://github.com/input-output-hk/cardano-ledger-specs/blob/master/eras/alonzo/impl/src/Cardano/Ledger/Alonzo/TxBody.hs#L174. Do you have some other solution in mind? I guess we could have a second non-partial version of the Compactible class with toCompact :: a -> CompactForm a, but that seems out of scope of this PR.

I also have error cases when matching on PackedBytes even though the number of bytes is known statically in the type. This is because the PackedBytes# constructor is defined for all sizes so we must have a catch all case and rely on the invariant that we don't use PackedBytes# when we can used the unpacked constructors. One solution would be to reimplement the constructors of PackedBytes into pattern synonyms that discharge the invariant as follows. Thoughts? Or is there somet else I'm missing?

{-# COMPLETE  PackedBytes32 #-}
pattern PackedBytes32 :: Word64 -> Word64 -> Word64 -> Word64 -> PackedBytes 32
pattern PackedBytes32 a b c d = view32 -> (a, b, c, d)

view32 :: PackedBytes 32 -> (Word64, Word64, Word64, Word64)
view32 (PackedBytes32 a b c d) = (a,b,c,d)
view32 (PackedBytes# _) = error "Impossible!"

@DavidEichmann DavidEichmann changed the title WIP Create a more compact TxOut using unpacked Word64s Create a more compact TxOut using unpacked Word64s Nov 3, 2021
Copy link
Collaborator

@lehins lehins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@DavidEichmann DavidEichmann merged commit 0411c3e into master Nov 8, 2021
@iohk-bors iohk-bors bot deleted the davide/simple_smaller_TxOut branch November 8, 2021 12:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants