Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed hash script function (updated) #217

Merged
merged 4 commits into from
Nov 1, 2021

Conversation

rooooooooob
Copy link
Contributor

continues with #209 to fix utils::hash_script_data() which in the Haskell node instead of using cost_mdls directly seems to construct some kind of canonical LangViewDep structure to hash instead.

for (key, key_bytes) in keys_bytes.iter() {
serializer.write_bytes(key_bytes).unwrap();
let cost_model = self.0.get(&key).unwrap();
// not sure why but the cardano node seems to use indefinite encoding despite this not being standard canonical CBOR
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alessandrokonrad The language views you posted in #209 seems to in the internal bytes representing the serialized cost models it uses indefinite length encoding. How exactly did you get that a141005901d59f1a000302590001011a00060bc719026d00011a000249f01903e800011a000249f018201a0025cea81971f70419744d186419744d186419744d186419744d186419744d186419744d18641864186419744d18641a000249f018201a000249f018201a000249f018201a000249f01903e800011a000249f018201a000249f01903e800081a000242201a00067e2318760001011a000249f01903e800081a000249f01a0001b79818f7011a000249f0192710011a0002155e19052e011903e81a000249f01903e8011a000249f018201a000249f018201a000249f0182001011a000249f0011a000249f0041a000194af18f8011a000194af18f8011a0002377c190556011a0002bdea1901f1011a000249f018201a000249f018201a000249f018201a000249f018201a000249f018201a000249f018201a000242201a00067e23187600010119f04c192bd200011a000249f018201a000242201a00067e2318760001011a000242201a00067e2318760001011a0025cea81971f704001a000141bb041a000249f019138800011a000249f018201a000302590001011a000249f018201a000249f018201a000249f018201a000249f018201a000249f018201a000249f018201a000249f018201a00330da70101ff language views you posted in the topic of #209 ? All the references I found to canonical enocdings seem to mention only definite length being allowed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

e.g. that decodes in cbor to:

MAP(Len(1)) {
	BYTES([0])
	BYTES([159, 26, 0, 3, 2, 89, 0, 1, 1, 26, 0, 6, 11, 199, 25, 2, 109, 0, 1, 26, 0, 2, 73, 240, 25, 3, 232, 0, 1, 26, 0, 2, 73, 240, 24, 32, 26, 0, 37, 206, 168, 25, 113, 247, 4, 25, 116, 77, 24, 100, 25, 116, 77, 24, 100, 25, 116, 77, 24, 100, 25, 116, 77, 24, 100, 25, 116, 77, 24, 100, 25, 116, 77, 24, 100, 24, 100, 24, 100, 25, 116, 77, 24, 100, 26, 0, 2, 73, 240, 24, 32, 26, 0, 2, 73, 240, 24, 32, 26, 0, 2, 73, 240, 24, 32, 26, 0, 2, 73, 240, 25, 3, 232, 0, 1, 26, 0, 2, 73, 240, 24, 32, 26, 0, 2, 73, 240, 25, 3, 232, 0, 8, 26, 0, 2, 66, 32, 26, 0, 6, 126, 35, 24, 118, 0, 1, 1, 26, 0, 2, 73, 240, 25, 3, 232, 0, 8, 26, 0, 2, 73, 240, 26, 0, 1, 183, 152, 24, 247, 1, 26, 0, 2, 73, 240, 25, 39, 16, 1, 26, 0, 2, 21, 94, 25, 5, 46, 1, 25, 3, 232, 26, 0, 2, 73, 240, 25, 3, 232, 1, 26, 0, 2, 73, 240, 24, 32, 26, 0, 2, 73, 240, 24, 32, 26, 0, 2, 73, 240, 24, 32, 1, 1, 26, 0, 2, 73, 240, 1, 26, 0, 2, 73, 240, 4, 26, 0, 1, 148, 175, 24, 248, 1, 26, 0, 1, 148, 175, 24, 248, 1, 26, 0, 2, 55, 124, 25, 5, 86, 1, 26, 0, 2, 189, 234, 25, 1, 241, 1, 26, 0, 2, 73, 240, 24, 32, 26, 0, 2, 73, 240, 24, 32, 26, 0, 2, 73, 240, 24, 32, 26, 0, 2, 73, 240, 24, 32, 26, 0, 2, 73, 240, 24, 32, 26, 0, 2, 73, 240, 24, 32, 26, 0, 2, 66, 32, 26, 0, 6, 126, 35, 24, 118, 0, 1, 1, 25, 240, 76, 25, 43, 210, 0, 1, 26, 0, 2, 73, 240, 24, 32, 26, 0, 2, 66, 32, 26, 0, 6, 126, 35, 24, 118, 0, 1, 1, 26, 0, 2, 66, 32, 26, 0, 6, 126, 35, 24, 118, 0, 1, 1, 26, 0, 37, 206, 168, 25, 113, 247, 4, 0, 26, 0, 1, 65, 187, 4, 26, 0, 2, 73, 240, 25, 19, 136, 0, 1, 26, 0, 2, 73, 240, 24, 32, 26, 0, 3, 2, 89, 0, 1, 1, 26, 0, 2, 73, 240, 24, 32, 26, 0, 2, 73, 240, 24, 32, 26, 0, 2, 73, 240, 24, 32, 26, 0, 2, 73, 240, 24, 32, 26, 0, 2, 73, 240, 24, 32, 26, 0, 2, 73, 240, 24, 32, 26, 0, 2, 73, 240, 24, 32, 26, 0, 51, 13, 167, 1, 1, 255])
}

but then in that second bytes it's the cost model serialized - but using indefinite length encoding which seems weird to me considering that's usually not how canonical CBOR is, and plus their checks in code seem to check against it - but maybe they only do it in the outer level but that still doesn't seem very canonical to me. I just want to confirm how you got those bytes first before I talk with IOHK about it.

Copy link
Contributor

@alessandrokonrad alessandrokonrad Oct 6, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes the encoding is weird, however it must be right, because I use it daily for contract based tx on the testnet. The node would otherwise reject the tx and tell me that the script data hash is invalid, since it constructs the language view encoding on his side.

I basically got the encoding directly from the haskell libraries. This was my entry function:
https://github.com/input-output-hk/cardano-ledger-specs/blob/d37757e273d10fb251681faa1a2a0cc1ac018384/alonzo/test/test/Test/Cardano/Ledger/Alonzo/Serialisation/Canonical.hs#L36-L43

canonicalLangDepView :: PParams era -> Set Language -> Property
canonicalLangDepView pparams langs =
  let langViews = Set.fromList $ getLanguageView pparams <$> Set.toList langs
      encodedViews = serializeEncoding $ encodeLangViews langViews
      base16String = show (B16.encode $ LBS.toStrict encodedViews)
   in counterexample base16String $ case isCanonical encodedViews of
        Right () -> QCP.succeeded
        Left message -> QCP.failed {QCP.reason = message}

I remodeled the function a little bit so it gives me this base16String as result. I checked out all other needed parameters like the CostModel and filled in the current costs from the mainnet protocol parameters. It then returned me this encoding.

@rooooooooob
Copy link
Contributor Author

IOHK confirmed that it was a bug in the canonical form representation: IntersectMBO/cardano-ledger#2512

I just updated the comments to reflect this, and at least for the rest of Alonzo it looks like we'll have to live with this part being encoded as indefinite in order to match with existing on-chain data. This PR should be ready then now @vsubhuman @alessandrokonrad

@vsubhuman vsubhuman added this to the 9.2.0 milestone Oct 7, 2021
@vsubhuman vsubhuman merged commit ec1be9c into Emurgo:master Nov 1, 2021
@rooooooooob rooooooooob deleted the fixed_hash_script_function branch November 3, 2021 04:30
@vsubhuman vsubhuman mentioned this pull request Feb 6, 2022
40 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants