-
Notifications
You must be signed in to change notification settings - Fork 87
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Write down ADR for data serialization within on-chain validator. #147
Conversation
67ddbf6
to
5482529
Compare
In Cardano, transactions may carry information in various ways and in particular, one must provide Plutus data as part of a transaction witness set. Those data are made available to the underlying validator script context as a (key, value) list where keys are data hashes and value the data. It's important to note that the correspondence between a hash and its data is | ||
verified by the ledger during phase-1 validations; | ||
|
||
We want to leverage this data lookup table to pass arbitrary data and their corresponding hashes to a validator. This effectively means that we introduce an extra indirection in the redeemer of the `Close` and `Context` transition. Indeed, instead of passing the full data as the redeemer, we can only give a hash which can be looked up from the script context to obtain its corresponding data. This can be achieved with the following on-chain function: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know what are the requirements of the Close
and Context
transactions so the question might be stupid... But if the multi-signed data piece isn't duplicated, is representing it as a datum more efficient than a redeemer? Both require the data in the transaction witness set but finding data in the script context is more costly inside validator scripts. Redeemers also don't require storing unused data (datum hash of the multi-signed data piece) in the next UTxO set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not quite sure what you mean by "isn't duplicated"?
The multisigned payloads are produced off-chain, as part of the running head. It is basically what entitles participants to move the contract on-chain. Thus, it is precisely something we cannot store on-chain upfront of course. Nor it is something we can pass as-is for redeemer to the script (because of the surrounding information that come with the multisigned payload).
I would need to double check that but, as far as I remember, the ledger rules also forbid to add supplementary 'unused' redeemer to the script data. Hence the use of a datum for that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not quite sure what you mean by "isn't duplicated"?
Sorry for the vague description. By that, I meant redeemers are stored per input (more precisely per pointer as things like minting policies have redeemers too) while datums are stored per unique hash. If the multi-signed data is only used in the validation of one input then the storage in the witness set is roughly the same. Datums are only "lighter" when that same data is used in the validation of multiple inputs. In this context, I guess that the multi-signed data is only used in the validation of its specific head UTxO, hence no storage gains from the datum way.
Nor it is something we can pass as-is for redeemer to the script (because of the surrounding information that come with the multisigned payload).
I'm still confused why you can represent something as a datum but not as a redeemer when building transactions. Aren't they represented in the same Data
type at both the ledger and script levels?
I would need to double check that but, as far as I remember, the ledger rules also forbid to add supplementary 'unused' redeemer to the script data. Hence the use of a datum for that.
The redeemer will be used in the validation script for the multi-signature check?
LGTM. Concerning the obstacle: isn't there also auxiliary datums which are also hashed into script validity of the body? In any case we should ask the ledger ppl and draw the consequences here. That is, in the worst case, requiring changes in the ledger -> additional scope for babbage hard-fork |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am unsure what problem this solves 🤔
Having spent a few hours dealing with reconciling serialisation representations on- and off-chain to validate some output, I have doubts about this approach. It seems to me it adds complexity both on-chain (adding the need to lookup actual data from a hash that's in another txout, so double indirection) and off-chain (adding the need to pack the hash and the datum in some other txout).
On-chain we can always hash, and if we cannot serialise then either we should ask for it to be included as a builtin, or pass additional serialised representation should we need it.
As for all our other ADRs, I think this should be based on a proper experiment demonstrating the validity and generality of the solution.
@abailly-iohk keep in mind that this proposal's strong hypothesis is that we cannot / do not want to modify the ledger rules. While this discussion could happen with the ledger team, it'll push back likely quite a lot any testnet or mainnet integration.. Hence, in the meantime, this proposal offers "a way" to workaround the issue. As for, "pass additional serialised representation should we need it", this is precisely what we cannot do 😅! Or more exactly, the on-chain validator must be in a position to verify that the serialised data matches the unserialised content to verify! Otherwise an attacker may close with different UTxO than the one in the signature... So, without re-implementing the serialization logic in the on-chain validator, the only hope I see is to make use of the ledger as a middleman for validation. |
Fair enough. That said, I still want to see a proper spike done before considering adoption this as a guiding principle. As an alternative, which I have worked on this morning, we should consider "serialising" the data we are interested in should it matter. I suspect we might have the problem this is supposed to solve because of current limitations or shortcomings in our implementation, not because there will be a need for it in the long run. |
Can we document also the alternative? That is, requiring a Plutus language builtin for serializing any |
Closing this as we went for writing on-chain encoders. In the end, this approach would have probably worked if we needed a single hash of a large data-structure, but in practice, we need individual hashes for many tx outs which would be unpractical to all have as independent datums. |
13. Data Serialization Within On-Chain Validators
Date: 2021-12-21
Status
Proposed
Context
In Hydra, during the
Close
andContest
transitions, one must verify, within on-chain validators, that a certain piece of data has been multi-signed by all head participants. While verifying a multi-signature performed via MuSig2 (which can be made Schnorr-compatible) is relatively easy and can rely on existing Plutus built-ins; producing the payload / pre-image that was signed is problematic for there's no Plutus built-ins regarding data serialization.Incidentally, event though there exists quite simple and compact (implementation-wise) serialization algorithms (e.g. CBOR), this is path we do not want to follow as there's a high chance to increase the validator size far above an acceptable limit.
Hence, how to obtain arbitrary serialized data within an on-chain validator?
Decision
Overview
In Cardano, transactions may carry information in various ways and in particular, one must provide Plutus data as part of a transaction witness set. Those data are made available to the underlying validator script context as a (key, value) list where keys are data hashes and value the data. It's important to note that the correspondence between a hash and its data is
verified by the ledger during phase-1 validations;
We want to leverage this data lookup table to pass arbitrary data and their corresponding hashes to a validator. This effectively means that we introduce an extra indirection in the redeemer of the
Close
andContext
transition. Indeed, instead of passing the full data as the redeemer, we can only give a hash which can be looked up from the script context to obtain its corresponding data. This can be achieved with the following on-chain function:Obstacles
There's a little quirk with this approach unfortunately: the ledger does not allow the presence of extraneous datum in the witness set. In fact, the ledger will fail phase-1 with a
NonOutputSupplimentaryDatums
error if a transaction include any datum that is neithera. Required by an input associated to a script address
b. Referenced by an output
Thus, without requiring a hard-fork, we must be careful including an extra output carrying the required datum hash. In the context where we control the underlying wallet, we can rather easily adds this to a change output already fueling the transaction. Note that this barely change anything for a vk output; the datum will simply be ignored by the ledger and not required for spending.
Consequence
Close
andContest
validator