-
Notifications
You must be signed in to change notification settings - Fork 87
Logbook 2021 H2
- What is this about?
- Newer entries
- December 2021
- November 2021
- October 2021
- September 2021
- August 2021
- July 2021
- Older entries
- Add a mutator to
closeTx
which changes the snapshot number without changing the signature -> tests fail sometimes. - We added labels to locate which mutation failed -> red bin: do this properly, it feels hacked and in a way we should have a sum type enumerating close-specific mutations
- Correctly discard the healthy case when mutating the snapshot number -> tests pass always.
- Add a third close mutator to improve the implementation: changing both, the signature and snapshot number to a valid but unexpected value
- First we did this using the
closeRedeemer
smart constructor, thus always getting well-formed snapshot numbers - Later we deliberately not used
closeRedeemer
to test using the "on-chain type", that is (still) a Bytestring - For example, mutating this to something not resembling a serialized integer (but still a valid signature)
- First we did this using the
- Then we changed the snapshot number back to a
Integer
, as we will be wanting to check> 0
later - When
snapshotNumber
is anInteger
, we need to serialize and hash on-chain to verify a signature- We implemented a basic Integer-to-CBOR-encoder or rather Natural-to-CBOR as it will error on negative numbers
- The fact that negative Integers error, made our tests pass as ill-formed values
(< 0)
were not valid, despite correct signatures
- Discussion on what we eventually will need on-chain with the realization that ultimately we only will ever need to serialize/hash Integers and Hashes for
closeTx
, butTxOut
forfanoutTx
-
We continued working on the close tx validator, starting with a first observation: there was a discrepancy between the on-chain and off-chain representations of a snapshot number (bytestring vs natural). Since any positive number is actually a potentially valid snapshot number, we created a mutation to generate negative number as snapshot numbers. As a consequence and to cope with the now failing test, we changed the on-chain representation to an
Integer
(there's noNatural
available in Plutus!) and we wrote a basic CBOR encoder for unsigned integers (to match the off-chain signable representation). -
From there, we discussed whether this approach could indeed be used in the long-run. Writing a CBOR encoder for unsigned integer is quite trivial and does not require much code. While this is mostly sufficient for the close tx (which only requires to serialize the snapshot number), it isn't for the fanout. In its simplest form (i.e. no split, full UTXO fits in the transaction), it is necessary for the validator to verify that the output UTXO does indeed match whatever UTXO's hash was specified during the close and stored in the state-machine datum. This could potentially be addressed by #147. However, when we consider the realistic form of the fanout which will likely require splitting UTXO into sub-utxo, we will need to:
- (a) Split the UTXO into structured subsets;
- (b) Prove inclusion of a subset into the bigger set.
In the paper, this is achieved via Merkle-Patricia-Trees, but as discussed previously, in the coordinated form it can also be achieved with much simpler Merkle-Trees. This still means that we will need, eventually, to construct a hierarchical structure of hashes; where signable representations are Merkle nodes where leaves are transaction output. Unfortunately, we can't really use the trick described in #147 in this case because it would require one extra datum PER TXOUT. Thus, without added builtins in Plutus, we are left with no choice than finding some on-chain signable representation of TxOut. Could this be CBOR in the same way we approached the signable representation of snapshot numbers? Maybe. We would need to write a CBOR for:
- non-negative integers (CBOR type-00)
- negative integers (CBOR type-01)
- bytestrings (CBOR type-02)
- (finite) lists (CBOR type-04)
- (finite) dictionnaries (CBOR type-05)
Note that, this can be quite straightforward (e.g. https://github.com/elm-toulouse/cbor/blob/master/src/Cbor/Encode.elm) so it may not be a bad idea. Possibly as an independent "Plutus library". We should also probably start a discussion with the ledger and Plutus team with regards to including CBOR-serialization of builtins types as builtin.
- Discussed the ADR13 candidate:
- There was an "optional datum" field in Alonzo before, not in the final spec though.
- If we can get the ledger team to drop https://github.com/input-output-hk/cardano-ledger/blob/70cfbf9be79533a6d1b2ff446567f5b78bf945aa/eras/alonzo/impl/src/Cardano/Ledger/Alonzo/Rules/Utxow.hs#L290-L301, this approach would be less hacky.
- We should write down the alternative: Adding a serialization (+ hashing) builtin to plutus.
- Reviewed open PRs and what had been merged to master
- realized that the mock implementation is actually wrong: nothing checks whether the included hash (which is verified) is indeed corresponding to some specific snapshot number / pre-image, i.e. we could change the "content" + the signature to another valid value
- Goal for the week:
- complete the mock implementation using an on-chain encoder for the snapshot number (it's only an integer)
- implement the ADR13 method for
close
on a branch to check feasibility
Plan for today:
- merge pending PRs
- error handling in Head Logic
- architecture writeup
- Mithril
Rebased https://github.com/input-output-hk/hydra-poc/pull/144 on master after merging branch expanding MockHead contracts
Writing NodeSpec
test to check we properly notify clients when an exception is raised in the Chain
component.
- Added a new
PostTxOnChainFailed
message to the server output - Introduce combinators to capture server output and mock exceptions in the
Chain
component - Not sure this is the right thing to do however, seems like we are somewhat tightly coupling node and implementation of other components?
Also, perhaps "let-it-crash" strategy would be better for the Hydra node?
Need to adapt YAML specifications to add new message and move shared PostChainTx
and InvalidTxError
to common.yaml
.
- Got an "interesting" error in the Log API tests:
1) Hydra.Logging HydraLog Assertion failed (after 1 test and 9 shrinks): [Envelope {namespace = "", timestamp = 1864-05-09 09:18:20.203694887175 UTC, threadId = 0, message = DirectChain {directChain = PostingTxFailed {toPost = AbortTx {utxo = fromList []}, reason = CannotSpendInput {input = "", walletUtxo = fromList [], headUtxo = fromList []}}}}] Traceback (most recent call last): File "/nix/store/ypidjcrcsxpnrqm4ivxf8pg475m0axqd-python3.8-jsonschema-3.2.0/lib/python3.8/site-packages/jsonschema/validators.py", line 811, in resolve_fragment document = document[part] KeyError: 'PostChainTx' During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/nix/store/ypidjcrcsxpnrqm4ivxf8pg475m0axqd-python3.8-jsonschema-3.2.0/bin/.jsonschema-wrapped", line 9, in <module> sys.exit(main()) File "/nix/store/ypidjcrcsxpnrqm4ivxf8pg475m0axqd-python3.8-jsonschema-3.2.0/lib/python3.8/site-packages/jsonschema/cli.py", line 76, in main sys.exit(run(arguments=parse_args(args=args))) File "/nix/store/ypidjcrcsxpnrqm4ivxf8pg475m0axqd-python3.8-jsonschema-3.2.0/lib/python3.8/site-packages/jsonschema/cli.py", line 87, in run for error in validator.iter_errors(instance):
Errors are now properly reported to clients and the TUI. There are still errors which make the node crash, esp. the ones related to failure to validate the tx and validators.
- These should be also reported as
InvalidTxError
but this is left for next year
Plans for today:
- Spike implementation of matching mock crypto so that we can verify sigantures in MockHead
- PR about reporting failures to submit transaction to end-user, catching exceptions and sending messages
Signature property check fails immediately => algorithm for computing encoding is basically wrong...
I have Inverted quotient and remainder in my transformation from integer to bytes on-chain 🤦
The error we get with a failed property is not very helpful as iut does not show anything about datums or redeemers, need to enhance output to show those?
- Adding redeemers display and datums to the
describeCardanoTx
function
Still have failure: There's probably a difference in the representation of snapshot (number) on and off chain that explains the failing signatures verification
- In the
SignableRepresentation
ofSnapshot
we usedshow
but in constructing the datum incloseTx
we useserialise'
Still having failure: There is a one character difference between what show
displays for off-chain Signed
data and what it displays from Datums
:
MultiSigned {multiSignature = [UnsafeSigned "K\245\DC2/4ET\197\NUL\NUL\NUL\NUL\NUL\NUL\NUL\SOH",UnsafeSigned "K\245\DC2/4ET\197\NUL\NUL\NUL\NUL\NUL\NUL\NUL\STX",UnsafeSigned "K\245\DC2/4ET\197\NUL\NUL\NUL\NUL\NUL\NUL\NUL\ETX"]}
vs
DataConstr Constr 1 [B "\SOH",List [B "PK\245\DC2/4ET\197\NUL\NUL\NUL\NUL\NUL\NUL\NUL\SOH",B "PK\245\DC2/4ET\197\NUL\NUL\NUL\NUL\NUL\NUL\NUL\STX",B "PK\245\DC2/4ET\197\NUL\NUL\NUL\NUL\NUL\NUL\NUL\ETX"]]
That must come from the ToCBOR
instance which certainly adds some prefix when we invoke serialize'
: CBOR encoding of bytestrings prepend one or more bytes identifying the type and length of the bytes.
- Using CBOR encoding to pass bytes on- and off-chain is problematic because we don't have CBOR parsing capabilities on-chain so we must ensure whatever bytes manipulation we do works on compatible representations.
Trying to use directly the bytes from the underlying signature => Unit tests pass
Got some tests still failing with new closeTx
validator:
- DirectChainSpec is failing but this is expected as we don't pass any signature in the Snapshot we post :)
- TxSpec is also failing to observe closeTx
- ETESpec and TUISpec
Fixing first DirectChainSpec was easy enough, just needed to sign the snapshot.
- There's a minor snag in that we pass the
Party
towithDirectChain
which means we need to draw the signing key from somewhere else. Perhaps it would make sense to pass theSigningKey
towithDirectChain
?
The problem with observing closeTx
now is that we expect to decode an integer as SnapshotNumber
but of course we get a bytestring!
https://github.com/input-output-hk/hydra-poc/blob/ensemble/more-contract-testing/hydra-node/src/Hydra/Chain/Direct/Tx.hs#L592
- Spent an hour troubleshooting the
TxSpec
test which was not passing because we passed aSNothing
datum. The output of the test failure is particularly cryptic and does not provide much clues on what's going on.
I now only have the ETE test failing on validating the signatures, not sure why however 🤔 The cool thing is that we have a preoper failure being reported in the test.
Having a look at the node's logs: The CloseTx
transaction is properly posted on-chain by node 1 AFAICT, but the message is actually misleading: What happens is that the transaction gets constructed properly but the submission fails and crashes the node => Adding some more detailed log messages
Seems like not all parties have signed the snapshot, here is the plutus reported error:
The data is: Constr 1 [List [I 10,I 20,I 30]]
The redeemer is: Constr 1 [B "\SOH",List [B "K\245\DC2/4ET\197\NUL\NUL\NUL\NUL\NUL\NUL\NUL\n",B "K\245\DC2/4ET\197\NUL\NUL\NUL\NUL\NUL\NUL\NUL\RS"]]
We have 3 parties but only 2 signatures which is fishy! The error might comes from the HeadLogic: The confirmed snapshot contains only part of the signatures we received! How come we did not catch this bug earlier 😮?
- Checking the ETE test passes with the "correct" signatures set then will write a proper test for that.
After the change we get the proper number of signatures:
but still have a failure to validate signatures.
The data is: Constr 1 [List [I 10,I 20,I 30]] The redeemer is: Constr 1 [B "\SOH",List [B "K\245\DC2/4ET\197\NUL\NUL\NUL\NUL\NUL\NUL\NUL\DC4",B "K\245\DC2/4ET\197\NUL\NUL\NUL\NUL\NUL\NUL\NUL\n",B "K\245\DC2/4ET\197\NUL\NUL\NUL\NUL\NUL\NUL\NUL\RS"]]
This comes from the fact the signatures are not in the same order than the parties which is an important assumption made in the mock head code.
- Possible solutions: Pass a map of parties to signatures, check the signatures differently (eg. shuffling lists), or make sure the 2 lists are in the same order
Adding the signatures to the SnapshotConfirmed
message so that we can observe it and have a proper unit test
Decided to not implement complex ordering logic checking within MockHead
but rather to order the multisignatures in the HeadLogic, where is produced, to match parties
ordering.
- Ideally I would have liked to do that in the
aggregate
function but this one works on already signedByteString
s so thePArty
information is buried. - Also noticed that the
APISpec
tests did not fail in spite of me adding a field to aServerOutput
s constructor: I would expect the JSON specification validation code to have caught this but it did not.
TUISpec
test cannot pass in the current implementation of validators because it does not produce any snapshot, hence the Close
transaction fails to pass validation.
This is explicitly handled in the paper, e.g when snapshot number is 0 hence we should deal with it in our ContractSpec
tests.
Made all tests green but for one TUISpec test which I put in pending because we don't have the ability to make it pass right now: The TUI full lifecycle test does not produce any snapshot hence we close with snapshot 0 and no signatures and the MockHead
validator does not cover this.
Trying to generalise mutations, to produce more redeemers then introducing different and more significant ones.
I am "surprised" by the fact one cannot automatically derive instances for Plutus datatypes:
src/Hydra/Contract/MockHead.hs:42:13: error: [-Wmissing-methods, -Werror=missing-methods]
• No explicit implementation for
‘==’
• In the instance declaration for ‘Eq Input’
|
42 | deriving (Eq, Generic, Show)
- Also, the
Eq
instance does not even seem to be visible to test code. This is probably an arfifact from the Plutus plugin compiler and transformation?
test/Hydra/Chain/Direct/ContractSpec.hs:166:15: error:
• No instance for (Arbitrary Plutus.V1.Ledger.Value.Value)
arising from a use of ‘genericArbitrary’
There are instances for similar types:
instance cardano-ledger-shelley-test-0.1.0.0:Test.Cardano.Ledger.Shelley.ConcreteCryptoTypes.Mock
c =>
Arbitrary (Cardano.Ledger.Mary.Value.Value c)
-- Defined in ‘Test.Cardano.Ledger.ShelleyMA.Serialisation.Generators’
• In the expression: genericArbitrary
In an equation for ‘arbitrary’: arbitrary = genericArbitrary
In the instance declaration for ‘Arbitrary MockHead.Input’
|
166 | arbitrary = genericArbitrary
Perhaps they are available in some module we do not import?
- Looks there are some in the PAB code. As we don't depend on
plutus-app
anymore, we'll need to rewrite them or vendor this file. like we did for SM code. - Created a
Plutus.Orphans
module vendoring stuff from PAB.
I have a test failure with the generator when the redeemer is a Close
with a different snapshot number, which is an expected error I would say?
*Anyhow, the close redeemer should not be only a snapshot number but a whole signed snapshot so this paves the way to do the needed changes.
Trying to remove non-documented symbols from haddock document generation, seems like this is only available as a module-level prune
attribute, which sucks...
Some discussion aournd "mutation testing" approach:
- There is nothing there forcing us into implementing correctly the output datum, so that a
close
tx could just as well produce an invalid output datum which won't be consumable by the fanout tx - However we already have tests in place for the whole "happy path" so if a close does produce some invalid datum, this will be caught by the fanout, or later when we check the produced UTXO
- By thoroughly testing each validator, we ensure each link of the chain is correct but we need test(s) to ensure the whole chain is correct
- Validators are always checked and implemented as purely local functions so it makes sense to test them locally
Discussing what to do next, and which contracts to implement. Seems like Close is the best candidate because it's one we have barely touched so far.
What about the contestationPeriod
? Discussing the passing of time in the HEad logic, should probably be reported by underlying ChainComponent
as ticks, instaed of putting the contestation computation logic in the on-chain component.
2 different things:
- Has enough time passed?
- Is the fanout posting the right UTXO?
Discussing testing strategy for contracts:
- Mutation approach: Generate valid transaction, then mutates them to render them invalid and make sure the validator fails
- Constraints-based approach: Starts with an empty transaction then add more "constraints" representing how we expect the transaction to be
- requires to start with a
Const False
validator to make sure we have a failing test
- requires to start with a
- We need some higher level of testing involving sequence of transactions/validators to express properties: for example,
- It should always be possible to abort if the head is not open
- We also want to have rock-solid contracts so we need to test all kind of non-happy paths
We decide to give the "mutation" approach on Close
validator a go. The idea is the following:
- We start from an
arbitrary
supposedly valid Tx (and relevant UTXO) of the required type, in this casegenCloseTx
- We start from a
Const True
validator, eg. one that validates any transaction - We indeed verify this transaction passes stage 2 validation
- We don't care about stage 1 as it's supposed to be validated before that, which kind of implies we need to generate structurally valid transactions
- We then generate various relevant mutations to this valid transaction that are supposed to make it invalid
- We started with a simple one, namely replacing the redeemer with
Abort
- Various possibilities include:
- Modifying the value of an input or an output,
- modifying the datum of an input (note that we modify the hash in the provided UTXO because if the hash is not compatible the transaction is srtucturally invalid)
- Removing some output or some input
- We started with a simple one, namely replacing the redeemer with
- By adding more and more mutations, the goal is to "triangulate" the validators, to make them more and more precise and verify more and more conditions
- While constructing the mutations, we let emerge some kind of DSL to construct transactions that we use in
Tx
module to simplify it
-
Raised the question of MPT again with researchers based on mostly one observation / concern: Hydra's specification for MPTs is different from Ethereum's, and requires the prefix for each node to be part of the node's hash. Implementation-wise, this introduces a quite important complexity since it requires hashes to be constructed one-by-one, as an onion layer, for each digit of the prefix -- without what it is not feasible to add / remove element with only a proof and a root hash. This means that in the case of Cardano output references, an MPT path is at minima 32 hashes! We are afraid that this would make the computational budget go over the roof.
Thus two legitimate questions:
- Why include the prefix / path as part of the hash? What would be the security consequences / hypotheses for omitting the prefix / path from the hash structure?
- Do we actually need to add / remove elements from an MPT at all in a coordinated Head context?
Researchers are investigating (1) and comparing with Ethereum's implementation, looking at the trade-offs in both solutions. For (2), the answer is almost in the question. Adding and removing elements to and from an MPT is required by the OCV code for the close transition in the presence of dangling transactions (basically, the OCV is re-applying those transactions on top of the signed snapshot and checks that the resulting UTXO matches with the MPT's root). In a coordinated context, there's none. Which means that MPT are only truly needed for fanout splitting.
-
As a consequence of (2) above, we may also wonder why an MPT is even needed at all. Since the output references are meaningless on the layer 1, all that is really necessary during fanout is to check that resulting UTxO (i.e. addresses, value and datums) do indeed correspond to what's been agreed by the participants. As such, a "simple" Merkle Tree (perhaps enhanced with the accumulated size to facilitate splitting) would be sufficient in order to create and verify the split transactions.
- We discussed the possibility of adding and removing participants to and from an already established head. This would allow a head to on-board new participants, for either a short period of time, or until the end. While there's no apparent issue or concern with this (it was discussed during the writing of the paper, but omitted to avoid bloating the paper), there hasn't been any explicit use-case made for it. One possible idea would be to on-board new validating participants and make the head a bit more of an "open" network (so long as participants agree to onboard someone...).
Trying to remove the annoying messages that are printed when the thread controlling the TUI exits because it's blocked on a STM operation.
- Seems like it's thrown in the MonadSTM but the exception type is not exported which means there's no way to catch it.
- I presume this is intentional, in order to remove the temptation users could have to tamper with those exceptions but in our case this is pretty annoying. Perhaps I could wrap it in a
silently
call but then how about interaction with stdin/stdout? - It's possible to add a custom event that would be handled in the main event loop as an
AppEvent
that will invokehalt
function to stop the TUI. However, it's not clear how to inject that custom event into the channel that distributes them as it's currently private, so there would be a need for some kind of control side channel that would stop the TUI inner loop but this is too significant a change to be delt with right now.
In CI Build https://github.com/input-output-hk/hydra-poc/runs/4516685770?check_suite_focus=true there is something odd: it reports failed tests but those are nowhere to see.
The fact hydra-tui is not built along with hydra-node in the docker-compose is annnoying, also there aren't any build instructions so it's not possible to build them in compose.
- Seems like demo instructions assume user pulls images and does not build them locally, going to add instructions on how to build them locally.
There is a need for another level of ETE test, one that would check the docker images are properly working. We could test through the TUI, using existing infrastructure but running the nodes as containers and the TUI in-process with several instances interacting with a cluster.
Trying to simplify TxSpec
tests and see if I can extract common features to reuse in testing contracts.
-
My idea about contracts would be to provide a way to build transactions and UTXO then apply the tx against given UTXO using underlying
ledger-specs
infra as provided byHydra.Ledger.Cardano
. -
To test contracts we could do something similar to the constraints eDSL in Plutus: Start from a blank transaction then generate a new transaction applying a sequence of arbitrary constraints to generate a tx that would or would not pass the validator. Then trying to validate the transaction. Problem is the oracle: How do we know the property holds? Perhaps what we could do is to have the generator express what a valid init/commit/... transaction is? Like:
prop "check valid commit" $ \ (ACommitTx tx utxo) -> isRight $ runIdentity $ evaluateTransactionExecutionUnits pparams tx utxo epochInfo systemStart costmodels
Goal: Make the demo work.
We have a flawed logic in the observation of commits: We remove the inputs from initials
that have not been observed in the commit which obviously leads to the inability to commit after having observed another commit tx.
- To Test the
observeCommit
tx and modify the Onchain state, we need to generate a list of initials(TxIn, PKH)
. But we also need to populate the list ofinitials :: [(TxIn, TxOut, Data)]
- The reason why we have both
TxOut
andData
is that we are using ledger specs whereTxOut
only contains datum hash.
Going to fix the TUI's commits. The problem is that we cannot build transactions when the head is open as the UTXO committed are now using full cardano tx hence we need to identify them according to our own addresses. We also need matching signing key to be able to sign the tx
While refactoring TUI we are bitten by the problem that Party
only contains the multisig key and not the cardano key, which makes it impossiblre to use it to identify addresses to use in the TUI
- Workaround is to infer the list of addresses to sent money to in the TUI from the list of existing UTXpO
Some little bugs reamining in the TUI:
- List of UTXO displays duplicate addresses which messes up with navigation
- When user has no UTXO to send, she can still go to the recipient list but this crashes afterwards => Won't fix for now
We were able to complete the journey through the TUI, observing the fanout transaction in the cardano-node 🎉 :
df44aeb02bb740c86b3745b604bf3eccbb77f13d3493532df8fda38eea95de3a 0 2000000 lovelace + TxOutDatumHash ScriptDataInAlonzoEra "67d8ed01e13f33438ea9059ac9be2e159f943cffe054283485e0300271e3e9f9"
df44aeb02bb740c86b3745b604bf3eccbb77f13d3493532df8fda38eea95de3a 1 100000000 lovelace + TxOutDatumNone
df44aeb02bb740c86b3745b604bf3eccbb77f13d3493532df8fda38eea95de3a 2 100000000 lovelace + TxOutDatumNone
df44aeb02bb740c86b3745b604bf3eccbb77f13d3493532df8fda38eea95de3a 3 10000000 lovelace + TxOutDatumNone
df44aeb02bb740c86b3745b604bf3eccbb77f13d3493532df8fda38eea95de3a 4 90000000 lovelace + TxOutDatumNone
Also, we were able to open another Head after having closed the first one, and have one party not committing anything which is fine 🍾 .
- A problem with our current scheme is that a party whic commits nothing or whcih has consumed all its UTXO won't be listed in the recipients list.
[1 of 7] Compiling CardanoClusterFixture ( src/CardanoClusterFixture.hs, dist/build/CardanoClusterFixture.o, dist/build/CardanoClusterFixture.dyn_o )
src/CardanoClusterFixture.hs:14:15: error:
• Exception when trying to run compile-time code:
/tmp/nix-build-local-cluster-lib-local-cluster-0.1.0.drv-0/hydra-poc-root-local-cluster-lib-local-cluster-root/local-cluster/config: getDirectoryContents:openDirStream: does not exist (No such file or directory)
Code: makeRelativeToProject "config" >>= embedDir
• In the untyped splice:
$(makeRelativeToProject "config" >>= embedDir)
|
14 | configFiles = $(makeRelativeToProject "config" >>= embedDir)
Seems like file-embed does not work correctly inside nix build :sad:
- Using cabal API to package and extract extra data files across packages works just fine, no need to use
file-embed
module: https://cabal.readthedocs.io/en/3.4/cabal-package.html#accessing-data-files-from-package-code
Still failing to build inside nix:
Setup: filepath wildcard 'config/*.json' refers to the directory 'config',
which does not exist or is not a directory.
- => We probably want to list all files explicitly
- We need to regenerate materialisation when change data file (or any content) of a local package
Docker compose build and run is working fine now, needed to:
- Update permissions when running
prepare-devnet.sh
so that files have 0400 perms - rebuild hydra-tui properly
Cool thing is that running hydra-tui works fine from the docker-compose using just
docker-compose --profile tui run hydra-tui-alice
Injecting UTXO(s) into demo cardano-node so that TUI user can post transaction and commit
Trying to simplifying key juggling code between crypto keys and cardano-api keys
- I am hitting a small snag with the
hashKey
function which is used by theTx
module to pack in the initial datume, trying to find a hashing function that works with API types
Got a first working version of an exe injecting seed payment for one address, but got a submission error when trying to run it, might be an issue with version of nodes
- Rebuilding docker containers...
- Program can inject a single UTXO + seed payment to be used in the network:
$ cabal run seed-network -- --cardano-node-socket demo/devnet/ipc/node.socket --cardano-signing-key demo/devnet/credentials/alice.sk Querying node for Protocol Parameters at demo/devnet/ipc/node.socket Posting seed payment transaction at demo/devnet/ipc/node.socket, amount: Lovelace 100000000, key: demo/devnet/credentials/alice.sk UTXO for address ShelleyAddress Testnet (KeyHashObj (KeyHash "f8a68cd18e59a6ace848155a0e967af64f4d00cf8acee8adc95a6b0d")) StakeRefNull { "223de11cbda4126bae963c1d653e7c4711554011bcd807ec3eea8bf958199fa7#0": { "address": "addr_test1vru2drx33ev6dt8gfq245r5k0tmy7ngqe79va69de9dxkrg09c7d3", "value": { "lovelace": 100000000 } }, "223de11cbda4126bae963c1d653e7c4711554011bcd807ec3eea8bf958199fa7#1": { "address": "addr_test1vru2drx33ev6dt8gfq245r5k0tmy7ngqe79va69de9dxkrg09c7d3", "datumhash": "a654fb60d21c1fed48db2c320aa6df9737ec0204c0ba53b9b94a09fb40e757f3", "value": { "lovelace": 899899828691 } } }
Fixed generators for WalletSpec so that we have 100% Right coverage. Trying to get code coverage information to understand what we are testing really
Somewhat correct invocation for code coverage with hpc, generating HTML files but failing to generate an index:
hpc markup \
'--destdir=/home/curry/hydra-poc/dist-newstyle/build/x86_64-linux/ghc-8.10.7/hydra-node-0.1.0.0/t/hydra-node/hpc/vanilla/html/hydra-node' \
--hpcdir=/home/curry/hydra-poc/dist-newstyle/build/x86_64-linux/ghc-8.10.7/hydra-node-0.1.0.0/t/hydra-node/hpc/vanilla/mix/hydra-node \
--hpcdir=/home/curry/hydra-poc/dist-newstyle/build/x86_64-linux/ghc-8.10.7/hydra-node-0.1.0/hpc/vanilla/mix/hydra-node-0.1.0/ \
--hpcdir=/home/curry/hydra-poc/dist-newstyle/build/x86_64-linux/ghc-8.10.7/hydra-node-0.1.0.0/hpc/vanilla/mix/hydra-node-0.1.0.0 \
--srcdir hydra-node \
./dist-newstyle/build/x86_64-linux/ghc-8.10.7/hydra-node-0.1.0/t/tests/hpc/vanilla/tix/tests/tests.tix
Discussing how to retrieve the UTXO from the cardano node, whether or not to put it in existing Client
in TUI.
- It makes sense to separate responsibilities between component talking to Hydra and one talking to the node, even though in the end, from the perspective of the TUI, it's a single entry point
Got stuck once more in issues with various types of UTXO being available:
-
queryUtxo
returns a cardano-apiUTxO
- we need a Hydra'
Utxo
We need to filter the UTXO used for payment with the markedDatum
- We manage to see the Head open in the TUI, with the right commits available
Troubleshooting the issue on CI with TUISpec
, namely that End-to-end tests fail, looks related to the fact the fd used to write is not a vty: https://stackoverflow.com/questions/1605195/inappropriate-ioctl-for-device
- Starting to get the demo running: it's IMPORTANT to have the devnet re-created as the nodes do not sync back in time.
- Host-mounting the node.socket allows for some convenient cardano-cli querying
- hydra-node crashes when the TUI selects a randomly generated utxo to commit with "CannotSpendInput" -> expected
- By exposing the
IO Vty
initializer, we can hook into theVty
interface and re-direct theOutput
into a normal file - A generic BrickTest handle / with pattern emerges
- Realized that the
update
function does write continuously into the file and it contains "multiple" screens- try to seek backward on each update
- this messes with the terminal
- Theory why splitting individual frames is not possible: only changes are drawn to the Fd?
- There is a Mock output in vty: https://hackage.haskell.org/package/vty-5.33/docs/Graphics-Vty-Output-Mock.html
- Maybe outputPicture can be used instead? If used with a "fresh" displayContext, this shows the picture for real instead of forwarding it to the Fd
- Using a custom Output I can hook into 'outputByteBuffer' and redirect that? This seems to also allow providing an 'assumedStateRef' which could be cleared outside to force a "full re-render"?
- By writing into an IORef, which is cleared before each call to outputPicture we can keep a single frame!
- If we use the real display context, it draws correctly using
writeMoveCursor
but this output is harder to reason about programmatically - Maybe I could drop the escape codes for expectations and keep them for displaying?
- Adding the shouldRender function was easy now
-
threadDelay
s are necessary right now -> ugly .. double buffering where thegetPicture
uses a (T
)MVar
to block on the next frame could help - it outputs the frame which does not have the expected bytes -> nice!
-
- Also, I do get a BlockedIndefinitely exception on failing tests.. should be okay
- We want to add hydra & cardano node to the TUI tests using
withBFTNode
andwithHydraNode
- rewriting the tests was fine
- but fails now as the
withBFTNode
fromlocal-cluster
can't find the hard-coded fixtures inconfig/..
- We see two options now:
- generate everything similar to
cardano-testnet
is doing it - embed our hard-coded local-cluster files when compiling the
local-cluster
library -> we go for this as it's more in-line what we have right now
- generate everything similar to
- Writing config JSON files which had been copied before from memory
- Cardano keys are a bit more involved. We had been pointing the
withBFTNode
to the actual file paths instead of copying them to a temporary directory, so we tackle this and need to change quite some signatures ofkeysFor
orsigningKeyPathFor
- After rebasing the TUI test work, we also need to distribute
initialFunds
and make a "seed payment" usingmkSeedPayment
-> Success, all tests but the expected failure pass!
Goal: Fix all tests implementing fee coverage using "marked" UTXO
Still stuck in fixing DirectSpec
test, probably because there's a race condition while we are waiting for the payment utxo to appear
- Inject marker UTXO
- Throwing an exception when we cannot cover fee
-
DirectSpec
test is passing butWalletSpec
properties never coverRight
side:coverFee balances transaction with fees +++ OK, passed 100 tests (100% Left). transaction's inputs are removed from wallet +++ OK, passed 100 tests (100% ErrNoPaymentUtxoFound).
Struggling to write a correct seed payment transaction generator for use in Integration and ETE tests
- There is a mismatch between the config we generate as part of the CardanoNode setup and the existing initial funds: In one case we use 900000 ADA and in the other case 900 ADA. As they use the same address, when the
mkGenesisTx
function runs it retrieves one or the other. - I would like to start with empty
initialFunds
and then fill them up as we need when we start the cluster
Paying InitTx succeeds but posting all subsequent transactions in DirectChainSpec
fails, probably because they are not waiting for payment to appear
- Retrying blindly without timeout does not work, of course, so need to add a timeout to all retries
- Problem now is that
generatePaymentToCommit
is probably consuming themarkerDatum
without recreating it so it disappears
Forgot to retry some postTx
calls, the ones that are supposed to fail
- I can confirm the "marked" UTXO is consumed by the
generatePaymentToCommit
.
ETE tests still failing because we don't have a seed transaction posted so there's no payment utxo available => seeding UTXO
-
It's possible there is a race condition between the time the node sees the commits and the time the wallet takes into account the payment utxo to cover the collectcom tx? It's bob who is tryign to post the collectcomtx and is failing to cover its fees I can see this in Bob's log:
{ "message": { "directChain": { "contents": { "after": { "d0e48424eed4e798aac21e0caae434aa3fbb2fafa4dd62f40f568c6a7c895bdb#0": { "address": "601052386136b347f3bb7c67fe3f2ee4ef120e1836e5d2707bb068afa6", "datahash": null, "value": { "policies": { }, "lovelace": 97834279 } }, "876818297ef126d372d05572ddeb3a4dd971d72eb6062a119987bc20ab6212c5#1": { "address": "601052386136b347f3bb7c67fe3f2ee4ef120e1836e5d2707bb068afa6", "datahash": "a654fb60d21c1fed48db2c320aa6df9737ec0204c0ba53b9b94a09fb40e757f3", "value": { "policies": { }, "lovelace": 899896530691 } } }, "before": { "d0e48424eed4e798aac21e0caae434aa3fbb2fafa4dd62f40f568c6a7c895bdb#0": { "address": "601052386136b347f3bb7c67fe3f2ee4ef120e1836e5d2707bb068afa6", "datahash": null, "value": { "policies": { }, "lovelace": 97834279 } } }, "tag": "ApplyBlock" }, "tag": "Wallet" }, "tag": "DirectChain" }, "timestamp": "2021-12-07T12:14:49.531989295Z", "threadId": 20, "namespace": "HydraNode-2" }
which happens "after" it tries to post collectcom, it's perfectly possible it's missing the payment txout
-
Should the node try again to post it if it fails, or should this be handled in the
postTx
definition in theDirect
module? => It's reasonable to expect various race conditions and the need to retry posting given some conditinos are not yet met but could be met in the future.
Added a timeout around finalizeTx
in Direct
and reinstated retry
in the wallet so that we can wait for payment utxo to appear.
- This is actually a case where the client could be interested in the error reported and do something about it, eg. send money to the Node's "wallet" to pay for Head SM
LocalClusterSpec
fails because there's no initialFunds
as I removed it from the genesis-shelley.json
file => We want to put them back, and overwrite them in tests.
- Start pairing with realization that the
MockHead
is already checking things:- it asserts the "newValue" is the same as "oldValue", i.e. nothing is added
- but of course we are adding the collected value from the commits
- So we start by passing off-chain knowledge to the Head (SM) validator via the redeemer (SM Input)
- NOTE: This might not be a good idea and instead we should look at the script context / all commit (PT) inputs
- We improve error printing on tx submission failures of
Chain.Direct
- Now the close fails because it does not preserve value and we pass in a
TxOut
of the head utxo
Goal: Have ETE and Benchmarks pass
There errors we are seeing from tests execution are painful, so we want to improve their formatting:
- Formatting the submission error is bit annoying as it requires peeling several layers of stacked errors
The plutus error is already formatted so would be nice to print it directly rather than
show
ing it - Is there not already a way to PPrint ledger/node errors?
The error in the CollectCom comes from the SM:
- It checks the value stored in the SM UTXO is preserved between transitions, and this is not the case currently, hence the errors reported on script execution
- We collect the total value in the
CollectCom
redeemer and use that to update the destination state's value
SM validator now fails on the close tx, for the same reason, eg. missing values
Then fanoutTx
also fails, for the same reason, but now we must ensure the Final
state is really final so that the SM logic checks the destination state value is 0 and there's no additional output for the SM/
Commit is failing in the benchmarks run with not enough fees => Trying to fix the wallet logic to remove the input txin from selection logic, but this does not work
Unit tests are failing:
- Struggling with the ledger/api discrepancies to fix unit tests
- We are missing the datum to pass to the
collectComTx
function so we add them, but now this breaks some tests which require the UTXO, but not the datum and we are stuck in a maze of mapping and transforming back and forth between ledger and api
We are having an error in DirectSpec which fials to post the init tx: It seems we are retrying when posting the initTx but catching all errors which is sledgehammerish => Adding a proper exception instead
- InitTx submission blocks because we changed the way inputs are selected in the
finalizeTx
and in the wallet weretry
when there's noavailableUtxo
, removing theretry
reveals the error
Idea:
- We need to have a distinguished address or utxo we carry around to pay for the fees. Could we use a datum for that?
- Other option: Simply call cardano-cli to do the balancing of a tx
Trying to make the ETE fail by having both Alice and Bob committing UTXO. This should fail according to my theory that we are consuming the wrong UTXO in the collectCom tx
ETE test now fails for the same reason than Benchmark fails:
CannotSpendInput
{ input = ("9546383daca50c0c643abca09331c5e58cfef49fa899eb8d15bfb2347ba1b001", 1)
, walletUtxo =
fromList
[ (TxIn "8d383a29a211578298143ab26b3b2e1c4406abe5d7a905c49b234fdccf2627c8" (TxIx 1), TxOut (AddressInEra (ShelleyAddressInEra ShelleyBasedEraAlonzo) (ShelleyAddress Testnet (KeyHashObj (KeyHash "3aaa2e3de913b0f5aa7e7f076e122d737db5329df1aa905192284fea")) StakeRefNull)) (TxOutValue MultiAssetInAlonzoEra (valueFromList [(AdaAssetId, 899996702000)])) TxOutDatumNone)
]
, headUtxo =
fromList
[ (TxIn "bd71f91ba872c4d79f45163dae877c1644a22527df47e1c49c25657acd10603c" (TxIx 0), TxOut (AddressInEra (ShelleyAddressInEra ShelleyBasedEraAlonzo) (ShelleyAddress Testnet (ScriptHashObj (ScriptHash "07204aff6885e57d799ee359f7e401d64ff3b54e0de7c4edc3f163d9")) StakeRefNull)) (TxOutValue MultiAssetInAlonzoEra (valueFromList [(AdaAssetId, 2000000)])) (TxOutDatumHash ScriptDataInAlonzoEra "c2f7589a052854c8877e74b7ec3de892981766ef819fc03bc8c893daf66dd72e"))
, (TxIn "bd71f91ba872c4d79f45163dae877c1644a22527df47e1c49c25657acd10603c" (TxIx 1), TxOut (AddressInEra (ShelleyAddressInEra ShelleyBasedEraAlonzo) (ShelleyAddress Testnet (ScriptHashObj (ScriptHash "12c4f8ff8070f0d659bdb2ecf844190ddb1134ef821764bd2b4649b2")) StakeRefNull)) (TxOutValue MultiAssetInAlonzoEra (valueFromList [(AdaAssetId, 2000000)])) (TxOutDatumHash ScriptDataInAlonzoEra "2502ff9c9c341dd1384724ae35eab0b19e394c90226892fcc8e7cc86342d324e"))
, (TxIn "bd71f91ba872c4d79f45163dae877c1644a22527df47e1c49c25657acd10603c" (TxIx 2), TxOut (AddressInEra (ShelleyAddressInEra ShelleyBasedEraAlonzo) (ShelleyAddress Testnet (ScriptHashObj (ScriptHash "12c4f8ff8070f0d659bdb2ecf844190ddb1134ef821764bd2b4649b2")) StakeRefNull)) (TxOutValue MultiAssetInAlonzoEra (valueFromList [(AdaAssetId, 2000000)])) (TxOutDatumHash ScriptDataInAlonzoEra "3b5e4228faf69ddf21fb84990b54d806c2b1234a250e0f4c54cc953257ff57ac"))
, (TxIn "bd71f91ba872c4d79f45163dae877c1644a22527df47e1c49c25657acd10603c" (TxIx 3), TxOut (AddressInEra (ShelleyAddressInEra ShelleyBasedEraAlonzo) (ShelleyAddress Testnet (ScriptHashObj (ScriptHash "12c4f8ff8070f0d659bdb2ecf844190ddb1134ef821764bd2b4649b2")) StakeRefNull)) (TxOutValue MultiAssetInAlonzoEra (valueFromList [(AdaAssetId, 2000000)])) (TxOutDatumHash ScriptDataInAlonzoEra "59b19510ad6d701df3df01804374a9a5126b265b779804d739e78721ebd3872c"))
]
}
This error is returned by the Wallet
when it tries to resolveInput
s. Adding more info about all the inputs of the transaction
- I don't see a way to have a proper
CollectComTx
than collecting the outputs of the commit transactions and keeping them around in the Direct chain state, or the head state, to pass later on when submitting collect com tx. Now trying to retrieve the commits from observing the UTXO, we need to return theν_commit
UTXO from the observed commit txs. - Trying to fix the value produced in the commit output
Once again lost in the maze of types between ledger and API...
Why don't we simply pass a Utxo
to the commitTx
function and instead pass a Maybe (x,y)
tuple?
- This tuple is what is selected by the user, which should be a proper
Utxo
containing a single input.
Observation test now pass but test for transaction size fails
The problem in the TX size comes from the generated Value
which is huge. This is also what we observe in the CI and this comes from our use of arbitrary Txs from the Ledger. We have something in the WalletSpec
already for trimming down the values to something more palatable, perhaps using ReasonablySized
?
Struggled quite a bit but test checking we observe collectCom properly is failing because we do not use the commit outputs.
Note:
I really think the observeXXX
functions should work with OnChainHeadState
as:
- What is relevant for observation depends on the state
- The state can be modified by the observed TX, like the initials being produced/consumed by various transactions
Managed to have the CollectCom transaction consumes the commits UTXO and not the committed ones, now checking what happens in ETE test
- ETE and DirectChainSpec tests still failing
I think I know what happens: The Wallet
tries to resolve an input corresponding to the commit UTXO but it does not have it in its UTXO set because it's a UTXO paid to a script address and not to the Wallet's owner address so we don't track it.
- But what happens for Head script UTXO?
- We pass more UTXO to resolve when we call
coverFee
=> Trying to add the accumulated commits
in OnChainHeadState
to the cover fee function
We are now observing a
WrappedShelleyEraFailure
( MissingScriptWitnessesUTXOW
( fromList
[ ScriptHash "6679d3c92844becb16c55161b60111336e1ba2f3d14bbb52b051c4db"
]
)
)
which probably means we don't put the script for consuming the commit outputs into the collect com tx. Also putting the redeemers.
Transaction now fails because of scripts execution:
HardForkApplyTxErrFromEra S (S (S (S (Z (WrapApplyTxErr {unwrapApplyTxErr = ApplyTxError [
UtxowFailure (
MissingRequiredDatums (fromList [SafeHash "549485dcc8131ab64122a9163080943f55b83ae368bc55bec73e583f192f3080"]) (fromList [SafeHash "8392f0c940435c06888f9bdb8c74a95dc69f156367d6a089cf008ae05caae01e",SafeHash "f4b9d64e4725efc05d7d078bb19e952b288c8403f5a585a8a6ffe589a9851614"])),UtxowFailure (WrappedShelleyEraFailure (UtxoFailure (UtxosFailure (ValidationTagMismatch (IsValid True)
(FailedUnexpectedly [
PlutusFailure "
The 3 arg plutus script (PlutusScript PlutusV1 ScriptHash \"07204aff6885e57d799ee359f7e401d64ff3b54e0de7c4edc3f163d9\") fails.
CekError An error has occurred: User error:
The provided Plutus code called 'error'.
The data is: Constr 0 [Constr 0 [I 100000000000000],List [I 10]]
The redeemer is: Constr 0 []
The context is:
Purpose: Spending (TxOutRef {txOutRefId = bc1eae5aa8e72d3f9e5c0cd725b49dc47954570b6087eb16b5cc3f4ce6daf4ea, txOutRefIdx = 0})
TxInfo:
TxId: 91b8eae01ad75d635d8e925925195da468bc700c9a20286d2efdfd7957b3d3a8
Inputs: [ a24d818e7fc823c61416a095f98b139fc8c520b9ee5365791245f8d9ec7efc6b!0 -> - Value (Map [(,Map [(\"\",3000000)])]) addressed to
ScriptCredential: 6679d3c92844becb16c55161b60111336e1ba2f3d14bbb52b051c4db (no staking credential)
,a24d818e7fc823c61416a095f98b139fc8c520b9ee5365791245f8d9ec7efc6b!1 -> - Value (Map [(,Map [(\"\",899989536279)])]) addressed to
PubKeyCredential: f8a68cd18e59a6ace848155a0e967af64f4d00cf8acee8adc95a6b0d (no staking credential)
, bc1eae5aa8e72d3f9e5c0cd725b49dc47954570b6087eb16b5cc3f4ce6daf4ea!0 -> - Value (Map [(,Map [(\"\",2000000)])]) addressed to
ScriptCredential: 07204aff6885e57d799ee359f7e401d64ff3b54e0de7c4edc3f163d9 (no staking credential)]
Outputs: [ - Value (Map [(,Map [(\"\",5000000)])]) addressed to
ScriptCredential: 07204aff6885e57d799ee359f7e401d64ff3b54e0de7c4edc3f163d9 (no staking credential)
, - Value (Map [(,Map [(\"\",899986238279)])]) addressed to
PubKeyCredential: f8a68cd18e59a6ace848155a0e967af64f4d00cf8acee8adc95a6b0d (no staking credential) ]
Fee: Value (Map [(,Map [(\"\",3298000)])])
Value minted: Value (Map [])
DCerts: []
Wdrl:[]
Valid range: (-\8734 , +\8734)
Signatories: []
Datums: [ ( 8392f0c940435c06888f9bdb8c74a95dc69f156367d6a089cf008ae05caae01e
Seems like it's missing the datum witnesses for the commit outputs.
- => Adding commit datums
So I don't have the missing datum error anymore, only a script failure
- The datum types are odd in the error, need to dump the transaction to see what's going on
It seems it's the head script which is failing:
The data is: Constr 0 [Constr 0 [I 10000000000000],List [I 10,I 20,I 30]]\nThe redeemer is: Constr 0 []
It's clear the datums are there:
HardForkApplyTxErrFromEra S (S (S (S (Z (WrapApplyTxErr {unwrapApplyTxErr = ApplyTxError [UtxowFailure (WrappedShelleyEraFailure (UtxoFailure (UtxosFailure (ValidationTagMismatch (IsValid True) (FailedUnexpectedly [PlutusFailure "
The 3 arg plutus script (PlutusScript PlutusV1 ScriptHash \"07204aff6885e57d799ee359f7e401d64ff3b54e0de7c4edc3f163d9\") fails.
CekError An error has occurred: User error:
The provided Plutus code called 'error'.
The data is: Constr 0 [Constr 0 [I 10000000000000],List [I 10,I 20,I 30]]
The redeemer is: Constr 0 []
The context is:
Purpose: Spending (TxOutRef {txOutRefId = bd71f91ba872c4d79f45163dae877c1644a22527df47e1c49c25657acd10603c, txOutRefIdx = 0})
TxInfo:
TxId: 35b789b90d675222ca720ec0edb167d3c80d8ea2505327e5a8a6154de39c8ef7
Inputs: [ 0baa47ee668c4a9daf984ca29d2ada80224ea74ecae114f2b49c923252bd612f!0 -> - Value (Map [(,Map [(\"\",4000000)])]) addressed to
ScriptCredential: 6679d3c92844becb16c55161b60111336e1ba2f3d14bbb52b051c4db (no staking credential)
, 0baa47ee668c4a9daf984ca29d2ada80224ea74ecae114f2b49c923252bd612f!1 -> - Value (Map [(,Map [(\"\",899984536279)])]) addressed to
PubKeyCredential: f8a68cd18e59a6ace848155a0e967af64f4d00cf8acee8adc95a6b0d (no staking credential)
, 8d383a29a211578298143ab26b3b2e1c4406abe5d7a905c49b234fdccf2627c8!0 -> - Value (Map [(,Map [(\"\",2000000)])])addressed to
ScriptCredential: 6679d3c92844becb16c55161b60111336e1ba2f3d14bbb52b051c4db (no staking credential)
, 97469573293f61bf761da1adc77c1ad207e4c65249ee78f18c336db0a137a7a8!0 -> - Value (Map [(,Map [(\"\",4000000)])]) addressed to
ScriptCredential: 6679d3c92844becb16c55161b60111336e1ba2f3d14bbb52b051c4db (no staking credential)
, bd71f91ba872c4d79f45163dae877c1644a22527df47e1c49c25657acd10603c!0 -> - Value (Map [(,Map [(\"\",2000000)])]) addressed to
ScriptCredential: 07204aff6885e57d799ee359f7e401d64ff3b54e0de7c4edc3f163d9 (no staking credential) ]
Outputs: [ - Value (Map [(,Map [(\"\",12000000)])]) addressed to
ScriptCredential: 07204aff6885e57d799ee359f7e401d64ff3b54e0de7c4edc3f163d9 (no staking credential)
, - Value (Map [(,Map [(\"\",899981238279)])]) addressed to
PubKeyCredential: f8a68cd18e59a6ace848155a0e967af64f4d00cf8acee8adc95a6b0d (no staking credential) ]
Fee: Value (Map [(,Map [(\"\",3298000)])])
Value minted: Value (Map [])
DCerts: []
Wdrl: []
Valid range: (-\8734 , +\8734)
Signatories: []
Datums: [ ( 8392f0c940435c06888f9bdb8c74a95dc69f156367d6a089cf008ae05caae01e
, <> )
, ( 9352b132cb8dcedbc4d1115321a357d32b538aa1ba57c4c958ee6ebae8f5d50c
, <10,
\"{\\\"9546383daca50c0c643abca09331c5e58cfef49fa899eb8d15bfb2347ba1b001#1\\\":{\\\"address\\\":\\\"addr_test1vru2drx33ev6dt8gfq245r5k0tmy7ngqe79va69de9dxkrg09c7d3\\\",\\\"value\\\":{\\\"lovelace\\\":2000000}}}\"> )
, ( af586b80c5243d28f4c1e9d1984236d2934fecb6d3d0a3b7e50ba94f446c150f
, <30, \"{}\">)
, ( c2f7589a052854c8877e74b7ec3de892981766ef819fc03bc8c893daf66dd72e
, <<10000000000000>, [10, 20, 30]> )
, ( f5bcf944acb09ae13fcdec6517ad3ee23c03de7c6ac779dd4304ebfd38faeb44
, <20,
\"{\\\"ac6e8d41c8e11d7883b1a5f5b025494cde06c7cecd67d10d06d7627374cf81af#1\\\":{\\\"address\\\":\\\"addr_test1vqg9ywrpx6e50uam03nlu0ewunh3yrscxmjayurmkp52lfskgkq5k\\\",\\\"value\\\":{\\\"lovelace\\\":2000000}}}\"> ) ]
"
It's definitely the head script (with hash 07204aff6885e57d799ee359f7e401d64ff3b54e0de7c4edc3f163d9
) that's failing as it's the only output beside change to the Tx.
- Amount is correct, it equals the sum of inputs + 2 ADAs
- I can see the 2 committed UTXO from Alice and bob.
- Trying to have the mockhead script always succeed does not help
Trying to avoid my LSP displaying "ghost imports" which are a PITA to navigate the file. Also removing the annoying popups.
- It's
lsp-lens-mode
which is enabled by default in lsp-mode. Addingper https://emacs-lsp.github.io/lsp-mode/page/settings/lens/(use-package lsp-mode :custom (lsp-lens-enable nil))
Got the following error when running benchmark:
hydra-node: failed to submit tx: HardForkApplyTxErrFromEra S (S (S (S (Z (WrapApplyTxErr {unwrapApplyTxErr = ApplyTxError [UtxowFailure (WrappedShelleyEraFailure (UtxoFailure (ValueNotConservedUTxO (Value 2000000 (fromList [])) (Value 1493272799 (fromList []))))),UtxowFailure (WrappedShelleyEraFailure (UtxoFailure
(BadInputsUTxO (fromList [TxInCompact (TxId {_unTxId = SafeHash "26ecb3a06c1e32f63742f1f7836c42dc86184fd71e32988d0b4099382cf009d1"}) 0])))),UtxowFailure (WrappedShelleyEraFailure (UtxoFailure NoCollateralInputs)),UtxowFailure (WrappedShelleyEraFailure (UtxoFailure (InsufficientCollateral (Coin 0) (Coin 4947000))))]
})))))
Another (different) error:
hydra-node: failed to cover fee for transaction: ErrNotEnoughFunds {missingDelta = Coin 3589920368}, ValidatedTx {body = TxBodyConstr TxBodyRaw {_inputs = fromList [TxInCompact (TxId {_unTxId = SafeHash "852d11d73776b64a9416bbef7811cca03485a01691abc37751dcb866b1353a29"}) 0], _collateral = fromList [], _outputs = St
rictSeq {fromStrict = fromList [(Addr Testnet (ScriptHashObj (ScriptHash "07204aff6885e57d799ee359f7e401d64ff3b54e0de7c4edc3f163d9")) StakeRefNull,Value 2000000 (fromList []),SJust (SafeHash "67d8ed01e13f33438ea9059ac9be2e159f943cffe054283485e0300271e3e9f9")),(Addr Testnet (KeyHashObj (KeyHash "16601980e4ae7eb11e87
180d154cc44a9b24105a6ee7c592ca66329c")) StakeRefNull,Value 3604780403 (fromList []),SNothing),(Addr Testnet (KeyHashObj (KeyHash "542a8a32c2a56fc6081e784a2a0527803015922309eee9ff051f629e")) StakeRefNull,Value 2541227916 (fromList []),SNothing),(Addr Testnet (KeyHashObj (KeyHash "529b55087caf60a7251c68b38f480de1f3ad
d14561322447f25fcf20")) StakeRefNull,Value 1042096452 (fromList []),SNothing)]}, _certs = StrictSeq {fromStrict = fromList []}, _wdrls = Wdrl {unWdrl = fromList []}, _txfee = Coin 0, _vldt = ValidityInterval {invalidBefore = SNothing, invalidHereafter = SNothing}, _update = SNothing, _reqSignerHashes = fromList [],
_mint = Value 0 (fromList []), _scriptIntegrityHash = SNothing, _adHash = SNothing, _txnetworkid = SNothing}, wits = TxWitnessRaw {_txwitsVKey = fromList [], _txwitsBoot = fromList [], _txscripts = fromList [(ScriptHash "07204aff6885e57d799ee359f7e401d64ff3b54e0de7c4edc3f163d9",PlutusScript PlutusV1 ScriptHash "07
204aff6885e57d799ee359f7e401d64ff3b54e0de7c4edc3f163d9")], _txdats = TxDatsRaw (fromList [(SafeHash "67d8ed01e13f33438ea9059ac9be2e159f943cffe054283485e0300271e3e9f9",DataConstr Constr 3 []),(SafeHash "ff5f5c41a5884f08c6e2055d2c44d4b2548b5fc30b47efaa7d337219190886c5",DataConstr Constr 2 [])]), _txrdmrs = RedeemersR
aw (fromList [(RdmrPtr Spend 0,(DataConstr Constr 3 [],WrapExUnits {unWrapExUnits = ExUnits' {exUnitsMem' = 0, exUnitsSteps' = 0}}))])}, isValid = IsValid True, auxiliaryData = SNothing}, using head utxo: fromList [(TxInCompact (TxId {_unTxId = SafeHash "852d11d73776b64a9416bbef7811cca03485a01691abc37751dcb866b1353
a29"}) 0,(Addr Testnet (ScriptHashObj (ScriptHash "07204aff6885e57d799ee359f7e401d64ff3b54e0de7c4edc3f163d9")) StakeRefNull,Value 2000000 (fromList []),SJust (SafeHash "ff5f5c41a5884f08c6e2055d2c44d4b2548b5fc30b47efaa7d337219190886c5")))], and wallet utxo: fromList [(TxIn "852d11d73776b64a9416bbef7811cca03485a01691
abc37751dcb866b1353a29" (TxIx 1),TxOut (AddressInEra (ShelleyAddressInEra ShelleyBasedEraAlonzo) (ShelleyAddress Testnet (KeyHashObj (KeyHash "30b49f3a89bb12e567cc21f749bbad9276e214eee6ffa63257bbcf30")) StakeRefNull)) (TxOutValue MultiAssetInAlonzoEra (valueFromList [(AdaAssetId,5378452288)])) TxOutDatumNone),(TxIn
"8a927269eb6e203d189c5be935efc8c239721a58e1176ffab515bcc3ba69040d" (TxIx 1),TxOut (AddressInEra (ShelleyAddressInEra ShelleyBasedEraAlonzo) (ShelleyAddress Testnet (KeyHashObj (KeyHash "30b49f3a89bb12e567cc21f749bbad9276e214eee6ffa63257bbcf30")) StakeRefNull)) (TxOutValue MultiAssetInAlonzoEra (valueFromList [(Ada
AssetId,3601482403)])) TxOutDatumNone)]
Seems like all nodes try to post the fanout which explains the invalid UTXO we observer -> leader only should try to fanout
- Our
BehaviorSpec
test is passing which is wrong We need to observe the transactions posted on chain to ensure a single node posts it RefactoringConnectToChain
type to have ahistory
function exposed to observe it - Our unit test is still passing :(
We confirm there's only on
FanOutTx
posted, by the party which decided toClose
. Trying to change the closer in ETE test shows ETE test passes consistently even when we change the closing party, so there's probably something fishy in the transactions we generate in the benchmarks
tx =
HardForkApplyTxErrFromEra
S
( S
( S
( S
( Z
( WrapApplyTxErr
{ unwrapApplyTxErr =
ApplyTxError
[ UtxowFailure (WrappedShelleyEraFailure (UtxoFailure (ValueNotConservedUTxO (Value 2000000 (fromList [])) (Value 520403697 (fromList [])))))
, UtxowFailure (WrappedShelleyEraFailure (UtxoFailure (BadInputsUTxO (fromList [TxInCompact (TxId{_unTxId = SafeHash "5866ca5080edbe814c1a5d05d505b137c8648e4f885bc35b951dfaf82c1a969b"}) 0]))))
, UtxowFailure (WrappedShelleyEraFailure (UtxoFailure NoCollateralInputs))
, UtxowFailure (WrappedShelleyEraFailure (UtxoFailure (InsufficientCollateral (Coin 0) (Coin 4947000))))
]
}
)
)
)
)
)
Seems like the wallet cannot find the input to pay/provide collateral There's probably a race condition in the wallet whereby we keep a tx input that's been used until we observe it from an onchain block
- We should remove the inputs as soon as we post the transaction to theyt become unavailable
Trying to add a property to WalletSpec
checking that: Our properties for covering fees are not relevant as they end up with 100% False
cases, eg. the generated tx/outputs don't have enough ADA to pass the function
- Trying to change the generators to produce TxOut with emough value and messing up with api/ledger discrepancies
- Struggling to generate the right combination of UTXOs for the wallet and an arbitrary transactions to cover fees. Our calculation depends on PParams' maximum fees which are too high as we currently compute fees as an upper bound using
maxTxExUnits
- Provide trimmed down pparams to ensure most of the transactions successfully cover fees
We observe the close fails to submit because it does not have enough funds, which makes sense given the collectCom
tx does not properly propagates the total funds committed.
- Fixing the value in the CollectCom's output
- Benchmarks now failing consistently because of
CannotSpendInput
error, which is probably caused by some node trying to post a CollectCom transaction concurrently with another node?
Looking at the errors reported by the benchmarks, feeling they could be clearer. Having the logs dumped to stdout as Haskell Show
instances makes them somewhat less readable and parseable.
Also thinking of a way to prove this is caused by concurrent attempts at posting the collectcom tx: There aren't any hints at this in the logs
- Trying to replace the body of transactions in the
DirectChain
log with their ids - Turns out we also have a problem in
commitTx
:There too the values are incorrectly computed. This does not really explain why the benchmarks fail withcommitTx party utxo (initialIn, pkh) = mkUnsignedTx body datums redeemers scripts where body = TxBody { inputs = Set.singleton initialIn <> maybe mempty (Set.singleton . toShelleyTxIn . fst) utxo , collateral = mempty , outputs = StrictSeq.fromList [ TxOut (scriptAddr commitScript) (inject $ Coin 2000000) -- TODO: Value of utxo + whatever is in initialIn (SJust $ hashData @Era commitDatum) ]
CannotSpendInput
though?
It seems we are consuming the wrong UTXO in the collectCom, eg. we are consuming the committed UTXOs instead of the result of the commit tx
In the observeCommitTx
we return the committed UTXO:
observeCommitTx :: ValidatedTx Era -> Maybe (OnChainTx CardanoTx)
observeCommitTx tx@ValidatedTx{wits} = do
txOut <- snd <$> findScriptOutput (utxoFromTx tx) commitScript
dat <- lookupDatum wits txOut
(party, utxo) <- fromData $ getPlutusData dat
OnCommitTx (convertParty party) <$> convertUtxo utxo
where
commitScript = plutusScript MockCommit.validatorScript
convertUtxo = Aeson.decodeStrict' . OnChain.toByteString
and the collectCom has no way to know the actual UTXO from the CommitTX itself.
- Quick intro why Rollbacks happen and which parts of the architecture are related to this
- Arnaud presents the strategy:
- Each node just re-applies the events to recover the HeadState as it was
- When re-applying, the events are not reported back to the HeadLogic (really? why not?)
- But what happens when there is e.g. an Abort when re-applying / synchronizing with the chain.
- Any unexpected "replay" is deemed an adversarial action and we would be closing / aborting the head anyways -> anything happened in the Head so far would be lost.
- We aim to expose "stability" to our users so they can decide whether they rely on the open Head.
- Discussion starts
- PAB does this replaying as well and we are in danger of "re-inventing the wheel".
- Seeing an inconsistent transaction might not necessarily be an adversarial move though. Forking chains could result in this even with all honest parties.
- What is the actual problem here?
- Running example: The txs establishing a Head are rolled back and cannot be re-applied -> the Head was never open.
- Is it only when opening the Head? But also when closing/contesting the Head?
- Simple re-submission might not be enough, it could also require the application to re-balance or even re-construct the transaction. (Three levels of reaction)
- "Whatever it takes" to re-establish the HeadState.
- "Confidence" in Head should be visible to the users and they can decide on it
- Is this an individual decision or should it be known apriori? i.e. a parameter
- Users should decide
- Other situations where rollbacks are bad:
- Contestation rollbacks!
- What happens with contestation period / validity?
- Need to adapt the timeout to the new situation / new slots?
We have several issues popping up following our changes in the commitis/collectCom/fanout logic:
- ETE test is failing intermittently to submit the fanout tx because of unsifficient funds and also missing UTXO
- Some properties are also failing on generating init/commit?
Checking the serialisation (Plutus) for ContestationPeriod
Trying to track where the PT1
error comes from. It only appears in the .pir
code but not in the PLC, however the PIR for head is not generated.
- Other option is to dump the splice with TH.
- This is a dead end, what happens is that the script execution fails because of a mismatch in redeemers, because we are missing an input
Seems like we had a collision in the generators for TxIn
which we import from Test.Cardano.Ledger.Shelley.Serialisation.EraIndepGenerators
The hash generators used is based on an Int
:
genHash :: forall a h. HashAlgorithm h => Gen (Hash.Hash h a)
genHash = mkDummyHash <$> arbitrary
mkDummyHash :: forall h a. HashAlgorithm h => Int -> Hash.Hash h a
mkDummyHash = coerce . hashWithSerialiser @h toCBOR
...
instance CC.Crypto crypto => Arbitrary (TxId crypto) where
arbitrary = TxId <$> arbitrary
instance CC.Crypto crypto => Arbitrary (TxIn crypto) where
arbitrary =
TxIn
<$> (TxId <$> arbitrary)
<*> arbitrary
Int
has a much smaller domain than a 32-bytes BS obviously and we fell into a case where 2 TxIn were generated that lead to the inputs being "merged".
How to evolve our code to handle that problem? What this means is that we must ensure the head input is unique and does not conflict with the initials input
- Function is currently not total, it fails if this requirement is not met
- We could return an
Either
or create a "smaller" input type with a smart constructor, but the latter is pretty much the same as the first. - Trying to filter the initials to remove the head input if it's there => does not work, because it then fails to validate the scripts because of discrepancies between utxo, tx and redeemers
- If/when we mvoe to using cardano-api, we know the
makeTransactionXX
functions can fail so we probably need to fail too.
End up returning Either
which ripples all over the codebase
- We have 59 calls to
error
in our codebase, which is not great
We are observing collision in the list of initials now:
- Passing a
Map TxIn Data
asinitialInputs
toabortTx
function makes it more explicit we don't want collisions there
- Start refactoring of using cardano-api in
Hydra.Chain.Direct{.Tx, .Wallet}
- Start from the "inside out" by creating
Api.Tx
and converting it toLedger.ValidatedTx
on demand to ensure we can create the transaction drafts / pass them to the Wallet as we do right now. - When creating a
plutusScript
pendant to the existing one to get acardano-api
script (to get a script address) -> where is theToCBOR Plutus.Script
instance coming from? - Turns out.. there is only a
Serialise
instance (fromserialise
package) and other parts (plutus-ledger
packageLedger.Scripts
module) to define an orphan ToCBOR instance which does useencode
fromserialise
package -
observeInitTx
logic could be directly translated via access toLedger.TxBody
, but can be made much simpler using theTxBodyContent
- Rewriting
observeInitTx
was a bit more work than expected, but works. - If we use
cardano-api
types we might be able to drop theData
fromOnChainHeadState
triples as theApi.TxOut
can carry the Datum to spend an output (in CtxTx) .. or maybe not as it's optional in the Api.TxOut type. - Next Step: ensure
finalizeTx
can work with only aTxBody
and produce a fully balanced & signedValidatedTx
- idealy using
makeTransactionBodyAutoBalance
eventually is an alternative tocoverFee_
- idealy using
- Looking at the signature of
makeTransactionBodyAutoBalance
.. shouldinitTx
be producing aTxBodyContent BuildTx Era
instead? and only theHydra.Direct.Wallet
make it aBalancedTxBody
and then sign it? i.e.
initTx :: .. -> TxBodyContent BuildTx Era
-- rename to 'balance'?
coverFee :: .. -> TxBodyContent BuildTx Era -> TxBody Era
sign :: .. -> TxBody Era -> Tx Era
Today's goal: Fanout real UTXO
- We need to add the UTXO to the output of the fanout TX
- The fanout tx is currently incorrect as it outputs a UTXO for the state machine which should not be the case
- To detect the fanout tx, we can look at the inputs and check one of them uses the head script passing
FanOut
as redeemer
- To detect the fanout tx, we can look at the inputs and check one of them uses the head script passing
Adding an assert to the DirectChainSpec
to observe there is a payment made to Alice after observing the FanOutTx
- We fail to submit tx with a strange error about
OutpuTooSmall
- Looks like we were tripped by SN's comment about closing with an arbitrary UTXO! The problem is that we cannot make this test pass keeping it as it is ,there are still some kind of verifications done to ensure consistency of txs
Going to add more verification in EndToEndSpec
- We want the same kind of assertion to be done in ETE test, eg. to check we correctly fan out the right UTXO for alice and bob
- Refactoring check from
DirectChainSpec
into awaitForUtxo
function Had to extract\case
function to named and typed function in where clause to make compiler happy
We cannot use the utxo
in ETE directly as it's a mixed type TxIn/JSON -> converting to and From JSON to get a correct Utxo
- The ETE test fails because the output we want to fanout is too small! It's
14
which is fine off-chain as our params are very lenient, but not quite so in Alonzo. - We commit just 1 ADA in the head which is enough to fanout
We have a failure on the /Hydra.Chain.Direct.Tx/fanoutTx/transaction size below limit for small number of UTXO/
test: With too many UTXOs the transction becomes way too large, esp. as those UTXOs are pretty much arbitrary and can themselves be very large.
- Trimming it down to filter UTXO > 10 items does not help much
- *Disabling the property for now
Thinking about rollbacks and laying out a plan:
- If we can resubmit a rollbacked tx, we do it
- How do we detect a transaction has been rolled back?
- We need to record the block at which a transaction of interest is observed (from Chain Sync)
- When we get a rollback message, we can check which transactions are past the rolled back index
- Then we can resubmit them in order
- How do we know it can be resubmitted?
- we always resubmit
- If resubmit fails on a supposedly rolledback tx
- Alert user
- Act on failing tx in head state:
- Init => Head vanished => discard state
- Commit => User can try to recommit, the same or other utxo?? (we know which commit(s) has been rolledback)
- CollectCom => one commit probably disappeared?
- Who does the resubmit?
- the one who initially submitted it
- if the rollback is adversarial ??
- How do we detect a transaction has been rolled back?
- Could we kill
OnChainHeadState
and only do queries?- we have 2 competing states => risk of desync is high
- we should stop observing the chain in the DirectChain
- Each party's chain component should insulate the Head state from rollbacks and try to resubmit
- If someone does not resubmit, there's no point in another party trying to resubmit
- The onchain head state maintains a stream of events topost/to observe
- When a rollback happens, past events are rehandled in a "rollback mode" which means they do not propagate to the Head State
- New observation from the chain still entail notification to the head state
- Head state must trust the chain component and any event coming from it resets the head state
- All nodes need to follow the rollback protocol so there cannot be any "interleaved" event
- in this case we need to abort/close, but that can also be tricky
We need to design the Close sequence for rollbacks
How to deal with benchmarks?
- We need to commit UTXO initially
- We need to pass the keys for the initial UTxO to ensure the commits end up having the same ids between every run
Adding singingKey
to Dataset
type -> need to implement To/FromJSON
also removing Eq
instance
- Adding a roundtrip JSON test for
Dataset
- We cannot use plain
genDataset
, got some errors trying to generate arbitrary transactions:src/Test/Aeson/Internal/RoundtripSpecs.hs:59:5: 1) Test.Generator, JSON encoding of Dataset, allows to encode values with aeson and read them back uncaught exception: ErrorCall findPayKeyPairAddr: expects only Base or Ptr addresses CallStack (from HasCallStack): error, called at src/Test/Cardano/Ledger/Shelley/Generator/Core.hs:434:7 in cardano-ledger-shelley-test-0.1.0.0-827a00c3eaf868a9c6ed74e429f91efce6a3bea6c8e377f0e0d8dab608426e8b:Test.Cardano.Ledger.Shelley.Generator.Core (after 2 tests) Exception thrown while showing test case: findPayKeyPairAddr: expects only Base or Ptr addresses CallStack (from HasCallStack): error, called at src/Test/Cardano/Ledger/Shelley/Generator/Core.hs:434:7 in cardano-ledger-shelley-test-0.1.0.0-827a00c3eaf868a9c6ed74e429f91efce6a3bea6c8e377f0e0d8dab608426e8b:Test.Cardano.Ledger.Shelley.Generator.Core
Removed arbitrary dataset, now making sure we can commit generated UTXO
- In the
generatePayment
we extract the UTXO for the initial funds, there is a function apparently for that in the cardano-api we could use to get deterministically the initialtxIn
from which we can construct the payment transactions and hence its initial UTXO - Adding a
mkGenesisTx
function to compute the transaction and output frominitialFunds
for a given key - Realising we don't need the initial Utxo but actually an initial payment tx => we still need to store signing key to prime the node
Wart: Would make sense to have the networkId
in the CardanoNode
, right now we expose a defaultNetworkId
which is hard-coded
Implementing castHash
to convert from a payment key to genesis utxo key
- We don't have access to constructors so we need to serialise then deserialise to convert the values
GeneratorSpec
tests are now failing because we use all UTXO from the initial funding transaction to compute the amount to send, we should only select the "commit UTXO" and pass this around -> writing a (partial) function to select the minimal UTXO.
- This will work fine for the genesis transaction because all UTXO have the same TxId and different index, and the "commit" UTXO is the first one by construction.
Benchmark compiles but fails with a strange error about "peers not connected"
- There was a discrepancy in the value of initial funds leading to an error when committing on-chain -> unified into a hardcoded constant in
CardanoCluster
- We now only have a problem with paying the fees for the initial funding tx, so we need to use
buildRaw
andcalculateMinFee
to properly build the tx - We also have a logical problem with the way we generate and run datasets:
- The
concurrency
parameter defines how many datasets we generate - We use the number of datasets generated to define the number of nodes to run
- The
Now struggling to retrieve the ProtocolParameters
needed to calculate feees.
- We want to extract them from the genesis-shelley.json file, but apparently there's a discrepancy in the formats: The API has a
ProtocolParameters
format which is different from each Era's format - There is a JSON instance for genesisShelley, we can use that to read it from file and then convert to API's
ProtocolParameters
Benchmark fail on submitting commit tx with an ValueNotConserved
error: Seems like the UTXO we consume is not correct, probably unknown by the Wallet
-> This is the one we construct from the initialFunds
which is supposed to work thanks to a ledger function to produce a TxIN from initial funds
Managed to generate dataset with initial funding transaction but ran into a snag: The "leader" of the bench run does the InitTx
which means it consumes its initial funding and produces a new transaction out of it, hence the initial funding transaction in the dataset does not exist anymore.
- If we move the commit seeding before sending the init, we get another error: Commit transactions have more than one UTXO committed.
- We forgot to filter the utxo returned by the initial funding transaction
It now fails in the finalizeTx
: The wallet raises an exception saying it cannot find the input to spend or cover the fees.
Fixed run of the benchmarks:
- We attached the client to the datasets in the wrong order thus the keys and UTXO were not the right ones
- Always select the maximum value UTXO from the wallet for change
Migrating to cardano-api:
- need to materialise nix on MB's machine because of a weird error
- some failing tests: generator test is failing with invalid witnesses, validation of TX also fails?
The generator is failing with NoRedeemer
error -> probably generating tx with scripts and not passing the redeemers => Improving error reporting to see the actual tx generated
Transaction generated by alonzo generator are not necessarily valid because of the scripts or execution units or what not...
- We should talk to ledger team on how to generate valid Alonzo txs, who does generate Alonzo txs for test?
- In the meantime, would make sense to generate Mary txs and resign them because of the issue with body serialisation.
- Trying to increase execution budget in the PParams does not wokr
Using
freeCostModel
in our generator to sidestep the issue of execution units. Switching makes the test pass, seems like there's aTODO
in ledger code about using a non-free cost model for generating tx with scripts
Rebasing hail-cardano-api
branch onto master in order to fix ETE test: We want to be able to commit 0 or 1 UTXO which has been fixed in master and is the last failing test
Completing work on commits from L1.
Fixing ETE test to ensure it uses a properly committed UTXO. It's pretty straightforward thanks to the utxoToJSON
function that converts the generated payment to the expected format.
Transaction fails to be submitted off-chain:
seen messages: {"transaction":{"witnesses":{"keys":["8200825820db995fe25169d141cab9bbba92baa01f9f2e1ece7df4cb2ac05190f37fcc1f9d5840dffeaeb16f1b23a76b1f038f835099c81aaaab7d1ac9e8c0fadc192e7593466810193a72d53c9402eeb7748e6b7eef19d287241b385976da929237f279d3d300"],"scripts":{}},"body":{"outputs":[{"address":"addr_test1vz35vu6aqmdw6uuc34gkpdymrpsd3lsuh6ffq6d9vja0s6s67d0l4","value":{"lovelace":1000000}}],"mint":{"lovelace":0},"auxiliaryDataHash":null,"withdrawals":[],"certificates":[],"inputs":["9fdc525c20bc00d9dfa9d14904b65e01910c0dfe3bb39865523c1e20eaeb0903#0"],"fees":0,"validity":{"notBefore":null,"notAfter":null}},"id":"4c69e0154cdc07ca752157ed6cf247fe449b3d21e40bab0c848a822ae5a54c85","auxiliaryData":null},"utxo":{"998eec9baf49ee66c1609157f00a31198621740226584ae0eb4f32c81ff700f0#1":{"address":"addr_test1vru2drx33ev6dt8gfq245r5k0tmy7ngqe79va69de9dxkrg09c7d3","value":{"lovelace":1000000}}},"validationError":{"reason":"ApplyTxError [UtxowFailure (UtxoFailure (ValueNotConservedUTxO (Value 0 (fromList [])) (Value 1000000 (fromList [])))),UtxowFailure (UtxoFailure (BadInputsUTxO (fromList [TxIn (TxId {_unTxId = SafeHash \"9fdc525c20bc00d9dfa9d14904b65e01910c0dfe3bb39865523c1e20eaeb0903\"}) 0])))]"},"tag":"TxInvalid"}
The input to use for the off-chain transaction was hardcoded -> replacing with the one we generate
Now with another error:
{"transaction":{"witnesses":{"keys":["8200825820db995fe25169d141cab9bbba92baa01f9f2e1ece7df4cb2ac05190f37fcc1f9d58405f14a0e9b7da0deca07529cd4d1fa8e59b5efe345afcc94c7dbf1eb7c2e8a485658e36715d04ea305f510291204c0450f4d7cbc119e2495d98430134a3b9c301"],"scripts":{}},"body":{"outputs":[{"address":"addr_test1vz35vu6aqmdw6uuc34gkpdymrpsd3lsuh6ffq6d9vja0s6s67d0l4","value":{"lovelace":1000000}}],"mint":{"lovelace":0},"auxiliaryDataHash":null,"withdrawals":[],"certificates":[],"inputs":["998eec9baf49ee66c1609157f00a31198621740226584ae0eb4f32c81ff700f0#1"],"fees":0,"validity":{"notBefore":null,"notAfter":null}},"id":"d87840c06e3e65d422ed9181273579dc82e6b471024d3899610b5c025a243442","auxiliaryData":null},"utxo":{"998eec9baf49ee66c1609157f00a31198621740226584ae0eb4f32c81ff700f0#1":{"address":"addr_test1vru2drx33ev6dt8gfq245r5k0tmy7ngqe79va69de9dxkrg09c7d3","value":{"lovelace":1000000}}},"validationError":{"reason":"ApplyTxError [UtxowFailure (MissingVKeyWitnessesUTXOW (WitHashes (fromList [KeyHash \"f8a68cd18e59a6ace848155a0e967af64f4d00cf8acee8adc95a6b0d\"])))]"},"tag":"TxInvalid"}
Probably because we are passing the wrong key?
- => Keys and addresses were generated and hardcoded
ETE test is now passing!
Fixing benchmark to work with on-chain commits:
- we currently generate an arbitrary dataset from a seed random Utxo and then generating transactions with the right keys
- in the case of the constant Utxo, things are fine because we generate key pair to produce a utxo so we could as well keep the (initial) keys around and then commit the initial utxo on chain when starting the benchmark
Struggling a bit with getting to/fromJSON instances right for KeyPair
, current solution is to store the signing key only and regenerate the verification key from the serialised bytes
the bytestring is hex-encoded in a JSON object
- Added a function to transform a
Ledger.KeyPair
into a(VerificationKey PaymentKey, SigningKey PaymentKey)
Feels like changing the benchmark is a bit more involved than merely adding keys and committing, as the whole logic of generating transactions beforehand is a bit borked. We should probably provide not a dataset but parameters for a dataset, like number of transactions to run and other things, and generate the txs on the go. This might skew the timings a bit but probably dwarfed by IOs anyway. Feels like the "right path" be:
- have a single way of generating dataset, eg. one utxo per client
- generate keys for each participants
- pass dataset parameters instead of actual dataset
- do not store the dataset? Once we have keys defined then the UTxOs should be constant?
- generate transactions in client from previous transaction, using the same keys
- transactions can be sent at random to some other party? but this might deplete the clients' funds and led to a client not being able to post txs anymore
- fix the payment graph so that amounts stay constant and all parties can always keep generating txs
Still trying to properly commit actual UTXOs.
Managed to get TxOut
transformed between an Alonzo one and a Mary one, without requiring full transformation of the internal ledger
- Tests are failing still but with a different error, namely that we commit more than UTxO which is odd... Actually not: We wait for all payments at some address and use the retrieved UTxO there to commit, but we should only commit one of them.
got another interesting error:
ErrNotEnoughFunds {missingDelta = Coin 2298000}
At least, I can see that the inputs are correctly set, with the committed UTxO as input and also in the datum:
failed to cover fee for transaction: ErrNotEnoughFunds {missingDelta = Coin 2298000},
ValidatedTx {body = TxBodyConstr TxBodyRaw {
_inputs = fromList [TxIn (TxId {_unTxId = SafeHash "bc1eae5aa8e72d3f9e5c0cd725b49dc47954570b6087eb16b5cc3f4ce6daf4ea"}) 1
,TxIn (TxId {_unTxId = SafeHash "e50062182d5d401d13249a7f7e7e1ac73deec0170421e10bc7d9b346c284ebdd"}) 1],
_collateral = fromList [],
_outputs = StrictSeq {fromStrict = fromList [(Addr Testnet (ScriptHashObj (ScriptHash "6679d3c92844becb16c55161b60111336e1ba2f3d14bbb52b051c4db")) StakeRefNull,Value 2000000 (fromList []),SJust (SafeHash "549485dcc8131ab64122a9163080943f55b83ae368bc55bec73e583f192f3080"))]},
_certs = StrictSeq {fromStrict = fromList []},
_wdrls = Wdrl {unWdrl = fromList []},
_txfee = Coin 0,
_vldt = ValidityInterval {invalidBefore = SNothing, invalidHereafter = SNothing},
_update = SNothing,
_reqSignerHashes = fromList [],
_mint = Value 0 (fromList []),
_scriptIntegrityHash = SNothing,
_adHash = SNothing,
_txnetworkid = SNothing},
wits = TxWitnessRaw {_txwitsVKey = fromList [],
_txwitsBoot = fromList [],
_txscripts = fromList [(ScriptHash "12c4f8ff8070f0d659bdb2ecf844190ddb1134ef821764bd2b4649b2",PlutusScript PlutusV1 ScriptHash "12c4f8ff8070f0d659bdb2ecf844190ddb1134ef821764bd2b4649b2")],
_txdats = TxDatsRaw (fromList [
(SafeHash "2502ff9c9c341dd1384724ae35eab0b19e394c90226892fcc8e7cc86342d324e",DataConstr B "\248\166\140\209\142Y\166\172\232H\NAKZ\SO\150z\246OM\NUL\207\138\206\232\173\201Zk\r"),
(SafeHash "549485dcc8131ab64122a9163080943f55b83ae368bc55bec73e583f192f3080",DataConstr Constr 0 [I 10,B "{\"e50062182d5d401d13249a7f7e7e1ac73deec0170421e10bc7d9b346c284ebdd#1\":{\"address\":\"addr_test1vru2drx33ev6dt8gfq245r5k0tmy7ngqe79va69de9dxkrg09c7d3\",\"value\":{\"lovelace\":1000000}}}"])]),
_txrdmrs = RedeemersRaw (fromList [(RdmrPtr Spend 0,(DataConstr Constr 0 [],WrapExUnits {unWrapExUnits = ExUnits' {exUnitsMem' = 0, exUnitsSteps' = 0}}))])}, isValid = IsValid True, auxiliaryData = SNothing},
using utxo: fromList [(TxIn (TxId {_unTxId = SafeHash "bc1eae5aa8e72d3f9e5c0cd725b49dc47954570b6087eb16b5cc3f4ce6daf4ea"}) 0,(Addr Testnet (ScriptHashObj (ScriptHash "07204aff6885e57d799ee359f7e401d64ff3b54e0de7c4edc3f163d9")) StakeRefNull,Value 2000000 (fromList []),SJust (SafeHash "f4b9d64e4725efc05d7d078bb19e952b288c8403f5a585a8a6ffe589a9851614"))),
(TxIn (TxId {_unTxId = SafeHash "bc1eae5aa8e72d3f9e5c0cd725b49dc47954570b6087eb16b5cc3f4ce6daf4ea"}) 1,(Addr Testnet (ScriptHashObj (ScriptHash "12c4f8ff8070f0d659bdb2ecf844190ddb1134ef821764bd2b4649b2")) StakeRefNull,Value 2000000 (fromList []),SJust (SafeHash "2502ff9c9c341dd1384724ae35eab0b19e394c90226892fcc8e7cc86342d324e")))]
Finally fixed the commit test:
- The UTXO set maintained by the Wallet is right: When we find a block, we traverse the transaction list (topological ordering?), remove the txins we know from the map and add the txouts we found corresponding to our address of interest.
- The problem was in the way we select the UTXO to use in
coverFee
: We take the maximum of the UTXO from our internal state but this maximum is just an ordering of txids and chances are we get a smaller UTXO. Just filtering the map to select UTXO with a value higher than some threshold makes the test pass. - The only remaining test that fails is the
EndToEndSpec
test as we still try to commit an arbitrary UTXO.
Fixing TUI and the use of addresses, comparing by Show
instance which is not great but needed because we keep them as keys in Map
. Now implementing mkSimpleTx
which is the function from TUI that creates actual transaection to be committed on chain
Completed implementation of mkSimpleTx
, it now returns an Either
with an error, because makeTransactionBody
does: It checks well-formedness of the transaction.
Got a failure with invalid witnesses and an encoding problems for UTxOs, getting an invalid UTF-8 encoding
error
- Problem comes from
AssetName
.AssetName
are encoded as Latin-1 in the cardano-api, why? -
ToJSON
/FromJSON
instances in our version of cardano-api are wrong, they do not roundtrip properly. We want to upgrade our cardano-node dependency as it's been fixed recently
Fixed the dependencies and the few impacts from changes in API
- Still have a few failures related to the UTXO generator
- One test that's failing is the one about size of commits: What is this test about?. Perhaps we only need to ensure a single UTXO would fit atm?
- Discussed usage of
cardano-api
inHydra.Ledger.Cardano
, MB is implementing it; that the Alonzo types are more complete / simpler to handle, but most of Alonzo features are not supported yet by our integration -> need to strip down generators in tests - "Fixed" benchmarks to only commit a single UTXO (current limitation)
- Continued "committing real UTXO" in pairing session
- Expect
postTx
of committing arbitrary UTXO to fail when really spending the selectd UTXO incommitTx
- To make it succeed though, we need to generate a payment tx such that the
Hydra.Direct.Wallet
"sees" the resulting UTXOs, knows about them and can spend them - We are using
cardano-api
viaCardanoClient
to construct, sign and submit this transaction - In order to pass the resulting
UTxO
toCommitTx (Utxo CardanoTx)
we would either need to convertcardano-api
UTxO
to thecardano-ledger
UTxO
type, or utilize the refactoredHydra.Ledger.Cardano
which usescardano-api
types
- Expect
- The
shell.nix
by default does also build local cluster scripts which cannot be disabled with an argument. - Also, this workbench is a bit confusing and it didn't seem to be giving me something I need.
- The
cabal
in scope is actually wrapped and uses a differentcabal.project
and using the standard one is not working well withexactDeps = true
- In summary, here is a diff I used to get into a
nix-shell
which cancabal test cardano-api
:
diff --git a/shell.nix b/shell.nix
index b44ff6d99..c20294d26 100644
--- a/shell.nix
+++ b/shell.nix
@@ -89 +89 @@ let
- cabalWrapped
+ pkgs.cabal-install
@@ -104,5 +103,0 @@ let
- ## Workbench's main script is called directly in dev mode.
- ++ lib.optionals (!workbenchDevMode)
- [
- cluster.workbench.workbench
- ]
@@ -112,9 +106,0 @@ let
- ]
- ## Local cluster not available on Darwin,
- ## because psmisc fails to build on Big Sur.
- ++ lib.optionals (!stdenv.isDarwin)
- [
- pkgs.psmisc
- cluster.start
- cluster.stop
- cluster.restart
@@ -125 +111 @@ let
- exactDeps = true;
+ exactDeps = false;
Looking at PR to review, SN's PR fails to build on CI with the following error:
hydra-node: failed to submit tx: HardForkApplyTxErrFromEra S (S (S (S (Z (WrapApplyTxErr {unwrapApplyTxErr = ApplyTxError [UtxowFailure (WrappedShelleyEraFailure (UtxoFailure (MaxTxSizeUTxO 17634 16384)))]})))))
CallStack (from HasCallStack):
error, called at src/Relude/Debug.hs:288:11 in relude-1.0.0.1-7lFjKp98vwwCeaGqBf05Yi:Relude.Debug
error, called at src/Hydra/Chain/Direct.hs:329:34 in hydra-node-0.1.0-inplace:Hydra.Chain.Direct
Seems like we are already hitting some limits of the chain, in this case the size of Tx apparently (greater than 16K)
- Bumping tx size to 50K and block size to 100K allows running the benchmark
- committed Utxo is about ±100 Utxo in total which makes the datum quite large
Something we could do to reduce the size of the datum on chain would be to make the commit and collectCom transactions pass a datum made from a hash of the MT of the committed Utxos.
- The committed Utxos would then be sent as a first message off-chain, signed by each party and verified thanks to the MT root hash:
- each party send
Committed
message containing its Utxo - each party reconstruct the Utxo MT and verify its root hash
- But is it really needed? We can reap the Utxo directly from the commmitted transactions, no need to pass it around
- Plus, how is it verified on-chain? The
ν_commit
validator needs to verify, in the case of an abort, that the UTXO posted by theAbortTx
are indeed the ones present in the datum of each of the aborted commits: This could be achieved by computing the MT root of the UTXOs committed by the abort transactions, as the validator has access to them but of course this could be computationally relatively expensive.
Another failure running the benchmark:
hydra-node: failed to submit tx: HardForkApplyTxErrFromEra S (S (S (S (Z (WrapApplyTxErr {unwrapApplyTxErr = ApplyTxError [UtxowFailure (WrappedShelleyEraFailure (UtxoFailure (FeeTooSmallUTxO (Coin 3365841) (Coin 3298000))))]})))))
Two problems:
- We can't just pack arbitrary many Utxo because this would blow up the tx size -> limit a single Utxo per participant
- We can only ever commit concrete
CardanoTx
not abstract ones on the direct chain -> removing parameterisation on concrete transactions handling in Direct chain - We can do things 2 steps:
- first degenerify the
tx
when usingDirectChain
- second limit commits to a single Utxo
- first degenerify the
Where to check we do not commit more than on TX?
- We could do it in the
CommitTx
but this would require introducing some more type for representing a single TxOut -> large change - We do it at the
DirectChain
level throwing an exception if something goes wrong - We check the size of committed Utxo in the
fromPostChainTx
function -> we could do it in thecommitTx
function instead?
Still 2 tests failing:
- 1 test about the size of the commitTx -> should be fixed once we change the interface to the
commitTx
function - End to end test
Interestingly our current code does not allow commiting no UTxO which explains why the ETE tests is failing: node 2 and 3 do not commit anything.
But the error reported says: MoreThanOneUtxoCommitteed
which is certainly misleading.
- Added a test to
DirectChainSpec
asserting we can commit an empty UTxO set, but this feels a bit too high-level for this kind of test. Perhaps there could be a more granular test module for thepostChainTx
function? There seems to be a separate responsibility here, which is the handling of the on-chain state - But the
commitTx
function accepts one and only one UTxO, so we cannot commit an empty UTxO set Going down the easy route: passMaybe Utxo
to thecommitTx
function
What we achieved?
- Swapped Hydra node to use real cardano node on a devnet, removing Mock chain
- Working on making master "green" again following big changes
- Demo works again with cardano node
What we plan to do?
- Properly commit and fanout Utxo from/to the real chain
- Design and implement handling of rollbacks
- Start implementing proper OCV in Plutus (again)
- Follow-up meetings with potential Hydra early adopters
- Seems like
haskell.nix
puts an exe into the shell env when it's mentioned as abuild-tool-depends
in one of the local packages'.cabal
- When showing the demo, ensure the
devnet
is wiped andcardano-node
is restarted, otherwise thehydra-node
(it's wallet) could not find a "seed input" and crashes (for now at least)
- To have the
cardano-node
docker (entrypoint) not remove key arguments and indeed produce blocks it requires the environment variableCARDANO_BLOCK_PRODUCER=true
- https://github.com/input-output-hk/cardano-node/blob/master/nix/docker/context/bin/entrypoint seems to be the entrypoint in use
- When getting
NoLedgerView
errors, updating genesissystemStart
(byron + shelley) to be "within some time" helps - Investigating why the returned tx is not captured by
observeInitTx
- We should really log why something is not an initTx etc. -> Either Reason OnChainTx
- Found the reason:
-
observeInitTx
thought party is notelem
of the converted parties -
convertParty
creates un-aliasedParty
from the chain data
-
- Possible solutions:
-
alias
ofParty
is not taken into account forEq
ofParty
- strip
alias
fromparty
before checkingelem party parties
- Do not incorporate alias into party but wrap it with data AliasedParty = AliasedParty Text Party
-
- Receive a
CommandFailed
when trying to commit - Ran into the problem that the hydra-tui was showing "Initializing" ==
InitialState
, but in fact we were only in "ReadyState" -> this is because we violated "make impossible states unrepresentable" when managing the state in Hydra TUI! :( - Make HeadLogic "alias"-proof by adding an aliased party in
HeadLogicSpec
- Lots of repetition between
README.md
anddemo/README.md
, especially after introducingprepare-devnet.sh
-> only explain in demo/? - Chain works, but seems like hydra-nodes are not connecing to each other (using direct
cabal
invocation of demo setup) - Nodes seem to be connected in the
docker-compose
setting- No tx submission in an open head though with repeating ReqTx:
{"message":{"node":{"by":{"alias":"alice","vkey":"000000000000002a"},"event":{"message":{"transaction":{"witnesses":{"scripts":{},"keys":["820082582060cdff1c5cd672fb7d8df7f60121fabd4416b2381df70d5c65cb1559af81599858406d08cd35336088575712d8f7fb5fc96a9e29fa6c89305a920aa41e2162a98b0daeb82a7696d14cd9ff6b308eebf71620f354a6820467d87ca5ff8ca383f10705"]},"body":{"outputs":[{"address":"addr_test1vre6wmj9zmh0fjfavedh6q9lq32lunnlseda4xk7t0cg47sal9qft","value":{"lovelace":1893670963}}],"mint":{"lovelace":0},"auxiliaryDataHash":null,"withdrawals":[],"certificates":[],"fees":0,"inputs":["ae85d245a3d00bfde01f59f3c4fe0b4bfae1cb37e9cf91929eadcea4985711de#93"],"validity":{"notBefore":null,"notAfter":null}},"id":"7071e48915eb9c3de986cef336544c24af8a01eedbf4721ddbef6fce0b591ad3","auxiliaryData":null},"party":{"alias":"alice","vkey":"000000000000002a"},"tag":"ReqTx"},"tag":"NetworkEvent"},"tag":"ProcessedEvent"},"tag":"Node"},"timestamp":"2021-11-17T18:21:07.164988302Z","namespace":"HydraNode-1","threadId":33}
Just realised we have this section in shell.nix
tools = [
pkgs.pkgconfig
pkgs.haskellPackages.ghcid
pkgs.haskellPackages.hspec-discover
pkgs.haskellPackages.graphmod
pkgs.haskellPackages.cabal-plan
pkgs.haskellPackages.cabal-fmt
# Handy to interact with the hydra-node via websockets
pkgs.ws
# For validating JSON instances against a pre-defined schema
pkgs.python3Packages.jsonschema
pkgs.yq
# For plotting results of local-cluster benchmarks
pkgs.gnuplot
];
but actually it's not used and the tools listed are not available on the command-line!
Seems like they are used in the shell based on haskell.nix
but not in the cabal only shell
Problems on master:
- Flacky test on DirectChain
- Test checking conformance of logs with schema does not seem to catch undocumented ctors
Adding an item in the backlog for rollbacks which we should handle sooner rather than later
Looks like the flakiness of DirectChainSpec
comes from the use of withCluster
which strats 3 nodes and produces rollbacks
- We need to add
waitForSocket
everywhere which is clumsy => refactor to move intowithBFTNode
Last step before merge to master = make benchmark runnable again
- We need to generate all key pairs for all nodes in the cluster and then write relevant files for each node
- We need the (Cardano) keys to modify the entries in
initialFunds
-> move them before we start BFTNode - Passing the list of verification keys to
makeNodeConfig
so that we do the change togenesis-shelley.json
inside thewithBFT
function => add empty list when not needed Using lenses to update theinitialFunds
field, we can useaddField
function fromCArdanoNode
We need to encode theVerificationKey PaymentKey
we have into Hex-encoded thingy
We can run the benchmarks and got results 🎉
- Seems like re-running benchmarks does not work correctly now
- resubmit transaction when it gets rollbacked? if it is valid
- indicate to the user probability of a rollback -> %age of stability = paranoia level
- making it a function of value committed? overridable with own settings
- we could replay the sequence of events? => genuine rollbacks (non-adversarial)
- if L1 can rollback so can L2
- once enough time has passed no rollback can happen => only need to keep stream of events until
$k$ slots has passed - the off-chain can start from where it was, the latest snapshot if the Head can be reopened with same UTXO set
- if contestation period is shorter than rollback period this could be a security issue we could not submit the exact same close because the validator would check the contestation period extends from the start of the close
- txs in the mempool would be replayed automatically in case of rollbacks => we'll observe them in the ChainSync
- user might want to introspect the on-chain state?
- PAB does not do anything about rollbacks -> pushing it to users
- what to expose to users? stability level, probability that there will be a rolllbak (99.99% is a few dozen blocks)
- provide a
HeadRollbacked
output to users - practically, most rollbacks have been pretty small (< k/4) https://plutus-apps.readthedocs.io/en/latest/plutus/howtos/handling-blockchain-events.html https://plutus-apps.readthedocs.io/en/latest/plutus/explanations/rollback.html
Some documentation on Settlement error
SettlementError(b, eps, g) = g * exp(-0.69 - b * [0.249 * eps^{2.5} + 0.221 * eps^{3.5}])
Parameters:
b: the number of blocks on top of the transaction in question;
eps = 1 - 2*[adversarial stake], where the adversarial stake is a real between 0 and ½;
g: the grinding power of the adversary. A single-CPU grinding would correspond to g=10^5; a conservative default choice could be g=10^8 corresponding to a 1000-CPU grinding.
The resulting SettlementError(b, eps, g) is an estimate of the probability that a valid transaction appearing b blocks deep can be later invalidated. Here exp(X) refers to e^X where e is the base of the natural logarithm.
Note: given the crude estimate on grinding coming from the factor g, for small values of b the formula will produce outputs greater than 1 (until the exponential term becomes small enough to counter the effect of g). This simply means that for such small values of b this method does not provide any guarantees.
>>>>>>> Updated Logbook
- Add instructions on how to start a local, single cardano-node devnet.
- Generating a topology file feels annoying, but providing all peers as arguments (as we do) might scale less?
echo '{"Producers": \[{"addr": "127.0.0.1", "port": 3001, "valency": 1}\]}' > topology.json
- Got a
HandShakeError
withVersionMismatch
- was led astray on updating to a newer
cardano-node
dependency in our code - however, our code was already newer than the docker image so supporting the latest + one before was the solution
- was led astray on updating to a newer
-
hydra-node
can connect after fixing protocol version,initTx
is created and submitted, but not observed- Also when re-trying / re-submitting the node crashes with
hydra-node: cannot find a seed input to pass to Init transaction CallStack (from HasCallStack): error, called at src/Relude/Debug.hs:288:11 in relude-1.0.0.1-KWrPF7zdlFZ8gdnjuoSoUr:Relude.Debug error, called at src/Hydra/Chain/Direct.hs:359:13 in hydra-node-0.1.0-inplace:Hydra.Chain.Direct
- After restarting the hydra-node and re-trying [i]nit this is the error
hydra-node: failed to submit tx: HardForkApplyTxErrFromEra S (S (S (S (Z (WrapApplyTxErr {unwrapApplyTxErr = ApplyTxError [UtxowFailure (WrappedShelleyEraFailure (UtxoFailure (ValueNotConservedUTxO (Value 0 (fromList [])) (Value 900000000000 (fromList []))))),UtxowFailure (WrappedShelleyEraFailure (UtxoFailure (BadInputsUTxO (fromList [TxIn (TxId {_unTxId = SafeHash "39786f186d94d8dd0b4fcf05d1458b18cd5fd8c6823364612f4a3c11b77e7cc7"}) 0]))))]}))))) CallStack (from HasCallStack): error, called at src/Relude/Debug.hs:288:11 in relude-1.0.0.1-KWrPF7zdlFZ8gdnjuoSoUr:Relude.Debug error, called at src/Hydra/Chain/Direct.hs:328:34 in hydra-node-0.1.0-inplace:Hydra.Chain.Direct
- When adding all the block signing keys the node spams a weird log message (error?)
{"thread":"31","loc":null,"data":{"val":{"kind":"TraceNoLedgerView","slot":16370873857},"credentials":"Cardano"},"sev":"Error","env":"1.30.1:0fb43","msg":"","app":[],"host":"eiger","pid":"1","ns":["cardano.node.Forge"],"at":"2021-11-16T18:29:45.70Z"}
{"thread":"31","loc":null,"data":{"kind":"TraceStartLeadershipCheck","chainDensity":0,"slot":16370873858,"delegMapSize":0,"utxoSize":6,"credentials":"Cardano"},"sev":"Info","env":"1.30.1:0fb43","msg":"","app":[],"host":"eiger","pid":"1","ns":["cardano.node.LeadershipCheck"],"at":"2021-11-16T18:29:45.80Z"}
- Re-using a db + cardano-node run command from e2e test works (hydra-node shows initializing!)
- Manually invoking cardano-node keeps producing errors (and no blocks), this time:
{"thread":"32","loc":null,"data":{"val":{"kind":"TraceNodeNotLeader","slot":3462},"credentials":"Cardano"},"sev":"Info","env":"1.30.1:0fb43","msg":"","app":[],"host":"eiger","pid":"1","ns":["cardano.node.Forge"],"at":"2021-11-16T18:43:29.20Z"}
Goal: remove a pendingWith
statement in a test
- Master does not compile, so reverting to last green point which is 21 days in the past We should really take care of not breaking master in the future as this prevents rapid intervention and branching when need be (nbot that I am a bigfan of branchinbg anyhow)
Trying to reactivate ServerSpec test, seems like there's a race condition.
- I don't understand how the code works anymore so it's unclear to me why it's failing, this has to do with more messages coming than expected 🤔 ?
Here is a trace I see
received "{\"me\":{\"vkey\":\"0000000000000001\"},\"tag\":\"Greetings\"}" resp: Greetings {me = 0000000000000001} received "{\"me\":{\"vkey\":\"0000000000000001\"},\"tag\":\"Greetings\"}" resp: Greetings {me = 0000000000000001} sending ReadyToCommit {parties = fromList []}
- Trying to augment timeout does not work
Adding showLogsOnFailure
to have the server's traces displayed
- Of course, the client stops after receiving one message so it nevers waits for everyhing after the greetings....
- That was easy :)
Goal: Use DirectChain in EndToEndSpec
test
Added signing and verification keys for Cardano tx
- We can see the
initTx
being submitted and added to the MempPool but not having it be part of a block - Adding JSON instances to
DirectChainLog
in order to have better traces (we currently pass anullTracer
because we don't have those instances). Note that we don't write full instances becauseValidatedTx
is pretty complex so just use itsshow
instance
We need to increase timeout for observing init and commit transactions
- We see 1 node can commit but the other nodes are crashing with
no ownInitial
which says they cannot extract their own pkh from the initials Utxo => 🤦 We forgot to add our own pkh to the list of initials! - We fail to observe
headIsOpen
because of timeout again
Block length on our cluster is 2 seconds but we produce one block out of 3 because of our config which has 3 validators => we need 6 seconds we produce a block
- It depends on the
slotLength
andactiveSlotCoeff
: 100ms * 20 = 2s - The "rule" is that
$3k / f > epochLength$
Changed active slot coeff to 1.0 but to no avail, we still miss some commits and don't see the head being opened
- Carol has no fund so she can't do anything...
We got a green ETE test with DirectChain 🎉
- Tests are a bit slow though, even when we increase slot coefficient
We have a problem with benchmark: they run a cluster with arbitrary number of nodes, so we need to generate n
addresses and keys for each node. Plan is:
- read genesis file as JSON (raw JSON, we don't care to use cardano-api)
- generate right number of key pairs
- need to store the files in the temporary directory where we run the cluster
- convert to CBOR encoded address using
buildAddress
from CardanoClient - inject into
initialFunds
field ingenesis-shelley.json
After switching to use withDirectChain
in Bench.EndToEnd
we have issues with overlapping instances on Utxo CardanoTx
type which is an alias to the ledger's type
- Cardano.Api provides
ToJSON
instances which we should probably use, but we know cardano-api is prone to breaking changes. Also it has noFromJSON
from some of the types which is annoying as we rely on those in various tests - Got stuck down a rabbit hole with lot of tests failing => revert and use custom function to bypass overlapping instances + comment out some tests relying on
Arbitrary
instances forDirectChainLog
Ending the day with only DirectChainSpec
tests failing, unclear why.
Goal: Complete the OCV logic with Fanout
We have troubles with rollbacks in the chain: We observe some rollbacks even though there should not be any because we are on BFT nodes => Trying to run the network with a single BFT node makes the flakiness disappear
- BFT nodes should not do rollbacks but we have rollbacks, henceforth it's probably not BFT nodes we are running.
While writing observeFanoutTx
we are stuck with an issue: The test fails but there's no obvious reason why it's failing, seems like the redeemer (Fanout
) cannot be decoded correctly.
- Turns out it was a copy-paste issue 🤦
Ended up having the full test with mock fanout transaction (eg. one without actual committing txs) passing:
- We have a single node to avoid rollbacks
- The transition SM is still very simple, we probably want to ditch it altogether as we are handling the state threading by hand
- Refactored
CardanoCluster
andCardanoNode
to make it easy to run a single node cluster, also removing copying of keys which is unneeded in the cluster. We can just load the keys from where they are stored in the source tree
- Work on open & close via Direct chain (using mock validators)
- Patterns start to arise: there are many things which could be DRYed
- observing state machine transitions are very similar
- keeping track of "interesting" utxo in
OnChainHeadState
is very similar - constructing SM transition txs is very similar
- When adding the closing
SnapshotNumber
it was interesting to observe that initialy I kept it asDatum
, but toobserveCloseTx
it was more appropriate to keep it asRedeemer
and decode it from that- There was no need to store it as
Datum
(right now) - Also: Redeemers are more space efficient as we would only include them in the spending tx
- There was no need to store it as
- Concerning
OnChainHeadState
- Seeing the repetition in
OnChainHeadState
of(TxIn, TxOut, Data)
triples really gives the hint that this state-tracking code could also be generalized and only keep track of "interesting" utxo + their data - OTOH, a "head identifier" is currently implicitly encoded in the
threadOutput
TxOut
address, while it would make perfect sense to add it to thePostChainTx
/OnChainTx
types to describe "which head to abort" etc. - This makes me think that we could keep the whole state (head id + interesting utxo) abstractly in the
HeadState
, e.g. existentially quantified; That way, we would not needTVar
in theChain.Direct
and make the whole component stateless!
- Seeing the repetition in
- Copy & extend e2e test to also cover posting & observing
CollectComTx
- This should be fairly easy, as we know the total utxo from the
PostChainTx
value- OCV (not covered now) would "only" need to check all committed, i.e. all PTs present
- How far to go right now?
collectCom
could just ignore all the committed utxo? - Is probably the smalles step, but there would be no real value in the head (or the ledger would not allow it)
- When drafting
collectCom
I realize that we do not needHeadParameters
, but ratherData Era
- it is enough to just keep the Datum around uninterpreted in the
OnChainHeadState
- it is enough to just keep the Datum around uninterpreted in the
- When fixing
TxSpec
usage of construct functions (because changed signatures), I realize that "cover fee" test was more arbitrary than necessary- It was "side-loading" initial inputs, instead of feeding the
initTx
outputs intoabortTx
- It was "side-loading" initial inputs, instead of feeding the
- The more complex tests in
TxSpec
cry for some refactoring now- Some DSL or operators to easily construct outputs, datums and "forward" them from one to the next tx would help
- Continuing with implementing
collectComTx
via the canonical transaction size prop test- also about 7kB transaction size for most
arbitrary :: Utxo SimpleTx
- also about 7kB transaction size for most
- Next: roundtrip-test with a newly created
observeCollectComTx
- As the
OnCollectComTx
is actually holding no data, this should be quite trivial
- As the
-
observeCollectComTx
is just the same asobserveAbortTx
and can be obviously DRYed- deliberately holding back on it though
- it's possible that something comes up which makes it not as straight-forward
- Unit tests pass now. Quick confusion about why it passes even though datum hash of provided output is
SNothing
- To make the e2e "open Head" test pass, I only needed to plug
observeCollectComTx
into the<|>
sequence ofrunOnChainTxs
.. that was easy!- also,
runOnChainTxs
feels a bit off inChain.Direct.Tx
-> moving it toChain.Direct
- also,
- Made abortTx unit property tests pass by improving observeAbortTx, which requires to pass a Utxo to
observeAbortTx
now - When adding initials outputs to
initTx
, we need to store thePubKeyHash
of the participants Cardano credential!- This is not yet kept around
- We need to add the Cardano credentials of all the participants to
initTx
construction
- Discussing on
DirectChainSpec
where the cardano credentials for participants should go now - First try: Adding them to the
HeadParameters
analogously toparties = [alice, bob, carol]
- We know this is brittle and morally we would change
Party
to relateHydra
andCardano
(public) credentials to each other
- We know this is brittle and morally we would change
- Second try: Add it to
InitTx
for now as its used in less places - Realize adding it to
InitTx
is already involved- the lowest hanging fruit may be to pass it to the
withDirectChain
and thus make it "non-configurable" - would not work as we want to open subsets of participants?
- the lowest hanging fruit may be to pass it to the
- Third try: Start from bottom-up instead and work on
initTx
+observeInitTx
for now- not worry yet about where the keys come from
- Seeing the "not observed if not invited" test raises the question whether we should determine being part of the Head using the Hydra credentials or Cardano credentials?
- This is interesting case where we could have used the Mikado method to safely and incrementally build a plan to make cardano credentials available in
Party
- Start putting credentials into withDirectChain and see where this gets me
- Suprisingly, the
Chain.Direct
integration test passes with [] as cardano keys -> this should matter - Also interesting: only the "can commit" e2e test fails!
1) Test.DirectChain can commit
uncaught exception: ErrorCall
no ownInitial: []
CallStack (from HasCallStack):
error, called at src/Relude/Debug.hs:288:11 in relude-1.0.0.1-KWrPF7zdlFZ8gdnjuoSoUr:Relude.Debug
error, called at src/Hydra/Chain/Direct.hs:306:24 in hydra-node-0.1.0-inplace:Hydra.Chain.Direct
- Passing
[aliceCardanoVk]
makes the commit test progress further
src/Hydra/Chain/Direct.hs:275:15:
1) Test.DirectChain can commit
uncaught exception: ErrorCall
failed to cover fee for transaction: ErrUnknownInput ...
CallStack (from HasCallStack):
error, called at src/Relude/Debug.hs:288:11 in relude-1.0.0.1-KWrPF7zdlFZ8gdnjuoSoUr:Relude.Debug
error, called at src/Hydra/Chain/Direct.hs:275:15 in hydra-node-0.1.0-inplace:Hydra.Chain.Direct
- Of course: initials are not in
knownUtxo
- Need to add TxOut to
initials
ofOnChainHeadState
- this was a bit messy and the tuples are really crying for a refactor
- "can commit" e2e test progresses, but times out
- log is not very conclusive
- try adding more cardano logs to debug there
- For some reason I have not seen the
TraceMempoolRejectedTx
error before...
{"thread":"80","loc":null,"data":{"tx":{"txid":"txid: TxId {_unTxId = SafeHash \"7cda5fb5d5828c4cf1081f406cb1cc3d0241e2b8aa824b3682edfc4ea64c8138\"}"},"mempoolSize":{"numTxs":0,"bytes":0},"kind":"TraceMempoolRejectedTx","err":{"received":["fb5a425ee6b4da
│ 39fd9074006af88d7675e24acad19f252c0e133f379d1246c4"],"required":["2502ff9c9c341dd1384724ae35eab0b19e394c90226892fcc8e7cc86342d324e"],"kind":"MissingRequiredDatums","scripts":{"12c4f8ff8070f0d659bdb2ecf844190ddb1134ef821764bd2b4649b2":{"spending":"cf4be62
│ b474fe3047bc8630f462c0e130cb7064872a4e417da70ba321faf34e2#1"}},"errors":[{"kind":"CollectError","scriptpurpose":{"spending":"cf4be62b474fe3047bc8630f462c0e130cb7064872a4e417da70ba321faf34e2#1"},"error":"NoRedeemer"}]}},"sev":"Info","env":"1.30.0:a7085","
│ msg":"","app":[],"host":"eiger","pid":"1980729","ns":["cardano.node.Mempool"],"at":"2021-10-28T16:47:44.00Z"}
- "Handling" the reject result in
txSubmissionClient
has the test fail right away and no need to dig into log files! 🎉- For example
src/Hydra/Chain/Direct.hs:289:38:
1) Test.DirectChain can commit
uncaught exception: ErrorCall
failed to submit tx: HardForkApplyTxErrFromEra S (S (S (S (Z (WrapApplyTxErr {unwrapApplyTxErr = ApplyTxError [UtxowFailure (MissingRequiredDatums (fromList [SafeHash "2502ff9c9c341dd1384724ae35eab0b19e394c90226892fcc8e7cc86342d324e"]) (fromList [SafeHash "fb5a425ee6b4da39fd9074006af88d7675e24acad19f252c0e133f379d1246c4"]))]})))))
CallStack (from HasCallStack):
error, called at src/Relude/Debug.hs:288:11 in relude-1.0.0.1-KWrPF7zdlFZ8gdnjuoSoUr:Relude.Debug
error, called at src/Hydra/Chain/Direct.hs:289:38 in hydra-node-0.1.0-inplace:Hydra.Chain.Direct
- Reason:
commitTx
is not including an initial redeemer- our unit tests would ideally balance, sign and validate txs against a ledger to catch this earlier
- After adding datum/redeemer the tx submission does not fail anymore, but test times out
- enabling tracers in cardano-node to debug this
- There are funny things in the logs like
FetchDeclineChainNotPlausible
...declined","declined":"FetchDeclineChainNotPlausible","peer":{"remote":{"addr":"127.0.0.1","port":"25898"},"local":{"addr":"127.0.0.1","port":"41739"}}},{"length":"1","kind":"FetchDecision
- Reading our own logs would help though.. seems we see the tx, but not the corresponding
OnChainTx
FromDirectChain "alice" (ReceiveTxs {onChainTxs = [], receivedTxs = [ValidatedTx {...
- Duh..
observeCommitTx
was not even called from Direct chain /runOnChainTxs
- What kind of test would cover / improve this?
- Commit e2e test passes!
- Continue with creating a
MockCommit
script to focus on off-chain parts for now- the
Commit
script is not wrong per se, but we would not know as no test is covering them right now - also, any change driven by the off-chain logic (i.e. observing a commit tx) would force us updating the validator logic (not being checked etc.)
- the
- How to identify a commit tx?
- looking at the outputs? a single pay to v_commit?
- using the PT?
- After introducing a onchain
Utxo
type, we realize thatconvertUtxo :: OnChain.Utxo -> Utxo tx
is tricky- after as short discussion we decided to go via a binary representation
- for now ToJSON / FromJSON, later CBOR or so
- this allows us to use the
Tx
type class to convert to and from the on-chainUtxo
- Observing a commit tx works
- we were surprised by the small size and consistent size (~7kB)
- adding a scale (*100) had the transaction size property fail
- so it likely is because the
Gen (Utxo SimpleTx)
is staying in reasonable orders of magnitue and it's just Integers what we store
- The local-cluster test of committing via DirectChain still fails
- with a non-saying
PostTxFailed
- led us to changing a
Maybe
function chain back to useerror
for more better visibility - obviously we should improve error handling!
- with a non-saying
- The reason is that
initials = []
- as a next step, creating outputs which pay to a
MockInitial
script could work
- as a next step, creating outputs which pay to a
- After introducing a
MockInitial
script some unit tests fail
- seems like abortTx + observeAbortTx do not yield a Just anymore
- could it be that the mocked redeemer type
()
messes withHead.Abort
? - can't seem to spot the bug in
observeAbortTx
and how it would lead toNothing
.. too long of a day as it seems
- Found the reason:
observeAbortTx
does think it sees aJust CollectCom
when encounteringDataConstr Constr 0 []
in the redeemers, although that was the plutus equivalent of()
(theMockInitial
redeemer type)
- Next step: make
observeAbortTx
more robust
- Discussion about how
postTx
would fail if a transaction is invalid (or could not be submitted)- Synchronous failure via a return value or exception vs. asynchronous failure via the callback "back-channel"
- Easier to stick with the callback (
OnChainTx
) for now - How to extend that type to accomodate a
PostTxFailed
-> wrap or additional data constructor? - Decided for the latter
- Test assertion is then:
failAfter 10 $ takeMVar calledBackBob `shouldReturn` PostTxFailed
- Bob thinks he is part of the party now
- Sees the InitTx even though he is not invited
- Two ways:
- Provide the hydra credential to the direct chain and check HeadParamters against it
- Require to be payed (via a script) a participation token, spendable by our Cardano credentials
- Take small steps: We go with checking Hydra credentials being in parameters or not in off-chain
- Formulating a property test in
TxSpec
to facilitate "not being able to observe init tx when not invited"- required adding
Party
everywhere inChain.Direct
- required adding
- Prop tests pass, local-cluster test errs because can't post
AbortTx
in closed state- this catches the invalid scenario even in a synchronous manner
- in contrast to knowing whether a posted tx failed (tx submission thread is asynchronously connected via a queue)
- After wrestling a bit with the tx submission code, we get the DirectChainSpec test to pass!
- unfortunately, this is still not forcing us to do proper on-chain validation
- the stateful nature of the DirectChain component prohibits accidentially closing "other heads"
Today's goals:
- Finish reading Maladex's white-paper (they talk quite a lot about Hydra)
- Timebox trying to upgrade dependencies to make it possible to construct
Abort
transaction with cardano-api - Draft new ADRs
- Add PT minting, initial validator, and burning to init and abort txs
Trying to find a suitable set of dependencies to be able to build Hydra OCV transactions with cardano-api.
Found recent commits in cardano-node and cardano-ledger-specs that work fine but this leads to issues in plutus because of dependencies to stuff in networking
Removing all dependencies to PAB and off-chain Contract
from hydra-node to try to get to a smaller set of dependencies in Plutus
Sadly, the on-chain part of the StateMachine
code is in the new repository plutus-app which depends on cardano-node at an older version, but it is from more than one month ago and does not contain the changes we need, eg. allowing to store script as part of output's datum.
After upgrading dependencies for cardano-node, cardano-ledger-specs and plutus, and removing stuff related to the PAB, the local cluster test using cardano-api built transactions for init and abort passes:
Test.LocalCluster
should produce blocks and provide funds
Finished in 6.8452 seconds
1 example, 0 failures
Test suite integration: PASS
I had to vendorize the StateMachine
related code from plutus-apps
repository in order to make it work though.
Master build failed following merge, investigating why and fixing it:
- Timeout for observing transaction submission on-chain might too low for slow CI, might want to increase it
- Also, when tests fail, we cannot easily access the logs that were generated from the tests failure, we could upload the logs to some S3/google storage bucket when CI tests fail
Drafting ADRs related to cardano-api and direct chain interaction
Trying to write a test to introduce the need for Participation Tokens minting in order to make it possible to Commit funds to an opened Head and thus actually do useful stuff with an opened Head
The purpose of the PTs is to
- ensure only identified Head participants can commit
- ensure only Head participants can advance the Head, eg. post transactions for the OCV state machine The question is: How does one observe the rejection of a transaction submission, given this process is asynchronous?
- Seems like the "right way" to do this would be to add a constructor in the
OnChainTx
data type representing failed submissions, so that the node is notified when a submission fails? When we post a transaction we actually need evaluate the validators in order to be able to balance it and assign execution units, so we know whether or not the transaction is correct before submitting, even though a submitted transaction can still "fails", eg. being rolled back or rejected by the ledger because of double spending - There are quite a few
error
calls inDirect
module anyways, that should be handled in one way or another
Direct chain test is still failing apparently randomly, might come from issues in handling of rollbacks in the wallet: https://github.com/input-output-hk/hydra-poc/runs/3998479889?check_suite_focus=true#step:7:2699
- When Hydra (on mainnet)?
- What does opening a Hydra Head actually mean?
- Can a Hydra Head be opened from a mobile phone?
- How can a Hydra Head (auto-)scale horizontally?
- What are the limitations for Hydra when fully implemented?
- Where does the DApp go in Hydra?
- How would Hydra work with AMM (automatic market makers?)?
- Is Hydra a mixer?
- How is Hydra different from (zk-)rollups?
Trying to execute init -> abort script sequence using CardanoClient
, eg. only using cardano-api stuff
Currently lost in the maze of type wrappers for scripts...
Got a surprising error when trying to submit transaction:
uncaught exception: CardanoClientException
BuildException "TxBodyError TxBodyMissingProtocolParams"
The Protocol parameters are required in the TxBodyContent
to build the body in Alonzo era
So the script got executed but it produces the infamous error:
uncaught exception: CardanoClientException
BuildException "TxBodyScriptExecutionError [(ScriptWitnessIndexTxIn 0,ScriptErrorEvaluationFailed (CekError An error has occurred: User error:\nThe provided Plutus code called 'error'.))]"
Activated debugging output for scripts execution to try to see what's happening, but there are no logged errors, so it's a more general problem with the way the transaction is built.
- Trying to add collateral input does not fix the problem.
- MB spots (at least) one problem with the transaction: The datum is not part of it so the
Head.validatorScript
will fail because the state machine requires both datums (input and output) to be present in the transaction. - As observed by SN, there used to be now way in the cardano-api to include a
Datum
in a transaction, but this has changed "recently": One can now construct aTxOut
passing either aTxOutDatumHash
or aTxOutDatum
which will then be included in the scripts' context
- Problem is: coverFee does not know what's the input value to balance, inputs only is a txref
- short discussion about what to do now
- SN mentions this is a "self-made problem" as there is a split between the chain component and the tiny wallet, where the chain component would know the relevant utxo
- By keeping the separation we might end up with a good interface of "doing it externally" later
- Start pairing by adding a UTXO to lookup inputs to
coverFee :: ValidatedTx Era -> STM m (Either ErrCoverFee (ValidatedTx Era))
- After adding
lookupUtxo
tocoverFee_
we introduceknownUtxo
to get a set of known utxo from the Chain component'sOnChainHeadState
to provide tocoverFee
- SN is not convinced by the OnChainHeadState in general
- But this provides for the right info at the right time without refactoring too much know
knownUtxo :: OnChainHeadState -> Map (TxIn StandardCrypto) (TxOut Era)
- Consequently we also add
TxOut
values to theOnChainHeadState
'sInitial
constructor- This required more work in
observeInitTx
to provide theTxOut
and made it's implementation more complex - But we also fixed the "bug" that it's thread output's might not always be the
firstInput
- This required more work in
- When trying to fix
observeAbortTx
with the additional txout for thethreadOutput
...- fixing it is smelly as it is not used at all in
observeAbortTx
and one would rather expect a single "Head identifier" or so - Remove the unused types with intention to re-add what the
observeXXX
require further down the road to not "overgeneralize" - Removing the
OnChainHeadState
made the signatures ofobserveAbortTx
again a-> Maybe (OnChainTx, OnChainHeadState)
and simplified implementation ofobserveInitTx
again
- fixing it is smelly as it is not used at all in
- WalletSpec now failing because we changed interface and semantics of
coverFee
- Need to also
resolveInput
inwalletUtxo
(besideslookupUtxo
) - We now get an
ErrorCall
from the ledger for too big TxOut values- scaling to
reasonablySized
generators helped
- scaling to
-
coverFee
results in double the amount -> we counted the selected utxo twice- include the selected utxo to cover fee to the inputs (use inputs' for resolvedInputs)
-
MissingScriptUTXOW
remains an error- we saw this in the past
- this time again, the ledger sees less needed than provided scripts -> confusing error
- in our case it was the abort tx not spending any 'initia' outputs (yet)
- We get
FeeTooSmallUTxO
error now from the ledger- after initial confusion of a too low fee or that
minFeeA
andminFeeB
being 0 are possible problems - we found out that this in fact that fee is just too low for script execution etc.
- our cluster had too high
executionPrices
-> use realistic (mainnet) values for genesis-alonzo.json
- after initial confusion of a too low fee or that
- The integration test passes!
- We have a full roundtrip of posting and observing Init -> Abort, i.e. the short Head lifecycle (with many simplifications)!
-
Another round of discussions over deposit-based Head allowing more participants off-chain than on-chain, e.g. variation on the idea of distinguishing between running a Hydra Head node and using a Hydra Head to make transactions.
-
There is some research in that direction, along the lines of the tail protocol but removing the requirement for large amount of collateral deposit from intermediaries
-
We discussed a related approach proposed by Matthias based on deposits and inspired by Lightning and drew it up on our Miro board
Analysing transaction that fails validation to check our scripts execution logic and redeemers setting
- Turns out the issue in the
coverFee_
test came from missing coins in the abortTx output: In the initTx output, we add a fixed 2000000 lovelace, but in the output of the abortTx we set the fee to 0 - Now more tests are failing, probably because of changes in
coverFee
. Also the previousabortTx
property fails because of the missing 2000000.
Test checking abortTx transition was flaky because we were looking at the datums to identify the aborttx, but it could be the case that we decode the initial datum first -> Refactored code to look at the redeemers instead of the datums
- Test is still failing because of the 0 execution units
Writing test to check coverFee_
updates and cover execution cost of scripts
- Instead of calculating the exact execution cost for the redeemers, we take the maximum for a Tx and divide by the number of redeemers.
- But it's not much more complicated to call the actual function for computing exunits... 🤔
We notice we are consuming the same UTXO in both transactions beacuse the Wallet does not remove the UTXO it consumes when it covers fees
-
retry
in STM for the UTXO set to change when retrieving it
The error we now get is weird, it seems to be a mix of several errors:
- It's unbalanced
- It's using a UTXO which has already been spent before (39786f186d94d8dd0b4fcf05d1458b18cd5fd8c6823364612f4a3c11b77e7cc7)
- The
MissingScriptWitnessesUTXOW
error should be accompanied with a list of missing hashes - => We should improve error reporting in our Direct chain test ot understand better why it fails...
Continue working on replacing script with actual calls to cardano-api, part as educational work, part to build a proper CardanoClient
we'll be able to use elsewhere when interacting with a node.
- Struggling to extract the value from the output: Some functions are very recent and not available in our version of the API which is already 3 weeks old
- Replaced more cardano-cli calls with custom function in
CardanoClient
module, could go with more but I would like to have the master build.
Working on making master green, eg. have DirectChainSpec
validates
- Added query for
PParams
as it's needed fro computingscriptIntegrityHash
- Compute
scriptIntegrityHash
incoverFee_
-> InitTx passes correctly, now tacklingAbortTx
which is failing currently
Adding a property test checking coverFee_
does the right thing with execution units computation and setting redeemers pointers right should it add a UTXO
- There's this annoying
PParams
lying around, going to add it to some fixture code so that we can use it in different modules. - Managed to get property for coverFee fails with the "right" error, eg. covering of transaction succeeds but validation fails because
rdptr
is not correctly set, going to write code to fix that :fingers_crossed:
Adjusting the redeemer pointers, the idea is to compare the two sorted list of inputs and adjust those RdmrPtr
for which the initial value is different from the final one, once we have added the input for balancing tx and paying fees.
- It's not working, still got a script execution error for one of the redeemers but it seems the adjustment of redeemer pointers actually worked.
Writing a cardano-cli "wrapper" module, something that would provide useful functions for common operations in order to remove the need to run scripts from the command-line, running into a couple snags:
- The generators exposed in the sub-library for cardano-api use Hedgehog instead of QuickCheck, so we need to convert them but this loses shrinking capability
- Of course, the cardano-cli uses types from cardano-api, but our
Wallet
uses cardano-ledger-specs types Going to write the module using cardano-api types
While trying to replace address building from cardano-cli with Haskell code I got the following error:
test/Test/LocalClusterSpec.hs:25:11:
2) Test.LocalCluster should produce blocks and provide funds
uncaught exception: ErrorCall
InputTextEnvelopeError (TextEnvelopeTypeError [TextEnvelopeType "PaymentVerificationKeyShelley_ed25519"] (TextEnvelopeType "GenesisUTxOVerificationKey_ed25519"))
CallStack (from HasCallStack):
error, called at src/Relude/Debug.hs:288:11 in relude-1.0.0.1-KWrPF7zdlFZ8gdnjuoSoUr:Relude.Debug
error, called at test/Test/LocalClusterSpec.hs:45:21 in main:Test.LocalClusterSpec
assertCanSpendInitialFunds, called at test/Test/LocalClusterSpec.hs:25:11 in main:Test.LocalClusterSpec
although the underlying representation is the same, it fails to deserialise properly because of the envelope tag, so need to make address more robust.
There is the castVerificationKey
function which could be used for that?
In the genesis-shelley.json file we need a base16 encoding of the address but cardano-cli
uses bech32
cardano-cli address build --testnet-magic 42 --payment-verification-key-file alice.sk | bech32
Managed to have a Direct component connected to the local cluster, had to tweak some parameters in the cardano-node.json
to have protocols activated.
- We can see the transaction submitted but it fails, first with minimum output value, then with not being balanced -> need to wire wallet for balancing
-
coverFee
in Wallet takes a TxBody and not aValidatedTx
, so we need to adapt it - Still missing computation of
scriptIntegrityHash
which sdhould be done as part of tx balancing, fee payment and signing because it requires the final set of inputs to be defined
- Updated GHC to version 8.10.7
- Updated haskell.nix to version fd4d10efe278ba9ef26229a031f2b26b09ed83ff
- Removed nix dependency to
cardanoPkgs
- It's probably what caused me a lot of trouble when I first tried to update haskell.nix a week ago
-
local-cluster
now usesbuild-tools-depend
to pullcardano-node
andcardano-cli
in scope which guarantees we use the same version everywhere
- Updated plutus version to 5ffcfa6c0451b3b937c4b69d2575cd55adebe88b
- Updated ledger and node packages to suit plutus' dependencies
- Note this implies we temporarily depend on a fork of cardano-ledger-specs: https://github.com/raduom/cardano-ledger-specs, which does not yet contain major changes to directory structure
- Fixed minor changes in API following dependencies update
- Updated cabal.project's
index-state
to 2021-08-14T00:00:00Z- This needs to be older than haskell.nix's version but I am really unsure how they relate to each other
- Ideally, we should need to change only one of those to ensure pinned dependencies
Goal for today:
- Have a working Alonzo cluster with funds in
local-cluster
package - Rename
local-cluster
->hydra-cluster
and rename modules
I can see the cluster tries to start but there's no logs, need to store the logs as we do in the EndToEndSpec tests
- Added capture of logs to the
CardanoNode
process wrapper but seems like no logs are output => need to addbackends
to the node's configuration -> Now have logs activated for all cardano-nodes, in JSON - Nodes are starting up and apparently succeeding in their connection, not sure why they are not producing logs? Our
waitForNewBlock
function is a bit crude => making it a bit smarter actively waiting for a new block to be produced
Adding a test checking we can make a simple payment transaction using initial funds:
- Test simply executes a script using
cardano-cli
, this sounded much easier than trying to replicate all commands within Haskell - Struggling to get script to run correctly, it cannot find the
cardano-cli
executable in itsPATH
even though I pass the testing process' environment to it => I was incorrect calling[ -x cardano-cli]
- I can see the transaction submitted to the node-1's mempool but it seems to never end in a block?
- The transaction appears to be rejected by the mempool:
{"thread":"73","loc":null,"data":{"tx":{"txid":"txid: TxId {_unTxId = SafeHash \"00943cd84146550c7162ba5fc9d2bdef940afddfc4712e916060af5373acefdb\"}"},"mempoolSize":{"numTxs":1,"bytes":234},"kind":"TraceMempoolRejectedTx","err":{"produced":{"policies":{},"lovelace":900000000000},"kind":"ValueNotConservedUTxO","consumed":{"policies":{},"lovelace":0},"badInputs":["2fec440b7b461450420820a57f913d17525bc915da37d86e0423775110a05683#0"],"error":"This transaction consumed Value 0 (fromList []) but produced Value 900000000000 (fromList [])"}},"sev":"Info","env" :"1.30.0:7ff91","msg":"","app":[],"host":"haskell-","pid":"696941","ns":["cardano.node.Mempool"],"at":"2021-10-15T08:10:56.70Z"}
It takes time for the UTXO to be committed, going for an active loop in the script to check the newly created UTXO presence:
- One problem in the script: the
-e
option fails the entire script as soon as 1 sub-command fails, sogrep
failing meant the script exited immediately - Another problem: I was using some syntax not available in plain
/bin/sh
so it was running somewhat differntly inside the test and outside of it.... - It actually takes a while to have the transaction correctly submitted => Reduced slot length to 100ms and it's definitely much faster :)
PR for AbortTx is green 🎉 🍾
Plans for today:
- merge PR to master
- send head -> abort transaction sequence to a local testnet
Now trying again to submit an aborttx to the testnet manually
Note: We should make available soon a Hydra Test Cluster "framework" or package that will make it easy for people to start a local cluster and interact with it programmatically, eg. expose our HydraNode
from the local-cluster package to be used downstream and not only the docker-compose file. When working on the cardano-node cluster, I have found that having scripts is great but having our local-cluster is even greater as it should now be straightforward to wrap any test within this cluster and connect our hydra-nodes to it.
Exposing this feature to other developers would be super-useful.
Discussing with MB issues with running scripts on-chain:
- configure local-cluster to start in Alonzo era with some funded UTXOs
- see https://github.com/input-output-hk/cardano-configurations to pull config
- fix how we create transactions:
- correctly assign ex units => need to evaluate the Tx and update ex units in redeemers
- script integrity hash -> see
hashScriptIntegrity
in LEdger - add collateral input -> should be done in the tinyWallet reusing the one and only input we have in the wallet
- beware of redeemer pointers logic: balancing the tx adds a new input which can change the pointer logic
- see
Shelley/Transaction.hs
in cardano-wallet - beware of error in execution units -> seems like one needs to take a large margin (like 2x)
- scirpt execution takes maximum 6 ADA
- wire wallet in the Direct.Tx to cover fees
- should also assign redeemers?
To debug scripts failure need a cardano-node rebuilt with tweaked cardano-ledger-specs to provide logging output when evaluating scripts
-
Wrote a simple script to submit transactions for a head until the abort tx.
-
Managed to get some debugging output:
["L1","Ld","S5","PT5"]
- L1 ->
Output constraint
failed - Ld is used for 2 things:
MustSatisfyAnyOf xs -> traceIfFalse "Ld" -- "MustSatisfyAnyOf" $ any (checkTxConstraint ctx) xs {-# INLINABLE checkScriptContext #-} -- | Does the 'ScriptContext' satisfy the constraints? checkScriptContext :: forall i o. ToData o => TxConstraints i o -> ScriptContext -> Bool checkScriptContext TxConstraints{txConstraints, txOwnInputs, txOwnOutputs} ptx = traceIfFalse "Ld" -- "checkScriptContext failed" $ all (checkTxConstraint ptx) txConstraints && all (checkOwnInputConstraint ptx) txOwnInputs && all (checkOwnOutputConstraint ptx) txOwnOutputs
- S5 ->
"State transition invalid - constraints not satisfied by ScriptContext
from the StateMachine library - PT5 -> generic error when a check is false
- L1 ->
Getting an even more puzzling error when trying to submit txs with MockHead
script:
cardano-cli transaction build --alonzo-era --cardano-mode --testnet-magic 42 --change-address $(cat alice/payment.addr) --tx-in 0e2c5cb64ca7012dd235bad5b00fc8bf86662172e8af600a35aa5d42e761e5c3#1 --tx-in-collateral 0e2c5cb64ca7012dd235bad5b00fc8bf86662172e8af600a35aa5d42e761e5c3#0 --tx-out $(cardano-cli address build --payment-script-file mockHeadScript.plutus --testnet-magic 42)+1000 --tx-out-datum-hash ff5f5c41a5884f08c6e2055d2c44d4b2548b5fc30b47efaa7d337219190886c5 --tx-in-script-file mockHeadScript.plutus --tx-in-datum-file headDatum.data --tx-in-redeemer-file headRedeemer.data --out-file abort.draft --protocol-params-file example/pparams.json
Command failed: transaction build Error: The transaction does not balance in its use of ada. The net balance of the transaction is negative: Lovelace (-21987700) lovelace. The usual solution is to provide more inputs, or inputs with more ada.
- This shows I don't provide enough input to pay for the scripts' execution according to the result of evaluating execution units -> providing more input makes the transaction valid and submitted successfully.
L1
failure code happens in checkOwnOutputConstraint
which checks the datum hash of the outputs of the transaction. It's as if it was missing the abort datum, but the hash is actually present in the transaction.
- Trying to use the
--tx-out-datum-file
option which passes a hash extracted from datum file does not work either. - 🎉 The trick is to use
--tx-out-datum-embed-file
which puts the data in the transaction! - This means transactions working with the state machine always require to have the datums of its output included and not only the hashes, which increases the tx size requirements esp. in the case of Hydra where the state of the SM can potentially be large
Configuring our local-cluster
to start in alonzo and have initial funds
- Extended
LocalClusterSpec
to check we can actually spend outputs from initial funds. Writing it all in Haskell would be a PITA so I will first try to do that using an embedded shell script. - Got an error when starting the cluster, so need to not delete the created directory if the tests fail.
-
LocalClusterSpec
is failing with new configuration, but not sure if this is not just a timeout problem -> need to gather logs...
Managed to have the project builds outside nix-shell sidestepping some issue in the retrieval of blocks in Wallet
module.
All but 3 tests pass:
-
TxSpec
test for aborttx fails, I suspect this is caused by the change tovalidatorHash
now that the dependencies are updated - it cannot find cardano-node nor cardano-cli executable -> they should be provided in build-tools-depends as suggested by MPJ
- prometheus monitoring fails, perhaps a side effect of the above
What I did:
- Install ghcup and set versions to GHC 8.10.7
I had some troubles with my path because I asked to append the
$ curl --proto '=https' --tlsv1.2 -sSf https://get-ghcup.haskell.org | sh $ ghcup set ghc 8.10.7
~/.ghcup/bin
toPATH
instead of prepend - Install various system dependencies
Do not confuse
$ sudo apt install -y build-essential curl libffi-dev libffi7 libgmp-dev libgmp10 libncurses-dev libncurses5 libtinfo5 $ sudo apt install -y libz-dev liblzma-dev libzmq3-dev pkg-config libtool
lzma
withliblzma-dev
, those are 2 existing package - Install forked libsodium
git clone https://github.com/input-output-hk/libsodium cd libsodium/ git checkout 66f017f16633f2060db25e17c170c2afa0f2a8a1 ./autogen.sh ./configure make && sudo make install
- Build and test everything:
cabal build all && cabal test all
Trying to update haskell.nix at fd4d10efe278ba9ef26229a031f2b26b09ed83ff and using ghc8107 -> udpated materialisation first -> nix-shell
works fine and I can build hydra 🎉 locally on my VM.
It appears the datum file which the cardano-cli requires must be the JSON representation of the Data
and not the TextEnvelope
. => making the changes in the inspect-script
code...
- Doing
Aeson.encode $ toData mydata
does not work, it generates some CBOR encoding and not JSON - Turns out one must not use
serialiseToTextEnvelope
for datums butscriptDataToJson ScriptDataJsonDetailedSchema
. ButscriptDataToJson
operates on aScriptData
which is a mirror in cardano-api of Plutus'Data
so first one must converttoData
and then useCardano.Api.Shelley.scriptDataFromPlutus
From Duncan on slack:
if you're starting from a type from the underlying ledger rather than starting with the API types, then yes you can use the conversion functions to/from the underlying ledger types. The general principle of the API is that Cardano.Api exports everything at the level of the API types and Cardano.Api.Byron or Cardano.Api.Shelley exports and exposes all the underlying representations and conversion functions. So since you want to "lift the lid" and see all the representations (i.e. using fromPlutusData) then you want to import Cardano.Api.Shelley.
The generated datums should be OK for now, rebuilding cardano-node in order to be able to start a local cluster and retry submitting my script transactions.
- The generated
TextEnvelope
for scripts have an incorrect type but not sure why? -> Need to use aLedger.Scripts.toCardanoApiScript
on Plutus' scripts to "convert"
Trying to fix the last remaining important issue following up dependencies upgrade, namely the Point Block
conversion problem: We have a Point (ShelleyBlock (AlonzoEra c))
and we want a Point (CardanoBlock c)
.
castPoint
definition requires coercibility between 2 different HeaderHash
type family instances:
castPoint :: Coercible (HeaderHash b) (HeaderHash b') => Point b -> Point b'
The concrete needed coercion looks like:
ShelleyHash StandardCrypto
-> OneEraHash
'[Ouroboros.Consensus.Byron.Ledger.Block.ByronBlock,
ShelleyBlock (Cardano.Ledger.Shelley.ShelleyEra StandardCrypto),
ShelleyBlock (Cardano.Ledger.Allegra.AllegraEra StandardCrypto),
ShelleyBlock (Cardano.Ledger.Mary.MaryEra StandardCrypto),
ShelleyBlock (Cardano.Ledger.Alonzo.AlonzoEra StandardCrypto)]
Following the chain of definitions and imports gives me:
newtype OneEraHash (xs :: [k]) = OneEraHash { getOneEraHash :: ShortByteString }
then
newtype ShelleyHash c = ShelleyHash {
unShelleyHash :: SL.HashHeader c
}
...
type instance HeaderHash (ShelleyBlock era) = ShelleyHash (EraCrypto era)
with SL.HashHeader
being
newtype HashHeader crypto = HashHeader {unHashHeader :: Hash crypto (BHeader crypto)}
then
type Hash c = Hash.Hash (HASH c)
where module Hash
is ultimately Cardano.Crypto.Hash.Class
which does not export the constructor for newtype Hash
module Cardano.Crypto.Hash.Class
( HashAlgorithm (..)
, sizeHash
, ByteString
, Hash(UnsafeHash)
...
newtype Hash h a = UnsafeHashRep (PackedBytes (SizeHash h))
...
pattern UnsafeHash :: forall h a. HashAlgorithm h => ShortByteString -> Hash h a
pattern UnsafeHash bytes <- UnsafeHashRep (unpackBytes -> bytes)
where
UnsafeHash bytes = UnsafeHashRep (packBytes bytes :: PackedBytes (SizeHash h))
{-# COMPLETE UnsafeHash #-}
So if I cannot coerce
I could just pattern match and get the underlying ShortByteString
and rewrap it.
Right tip -> do
let blk = case tip of
GenesisPoint -> GenesisPoint
(BlockPoint slot h) -> BlockPoint slot (fromShelleyHash h)
fromShelleyHash (Ledger.unHashHeader . unShelleyHash -> UnsafeHash h) = coerce h
query = QueryIfCurrentAlonzo $ GetUTxOByAddress (Set.singleton address)
pure $ LSQ.SendMsgQuery (BlockQuery query) (clientStQueryingUtxo blk)
Now I need to fix the TxSpec
test which fails, probably because serialisation has been fixed in Plutus
- ✅ Replaced convoluted
Initial.Dependencies
hash computation withvalidatorHash
and script validates
Rebasing ch1bo/aborttx
branch over master as it does not have some changes improving over flakiness of monitoring tests
Back to submitting transactions, restarting a cluster and recreating a user:
mkdir alice
cd alice
cardano-cli address key-gen --verification-key-file payment.vkey --signing-key-file payment.skey
cardano-cli address build --testnet-magic 42 --payment-verification-key-file payment.vkey > payment.addr
cd ..
cardano-cli query utxo --testnet-magic 42 --address $(cat alice/payment.addr)
I managed to build the aborttx transaction except the script validation failed before submission:
cardano-cli transaction build --alonzo-era --cardano-mode --testnet-magic 42 --change-address addr_test1vqpfgdh6ldx73nypc5hkur2wm2hpt0kx240qlxvykhy8efc74sfu5 --tx-in 6bd74fd0e48e6a35c4fd59ba474b671866f115bc67fc8d6d84259e45e229bf15#1 --tx-in-collateral 6bd74fd0e48e6a35c4fd59ba474b671866f115bc67fc8d6d84259e45e229bf15#0 --tx-out addr_test1wp3urt44rzvpsj2fu696su9ee573m6ne0ce4uydhcdnwhkshjamur+1000 --tx-out-datum-hash ff5f5c41a5884f08c6e2055d2c44d4b2548b5fc30b47efaa7d337219190886c5 --tx-in-script-file headScript.plutus --tx-in-datum-file headDatum.data --tx-in-redeemer-file headRedeemer.data --out-file abort.draft --protocol-params-file example/pparams.json
Command failed: transaction build Error: The following scripts have execution failures:
the script for transaction input 0 (in the order of the TxIds) failed with:
The Plutus script evaluation failed: An error has occurred: User error:
The provided Plutus code called 'erro
🤦 It's perfectly possible all this dance on dependencies upgrade was unneeded to submit transactions manually. I thought the formats had changed over the past few weeks but turns out I was using the wrong serialisation functions...
Today's goal:
- Spin-up an Alonzo (local) test network
- (optional) Send transactions from our Direct chain component to this network
The scripts directory in cardano-node repo contains mkfiles.sh script that does the necessary magic to create an Alonzo network, either transitioning all the way from Byron to Alonzo, or hardforking immediately at epoch 0.
Script needs to be run from top-level:
$ scripts/byron-to-alonzo/mkfiles.sh
~/cardano-node/example ~/cardano-node
scripts/byron-to-alonzo/mkfiles.sh: line 205: cardano-cli: command not found
and requires cardano-cli and cardano-node to be available in PATH
I need a recent version of cardano-node obviously to start an alonzo network, latest version in master is 1.30.0, but the version we have in scope in hydra-poc is 1.27.0
Activating nix inside cardano-node directory through echo use nix > .envrc
and direnv allow .envrc
-> perhaps we should upgrade our dependencies after all...
Other option suggested by MB: https://github.com/input-output-hk/cardano-wallet/blob/master/lib/shelley/exe/local-cluster.hs
- Wallet uses this version of cardano-node: https://github.com/input-output-hk/cardano-node/commits/0fb43f4e3da8b225f4f86557aed90a183981a64f
- Cardano-node (master) depends on this plutus version:
commit edc6d4672c41de4485444122ff843bc86ff421a0 Merge: 569f98402 63c6ca8ac Author: Michael Peyton Jones <michael.peyton-jones@iohk.io> Date: Fri Aug 20 10:43:53 2021 +0100 Merge pull request #3430 from input-output-hk/hkm/windows-cross windows cross compile
Running scripts/byron-to-alozon/mkfiles.sh alonzo
"works": I can see 3 nodes up and running. Now need to understand how to post a transaction to them...
In the cardano-node scripts there's an initialFunds
field which sets some lovelaces to some address, but in the wallet there's none and it says there can't be as it needs to transaction to byron from shelley, but if we hard fork at epoch 0 immediately this should work?
cardano-cli
can talk to the node and get some information:
$ CARDANO_NODE_SOCKET_PATH=example/node-bft1/node.sock cardano-cli query tip --cardano-mode --testnet-magic 42
{
"epoch": 5,
"hash": "48c4b8c546a0a9ffd0649a77b0926881e6e8869d83cb6da70f1d32ac9f936878",
"slot": 2700,
"block": 182,
"era": "Alonzo",
"syncProgress": "42.68"
}
I can also get the UTXO set of the network:
$ CARDANO_NODE_SOCKET_PATH=example/node-bft1/node.sock cardano-cli query utxo --cardano-mode --whole-utxo --testnet-magic 42
TxHash TxIx Amount
--------------------------------------------------------------------------------------
61fa39c2f3e110850c741da3a0f978bcee0fd9abfc7b0bce4df3ea047d61e824 0 5010000000 lovelace + TxOutDatumNone
704e1dc2f4dfcc44c0ba90978a0d58371b5f7ee1d3c47b1cedb52e1b1cb37b18 0 5010000000 lovelace + TxOutDatumNone
f2d04cab14eefbb0571e6a74b64b49453ac3312c20ce8fcda9d125c9020bd267 0 900000000000 lovelace + TxOutDatumNone
Leveraging SN's previous experiments to learn how to create a transaction to send some ADAs between addresses in the testnet
- How do I get the details of a transaction using cardano-cli?
- Going throuhg https://github.com/input-output-hk/cardano-node/blob/master/doc/stake-pool-operations/simple_transaction.md to create a transaction to send some ADAs from the genesis funding transaction to some other user's address but it fails as I am missing the right key
- The key to use is
example/shelley/utxo-keys/utxo1.skey
which results in successfully submitting transactions. I can see the transaction is successfully submitted but it does not appear when querying the utxo set: Possibly because I set the validity interval too far in the future and I need to wait? The node is stale and does not make progress anymore => Restarting from scratch
I was able to submit a transaction!
$ cardano-cli query utxo --whole-utxo --testnet-magic 42
TxHash TxIx Amount
--------------------------------------------------------------------------------------
06a82e6521f8d88a9ffe082f66f9f2bb114c9145d3f13cbfb36a3facba8d4de9 0 5010000000 lovelace + TxOutDatumNone
6cafe0b8352fa6bf5c7433bb668bf675c220d27adb42e1da28ab25741290176e 0 899999999599 lovelace + TxOutDatumNone
c989e8557c10a2fee3de5d37bc3858e4a3f2629d07d897f39d8d9ddf631e0c0f 0 5010000000 lovelace + TxOutDatumNone
Now going to submit some plutus transactions and check how it goes... There are a bunch of examples in scripts/plutus
that seem interesting
-
I was able to run successfully :
$ scripts/plutus/example-txin-locking-plutus-script.sh guessinggame TxHash TxIx Amount -------------------------------------------------------------------------------------- 9fd2c741e9b582328269dcd1ee5282625be36215126ae2ce0edc24f48de82057 1 10000000 lovelace + TxOutDatumNone
It does not work twice though, needed to do a minor change to retrieve the first Tx for given address -> updating script to select first transaction found
Next step: Generating needed files for our own scripts and datums -> Reviving old executable from MB that outputs a script in serialised form, useful for manually testing SC on a network
- Some interesting and useful documentation available here on genesis configuration for Shelley
- Plutus provides ways to export data for consumption by cardano-cli: https://plutus.readthedocs.io/en/latest/plutus/howtos/exporting-a-script.html -> One needs to serialise with
TextEnvelope
apparently
Looking at what the plutus script in cardano-node does:
plutusscriptaddr=$($CARDANO_CLI address build --payment-script-file "$plutusscriptinuse" --testnet-magic "$TESTNET_MAGIC")
it constructs an address from the script file's content which indeed is an "enveloped" serialised script
Wrote an inspect-script
executable that output scripts, datums and redeemers for init and abort transaction given a currency and a token. These are written using cardano-node's custom TextEnvelope
format which is "semi-readable", now going to try to submit head then abort transaction.
Restarting network from scratch, creating 3 utxos for Alice to use in the head txs
Estimating fees:
$ cardano-cli transaction calculate-min-fee --tx-body-file tx.draft --tx-in-count 1 --tx-out-count 4 --witness-count 1 --byron-witness-count 0 --testnet-magic 42 --genesis example/shelley/genesis.json
I want to send the change back to the genesis utxo, so I need its address: how do I get that?
$ cardano-cli -- shelley address build \
--payment-verification-key-file example/shelley/utxo-keys/utxo1.vkey \
--testnet-magic 42
addr_test1vqcvgup2qg3uf525ln7xyj5ymenupyzq6shrwcq08nanm2s2708jd
then
$ cardano-cli transaction build-raw --tx-in 837b43e0ce1da9aabe9794a4c5f8e3da5fde73e5f24927a97862c776357790b3#0 --tx-out $(cat alice/payment.addr)+10000000 --tx-out $(cat alice/payment.addr)+10000000 --tx-out $(cat alice/payment.addr)+10000000 --tx-out addr_test1vqcvgup2qg3uf525ln7xyj5ymenupyzq6shrwcq08nanm2s2708jd+$((900000000000 - 10000000 - 10000000 - 10000000 - 601)) --invalid-hereafter 10000 --fee 601 --out-file tx.draf
signing and submission:
$ cardano-cli transaction sign --tx-body-file tx.draft --signing-key-file example/shelley/utxo-keys/utxo1.skey --testnet-magic 42 --out-file tx.signed
$ cardano-cli transaction submit --tx-file tx.signed --testnet-magic 42
Transaction successfully submitted.
I now have 3 UTXOs to spend in the scripts.
$ cardano-cli query utxo --whole-utxo --testnet-magic 42
TxHash TxIx Amount
--------------------------------------------------------------------------------------
8930182280603aab400a1856daf20c63a6376cae31f2be584f2493f13fba3b22 0 5010000000 lovelace + TxOutDatumNone
b4f31ac83344988b1cd7bcf8bb150b9e3b4aca519b7f6bc89bc09d545b343f6f 0 10000000 lovelace + TxOutDatumNone
b4f31ac83344988b1cd7bcf8bb150b9e3b4aca519b7f6bc89bc09d545b343f6f 1 10000000 lovelace + TxOutDatumNone
b4f31ac83344988b1cd7bcf8bb150b9e3b4aca519b7f6bc89bc09d545b343f6f 2 10000000 lovelace + TxOutDatumNone
b4f31ac83344988b1cd7bcf8bb150b9e3b4aca519b7f6bc89bc09d545b343f6f 3 899969999399 lovelace + TxOutDatumNone
f517a7081008aa3e658a7f88ad0458bda733d22307a6401a1881cc12ff199890 0 5010000000 lovelace + TxOutDatumNone
Trying to generate script's address fails with
$ cat alice/headScript.plutus
...
"description":"headScript","type":"PlutusV1Script"}curry@haskell-dev-vm-1:~/cardano-node$ cardano-cli address build --payment-script-file alice/headScript.plutus --testnet-magic 42
Command failed: address build Error: alice/headScript.plutus: Error decoding script: TextEnvelope type error: Expected one of: SimpleScriptV1, SimpleScriptV2, PlutusScriptV1 Actual: PlutusV1Script
The descriptor type has been changed in recent cardano-node versions, so I need update cardano-node dependencies to have the proper tag. ATM, trying to simply change the type in the plutus files direclty..
$ cardano-cli address build --payment-script-file alice/headScript.plutus --testnet-magic 42
addr_test1wq2rv89vr2mtkfmcqqpzwz0f88sv86h05cw8mz74vcyd9gclj6lqt
🤦 Actually I need the datum hash to build the tx, not the datum of course.
Creating draft tx for Head init tx without outputting any PTs
$ cardano-cli transaction build --alonzo-era --cardano-mode --testnet-magic 42 --change-address $(cat alice/payment.addr) --tx-in b4f31ac83344988b1cd7bcf8bb150b9e3b4aca519b7f6bc89bc09d545b343f6f#0 --tx-out $(cardano-cli address build --payment-script-file alice/headScript.plutus --testnet-magic 42)+1000 --tx-out-datum-hash a6196b078239886432cc8bb0f981cb9f7df54bcf2fb8951b01c6639104a10640 --out-file head.draft
I was finally able to submit the transaction succesfully:
$ cardano-cli query utxo --whole-utxo --testnet-magic 42
TxHash TxIx Amount
--------------------------------------------------------------------------------------
6147dae7ecb37fc1ea0c34e32419c1cc5916244dfb94f1239622d65d0be0d23d 0 9998733 lovelace + TxOutDatumNone
6147dae7ecb37fc1ea0c34e32419c1cc5916244dfb94f1239622d65d0be0d23d 1 1000 lovelace + TxOutDatumHash ScriptDataInAlonzoEra "a6196b078239886432cc8bb0f981cb9f7df54bcf2fb8951b01c6639104a10640"
...
Now checking I can actually consume the transaction! Unfortunately, serialisation formats definitely have changed for datums too:
$ cardano-cli transaction build --alonzo-era --cardano-mode --testnet-magic 42 --change-address $(cat alice/payment.addr) --tx-in 6147dae7ecb37fc1ea0c34e32419c1cc5916244dfb94f1239622d65d0be0d23d#1 --tx-in-collateral 6147dae7ecb37fc1ea0c34e32419c1cc5916244dfb94f1239622d65d0be0d23d#0 --tx-out $(cardano-cli address build --payment-script-file alice/headScript.plutus --testnet-magic 42)+1000 --tx-out-datum-hash 08090cf3024c750773519501c52bec72749c28d8732dcafc3690c2f77793f84e --tx-in-script-file alice/headScript.plutus --tx-in-datum-file alice/headDatum.plutus --tx-in-redeemer-file alice/headRedeemer.plutus --out-file abort.draft
Command failed: transaction build Error: Error reading metadata at: "alice/headDatum.plutus"
JSON schema error within the script data: {"cborHex":"d8799fd8799f1b000000e8d4a51000ff80ff","description":"headDatum","type":"ScriptDatum"}
JSON object does not match the schema.
Expected a single field named "int", "bytes", "string", "list" or "map".
Unexpected object field(s): {"cborHex":"d8799fd8799f1b000000e8d4a51000ff80ff","description":"headDatum","type":"ScriptDatum"}
- When upgrading dependencies I have run into more nix/cabal/hackage issues and was unable to upgrade cabal.project alone.
- Now trying to build the project with updated plutus dependencies not using nix: downloaded and installed ghcup with ghc version 8.10.7
Goal: Upgrade dependencies to more recent Plutus, Ledger and Cardano-node
Try upgrading hydra-poc dependencies following https://github.com/CardanoSolutions/ogmios/blob/5048fb6cd9eb245b4062191220ad96e945d66258/server/cabal.project
- Hitting an issue with
plutus-contract
package not building, checking what the dependencies are in Plutus at this commit - Dependencies in ogmios are actually too old for plutus. The revision pointed at dates back from 2 months ago:
commit edc6d4672c41de4485444122ff843bc86ff421a0 Merge: 569f98402 63c6ca8ac Author: Michael Peyton Jones <michael.peyton-jones@iohk.io> Date: Fri Aug 20 10:43:53 2021 +0100 Merge pull request #3430 from input-output-hk/hkm/windows-cross windows cross compile
Starting from plutus' master which might be a better choice for us
-
Interestingly, plutus depends on a fork of cardano-ledger-specs:
source-repository-package type: git location: https://github.com/raduom/cardano-ledger-specs tag: ef6bb99782d61316da55470620c7da994cc352b2
The pointed at commit (https://github.com/raduom/cardano-ledger-specs/commit/ef6bb99782d61316da55470620c7da994cc352b2) says:
Make the code compile with a newer plutus version raduom/plutus-exbudget-error
-
Now trying to update cardano-ledger-specs following changes in directories structure Looks like updating those dependencies will be a nice 🐰 🕳️
-
Build fails because of missing liblzma dependency, added
pkgs.lzma
to the
shell.nix
file and now it's recompiling nix! -
Reverting all my changes as it has become a mess. Starting over from the nix shell dependencies as it seems to be the root to be updated. I am in the situation I definitely would like to avoid: I need to update dependencies and for this I need to modify nix stuff which means I need to understand what I am going and what to update where. But I don't really know what I am doing and SN is away and has been the one updating dependencies and maintaining the nix infrastructure in the past -> Bus factor = 1
-
When last updated dependencies SN used
nix-shell -A cabalOnly
to not use haskell.nix which seem to have helped him, will try this.
Adding lzma dependency to the shell.nix
, also upgrading GHC to 8.10.7, haskell.nix archive to a more recent one and nixpgs reference to 21.05
- Needed to materialise nix plan and it's now compiling all base dependencies
- Struggling with nix to get the update to 8.10.7 to pass, now I need to add some more libsodium configuration for some packages from https://github.com/input-output-hk/plutus/blob/master/nix/pkgs/haskell/haskell.nix#L256 => Just duplicated the
libsodium-vrf
declaration fromshell.nix
todefault.nix
and it now seems to work
Seems like upgrading Plutus won't be easy: MPJ failed to upgrade dependencies to cardano-ledger-specs in their repository, he had to create a separate branch to add some pending changes. It's probably safer to stay as we are now even with the issues we are having.
List of archiectural katas: http://nealford.com/katas/list.html
Goal: Fix Direct on-chain component abort transaction validation failure
- Got failing test for init -> abort transaction logic, going to add traces to understand what's failing
- Checking the scripts and datums hashes to make sure the transaction provide all of them
- Modifying cardano-ledger-specs to add more verbose output when script validation fails. Trying to enter nix-shell in cardano-ledger-specs to be able to test my changes, took about 10 minutes to enter nix shell, now doing a
cabal build all
in the ledger specs directory to compile stuff- Depending on external dependencies like the ledger increases turnaround time to insane levels: Now need to check it works in the original repo before changing the reference in the hydra-poc repo, because otherwise every commit will require a full recompilation which is ridiculously expenseive
- cardano-api depends on
ValidationFailed
error's structure so I need to also adapt the code there because I changed the error in alonzo => Rather than modifying the cardano-ledger-specs I am going to work at a lower level, namely testing the script directly with Plutus, as a unit test - Just added
Verbose
logs withDebug.Trace.trace
in ledger spec then update the dependency in hydra-poc
- Abort Tx test fails with
which is exactly what MB was seeing the last time, which seems to imply the script cannot be found, either because the hash is invalid or some other reason. => looking at the source of the error
["mustRunContract: script not found.","Pd"]
Trying to write a test using https://github.com/input-output-hk/plutus/blob/master/plutus-ledger-api/src/Plutus/V1/Ledger/Api.hs#L262
- This is hard because I need to build the
ScriptContext
which requires a full transaction which is difficult to build by hand. Could build the transaction in the ledger and then use the functions to translate it to Plutus but thought I might as well check first what the logsa are saying
Trying to uncomment the mustRunContract
function which is the one resolving the contract references we need to validate the
output is correctly spent: This function fails to resolve the script, when replaced with a const True
function the test does not pass but the scripts execution succeeeds
- display the
Dependencies
content and compare with what's in the transaction Hashed dependencies show:but the scripts' hash in the transaction are :[581cf0bce8043dc5f9c32ebad31652e239a8f15d1bf01f4d8d1b9740f73f,581c2e95c0a89c450a245d3324d16260797b54f2010e2ea494e5214323c9]
5d8dd23697de989275a58ef20edeacb320994f590cf0e10a0163cf3a f0bce8043dc5f9c32ebad31652e239a8f15d1bf01f4d8d1b9740f73f
Trying to simplify dependencies hash computation to use the validatorHash
provided in the SC code. AFAICT The validatorHash ultimately uses the same hash function, the one from Cardano.Ledger.Era.hashScript
- Interestingly replacing the hash computation yielded the same hashes
- So I still have the same error... Now investigating what the hashes look like on both sides and trying to find how the
Credential
in theTxInInfo
we are filtering is constructed on the ledger sidetransCred :: Credential keyrole crypto -> P.Credential transCred (KeyHashObj (KeyHash (UnsafeHash kh))) = P.PubKeyCredential (P.PubKeyHash (P.toBuiltin (fromShort kh))) transCred (ScriptHashObj (ScriptHash (UnsafeHash kh))) = P.ScriptCredential (P.ValidatorHash (P.toBuiltin (fromShort kh)))
🍾 I managed to have the abortTx validates its scripts. The issue was indeed in the way we construct the hashes. It was unclear to me why we are seeing different hashes between Plutus.validatorHash
and Ledger.hashScript
but I finally found the reason: We are using an "old" version of Plutus.
Hash Computation
Here is the code that computes a `ValidatorHash` given a scriptvalidatorHash = ValidatorHash . scriptHash . getValidator
scriptHash :: Script -> Builtins.BuiltinByteString
scriptHash =
toBuiltin
. Cardano.Api.serialiseToRawBytes
. Cardano.Api.hashScript
. toCardanoApiScript
toCardanoApiScript :: Script -> Script.Script Script.PlutusScriptV1
toCardanoApiScript =
Script.PlutusScript Script.PlutusScriptV1
. Cardano.Api.PlutusScriptSerialised
. SBS.toShort
. BSL.toStrict
. serialise
Then the code for Cardano.Api.Script.hashScript
:
hashScript :: Script lang -> ScriptHash
hashScript (SimpleScript SimpleScriptV1 s) =
...
hashScript (PlutusScript PlutusScriptV1 (PlutusScriptSerialised script)) =
-- For Plutus V1, we convert to the Alonzo-era version specifically and
-- hash that. Later ledger eras have to be compatible anyway.
ScriptHash
. Ledger.hashScript @(ShelleyLedgerEra AlonzoEra)
$ Alonzo.PlutusScript script
Where Cardano.Ledger.Era.hashScript
is a method of ValidatorScript
typeclass with Era-dependent implementations,
The generic implemetnation says:
-- UNLESS YOU UNDERSTAND THE SafeToHash class, AND THE ROLE OF THE scriptPrefixTag
hashScript =
ScriptHash . Hash.castHash
. Hash.hashWith
(\x -> scriptPrefixTag @era x <> originalBytes x)
but the implementation for Alonzo says:
instance (CC.Crypto c) => Shelley.ValidateScript (AlonzoEra c) where
scriptPrefixTag script =
if isPlutusScript script
then "\x01"
else nativeMultiSigTag -- "\x00"
So it seems it hashes not only the script's serialised content but also a prefix tag of 0x01
!
In the https://github.com/input-output-hk/cardano-node/blob/master/cardano-api/src/Cardano/Api/Eras.hs#L336 file we have:
ShelleyLedgerEra AlonzoEra = Ledger.StandardAlonzo
with the latter being defined in https://github.com/input-output-hk/ouroboros-network/blob/master/ouroboros-consensus-shelley/src/Ouroboros/Consensus/Shelley/Eras.hs#L89 as
type StandardAlonzo = AlonzoEra StandardCrypto
So all in all the computed hash values should be equal!
It happens it all make sense: The hashes are actually consistent but in a more recent version of Plutus code than the one we are using! The version of plutus we use is at commit 36dcbb9140af0c9b5b741b6f7704497d901c9c65 which contains this code for hashing scripts:
scriptHash :: Serialise a => a -> Builtins.BuiltinByteString
scriptHash =
toBuiltin
. Crypto.hashToBytes
. Crypto.hashWith @Crypto.Blake2b_224 id
. Crypto.hashToBytes
. Crypto.hashWith @Crypto.Blake2b_224 id
. BSL.toStrict
. serialise
-
Discussing so-called “star-shaped head network” protocol draft:
- There's one server which is part of a Head, or even is running a Head alone
- There are many clients connected to the server
- Client <-> Server are connected through 2-parties isomorphic channels, eg. "mini-Heads" that should be simpler than full multiparty head but with the same properties: Isomorphic, safe, requiring being online to ensure progress.
- Transactions can flow from one client to the others through the pairwise channels mediated by the Head which acts effectively as a bridge
- This channel construction is similar to Perun/lightning channels and can easily be leveraged to give Virtual channels network,
- On-line requirement is needed to ensure safety without collateral from the server like in T2P2: When offline, channel with the server is stale so client needs at least to be periodically online. This could be good fit for light/mobile clients provided there's a way to have a safe access to the chain's state (Watchtowers?)
- This implies some form of multi-protocol support inside a single node is needed (relaying, different protocols between different parties)
-
Discussing a potential collaboration with Perun researchers/engineers for an alternative way to inter-connect Hydra Heads (using virtual perun channels?)
-
Some more discussion about NFTs on Hydra Heads:
- analogy with "Scotty, beam me up" -> do you transport the matter or destruct/reconstruct it somewhere else?
- NFTs in Head -> MintingPolicy allowing to remint the NFT on-chain in the fanout
Published Milestone report urbi et orbi. Highlights are:
- good feedback from Summit and outcome of team Workshop in Berlin leading to refined understanding of short term goals and use case
- while we managed to have a working demo in time for the summit, we did not "close the loop" and were not able to run Hydra Head cluster over an actual Cardano Alonzo testnet,
- we are on track to provide a roadmap and implementation plan for S1 2022 by end of October 2021.
I want to clean up the PR backlog before tackling the Direct transaction submission problem, going to fix log-filter tests and process wrapper to ensure we can merge that today.
- Monitoring test is flacky on CI, although I changed the way we allocate ports => Fuse the 2 unit tests in one because it does not make much sense to have 2 separate tests for the same "behavior"
Thinking it could be a "fun" side-project to implement Golomb-Rice set in Haskell: https://github.com/btcsuite/btcutil/blob/master/gcs/gcs.go
Added a section in demo/README.md to start the demo without docker as requested by someone on the discord channel.
Also all PRs are merged and the only one left is the "direct chain interaction" https://github.com/input-output-hk/hydra-poc/pull/90 to have an Init -> Abort sequence working properly on-chain (or at least with transactions validated by the ledger)
Working on Wallet PR KtorZ/ADP-919/sign-transaction to add more integration tests
https://cbor.me enables decoding a base64 encoded CBOR data
- Got stuck with testing transaction signing with withdrawal(s), got an error with CBOR decoding of TX on the server side. Wrote a unit test at the deserialisation level for
sealedTxFromBytes
which led us to realising the roundtrip test was not covering much of the structured of the sealedTx. - Possible investigations: Cover more fields in the roundtrip, also try to get better error report at CBOR level
- The problem was that we passed the serialised
sealedTx
into the quasiquoter constructing the payload to the sign transaction endpoint, instead of the value itself. Seems like there's ainstance ToJSON ByteString
in scope. - A
SealedTx
is just a wrapper around the raw bytes of a cardano transaction, not clear what the other fields are used for. It can be produced by parsing some bytes as it is just a cardano transaction in CBOR encoding. The question is: How is the bytestring encoded in user requests?- In JSON, it is assumed to be a string containing the base64 encoding of the transaction, but the cardano-cli output base16 encoded raw transactions for signing.
- So we should accept both encodings for an
ApiT SealedTx
in order to minimize friction for end users.
Not having haskell-language-server working is painful, tried to install HLS from nix and source but does not work.
Trying to use HLS in the cardano-wallet does not work out of the box, seems like one needs to do more work: using scripts/gen-hie.sh
. Wallet has more than 100 modules in the core packages which leads to long compilation time esp. without immediate feedback from HLS, this is painful.
Not sure me helping the wallet for a couple of weeks is very useful and productive use of my and Matthias' time: I won't be able to learn much of the codebase in the given time frame so won't be autonomous and will need to ask lot of questions and get a lot of guidance, for a net ROI which is probably negative as I won't staty working on the wallet. I could pair and possibly contribute useful observations and a second pair of eyes but this would require pairing most of the time with different people so unsure if that's the gaol.
- Adding a shell.nix to get build tools like clang in scope
- Libsodium: the
musig2_compat
branch is quite different to the one we use in thehydra-node
-> make sure to rebase the necessary changes later - Got the
musig2test
working ->nix-shell --run "make && ./musig2test"
on https://github.com/ch1bo/musig2/tree/ch1bo/build-via-nix - Plan: dump keys and signed message from
./musig2test
, load them in haskell and run them through Plutus' verifySignature -> no FFI required for now - Writing keys and message was straight-forward
- quite nice to do C for a change!
- What is the format of the signed message / envelope?
- The example uses libsodium's combined mode
- Separated signatures are more likely what we would be requiring when wrapping this into a library
- Split it by hand after the fact for now
- Signature length should be
64
bytes
- Realization: Protocol validators need to ensure that the tx spending the contract output and creating the next datum, needs to make sure that "others" can now the datum used for the datum hash. For example: The close tx validator needs ensure that (at least the) snapshot number is included such that other participants can re-construct the datum using the number + stored snapshots in order to spend the output / contest.
- Making the datum simple helps in keeping the script output "spendable". In the end, the datum is a secret and if we want it easy to be able to contest, just having the snapshot number as datum is maybe enough?
- Walking through the on-chain validators, datums and redeemers again
- Fanout is quite complex, it might get expensive with many outputs
- Good thing: it is deterministic and costs would be known up-front and can be avoided before acknowledging them in the Head
- Optimizations to keep the utxos in the head small could be worthwhile
- Utxo set size and costs created by that need to be tractable and not hidden from users of hydra-node
- Applications / operators should be able to take action and decide on such things
- There are hard limits though based on main-chain protocol parameters
Looking at document about Plutus extensions provided by MPJ, which is referring to Hydra in a few place. Proposals are:
- Add reference inputs to transactions, eg. inputs which are not consumed by the transaction but whose datum/datum hash are available to scripts
- Use inline datum instead of only datum hashes.
- Provide script references which is a combination of the above 2 proposals, to remove the need to provide a script as witness in the consuming transaction every time it is used.
Found it difficult at first to understand what I had to do, and how to properly extend existing code to do what I want, eg. retrieve data from Init
transaction so that it can be consumed by the Abort
transaction. It's actually hard to shape one's thoughts along the lines of another person's thoughts, esp. when in "experimentation" mode and we take a lot of shortcuts, or drift away from the actual goal because it's too complicated to do in one step. In this situation, probably everyone would do a slightly diffrerent step and take a slightly diffrerent direction.
This shed some lights on the importance of pairing/mobbing to share the context of the code we write when it's not already obvious. An alternative is to be very explicit about the goals, and the intermediate steps we are taking, and the assumptions we make about the environment, the shortcuts we take. They are here but disseminated across different functions and files which makes building a big picture hard.
Trying to add logs to the direct chain component, seems however we don't have JSON instenaces for ValidatedTx
? -> using only Show
for now but should be fine
How do I make a TxIn
from a TxOut
? => hash the TxBody and add the index
Got to the point where I have only one failing test, namely the one about init -> abort dance which now makes sense.
- I need to properly observe the abort transaction from the chain and make sure it has the necessary inputs from the current head state.
Actually there's currently no way to link init to abort because:
- Init produces a single output which is the address of the main (SM) script with the parameters as datum
- Abort consumes an output for the validator Script with the pubkeyhash of the recipient
- Abort should also consume the SM output and pass the parameters as datum which is what we need to verify first
Going back to basics, here are next steps to be able to do the init -> abort sequence correctly:
- Add the output for the SM to the Init tx
- Make sure this output updates the on-chain state
- Have the Abort transaction consumes this output
- Add the thread token inferred from the seed txin
- Add parties' verification keys to the head parameters
- Mint the PTs (using Thread token) and create one output per party with the PTs and their verification keys
- Consume those output in the abort tx
In parallel we need to write and check the validator scripts themselves as this is not really done in our tests because the mock ledger does not verify anything, of course.
Got state change test is green but now the abort tx unit test validation fails with:
Evaluation results: fromList [(RdmrPtr Spend 0,Left (ValidationFailed (CekError An error has occurred: User error:
The provided Plutus code called 'error'.)))]
- Recompiling ledger-specs setting the flag for evaluation to
Verbose
in order to get better logs - It seems there's another problem in the Head validator's state machine as we don't pass any
ThreadToken
, or rather we pass one but do not use it to instantiate the SM hence it's more than probable the evaluator fails to find the SM -> MB adapted the interface in another PR - Trying to switch the validator to a simpler one, and check I can build the aborttx, possibly also checking I can sequence the transctions and observe the aborttx. With a simple (parameterized) validator, script evaluation succeeds just fine even though the test fails because the count of results is incorrect, but next execution fails
- With a single script reference to the MockHead validator it passes, so I must be doing something wrong with thr
RdmrPtr
logic.
There is an issue related to how it resovles its redeemers.
- I was using directly
RdmrPtr
passing an incremented counter but of course this does not make sense becauseinputs
is aSet
. - One need to either sort the inputs by
(TxId, TxId)
order or use therdptr
function from the ledger API that does the right thing to associate the redeemer with the right input.
Topic: What to do about rollbacks?
Actually, theses are not really rollbacks it's just a longer chain was found. It's important because this means we are not going back in time, eg. the new chain will be at the same "moment" in time than the new one.
-
CollectCom
is the most sensitive transaction, if rolled back it's as if nothing happened in the head which might be quite annoying if people are expecting head transactions to be "final" or "settled" and do side-effects depending on it. -
Close
andContest
could also be problematic as they are time-bound (by the contestation period) and it could be the case some contests "disappear" for want of time to post them in case of a rollback - Other mainchain transactions are less problematic, they can simply be resubmitted
Hydra users need to be aware of the settlement time on the mainchain, for exeample there is a 600s limit on Kraken for payments on Cardano to be considered final. To be safe against adversarial nodes, one needs to wait for some number of blocks (there is a document available providing some simple tables relating expected probability of "failure" to adversarial stake and number of blocks to wait, eg. for 5% adv stake, 0.01% failure, one has to wait 73 blocks or ~20 minutes)
- The contestation period needs to be set to some large enough value, eg. larger than expected time to get a rollback
- Validators do not get absolute time slots, they only get the validity range of the transaction which, in the case of
Close/Contest
transactions includes the contestation period, and because scripts run after stage 1 validation, they can assume the range is valid. From there, the validator can check if the range falls within accepted bounds - Also,
T_max
should not be too large as to prevent the head from making progress, but this can verified by the validator too in the range
Consequences for Hydra Head:
-
HeadLogic
needs to be aware that its state could be "rolled back", eg. an onchain transaction can reset the state to something else, even while the head is opened => This could be property tested, we had something similar at one point - The settlement time should be a parameter of the node set by users, depending on how long/what risk they are willing to take w.r.t to rollbacks
- The contestation period should be set large enough, possibly in relationship to this settlement time?
- The OnChain component could be the one doing the wait, retaining
OnChainTx
until enough blocks have passed before notifying node
- Plutus validators are also
Blake2b_224
, but why did thefromJust
not work before? case solved it, hash conversion works now - Get a MissingScript error now
Falsified (after 1 test):
TxInCompact (TxId {_unTxId = SafeHash "03170a2e7597b7b7e3d84c05391d139a62b157e78786d8c082f29dcf4c111314"}) 180
Utxo: UTxO (fromList [(TxInCompact (TxId {_unTxId = SafeHash "03170a2e7597b7b7e3d84c05391d139a62b157e78786d8c082f29dcf4c111314"}) 180,(Addr Testnet (ScriptHashObj (ScriptHash "1302a9a442fa86e8e836aa39d961ec3e71f500f21a633ae0cf2b60b1")) StakeRefNull,Value 0 (fromList []),SJust (SafeHash "faa51ea0059e04224cc13da34b53bba807fb2affd71ee401e85dfa3f769081fd")))])
Tx: ValidatedTx {body = TxBodyConstr TxBodyRaw {_inputs = fromList [TxInCompact (TxId {_unTxId = SafeHash "03170a2e7597b7b7e3d84c05391d139a62b157e78786d8c082f29dcf4c111314"}) 180], _collateral = fromList [], _outputs = StrictSeq {fromStrict = fromList []}, _certs = StrictSeq {fromStrict = fromList []}, _wdrls = Wdrl {unWdrl = fromList []}, _txfee = Coin 0, _vldt = ValidityInterval {invalidBefore = SNothing, invalidHereafter = SNothing}, _update = SNothing, _reqSignerHashes = fromList [], _mint = Value 0 (fromList []), _scriptIntegrityHash = SNothing, _adHash = SNothing, _txnetworkid = SNothing}, wits = TxWitnessRaw {_txwitsVKey = fromList [], _txwitsBoot = fromList [], _txscripts = fromList [(ScriptHash "a893ca7be59f00c935c382fd8f8e515adcc9850f1ec5dbafbe99face",PlutusScript ScriptHash "a893ca7be59f00c935c382fd8f8e515adcc9850f1ec5dbafbe99face")], _txdats = TxDatsRaw (fromList []), _txrdmrs = RedeemersRaw (fromList [(RdmrPtr Spend 0,(DataConstr Constr 0 [],ExUnits {exUnitsMem = 0, exUnitsSteps = 0}))])}, isValid = IsValid True, auxiliaryData = SNothing}
Evaluation results: fromList [(RdmrPtr Spend 0,Left (MissingScript (RdmrPtr Spend 0)))]
- Reason: address script hash of utxo is different than the provided scripts
- Apparently
hashScript
is different thanhashFromBytes
viaValidatorHash
- suspect: the double hashing in plutus'
scriptHash
is suspicious and maybe the ledger does not expect that? - use the same technique in both places to see whether it's just the hashing or the serialized script -> will it run?
- suspect: the double hashing in plutus'
- SN explained the current situation of transaction creation to AB
- Started off with the
MissingDatum
error when running theinitial
validator against anabortTx
- Solved it by providing the
PubKeyHash
toabortTx
, but this is really only a quick fix. TheabortTx
should actually spend all the outputs which contain PTs. - By providing the
PubKeyHash
,abortTx
validates now againstinitial
! - Bad news: We realize that we need some "onchain state" which is tracking utxo + which datums would be able to spend these
- e.g. for the abortTx we would need to have something like
[(TxIn, PTSpending)]
withto keep track from where and how utxo's would need to be spent by this transactiondata PTSpending = FromInitial PubKeyHash | FromCommit (UTxO Era)
- doing things the "Direct" way is hard!
- e.g. for the abortTx we would need to have something like
- Give another shot on PAB, tests fail because of a missing nextTransactionsAt
- awaitUtxoProduced seems for a good replacement, providing us with outputs and txs
- along with that there is
txOutRefMapForAddr
for filtering certain TxOut - can't seem to find/import
ChainIndexTx
though
-
typeScriptTxOut
won't decode the datum because thatChainIndexTxOut
is not necessarily containing the tx and thus it can't know the Datum (but just the hash)-
txOutRefMapForAddr
gives directly TxOuts which do only contain datum hashes - re-combine into a ChainIndexTxOut with ChainIndexTx or use
utxosTxOutTxFromTx
asPlutus.Contract.StateMachine
-
- Changed to just merge all utxos seen in
watchInit
together and try to decode the right datum from any output- It turns we do not need the address of the state machine validator anymore
- Maybe this is inefficient
- The Head statemachine contract can't be observed for
Final
state as it does not produce a txOut- re-defining
isFinal Final = False
works around this
- re-defining
- Idea: running the plutus validator in haskell against constructed transactions
- Which serialization of plutus scripts?
- From TypedValidator we would use tvValidator / mkValidatorScript to get a
Validator
- This is an instance of
Serialise
(fromserialise
package) - However
Validator
is only a thin wrapper aroundScript
, which has aToCBOR
where we can usecardano-binary
'sserialize
? -> opted for this one
- From TypedValidator we would use tvValidator / mkValidatorScript to get a
- Debating on how to run the validator now
-
evalScripts
takes the TxInfo already asData
, so maybe usecollectTwoPhaseScriptInputs
- this is from the ledger - there are other (more low-level) ways from
plutus-ledger-api
package -
evaluateTransactionExecutionUnits
is used bycardano-api
, maybe also fine for us / our tests? - the
constructValidated
function looks promising for a template to collect+eval scripts, although this function seems not to be used anywhere
-
- Call into plutus
evaluateScriptCounting
directly as we want to evaluate a specific script on Tx - Realized that validating the init tx with the initial contract does not make sense!
- rather the commit tx or abort tx would be goverened and thus need to validate with the Initial script
- those txs would also include said script in order to spend the input, so using the plutus functions is too low level and we should be able to use the collect+eval scripts from ledger after all
- Refactor to use
evaluateTransactionExecutionUnits
from ledger
- What is that
ScriptIntegrityHash
used for? - Converting from a plutus validatorHash/Address to the ledger's Addr is weird
- also could not find whether / where plutus is doing the same thing?
- BTW they also have exactly a mock chain sync client and server but this seems to be using their own
Tx
type (seePlutus.V1.Ledger.Tx
)
It's not only weird, but it fails because plutus does not give us blake2b_224 hashes (likely sha256 instead)
We discussed the new approach for on-chain interaction as an alternative to the PAB.
-
We were able to complete a full round-trip using the Ouroboros mini-protocols, though the transactions being submitted and deserialized are not representative of the actual chain interactions. It still demonstrates how the setup works and, that it's possible to fully test the approach in isolation with a mock server.
-
We agree that we want to keep "wallet concerns" inside the component, and not leak it through the abstraction to keep the solution as much as possible close to what the PAB gives us. Ideally, we could swap one implementation with the other once done.
-
The last point means that we have to provide the direct-chain component with credentials to be used for (a) signing Hydra on-chain transactions and (b) tracking users' funds to pay for those transactions. A simple pub/prv key pair would do, from which we can derive a change address and track the UTXO set easily. This means that the Hydra node would initially require users to move some funds into a specific address that they control, but gives "custody" to the Hydra-node for running a head. It's important that funds at the address are sufficient to cover a full head lifecycle (init, close and more importantly, contest) and we should warn users accordingly if not (or even, refuse to init a head).
-
The current implementation of the chain-sync client is wrong (and we know it) as it will synchronize blocks from the origin always. What we want however is to start at a much later point, for example, the current tip and onwards. This is easily achieved with the chain-sync protocol itself but, comes with a limitation: all participants have to be online and observing the chain before any init transaction is submitted to the network. While it is okay-ish for now, if we persist with that approach, we'll have to provide some synchronization mechanisms between peers.
-
We are also current over-simplifying the problem by considering that participants are only member of a single head at a time. Thus, looking for on-chain transaction does not currently check whether a transaction is indeed about a given head, but only checking whether it involves "us" as a participant. Later, we'll want to also recognize which head instance is concerned by an on-chain transaction, which can be done through the mean of the state-machine thread token.
-
The question of rollbacks was raised again. In principle, in Praos, a node can rollback up to 3k/f slots (~18h) so, transactions only truly reach immutability after 18h. Yet, they reach a high enough probability (99.999%) well before that; Still, depending on the adversarial stake in the system, this can vary between few minutes to hours. We want to bring this question to the next engineering meeting with the consensus team. There are really two types of rollbacks:
-
'organic' rollbacks, which can occur because of Praos and how the consensus sometimes elect two or more leaders for the same slot. This type is quite benign, and transactions lost by such rollbacks can simply be re-submitted if needed. Although this is annoying for the contestation, it can likely be managed gracefully.
-
'adversarial' rollbacks, which are induced by an adversarial party trying to double-spend. For example, one head participant could commit to a head some funds he/she is also trying to double-spend at the other edge of the network. This would result in head participants thinking they're indeed inside a head, whereas in practice, nothing really happened.
-
-
Work stream on the "round-trip" of
InitTx
by posting and observing it again on chain - anologously as with theExternalPAB
-
Start with storing/recovering the HeadParameters as Datum
- before diving into minting the thread token, participation tokens etc.
-
Managed to create and add a script output, BUT
- while I could check that there is an output with some datum hash (even checking the hash)
- cardano-api seems not to include the
Datum
when creating an output (in contrast to the plutus framework) - So while the test would need to also
signShelleyTransaction
to see script witnesses in the transaction, the ones for "creating outputs" are not present - This would make it impossible to deserialize the
HeadParameters
from aninitTx
- ... back to
cardano-ledger-specs
for more control?
-
Switching to cardano-ledger-specs was not too hard
- Could construct the TxBody with the single input
- Now the question is how to assert that the datum is present?
- Return
Tx
without signatures and update that later or introduce an intermediate type just for this? e.g.data TxDraft = TxDraft { body :: TxBody (AlonzoEra StandardCrypto) , -- | Datums used by scripts in the body. dats :: TxDats (AlonzoEra StandardCrypto) }
- Opted for simply returning an unsigned (and unbalanced for that matter)
ValidatedTx
-
Interesting observation: converting between HeadParameters and Initial / onchain representations is explicit now in the tests / not via JSON
-
observeTx
could serve as the pendant ofconstructTx
- Using
Alternative Maybe
we could provide for a nice interface - Is this efficient?
- Using
Trying to enhance log-filter
program to be able to fork a hydra-node program and filter the logs it produces directly from its stdout, rather than filtering logs on disk.
Ran into some problems:
- children are not properly reaped when parent dies apparently
- passing arguments to the
log-filter
program is problematic when invoking hydra-node throughcabal run
because I need to pass--
twice
Continue working on implementing a mock node in order to test Direct
chain component that will build transactions from PostTx
messages and send back OnChainTx
messages from observed transactions in blocks.
Implementing mock TxSubmission server using a TQueue
to hold the transactions, shortcutting the block construction with 1 block = 1 tx
- Find intersect: Client sneds some points which are supposed to be in the ledger, server will respond with the latest chain in the sent points
- Server maintains a cursor for each client and send them updates on request, which could be backward/forward
Design discussion about next steps for "mock node", or how to write server and client in order to test transactions are observed on both sides:
- We need one peer per client per protocol
- The client (production) and server-side (test) code should both be ouroboros applications
- We can build a pair of
Channel
s to the client/server? => the Mux can use aChannel
- Or we need to bite the bullet and use a
Snocket
- Or we need to bite the bullet and use a
Next steps:
- Complete network layer
- Add smart constructors to create transactions for Hydra protocol
- Which API to use? Cardano-api or ledger-api? => Which one is easier
- We used the cardano-api in the tests to decouple from the hydra-node library code, and went for using the JSON schema API instead
- Goal: smart constructors for protocl txs, e.g.
PostChainTx -> Tx AlonzoEra
- Start working on the "chain tx" constructors, created
Hydra.Chain.Direct.Tx
as separate module as it provides for a good "seam" for unit tests - Which Tx type to use? There are at least:
-
GenTx
(CardanoBlock StandardCrypto) from ouroboros-consensus? -
ValidatedTx (AlonzoEra StandardCrypto)
from cardano-ledger-specs? -
Tx AlonzoEra
from cardano-api?
-
- Discussion with KtorZ on whether we'll have a
Tx
or aTxBody
and "who signs transactions"?- Conclusion: Signing / keeping keys at the client would be morally the right thing, but we want to do it the same as we expected the PAB to work, i.e. "the system" has the keys and does sign txs / spend
- The smart constructors though, should be producing
TxBody
and thewithDirectChain
would have access to someSigningKey
- Started with using the
cardano-api
types as they were easy to handle in the e2e test - All of these functions will be something like ..
TxIn -> Either TxBodyError (TxBody AlonzoEra)
- For
initTx
we can use a singleTxIn
for paying fees & as the parameter of minting thread token & participation tokens - Problem: No Arbitrary instance for cardano-api's
TxIn
- Shift to using
cardano-ledger-specs
as it hasArbitrary TxIn
in one of their test package, BUT- it's easier to shoot yourself in the foot with this API. i.e. nothing prevents you from creating a TX without inputs
- not sure if this is good!?
- Was pointed at hedgehog
Gen TxIn
forcardano-api
-> switch back - Down a rabbit hole how to run a
hedgehog
Gen
in aQuickCheck
Gen
- Was pointed at
hedgehog-quickcheck
, where I missed thehedgehog -> QuickCheck
direction by being blind - Back in the flow of creating tests against
initTx :: ... -> Either TxBodyError TxBody
-> success with a generatedTxIn
- Adding TxOut parameter to calculate and add a change output
- Fails if fees /= 0 because there might not be enough in the generated txOut
- create a
===>
implication to ensure enough lovelace - hedgehog
Gen TxIn
seems not to bescale
-able from QuickCheck
- Seeing the complexity of
makeTransactionBodyAutoBalance
had me pivot to changeinitTx
createTxBodyContent
instead and have the balancing, change and fee calculation be done by this function
- Presentation of early benchmark results
- Evolution of plots and performance over time
- Effect of various dimension (number of nodes, generator type)
- We have 2 generators currently, a "standard" one which uses ledger's genTx producing "large" transacions and growing UTXO set, and a "constant" one that produces micro-payments like transcations (one input, one output) and keeps UTXO set constant
- Discussion:
- Suggestion: Do a heap profile of each nodes, does not require recompilation with profiling enabled
- "Fractal benchmark":: SN can't reproduce behaviour of simple txs over ouroboros network, so perhaps an artifact of a single run ona specific machine?
- Could be good to run on bare metal to get some real benchmarking, but virtualised environments are somewhat consistent with probable deployment model
- We should take some time with Marcin to check we are not doing something stupid in the protocol layer
- Question: Why does the performance seems to degrade over time? => we don't know (yet), requires more investigation
- Extract some info from RTS from the nodes and use embedded Prometheus server to get data at regular interval
- Discussion about PAB vs. direct interaction with the chain:
- Review with Marcin how to integrate with NodeClient protocol for the purpose of testing direct submission of txs to the chain
- Also have a look at how Plutus implements Mock chain server
- We want to get something running sooner, it's tactical decision for short-term unblock us
- PAB has taken a while to build because it's actaully complex, and deals with a lot of intrisic complexities of the chain (and node was also a moving target...)
- Once PAB is ready we will have an easier time because dev will be user-driven
- Even if we get it running now with the "direct" approach, it does not mean we won't be using the PAB (as-a-library) for production later
- Finish reading ACE Paper
I would like to count the UTXO set size after each transaction, then correlate that with the transaction confirmation time (or ReqTx time?) to check if this shows any correlation with the latency increase.
I now have a list of txId/utxo size pairs, now need to correlate that with the time the transaction was confirmed.
Need to have a table of list of tx ids with confirmation time from the SnapshotConfirmed
message
Struggled a bit with jq syntax to extract timestamp for each transction in the confirmedTransactions
field of a snapshot confirmed message but turns out it's pretty simple:
$ cat log.1 | grep ProcessedEffect | grep SnapshotConfirmed | jq -cr '{ts:.timestamp, txs: .message.effect.serverOutput.snapshot.confirmedTransactions[]}' | jq -cr '[.txs, .ts] | @csv'
The confirmedTransactions
array is actually "flattened" and one record is generated per element.
Trying to map UTXO over time, I messed up with the logic: I need to apply the list of transaction in the order of their confirmation to keep track of the growth of UTXO se over time ; instead I simply concatenated the list of transactions in the dataset which does not make sense. The confirmation time for each tx is good however, need to sort the file on this.
Wrote a small haskell program to extract the information I need from the dataset.json
file and the confirmed-txs.json
, namely the size of UTXO per transaction confirmed ordered by time.
- Then I can produce a time series for the size of UTXO set with:
cat utxo-size.json | jq -cr '.[] | @csv' > utxo-size.csv join -t ',' utxo-size.csv confirmed-txs.csv | cut -d ',' -f 2- | tr -d \" | awk -F ',' '{ print $2, $1 }' | sort > utxo-time
Now trying to run a benchmark with a different transaction set, namely one where the UTXO set does not grow over time.
- Struggling to get the API rights to generate a sequence of transcations that cycle among aUTXO set.
- In the
Cardano
module we useCrypto.Ledger.Keys
but this is not in thecardano-api
which has different types, ended up exposing a function to extract verification key from keypair as I need the former to generate UTXO, and the latter to generate addresses in transactions. - Trying to leverage what @KtorZ did for the TUI, generating keys, addresses and specific UTXO set, but working with the API to eg. select a random UTXO from a list of UTXOs is very annoying
- Trying another approach: Start with a single UTXO, a single key pair, and then consume the UTXO producing the same UTXOs, sending to a new key pair. It seems to work but I still have errors when running the property test, seems like the values are not correctly conserved...
- Turns out the error was I was trying to apply the transactions in the wrong order using
foldr
instead offoldl'
- Now I get another error with some UTXO being too large, trying to trim that down to some manageable size.
Finally managed to run a benchmark with a constant UTXO set size:
Writing results to: test-bench-constant/results.csv
Confirmed txs: 18000
Average confirmation time: 6.724038065066666e-2
Confirmed below 1 sec: 100.0%
What's missing for demo?
- Sending native assets
- Showing the 3 clients opened in the head
- Explaining use of test faucet + use of addresses representing each party
- 2 lists of things in the Head: Connected nodes + hydra keys in hydra heads
- use address instead of peer to send UTXO to? + alias
- We send only a part of the UTXO
- Pb: There are 4 addresses instead of 3
- Show list of head participants using Hydra public keys w/ colors
- Showing the final UTXO set after closing
- Identifying parties, passsing key pair to the exec for the head
- Initial addressses for the head are generated from the port only (which are all the same which is weird...)
-
Picked up on the "aliased Party" branch and implemented an optionally aliased
Party
type-
Ord
instance problems + tests -
Show
instance which prefixesalias@
and uses hex encoding on the verification key -
ToSON
droppingalias
whenNothing
-
-
Have the node alias
me
andotherParties
when loading keys from file- Heuristic to only alias when files start with a letter
- Leaves
Party
in end-to-end tests unaliased (otherwise we would need to carry around the names in our instrumentation)
-
Show the list of
Party
using the new format in the TUI now -
Noticed that the TUI was refactored into
data State = State { ..., clientState :: ClientState}
withdata ClientState = Connected | Disconnected
, which is exactly what I avoided initially -> need to discuss this -
How to get the
me :: Party
on the client side?- Thought about different ways to do this:
- Simply add a
me :: Party
toReadyToCommit {parties :: Set Party, me :: Party}
-> this provides us with enough info at the right point in time, which we can keep around while the head is open - Add a new ServerOutput
NodeInfo { me :: Party }
.. maybe also including aversion :: Version
etc.- This could be sent as some kind of "greeting" to each connected client as the first output in history
- A corresponding
GetNodeInfo
input which fetches this information
- Simply add a
- Intially thought of option one being easiest, but changing
ReadyToCommit
to be node-specific is a PITA for tests - Option 2 with the latched greeting was easy to achieve and has a taste of old-school protocols
- API server tests are obviously failing, could fix most.. but one was puzzling me and no time to fix now (sorry)
- Thought about different ways to do this:
-
Finally, the client-side can now store the public key of the connected node and we can generate Utxos and addresses from that, instead of a port or other info
- This essentially means we would want to
Party -> CardanoKeyPair
andParty -> Utxo
- Moved the "fauceting" and "credential conversion" into
Hydra.Ledger.Cardano
as it requires now to deconstruct theParty
'sVerKeyMockDSIGN
to get our hands on a suitable seed for crafting cardano credentials ->hydra-node
seemed to be a better place for this than the TUI
- This essentially means we would want to
-
Handling the
Greeting
was trivial, but the fact thatme :: Mabye Party
is messing up quite a lot of the code -
Some final more polishing / experiments in highlighting "us" and "own addresses"
Extracting siez of UTXO set from reduced log:
$ cat log.1 | grep HeadIsFinalized | jq '.message.effect.serverOutput.utxo | keys' | wc -l
Extracting length of processed snapshots:
cat log.1 | grep ReqSn | grep NetworkEvent | grep ProcessedEvent | jq -r '[.timestamp, (.message.event.message.transactions| length)] | @csv' | tr -d \" > snapshot-length.csv
Extracting timings for ReqSn
start and stop:
$ cat log.1 | grep ReqSn | grep NetworkEvent | grep 'ProcessingEvent' | jq -r '[.message.event.message.snapshotNumber, .timestamp] | @csv' | tr -d \" > snapshot-processing.csv
$ cat log.1 | grep ReqSn | grep NetworkEvent | grep 'ProcessedEvent' | jq -r '[.message.event.message.snapshotNumber, .timestamp] | @csv' | tr -d \" > snapshot-processed.csv
$ join -t ',' snapshot-processing.csv snapshot-processed.csv > snapshot-time.csv
The idea is to compute the time span between the first processing event for an ack and the last processed:
$ for i in {1..6}; do cat log.1 | grep AckSn | grep NetworkEvent | grep ProcessingEvent | jq -r "select (.message.event.message.party == $i) | [.message.event.message.snapshotNumber, .timestamp] | @csv" | tr -d \" > processing-ack-$i.csv ; done
$ for i in {1..6}; do cat log.1 | grep AckSn | grep NetworkEvent | grep ProcessedEvent | jq -r "select (.message.event.message.party == $i) | [.message.event.message.snapshotNumber, .timestamp] | @csv" | tr -d \" > processed-ack-$i.csv ; done
join -t ',' processing-ack-1.csv processing-ack-2.csv | join -t ',' - processing-ack-3.csv
What I want is the ack-sn.csv
file that looks like:
2021-09-10T16:33:28.664Z,118
2021-09-10T16:33:28.912Z,194
2021-09-10T16:33:29.141Z,136
2021-09-10T16:33:29.347Z,108
2021-09-10T16:33:29.598Z,149
2021-09-10T16:33:29.816Z,59
2021-09-10T16:33:30.066Z,145
2021-09-10T16:33:30.424Z,349
2021-09-10T16:33:30.629Z,186
2021-09-10T16:33:30.867Z,130
To produce it from the logs was somewhat painful though:
- Aggregate timings for
AckSn
events processed for each node, grouped byProcessingEvent
andProcessedEvent
, in a JSON array, - Load the array (in my case using
nodejs
) and extract the minimum of start time and the maximum of stop time, - Compute the different between the two and produce a file with x value being the end time, and y value being the total latency for Acking a snapshot.
Joining all AckSn timings:
join -t ',' processed-ack-1.csv processed-ack-2.csv | join -t ',' - processed-ack-3.csv | join -t ',' - processed-ack-4.csv | join -t ',' - processed-ack-5.csv| join -t ',' - processed-ack-6.csv > processed-ack-all.csv
join -t ',' processing-ack-1.csv processing-ack-2.csv | join -t ',' - processing-ack-3.csv | join -t ',' - processing-ack-4.csv | join -t ',' - processing-ack-5.csv| join -t ',' - processing-ack-6.csv > processing-ack-all.csv
join -t ',' processing-ack-all.csv processed-ack-all.csv > ack-all.csv
Then I manually transformed this file back to JSON in Emacs :( before processing its content in node. Here is the JS script to produce a JSON that contains the above data:
const fs = require('fs');
const ack = fs.readFileSync('ack-sn.json');
const d = JSON.parse(ack);
const ts = d.map(arr => [arr[0]].concat(arr.slice(1).map(d => new Date(d).getTime())));
const minmax = ts.map(arr => [arr[0],Math.min(...arr.slice(1,6)), Math.max(...arr.slice(7))]);
const sntime = minmax.map(arr => [new Date(arr[2]),arr[2] - arr[1]]);
fs.writeFileSync('ack-sn.json',JSON.stringify(sntime));
Then producing a CSV for plotting with gnuplot amounts to:
cat ack-sn.json | jq -rc '.[] | @csv' | tr -d \" > ack-sn.csv
gnuplot is a bit quirky to work with if the data is not in the right format, doing computation and transformations on data is awkward, eg. like computing the moving average for receiving all AckSn
in the following transcript:
set xdata time
set format x "%H:%M:%S"
set xtics out rotate
set title 'Snapshot acknowledgement time (ms) - 1389 snapshots'
samples(x) = $0 > 9 ? 10 : ($0+1)
avg10(x) = (shift10(x), (back1+back2+back3+back4+back5+back6+back7+back8+back9+back10)/samples($0))
shift10(x) = (back10 = back9, back9 = back8, back8 = back7, back7 = back6, back6 = back5, back5 = back4, back4 = back3, back3 = back2, back2 = back1, back1 = x)
init(x) = (back1 = back2 = back3 = back4 = back5 = back6 = back7 = back8 = back9 = back10 = sum = 0)
plot sum =init(0),\
'ack-sn.csv' u (timecolumn(1,"%Y-%m-%dT%H:%M:%SZ")):($2) w l t 'AckSn Processing time (ms)', \
'ack-sn.csv' u (timecolumn(1,"%Y-%m-%dT%H:%M:%SZ")):(avg10($2)) w l t 'Moving average (10 points)', \
'ack-sn.csv' u (timecolumn(1,"%Y-%m-%dT%H:%M:%SZ")):(sum = sum + $2, sum/($0+1)) w l t 'Cumulative mean'
Plotting moving average with gnuplot is clunky: http://skuld.bmsc.washington.edu/~merritt/gnuplot/canvas_demos/running_avg.html
What I would like to do tomorrow is to map the UTXO set over time, checking if there's a correlation between the UTXO set and the time it takes to produce snapshots after a while. This does not seem to be the case as we can see the ReqSn
processing time does not significantly change over time. Note this could also be tested experimentally by running a benchmark with a synthetic transaction sequence that does not increase the UTXO size, something like ping-pong style transactions which keep sending the same amount back and forth.
Plan for today:
- MB: Got simulation work to do for researchers, fixing "bug" in UTXO display and polishing TUI
- AB: Add more concurrency and parallelism to benchmark (run n clients per node, run more than 3 nodes)
Going to work on having multiple clients per node first, so that we can see the effect of submitting parallel non conflicting transactions to same node
- Also need to find a way to make the
contestationPeriod
dynamic becuase otherwise the test can fail as it timeouts waiting for finalisation - There is a single registry for confirmation time of all transactions on all sequences but this does really make sense and increases contention as each thread compete for the same piece of data -> split registry among all threads and combine them at end of run
Adding the ability to increase concurrency above number of nodes, eg. have more than one client per node
- Solution is rather simple: Just extract the
startConnect
function to the toplevel, it's the one responsible for opening the connection to the hydra node.
Introduce withCluster
function that "folds" withHydraNode
over an arbitrary non-zero expected number of nodes
- Got an error at startup so it seems there's some 3 hardcoded somewhere...
- I can see the 4 hydra nodes starting up but the test fails on an expectation:
waitFor... timeout! nodeId: 4 expected: {"parties":[1,2,3,4],"tag":"ReadyToCommit"} seen messages: {"parties":[4,1,2,3],"tag":"ReadyToCommit"}
- Turning the list of parties into a
Set
does the trick -> ordering is guaranteed asSet
needsOrd
instances and maintain deterministic ordering of nodes
Running a benchmark with 6 clients, 4 nodes gives me:
Confirmed txs: 481
Average confirmation time: 0.4695817474137214
Confirmed below 1 sec: 100.0%
With 3 nodes:
Confirmed txs: 571
Average confirmation time: 0.4129197640157618
Confirmed below 1 sec: 100.0%
with 2 nodes:
Confirmed txs: 533
Average confirmation time: 0.3144588079249531
Confirmed below 1 sec: 100.0%
Spent some time fixing a mistake I pushed to master: Changed the type of parties committing from [Party]
to Set Party
and this had ripples I did not notice
Trying to run a 10 nodes simulation => waitFor
startup timeouts, need to increase it
Trying with smaller number of nodes, say 6
- Got a 6-nodes benchmark running, I guess startup timeout should be increased to something like 20 seconds per node
- Benchmark still running after 1.5 hours, it's really hard to say how much has been done -> need better reporting than dumping transaction ideas and snapshots
Trying to run a benchmark with more "reasonable" values: 10 concurrent clients, 6 nodes, scaling factor of 50, and added some progress report:
Client 5 (node 2): 17/249 (6.83)
Client 1 (node 6): 13/1366 (0.95)
Client 3 (node 4): 15/1112 (1.35)
Client 4 (node 3): 14/642 (2.18)
Client 2 (node 5): 14/901 (1.55)
Client 6 (node 1): 17/1081 (1.57)
Client 7 (node 6): 13/361 (3.60)
Client 8 (node 5): 14/154 (9.09)
Client 9 (node 4): 15/571 (2.63)
Client 10 (node 3): 14/1452 (0.96)
Also set the number of transactions to be the same for all clients, otherwise we might get artifacts in the numbers we extract from the run if some clients stop sending transactions before others.
Benchmark TODOs:
- Run the benchmark over some period of time to get steady state behaviour => trim down the logs?
- Add validation time and throughput
- Add metrics internal to nodes (event queue size, confirmation time)
- Increase the number of nodes
- Spread the load on different clients
- We have the worst case => increase the parallelism of transactions generated
TUI TODOs:
- commit from faucet
- new transaction
- UTXO set visualisation
Note: We can easily DoS nodes spamming them with invalid transactions, as demonstrated by performance drop when submitting a lot of invalid transactions. We need some rate limiting and/or caching of validation to reduce the load on the nodes
Enhancing plot:
- Looking at https://torbiak.com/post/histogram_gnuplot_vs_matplotlib/#gnuplot to plot throughput of txs in benchmark
- Trying to plot a histogram of transactions throughput, this link is actually simpler and works: http://www.gnuplotting.org/calculating-histograms/
- Managed to plot confirmation time and throughput on the same graph. Obviously confirmation time follows an inverse trend with throughput which follows from Little's Law
Now adding more parallelism in the data set so that we can observe how the nodes behave with non conflicting transactions ste
- First step is generating
n
non-conflicting transaction sequences that will be each handle by one client thread sending to one node - Got caught in a rabbit hole transforming the way we store and pass data to
EndToEnd
in order to introduce potential concurrency Added a parameter to the benchmark and created aDataset
type to map aUxto
set and the sequence of transactions generated for it, so that we can have multiple sequences and distribute them among many clients - Implemented parallel submission and confirmation of transactions, with one thread per generated dataset and the threads distributed over the various clients available
The run deadlocks pretty quickly and
TxInvalid
messages show up which shouldn't be the case: It seems the first transaction submitted in the second thread is not the first transaction from dataset?- Of course: There is a single submission queue for all the threads -> removing it from the registry and creating it in each thread should work better, Each pair of submitter/confirmer now has its own queue but I am still running into a deadlock with concurrency > 2
- Trying to unblock the submitter when the transaction is invalid, so this works in the sense that the process does not deadlock but transactions stay invalid "forever"
- I have got the explanation: All the UTXO sets generated have the same TxId as reference, so the transaction consumes the wrong txout and everything goes awry => Need to make sure the UTXO sets are completely different...
- Updated
genUtxo
function to use an arbitrary genesis TxId which hopefully should be fine?
- Updated
- When trying to increase concurrency level over 3, eg. one client per node, I am running into troubles of course because all clients for same node share a connection to the node which does not work! => Need to make sure each submitter has its own connnection, which might be slightly annoying given the way we structured HydraNode
All nodes are now busy, each with its own dedicated client:
28553 curry 20 0 1024.8g 95716 23184 S 101.0 0.6 0:56.91 hydra-node
28585 curry 20 0 1024.8g 85788 23120 S 98.7 0.5 0:55.95 hydra-node
28574 curry 20 0 1024.8g 92324 23104 S 95.0 0.6 0:57.10 hydra-node
Tip: Counting the number of transcations in a dataset:
cat bench-parallel-test-2/dataset.json | jq -c .[].transactionsSequence[].id | wc -l
Just realised it's not possible to run 2 benchmarks in parallel: I killed a long running one because I was oblivious to this :( Now adding validation time to the plot so that we can check how this evolves over time. I suspect validation time amounts for a significant fraction of the time spent processing a transaction as the size of UTXO set grows
It's slightly annoying but due to the way the reader is coded for DiffTime
one has to add an s
suffix to the numnber of secondes for timeout:
$ cabal bench local-cluster --benchmark-options '--scaling-factor 20 --concurrency 3 --output-directory bench-test --timeout 1000s'
Trying to run the benchmark with nullTracer
just to make sure the output of the logs is not impacting the performance of the nodes significantly: I suspect JSON (de)serialisation might contribute significantly to bad performance of nodes
Final plot of the day, with a null tracer:
Goal for today:
- Write the snapshot decider "by the book", eg. independently from the rest of the protocol and as described by the paper
- Remove the current
ReqSn
production and wire thenewSn
Removing SnapshotStrategy which is really not used
Tried to wire the new snapshot decision logic into the Node, simply enhancing the effects
with a ReqSn
when we decide we should snapshot:
- Lot of tests are failing/hanging now -> Trying to debug the tests one by one as we have a lot of them around
- Also removed
RequestedSnapshot
from the ADT - Removed all snapshot emission tests from
HeadLogicSpec
-> Those are now covered bySnapshotStrategySpec
NodeSpec tests are failing with rather obvious reasons => We don't want to emit a ReqSn
if seenTxs
is empty while we ShouldSnapshot
Problem now is that we emit a ReqSn
upon every ReqTx
so one NodeSpec test fails because we add more ReqSn
than expected
-
emitSnaapshot
now change the state so that we don't emit multiple snapshots while processing a batch of events - We also make
newSn
work onCoordinatedState
directly instead ofHeadState
to remove some cases
Got all tests to pass but benchmarks are still livelocking so something's still wrong in the snapshotting/confirmation logic:
- AB going to finish log-filter program to be abel to analyse more easily logs
- MB to troubleshoot issues with bench
Back working on log-filter process, got a an issue with transforming the list of confirmedTransactions
, what I get is not a list of ids but a single ids so I assume my traversal is not working
Interesting question is: How to test this log filter and ensure it stays in sync with the structure of the log entries?
- MB suggests to generate random logs, and check compression achieved by
log-filter
against some expected threshold - Wrote unit tests to assert LogFilter properly transforms an Array of transactions into an Array of TxIds. Wasted 10 minutes troubleshooting the test which failed for the wrong reason because a keyword used was invalid => test is useful!
Impact of log-filter, before:
-rw-rw-r-- 1 curry curry 184567326 Sep 8 10:12 /run/user/1001/bench-67fb0dac6531c6bf/1
after:
-rw-rw-r-- 1 curry curry 5595672 Sep 8 13:40 filtered
Trying to extract a test case from the failed benchmark run of yesterday
Started writing a "log-filter" program whose purpose is to filter and trim down the logs, removing unessential details at this stage in order to better understand the flow of transactions and messages It currently replaces full tx with txid and removes long map of UTXO from a few messages, still some work to do to have it usable and remove most of the noise of the logs Note that it also removes all log entries which are not from the NOde (eg. ouroboros network messages) I have used lens-aeson but unsure if I am really harvesting the full power of the library and lenses in general, but it seems to work fine so far
Discussing issues about snapshots with MB
- Our current approach seems to be flawed: Its the 3rd time we are having issues with snapshots while we are trying to produce them synchronously with other events
- In the original formulation of the paper, the decision to create a new snapshot is independent from processing of txs and snapshots, and has been translated in hydra-sim as a
SnapshotStrategy
that drives a snapshot thread which injects theReqSn
Plans for tomorrow:
- Write a better tester for our protocol, possibly using some kind of events generator. One approach would be to consider individual transactions "validation journey" and then compose the needed events/messages to produce an arbitrary interleaving and test that
- Another solution would be to just generate randomly possible messages coming from the network, eg. consider the network abstractly without taking into account the other nodes and just observe messages coming and act on them. Then we could generate sequence of messages from the point of view of the network to try to trigger unexpected behaviour and interleaving (if it's sound unclear it's because it is...)
- emove the snapshotting logic from the
HeadLogic
's functionupdate
and write aSnapshotter
independently, with proper tests and specified behaviour, then plug it in the node
Recreating Dev VM, trying to see if I could get something faster to improve turnaround time for compilation, but seems like C2 machines are the fastest available CPU wise
I have managed to get autotest.sh
script work again, thus will see over the next weeks/month whether this has an impact on the cabal time.
I expect it should because I should be able to run the tests using autotest.sh
rather than cabal. Ideally I would need a small wrapper program to tally timings between compilation cycles withing ghcid, which means I shuold probably look into ghcid's source code
When compiling master, I am having problems with missing git reference from dependencies, which are addressed by SN's PR:
$ cabal build all & cabal test all
[1] 5621
fatal: reference is not a tree: 09433fe537a4ab57df19e70be309c48d832f6576
fatal: reference is not a tree: 09433fe537a4ab57df19e70be309c48d832f6576
Working on merging this PR, then trying to fix compiler errors stemming from upgrade in dependencies.
ContractTest is failing with
Contract instance stopped with error: ConstraintResolutionError (DatumNotFound ca54c8836c475a77c6914b4fd598080acadb0f0067778773484d2c12ae7dc756)
src/Plutus/Contract/Test.hs:241:
Init > Commit > CollectCom: CollectCom is not allowed when not all parties have committed
The collectCom
test is expected to fail but it seems this error is not expected? Actually it fails in the commit
call.
Rebasing update deps PR on master before pushing, contract tests are now pending but we should revisit Plutus anyway so 🤷
Deciding to work on bench as MB has not touched it a lot and it's broken. Writing a function to submit transactions in parallel with confirmation
- When we get a
TxInvalid
we want to wait for the next snapshot confirmed to resubmit the tarnsaction
Turning the Registry
into a record that contains a queue of txs to resubmit
Slight problem for pairing: AB has not the missing reference from master anymore so cannot work on it, but MB has not built the update-deps branch so has no dependencies and the build is not finished yet so we don't have a cache available -> we don't rotate for the moment and wait for PR to finish building
MB got troubles compiling after merge of update deps PR: Problems with happy
dependency from plutus-core -> program could not be found ??!!
- There were 2 different installations of happy and alex, removing one of them fixed the issue
We now put transactions in aqueue and then repush them when they are invalid, then we resubmit a transction only if the sanpshot number has changed
- Our transaction submitter gets stuck reenqueueing transactions and never stops, which hogs the process at 100% CPU
- Trying to simplify code by extracting a "decision function" running in STM that returns an
Outcome
which says what to do with the function - Got a successful run with 5 snapshots and 10 txs so I suspect there is a race condition
I think the snapshot process "loses" transactions:
- A tx received through a ReqTx should be snapshotted by node 2 which will be the leader, but there is still a snapshot in progress so no new snapshot is started
- The snapshot is emitted, but then no new tx can be submitted because of the "blocked tx" which is not confirmed -> All subsequent txs that depend on it will fail to submit
- As no new tx can be submitted, no new
ReqTx
is produced which prevents the production of a new snapshot
This is exactly the problem for which we introduced DoSnapshot
initially: We cannot link the production of a new snapshot to ReqTx
because if we don't produce it immediately, and there is no more ReqTx
to trigger the check, the transaction gets "lost" and never confirmed.
Originally not planned to do this today, but I realized that our master
cannot be built from scratch in a fresh working copy because we seem to be referring a now gone cardano-ledger-specs
commit (we took that one from the plutus we tracked so far).
So I set out to update the source-repository-package
in cabal.project
to match the most recent plutus
master
.
Using a nix-shell -A cabalOnly
this was quite rapid to set up and then later materialize into a "proper" haskell.nix shell.
Three API changes seem to have happened:
-
utxoAt -> utxosAt
renamed and does return aChainIndexTxOut
now instead of atype UtxoMap = Map TxOutRef TxOutTx
. Was not much of a problem to us, but I changed the code slightly to use more oftenTxOut
instead of theTx
-referencingTxOutTx
. -
ScriptLookups
is now takingChainIndexTxOut
forslTxOutputs
which can be created fromtxOut
usingfromTxOut :: ChainIndexTxOut -> Maybe TxOut
andslOtherScripts
is now takingValidatorHash
instead of addresses, this was also easy to change. -
nextTransactionsAt
seem to have been deleted! This is more of a problem and was not yet solved. Was it replaced? None of the functions in Request.hs seem to be doing this.
Until the last API change has been fixed, these things can be found on this branch: https://github.com/input-output-hk/hydra-poc/tree/ch1bo/update-deps
Having a look at glean from FB, a tool to explore code bases as advertised on the web site
Looks like there are only indexers (eg. parsers that generate predicates and facts from source code) for JS/Flow and Hack, the 2 "proprietary" languages owned by FB Shouldn't be too hard to make some for Haskell code based on HLS provided tooling?
Discussing in details the Hydra demo and improvements to TUI we want to make
- Use UTxO and TX, do not try to abstract away the details of the ledger (eg. use values only)
- Show the TxRef + the address the values are sent to by the UTxO
- Detail: How do you know which output is owned by whom? Can we derive the pubkey of owner from the address used?
- when using Ledger generation machinery, keys are set as part of
Constants
, we could use the same key for everyone
- How do you create a transaction?
- Select from your unspent outputs
- Accumulate available values from selected outputs
- One dialog screen per step: Select inputs, create outputs, confirm
Planning and aligning on work to do in the next couple of weeks:
- Flesh out the TUI
- Adding metrics to the benchmark and making it more useful (run with more nodes and different configurations) Also gather scrape internal metrics from each nodes and output them as part of the benchmark's results
Using the innovation/learning budget to see how to use Plutus w/o PAB => Run a spike to craft txs by hand and use compiled validators
Updated Logbook
Worked on 2 PRs still in flight: Log API documentation and schema testing, and round-robin leader "election" in the protocol.
Design discussion about the prop_specIsComplete
property is defined and how to use it:
- It currently takes a
SpecificationSelector
which is really a lens selecting some part of the provided schema - This lens should point to a schema fragment which is an array of
object
s having atitle
field, which we use to compare to the list of constructors extracted from arbitrary data and find discrepancies - Unfortunately, the lens or some other kind of expression is needed, and not only the name of the field we are interested in because of differences in structures in the schemas
- We should document this test and property a bit more as they are not really obvious, and also help users by letting the common parts out of the provided lens (the part extracting a list of
title
s from aValue
)
There was also a meta-discussion about whether or not it's ok for someone to add commits to someone else's PR to "fix" it. Seems like we agree this is all fine and good as long as the changes are motivated, but then one could ask: Why not simply add more commits on master
directly, either pairing or ensembling, or discussing them at start of ensemble session?
Worked together on the PR to "complete" it as we agreed the test written in the HeadLogicSpec
was not satisfying as it is: It would better fit as a NodeSpec
test which is better suited to express the expected output of a Node
given a sequence of events, without having to care about the details of the state.
Writing the NodeSpec test was not straightforward but led us to uncover an issue:
- If a node receives a ReqTx while a snapshot is being acknowledged, but before it's confirmed, and this node would be the next leader, then we should trigger a snapshot emission otherwise we run the risk of losing the transaction if no other tx is submitted => no snapshot is triggered until another transaction appears.
- We added a unit test in
HeadLogic
to ensure a leader emits aReqSn
when its turn comes
Discussing PRs in flight:
-
https://github.com/input-output-hk/hydra-poc/pull/69:
- Need to move
Enveloppe
type to Logging module as it's use in tests makes api doc and code inconsistent - There's something fishy going on with the tests as they should not passs because we don't have the files in the data-files, plus there's a
namespace
which should be used
- Need to move
- Implement ADTArbitrary as orphan instances in tests to make sure we cover constructors in aeson's roundrtip testse
-
https://github.com/input-output-hk/hydra-poc/pull/70 => merging, using gnuplot is fine and simple enough
- SN made some changes to make the script more portable => use
/usr/bin/env
to findbash
executable
- SN made some changes to make the script more portable => use
-
https://github.com/input-output-hk/hydra-poc/pull/72
- Discussion about the use of
contestationDeadline
in theOnCloseTx
- Seems like we need the deadline anyway in various places, not only in the client
- We store the "on-chain" transactions in the mock chain because we want to calculate time at posting time and not at consumption time
- Discussion about the use of
Trying to display the IP address of connected hosts in the TUI
- The
Heartbeat
answers aPArty
but we really need aHost
. We can simply encode theHost
in theData
constructor of theHeartbeat
. - Got into troubles with the APISpec saying
Committed
is wrong -> we had a comment saying it should be aParty
but it was really aPeer
Demoing the TUI:
- Mock chain is confusing name as it's already used by Plutus -> Stub chain or Proxy chain
- Make it clear what the limits of the demo are, what's available or not (crypto primitives, main chain, contracts...)
- Would be great to have PAB with (actual) mock-chain as its release is due mid-Sep but seems unrealistic
Demoing the benchmark:
- Does not work out-of-the-box as we made some breaking changes
Puzzled by the behaviour of the APISpec
and LoggingSpec
, esp. how the namespace
is used to check some properties
The classify
function only works on a specific structure, namely one where we have the following tree from the root:
properties:
<namespace>:
type: array
items:
oneOf:
- title: <property>
...
but the utxo
and txs
are defined as:
properties:
utxo:
type: array
items:
$ref: "#/definitions/Utxo"
txs:
type: array
items:
$ref: "#/definitions/Transaction"
and of course logs
is not defined anywhere.
The intent of the property is pretty clear, namely to check the completeness of the specification against generated values but it is very inflexible, tied to the precise structure of the api.yaml
file and not suited for anything but having top-level properties with a specific sturcture
Rewriting the property to accept some arbitrary selector which makes it possible to adapt to specific structure of schema and tested data type.
- Current goal: be able to iterate the full life-cycle of a Head
- but keep commands static and only later make the client aware which is possible when.
- Committing some value in Hydra-TUI could act as some kind of "faucet", but I opted for simply committing
mempty
- Maybe we could have a brick dialog to ask users how much ADA (or other assets) they would like to commit?
- Adding command and server output handlers was really easy and quickly done. Although I refrained from rendering
Utxo
sets. - When head was closed, client does not really know when the contestation period ends and this felt very unresponsive
- The
ServerOutput
should provide a point in time when this is (roughly) ends - The UI can then show a countdown or so
- So when the
HeadIsClosed
should hold acontestationDeadline
, theOnChainTx
handling inHydra.HeadLogic
needs to know the current time.. or is given the deadline as well. - The latter seems to be easier as the chain client would also know best about "what time really means" on the respective blockchain
- The
...And then add some tests for it
-
Got failures in the APISpec tests, unsurprisingly. Seems like
AuxiliaryData
produces aNone
when not present which is unexpected? -
Do
StrictMaybe
fields whose value isSNothing
generatesNone
instead ofnull
? Strict maybe'sToJSON
instance is defined here: https://github.com/input-output-hk/cardano-base/blob/eb58eebc16ee898980c83bc325ab37a2c77b2414/strict-containers/src/Data/Maybe/Strict.hs#L91 and it's defined in terms ofMaybe
s instance which must produce anull
ifNothing
: https://github.com/haskell/aeson/blob/master/src/Data/Aeson/Types/ToJSON.hs#L1244 -
Trying to generate a transaction and check manually the validity against api.json => Surprisingly, generated transaction is valid against schema. Trying to generate more but if I try to validate the generated tx from
encode
it works fine. -
Trying to save the input file to see if there's a discrepancy. The list of
CardanoTx
in the temporary directory is empty, seems like its content is not correctly updated upon shrinks perhaps? -
Saw https://github.com/Julian/jsonschema/issues/623: Json schema outputs the cryptic
None: None is not of type 'string'
when astring
field has anull
value, which is really not obvious from the output. Seems like a PR fixed it but unsure if we have it in our version -
=> Found the solution to allow
null
values forauxiliaryData
andauxiliaryDataHash
:-
For the hash, it's simply an enumeration of possible types:
auxiliaryDataHash: type: [ "string", "null"] description: >- Hex-encoding of the hash of auxiliary data section of the transactions. examples: - "9b258583229a324c3021d036e83f3c1e69ca4a586a91fad0bc9e4ce79f7411e0"
-
For the data, I had to resort to use
oneOf
keyword to either have aCbor
value ornull
:auxiliaryData: description: >- Hex-encoding of CBOR encoding of auxiliary data attached to this transaction. Can be null if there's no auxiliary data oneOf: - type: "null" - $ref: "#/definitions/Cbor"
Got bitten by the fact jsonschema is implemented in Python and actually relies on a mapping between the JSON schema specification and the Python type system. The value
null
is mapped to the value and type (?)None
in Python leading to some cryptic error messages. I am mildly convinced by the model-first approach especially if the tooling trips us. Also, tests are somewhat intricate as we need to pass through a layer of transformation from YAML to JSON, then call an external process to validate a schema. -
While working on adding a validation test of log entries against JSON schema, I am hitting a snag: Importing both the Logging
module and the Cardano
module leads to conflicting JSON instances on UTxO
, which has a JSON instance defined in Cardano.Api.Orphans
.
Instead of custom types for string encoding, we should use media-types to represent various encoded pieces of data: https://json-schema.org/understanding-json-schema/reference/non_json_data.html
Start documenting in more details the structure of the Cardano transactions as exposed by Hydra node API.
- Got a bit puzzled by how to represent dynamic keys which are needed for assets' representation.
Playing with better formatting of errors in the benchmark, using hspec => We can use runSpec
to run the bench, making it a Spec
simply using it
Goals for today:
- Validate
NewTx
against confirmed snapshot and not submitReqTx
if it fails - Let the client (benchmark) handle resubmission
We simply drop the transaction if it cannot be submitted by the benchmark => if it happens early then a lot of transactions will be dropped later
- We see the error message for the TxInvalid and the benchmark keeps running but we don't see any snapshot confirmed
- Node 1 sents a
AckSn
for its signature but it does not get processsed, Seems like we don't process our ownReqSn
?? - We need to improve our tooling for exploring the logs
Trying to modify wait so that we don't throw away messages => we can simply consume messages and dump them
- We still don't see snapshot confirmed messages
- Just happens we forgot to loop in the
TxInvalid
case 🤦
Benchmark succeeds but only 16 out of 526 transactions suceeded
- When having an InvalidTx we simply resubmit it. Resubmitting transaction immediately hits hard on the node, so trying to increase the delay between initial submission => much better, see a lot of snapshots
We managed to run benchmark to completion with all the transactions by delaying submission time => now plotting and interpreting the results
Week's progress:
- We are getting closer to a real ledger, no real crypto but a real ledger
- It's not about TPS but about latency => we need to plot distribution of latencies providing some kind of guarantees
- We also want to test with more nodes, seeing how the cluster behaves with more participants Load testing = saturate resources (CPU/Memory) and observe response time => need to be able to tune throughput to saturate the nodes
- Need to trim down the logs:
- remove some network logs which are very verbose ? => need to confirm if the network logs are actually a problem
- do not log full event/effect, log the end events/effects using ids
- Logs are written in a tmpfs now, we should parameterize it to be able to store more of them. tmpfs is limited in size. Later on, use some cloud storage or log ingestion system
There are infinitely many possibilities with the logs, what do we really need now?
- Confirmation of simulation?
- Is latency increasing when adding more nodes in an exponential/quadratic way?
- Keep a transaction set around that we can use as reference, rather than generating one on the fly every time. We need 2 different tools, we can have 2-3 different scenarios to becnhmark
- Extension to load testser: make the number of nodes dynamic, submit transactrions to multiple nodes instead of a single one
- We also want to check CPU/RAM load of each node to ensure they are saturaed (also network bandwidth?)
- Do some cleanup work and make tests green again
- Debugging APISpec failures is somehow possible by temporarily adding more specs for sub-types and corresponding top-level properties to the schema, e.g. "utxo"
specify "Utxo" $ \(specs, tmp) ->
property $ prop_validateToJSON @(Utxo CardanoTx) specs "utxo" (tmp </> "Utxo")
and
utxo:
type: array
items:
$ref: "#/definitions/Utxo"
additionalItems: false
- I kept these ☝️ entries for
Utxo CardanoTx
andCardanoTx
to differentiate test failures - Realized that a
HUnitFailure
is not properly formatted inBehaviorSpec
-> red bin - Found the bug in
NewTx
for the failingBehaviorSpec
:
case canApply ledger utxo tx of
Valid -> \[ClientEffect $ TxValid tx, NetworkEffect $ ReqTx party tx\]
Invalid err -> \[ClientEffect $ TxInvalid{utxo = utxo, transaction = tx, validationError = err}\]]]
We had been validating against the confirmed ledger, but not reporting it being invalid using the seenUTxo, so the expectation was wrong.
This now also requires the test to wait for a SnapshotConfirmed
before re-submitting the secondTx
.
And the benchmark should likely do the same.
-
ToJSON should not contain empty objects, e.g. assets in a value
- We did remove it for the
assets
, but there are others - Maybe tackle this when also documenting the API format for txs
- We did remove it for the
-
Is the benchmark really a load-test?
- Using
hspec
/runSpec
would also handle and renderHUnitFailures
properly - This turns out to quite simple:
runSpec (it "some context" action) defaultConfig >>= evaluateSummary
(although this kills the process)
- Using
-
Created a log capture template to easily capture entries like this
(setq org-capture-templates
'(("l" "Log" entry
(file+headline org-default-notes-file "Log")
"* %? %T\n%a"))
;; other templates
)
We start writing a "missing" unit test for broacast-to-self in NodeSpec but we realise this is not possible as it's not directly observable -> just remove the comment about the implementation details and rely on indirect observations
We switch to complete serialisation of cardano transactions, working on adding minted values to the JSON format
- Added
mint
but it seems transactions are not generated with minted value ingenTx
for Mary => check what's going on to Alonzo
Runing the benchmark we got errors in the validation of transactions. First error is about wrong script witness, then other errors about invalid key witnesses. Some transactions are valid, and we see 18 being processed as TxSeen
, with 12 reported as TxValid
Error reporting in the benchmark is painful:
- we should stop as soon as we get a
TxInvalid
report - We are missing some information when we get a validation error, namely the details of the transaction that failed and the UTxo set to which the transaction was applied -> add it to
TxInvalid
and then we can use it as regression tests when we get a failure
Adding unit test harvesting output from the Benchmark (need some love to be more usable), got the following failures:
ApplyTxError [UtxowFailure (MissingScriptWitnessesUTXOW (fromList [])),
UtxowFailure (UtxoFailure (ValueNotConservedUTxO (Value 55938162 (fromList [])) (Value 107981334 (fromList [])))),
UtxowFailure (UtxoFailure (BadInputsUTxO (fromList [TxInCompact (TxId {_unTxId = SafeHash \"d2635419a791eef0ba694bbcb66de7c7e76a865a493e7d2cc46f5c6b1ecb7b8d\"}) 3])))]")
MissingScriptWitnessesUTXOW
shows an empty difference between needed and provided
Looking at how transactions are generated, we replace the property test on single transactions with one on a sequence of transactions -> The property fails, reusing the example to check why it fails
Trying to shrink the examples we have -> works for txs but not for utxos because the shrinking is not done in relationship with the UTXO
How can we generate a valid sequence of transactions that then fails to validate against the very same UTXO set used fir generating the transactions. What is the shrinker for lists doing by default?
Answer: The reason applying several transactions at once vs. applying one by one fails is that we throw away the delegation pool state between each application when we generate them, we only keep the changed UXxO set. This is not the case in the applyTxsTransition
which carries over both the UTxO set and the DPState, so transactions can now fail.
Not applying the transactions as a list but one by one works!
- Seems there's something wrong with the way we are applying the transactions to the ledger?
- => changing the interfcae of the
Ledger
to only apply one transaction at a time - It's probable the failures we are seeing in the benchmark is caused by reordering of transactions?
We see more InvalidWitnessesUTXOW
failure, with a list of public keys
- Issue probably comes from serialisastion of keys (and possibly script) witnesses, investigating from a failure, using the Haskell show instance to compare how it's serialised to JSON And back.
Managed to have a WitVKey constructed as:
key :: CryptoFailable (WitVKey 'Witness StandardCrypto)
key = do
pubkey <- publicKey @ByteString "\150\f[\192l\179\v\136%\182%\137 \STX\215\229up\228$V\157?F\151i\236\144\SI;e\142"
sig <- signature @ByteString "\160r\240\221\191\ACK\221*\193\178>\SUB\USL\252HAID0\DC1\NUL~\131\&0\DLEy\188\187\197u\236\&8\201\175aNK\150\141\224\190\EM\141\129\STX\155\231\226N'E\DLEZ\249\131,ao\156\156\CANA\t"
pure $ Cardano.WitVKey (VKey $ VerKeyEd25519DSIGN pubkey) (SignedDSIGN $ SigEd25519DSIGN sig)
Writing a ToJSON/FromJSON instance for WitVKwey, unpacking what we had in Witnesses before => The WitVKey
is correctly encoded and decoded
Trying to chase the source of the error we are seeing from a failing transaction, deserialising the JSON witnesses and checking if they match the input transaction's => they do
However, this transaction contains minted values with a ScriptHash
value as policy ID, could be the case that we get a missing script witness because we don't pass down the mints in the body?
We are minting the value
mint = Value 0 (fromList [(PolicyID {policyID = ScriptHash "42c7a014a4cd5537f64e5ae8ec7349db3d8603e16765dc37f8fb6e67"},fromList [("
yellow0",134392),("yellow5",368980)])])}
which matches a script hash provided as part of the witnesses. Could it be we get a witness error because there are too many witnesses? OR a script provided is not matched in the body of the transaction? => yes
- The verification of script hashes checks that all script witnesses are used, and all required scripts are present
- Added JSON instance for assets in
Value
so that we complete the TxBody, bar the PP updates
Still having an error in the benchmark
ApplyTxError [UtxowFailure (UtxoFailure (ValueNotConservedUTxO (Value 309051813 (fromList [])) (Value 333277734 (fromList [])))),UtxowFailure (UtxoFailure (BadInputsUTxO (fromList [TxInCompact (TxId {_unTxId = SafeHash \"5e2921b6a85257bcdb0f2c5e9d96f0e5ed7cf199a646ce4d5d8961fa939bb126\"}) 2])))]")
It's perfectly possible for a submitted transaction to not be applicable at NewTx
time, but in the HeadLogic
we still submit it as a ReqTx
and report a TxInvalid
to the client
- In the original paper, transactions are required to apply to the confirmed UTxO set before being propagated
- Changing the logic of
NewTx
to:- Validate transaction against confirmed set (from latest snapshot
- Not send a
ReqTx
if the transaction does not apply The behaviorSpec test now failswhich is to be expectedFailureException (HUnitFailure (Just (SrcLoc {srcLocPackage = "main", srcLocModule = "Hydra.BehaviorSpec", srcLocFile = "test/Hydra/BehaviorSpec.hs", srcLocStartLine = 248, srcLocStartCol = 15, srcLocEndLine = 248, srcLocEndCol = 31})) (Reason "Test timed out after 1s seconds"))
Two things for tomorrow/next session:
- Change in the HeadLogic and adapt the tests
- Adapt the benchmark to use hspec to run it so that we get better error reporting. It's not really a benchmark anyway, it's more a load test.
We should change the names of the witnesses
fields:
-
scripts
is fine -
addresses
->keys
(and break it down into avkey
and asignature
part)
We discussed the perceived awkwardness of the NodeSpec
test as it is now:
- We should test at the boundaries and stub the effects, no more, and use the same
createHydraNode
function for all tests - This means we should move the
BroadcastToSelf
wrapper into the node and not configure it outside as it is an integral part of the behaviour of the node - Alternative would be to bake reinjection of
Event
fromEffect
into theHeadLogic
protocol itself - Writing a test exposing
Wait
of some event: We injet out of order AckSn/ReqSn and expect to see our own AckSn But we see the AckSn with a weird signature... => Refactoring Node code to have a dedicated createHydraNode function that does the wrapping - There is a problem in our
createHJydraNode
function:withHeartbeat
andwithBroadcastToSelf
require and produceNetworkComponents
which contain both sending and receiving part of the network. Solution is to refactorcreateHydraNode
towithHydraNode
as a with pattern => Let's take a step back and not focus too much on this refactoring => keep the test pending for now and refactor later in solo mode
We then turn our attention towards benchmark errors again:
- Fixing the test output to be less verbose so that we get a better error reporting Pondering if we should not write the messages into a file, but it makes things more complicatred, going for the simple thing of truncating the list of messages when displaying the error
We were waiting for snapshotConfirmed
and we changed the serialisation format to use Generic
which emits SnapshotConfirmed
with a capital C 🤦
Scaling bench again we have the timeout on waiting for confirmations again, but with more snapshots produced => Extracting the snapshot number from the confirmed ones and reporting it, instead of throwing an error and giving information, let's report progress from the benchmark
- We should use our
Tracer
to show progress in the benchmark
Investigating another failure, we see that node-3 gets a ReqSn
that it drops because it is still processing another one. This case is actually incorrect, we should Wait
if we get a ReqSn
that could be valid in the future:
Writing a unit test in HeadLogic:
- Send 2 ReqSn in a row, should wait the 2nd one
- Receiving a ReqSn which is from the past should fail
Added some tests to cover the case of snapshots "from the future", re-running the benchmarks works on AB's machine, yielding 50-70 snapshots whereas it fails on SN's machine which is faster
Node 1 ends up not emitting snapshots, seems like it is not emitting a network effect to send ReqSn
message
Trying to drop the null (seenTxs )
condition works => the condition should be at the level of the guard so that we don't "consume" The DoSnapshot
event without either waiting or emitting a ReqSn
- There is a property waiting to be written there, expressing snapshot strategy invariants in terms of variation of state/sequence of events.
Benchmark now runs to completion without failing 🎉
Discussing the snapshot strategy as it's getting somewhat cumbersome now
(Cont'ed work on Tx generator)
Setting most values to 0 lead to an error in the generator, with all frequencies set to 0
. QC.frequency
is used in different sections of the generator:
- To generate credentials registration => some
XXCred
must be non 0 - To generate delegation =>
frequencyKeyCredDelegation
orfrequencyScriptCredDelegation
must be non 0, or there must not exist a stake pool to delegate to (inDPState
?) - To generate withdrawals => we use defaults so should be fine?
Got another error now:
ApplyTxError [UtxowFailure (MissingScriptWitnessesUTXOW
So the transactions are generated sometimes with script addresses which obviously require a script witness, which we simply drop when creating the witnesses... => Going to add handling of script witnesses in the CardanoWitness
data structure
- The script witnesses field is actually a
Map ScriptHash Script
, trying to serialise it as an object? Interestingly there's aToJSON
instance forScriptHash
but noToJSONKey
which is somewhat sad - => Added scripts witnesses to the
ToJSON
instance for witnesses, andToJSONKey/FromJSONKey
instances forScriptHash
, now testing to see if I get failures in golden and roundtrip for witnesses, which should be the case... - Added JSON instances for
Timelock
so that the JSON instance for witnesses is simpler
All serialisation tests now pass, now having an even more complex error with several issues:
Left (ValidationError {reason = "ApplyTxError [
UtxowFailure (MissingScriptWitnessesUTXOW (fromList [])),
UtxowFailure (InvalidWitnessesUTXOW [VKey (VerKeyEd25519DSIGN (PublicKey \"\\177g\\EM@R\\DC2\\251\\129\\GS\\175\\211+t\\146\\161\\205\\174\\138\\247\\154S\\244>\\r\\f%\\195U\\141\\166\\234\\&9\")),VKey (VerKeyEd25519DSIGN (PublicKey \"\\252]=\\212&\\139\\138\\240\\185\\ESC\\185\\GS\\186Dk\\164\\ESC`\\249I\\186\\163\\224K\\r\\SI\\192KT\\204\\160\\SO\")),VKey (VerKeyEd25519DSIGN (PublicKey \"\\150\\f[\\192l\\179\\v\\136%\\182%\\137 \\STX\\215\\229up\\228$V\\157?F\\151i\\236\\144\\SI;e\\142\")),VKey (VerKeyEd25519DSIGN (PublicKey \"o9\\174\\133\\251V\\252\\247\\210j\\187\\DC4\\178\\223@\\225\\182\\&9\\148\\a\\229\\\"4{\\185XR\\210<\\245\\154\\255\")),VKey (VerKeyEd25519DSIGN (PublicKey \"J\\219\\247.<\\203\\238\\216\\162\\EMhY{\\ESCk#\\214\\155\\170\\206J\\210\\FS\\206\\130\\209\\158s\\255\\&4\\255\\ETB\")),VKey (VerKeyEd25519DSIGN (PublicKey \"\\EOT0\\ETBo\\183\\n\\138\\182\\143\\192#\\172\\183\\243\\245\\215Sp\\201\\220\\DLE)\\SYNQ\\167\\ETB\\251\\218e\\ETX\\132\\196\")),VKey (VerKeyEd25519DSIGN (PublicKey \"h\\239\\210sTVfp\\NAK2-\\130\\STX\\253a\\DC2\\209\\204n\\245\\188\\213\\138cG\\136\\186I\\r\\249\\173\\143\"))]),
UtxowFailure (UtxoFailure (ValueNotConservedUTxO (Value 230733318 (fromList [])) (Value 230733318 (fromList [(PolicyID {policyID = ScriptHash \"42c7a014a4cd5537f64e5ae8ec7349db3d8603e16765dc37f8fb6e67\"},fromList [(\"yellow1\",729252),(\"yellow2\",901652),(\"yellow4\",871114),(\"yellow5\",127109)])]))))]"})
The transaction is relatively small:
{
"witnesses": {
"scripts": {
"42c7a014a4cd5537f64e5ae8ec7349db3d8603e16765dc37f8fb6e67": "820181820181820180",
"a3e84983320841577ac20d77058e440d7fb7e17e98659e921b1274a3": "83030383820282820182820181820518208200581c733aea10df2a2bb1d3019a7337b240ad64a174c919fc034fb372fdc9820182820181820418208200581c733aea10df2a2bb1d3019a7337b240ad64a174c919fc034fb372fdc9820282820182820181820518598200581c571758200680b643781738e0436291811be83c1707fc66edd4982b0e820182820181820418598200581c571758200680b643781738e0436291811be83c1707fc66edd4982b0e8202828201828201818205183582$0581cb9acb4b5682ddb6980f2471bbd13a3765e54d79ebf46417c850a609c820182820181820418358200581cb9acb4b5682ddb6980f2471bbd13a3765e54d79ebf46417c850a609c"
},
"addresses": [
"8200825820b16719405212fb811dafd32b7492a1cdae8af79a53f43e0d0c25c3558da6ea395840b8de36f9836332743d8068478fd5a1e93aeff12dfade0dedf86c74a252e23c1f7903b81d43a6a8b21e42b08fb531c2e9f6e78080aa71bf234e5117a7a1328a0f",
"8200825820fc5d3dd4268b8af0b91bb91dba446ba41b60f949baa3e04b0d0fc04b54cca00e5840ff802a5358b84e2110f981a697b60141dce0925f251a954c3c04877ea061083b9ddbfd40ce96a72e3600950bd4b866a49965480d70f0f45e186f8bcc8f9d130d",
"8200825820960c5bc06cb30b8825b625892002d7e57570e424569d3f469769ec900f3b658e5840cb9aa3267c3c7a05aabb7e3f57b62b238590922ff9e2b4d5965d4eab5a3fee92ccb1fd095c87ee7685f18a8704b04234ba56236adfb037ea157988aa8605e902",
"82008258206f39ae85fb56fcf7d26abb14b2df40e1b6399407e522347bb95852d23cf59aff58402e639dd813c0a9879366f7c9491ea95d70134be90687b7687308551488a556811c4fc0aced07af841f2e5cc0248af747cf3dfd506d5158d71592878800bce709",
"82008258204adbf72e3ccbeed8a21968597b1b6b23d69baace4ad21cce82d19e73ff34ff1758404c94c33d62fd704369c110cd010a4d6ea04eb002ae6c3fbdadb0d62a84b03c0e2be84696631d82e82eb3deec595c72e5b2d810f3a95909058bc2e82549a0a104",
"82008258200430176fb70a8ab68fc023acb7f3f5d75370c9dc10291651a717fbda650384c45840135bcc2f58a21a9273e8e7ca481a744aefadf12b48e9e8b5b3e5e6820e04bf74113d696cd45f5b5d17ade8d23e38522902dd4463852d17c9ce4c818e61c2c107",
"820082582068efd2735456667015322d8202fd6112d1cc6ef5bcd58a634788ba490df9ad8f5840fa9f0adb12e0ccea8ef31c656af30b473334c026228f223940e08f2ec344c6c9cc0d3a9b4ba0f597f79fb84bb885fd59089fc12890c3563b96a4121e68fb2701"
]
},
"body": {
"outputs": [
{
"address": "addr_test1qqzfllufs42yh9tz3j5zeqeh8v789hvzvz57kd4n5xez0pnc0554t3aspn7xrc0ekfq7he5gwwx935kc8yzx00znxr3shu4jta",
"value": {
"lovelace": 46146663
}
},
{
"address": "addr_test1qryc674js99w50kjf30heds8eqqe0vre3d8487swgrmd7q5a8uwp3k06h9vg32z7lrnzjvpey9eymx7zq8atvz755sjqcguqss",
"value": {
"lovelace": 46146663
}
},
{
"address": "addr_test1qz2y2kkjyhz4d5957msrgtesv85aerhynpgp29fnrxq676zpgz2ae0fmt5rr6kzr97c4e0qg8jvvnx4ktjnlx7unu4wsaad457",
"value": {
"lovelace": 46146663
}
},
{
"address": "addr_test1xzd9pvulknjv8x5fq6d9pluz3kccxw3l8rzvjk4tzdndug2900ddnvzdkregx3scav6qjjc0vq0l9apfh9sd8983zcesws0shy",
"value": {
"lovelace": 46146663
}
},
{
"address": "addr_test1xqs6qrhy9hu77wms0xcryy9tcnv32340gy63ejdz7zxeqj8kf86jrtuhuy6recsnpsn9gfen2u6uueqdljnsqlvu5kpsaldxc9",
"value": {
"lovelace": 46146663
}
},
{
"address": "addr_test1qqh8gkdmj6d8exd4tq65hql93dclhkfamh7dgr6etf8l4k5akymz4y4zwhufvwrymmy08acmy2ujkllln7jcs43k87ys04zqf8",
"value": {
"lovelace": 3
}
}
],
"inputs": [
"03170a2e7597b7b7e3d84c05391d139a62b157e78786d8c082f29dcf4c111314#20",
"03170a2e7597b7b7e3d84c05391d139a62b157e78786d8c082f29dcf4c111314#58",
"03170a2e7597b7b7e3d84c05391d139a62b157e78786d8c082f29dcf4c111314#61",
"03170a2e7597b7b7e3d84c05391d139a62b157e78786d8c082f29dcf4c111314#80"
]
},
"id": "f207b15e2b5691ce237d0b28e6e26cfc5f933281eb16c363458652d663b3dd29",
"auxiliaryData": null
}
So it is missing outputs for non-ADA tokens obviously => We drop the mint
field from the generated body and a lot of other fields, which is a remnant from previous attempt at dumbing down the Body. Just passing the body encoded as is should be fine hopefully, except we are probably dropping a lot of fields in the serialisation process => Test for applying transactions succeeds, now going to wire that in the benchmark
ETE test is now failing which is expected as the format has changed.
- Network is wrong, should unify to
TestNet
for all components. Funnily, theTestnet
data constructor from cardano-api is different from the one in the Shelley ledger as it takes aMagic
argument, but in the conversion process it is simply dropped.
As expected, when converting the benchmark to use CardanoTx, it fails to validate the transactions emitted because our serialisation is missing quite a lot of fields from the TxBody. We should either filter those, or complete the serialisation to handle more body fields => Covering JSON serialisation of missing fields in TX, in order to ensure we can properly encode/decode all kind of transactions, we'll deal with rejecting irrelevant transactions later on.
Continued working on generating transactions and checking roundtrip/goldeb serialisation.
Need to tweak the max transaction size parameter to find the right one:
- Default maximum tx size is set to 2048 (bytes?): https://github.com/input-output-hk/cardano-ledger-specs/blob/nil/shelley/chain-and-ledger/executable-spec/src/Shelley/Spec/Ledger/PParams.hs#L320 Settting it to 1MB for the moment
Got a different error this time:
ApplyTxError [UtxowFailure (UtxoFailure (UpdateFailure (NonGenesisUpdatePPUP (fromList [KeyHas
h \"099f27f2d9bc901017518ee78b9b12a52ce658142e255666e2ce0b9d\",KeyHash \"859e3a86e34626df256a84ee03d813819aa731e854b6e4034e7024e0\",KeyHash \"a
01f063c96ada95334fcdc7beb3a8fb2d0ff4ee8d206be17fa1becae\",KeyHash \"a94e6fffab278ffef8092918bc3ae6ac47d3cf8d9f4b923ecbfd8236\",KeyHash \"f90e54
1ed22c517263ab0885721c02f08a313b21de009efd3672afed\"]) (fromList []))))]"})
Trying to strip the generated transactions' body from updates stuff but turns out it's not as simple as this, of course, simply stripping the body from the parts we are uninterested in leads to more errors:
[UtxowFailure (InvalidWitnessesUTXOW [VKey (VerKeyEd25519DSIGN (PublicKey \"J\\21
9\\247.<\\203\\238\\216\\162\\EMhY{\\ESCk#\\214\\155\\170\\206J\\210\\FS\\206\\130\\209\\158s\\255\\&4\\255\\ETB\")),VKey (VerKeyEd25519DSIGN (
PublicKey \"h\\239\\210sTVfp\\NAK2-\\130\\STX\\253a\\DC2\\209\\204n\\245\\188\\213\\138cG\\136\\186I\\r\\249\\173\\143\"))]),UtxowFailure (Miss
ingTxBodyMetadataHash (AuxiliaryDataHash {unsafeAuxiliaryDataHash = SafeHash \"8493f75f77f6f02b5342998e180be02fa132c26fba140bdb51026d9f1a2f6bce
\"}))]"})
As pointed out by Jared, we can tweak the generator by setting various parameters in the Constants argument
Reviewing TUI code written by SN
Trying to fix NodeSpec
test by adding a Wait
to replace error
-> of course, unit test pass but benchmark still fails, note we have to revert to using SimpleTx
in node and mock-chain
- One problem is that we can emit 2 times the same snapshot
Writing a NodeSpec
test to expose the problem of not emitting 2 ReqSn
for the same snapshot twice
- We don't see any
ReqSn
after injecting a bunch of txs -> node is not the leader - Also we do not handle effects so we want to create the node with a list of events to prepoluate the queue and then process events until completion or quiescence eg. when the queue is empty
Starting at ed8eeaba94efffca8596e2339e03b1852d3ce4aa BehaviorSpec
tests hang:
-
We spent some time troubleshooting an issue which ended up being caused by the following code:
runHydraNode :: Tracer m (HydraNodeLog tx) -> HydraNode tx m -> m () runHydraNode tracer node = forever . stepHydraNode
It happens that
forever
has typeApplicative f => f a -> f b
, and(a ->)
is actually anApplicative
so forever would endlesslly evaluate a thunk which reduce to a function which cannot be evaluated further, which locks the process.
Ended up having a finer grained description of the SeenSnapshot
than a Maybe
to distinguish the situation from the point of view of the leader and the followers so that we don't make too many snapshots.
There is an interesting micro-pattern here that also is prominent in cardano API which is to not have Maybe blindness: Use expressive and "domain relevant" ADTs to express the state.
Re-running the benchmark (replacing ledger type) still fails in the followers with an out-of-order ReqSn
but it fails much later, like in snapshot number 39 which is some progress: We get ReqSn
for 40 but we are still processing 39 in node 2. We don't have any InvalidEvent
in the node 3 though.
-
Added command line parameter to decide to which node to connect using
--connect
- Uses the
Host
type ofhyrda-node
moduleHydra.Network
- Needed to add more details to
ClientJSONDecodeError
to realize that thehydra-node
s in docker containers were still previous API JSON instances - Rebuilt the docker images and re-started nodes using
docker-compose build
anddocker-compose up -d
indemo/
- Uses the
-
Start adding commands to drive the Head lifecycle
- Rendering
[i]nit
and handling theKeyEvent
is quite easy withbrick
- Now the tricky part is implementing
Client{sendInput}
for a not necessarily connected websocket
- Rendering
-
Switching to
CardanoTx
and adding[a]bort
was a breeze -
I would like to test
handleEvent
as it gets quite complex, but I don't know how? -
At some point I realized that the
hydra-node
containers all full cpu utilization - are we busy looping?- Yes, in the recently introduced logging rewrite when flushing the queue: https://github.com/input-output-hk/hydra-poc/pull/63
Small note on the APISpec
which tests our types against api.yaml
:
test/Hydra/APISpec.hs:35:7:
1) Hydra.API, Validate JSON representations with API specification, ServerOutput
Assertion failed (after 1 test and 7 shrinks):
[PeerDisconnected {peer = VerKeyMockDSIGN 0}]
{'peer': 0, 'tag': 'PeerDisconnected'}: {'peer': 0, 'tag': 'PeerDisconnected'} is valid under each of {'required': ['tag', 'peer'], 'title': 'PeerDisconnected', 'properties': {'peer': {'$ref': '#/definitions/Peer'}, 'tag': {'enum': ['PeerDisconnected'], 'type': 'string'}}, 'type': 'object'}, {'required': ['tag', 'peer'], 'title': 'PeerConnected', 'properties': {'peer': {'$ref': '#/definitions/Peer'}, 'output': {'enum': ['PeerConnected'], 'type': 'string'}}, 'type': 'object'}
Does not mean necessarily that the PeerDisconnected
is implemented wrong, but in this case it was indeed the specification in api.yaml
of PeerConnected
!
It's a bit confusing that no PeerConnected
was in the failing list, althuogh this might come from the batch-wise invocation of jsonschema
and shrinking?
Discussing strategies and options for roadmap of Hydra. Could it be interesting to frame this using Real Options?
Merging PR about bech32 addresses, seems like TH is a bit overkill but OTOH it's safer and less ugly than handling a Left
impossible. Also, handling of various addr
types is cumbersome and could be removed if we used cardano-api's functions -> put a red bin to refactor that later
Got a failure in the mock0chain serialisation of UTxO: It's because the mock-chain is using SimpleTx
and not CardanoTx
. Could it be made more polymorphic and agnostic in the type of transactions it transports? => Not easily
We don't report the error to the InvalidInput
which is annoying -> moving the InvalidClientInput
wrapper to the HeadLogic
module to have it available there, we realise this representation is actually too complex and does not roundtrip properly in JSON. We fixed the encoding of invalid input as just Text
and display it to the end user (in ServerOutput
).
- Looks like we did not do the right thing the first time, namely making sure we are reporting error to the user properly
We now have a proper error message with ETE test: The transaction fails to be deserialised properly, and it points to the witnesses not being properly encoded.
- Writing unit test with the faulty transaction as JSON
- Comparing ETE encoding with what we have in the node, seems like the ETE is encoding the witness set as a CBOR list and not a list of CBOR?
- We encode a list of
KeyWitness
on the client side, which seems ok, but the encoding ofKeyWitness
is weird, depending on the type of witness it encodes it as a 2-elements list with the first element being a discriminator: We should do something symmetric in the ToJSON/FromJSON in the Cardano ledger
Serialisation of TX is working fine, and it now it fails on applying the transaction to the ledger: We'll reuse existing functions from MaryTest
- Previous code for applying transactions directly used
Cardano.Tx
but we have wrapped it in our type Need to convert aCardanoTx
to aCardano.Tx
- We hit the same problem AB had for tx generation about
TxBody
: The one available in the API is not the correct one -> Changing for the proper on inShelleyMA
We see the transaction is sent and injected into the ledger but it is invalid: We need to improve the reporting of errors about invalidity of transactions
- Make
ValidationError
more verbose which showed us our addresses were incorrect, namely we sentMainnet
and receiveTestnet
{"transaction":{"witnesses":["8200825820db995fe25169d141cab9bbba92baa01f9f2e1ece7df4cb2ac05190f37fcc1f9d58400599ccd0028389216631446cf0f9a4b095bbed03c25537595aa5a2e107e3704a55050c4ee5198a0aa9fc88007791ef9f3847cd96f3cb9a430d1c2d81c817480c"],"body":{"outputs":[{"address":"addr1vx35vu6aqmdw6uuc34gkpdymrpsd3lsuh6ffq6d9vja0s6spkenss","value":{"lovelace":14}}],"inputs":["9fdc525c20bc00d9dfa9d14904b65e01910c0dfe3bb39865523c1e20eaeb0903#0"]},"id":"56b519a4ca907303c09b92e731ad487136cffaac3bb5bbc4af94ab4561de66cc"},"output":"transactionInvalid","validationError":{"reason":"ApplyTxError [UtxowFailure (UtxoFailure (WrongNetwork Testnet (fromList [Addr Mainnet (KeyHashObj (KeyHash \"a346735d06daed73988d5160b49b1860d8fe1cbe929069a564baf86a\")) StakeRefNull])))]"}}
Node now fails because of missing ToCBOR/FromCBOR instances for CardanoTx
which prevents proper communication with other nodes.
- We realise our
CardanoTx
type is problematic as theTxId
sshould always stays in sync with theTxBody
-> remove it from the data structure and recompute it every time - We got stuck in decoding the
Annotator TxBody
as there is noFromCBOR TxBody
instance: TheFromCBOR
class provides aDecoder
but therunAnnotator
requires access to the underlyingByteString
which is annoying and prevents us to use aFromCBOR (Annotator a)
wihthin aFromCBOR
instance needing aFromCBOR a
. - Getting dragged into the weeds of how transactions get serialised inside a node....
- Solution is to
decodeBytes
then use those bytes as input to thedecodeAnnotator
function. We write theToCBOR/FromCBOR
instances ofCardanoTx
using the underlyingFromCBOR(Annotator Tx )
instances and reconstruct the txId using the body
Finally got a green ETE test! 🎉
Implemented a basic Arbitrary
instance for CardanoTx
to have a proper roundtrip and golden test.
I have made the instance use a genCardanoTx
function that will come in handy once we want to generate sequences of valid transactions, for example in the benchmark or end-to-end tests.
Now adding a ToCBOR/FromCBOR roundtrip test for completeness' sake.
Got failures when trying to quickcheck application of transactions to the Cardano ledger as the generated transactions run on the Testnet
and not the Mainnet
which changes the addresses used. -> Settle on using Testnet
everywhere
Then another failure: The problem is that generated transactions are more complex than what we cope with in the ledger apparently.
- BTW, it's really not obvious how to meaningfully shrink a transaction!
- The transaction generated are actually using the auxiliary data so when we simply drop those in the generator, we make the transaction possibly invalid -> Add the auxiliary data as a field to the
CardanoTx
so that we have all the information for a real transaction.
Looking at writing a genertor for our CardanoTx
. There is a genTx
function provided in ledger-specs
: https://github.com/input-output-hk/cardano-ledger-specs/blob/master/shelley/chain-and-ledger/shelley-spec-ledger-test/src/Test/Shelley/Spec/Ledger/Generator/Utxo.hs#L103 which produces valid transactions. Trying to salvage work I have done before on model based tester
Follow-up on update/abort in PAB: We can't both wait for updates and expose endpoints because the update resumption always is done from the tip of the chain apparently, which means that by the time the abort is done, the update will miss the abort transaction. => activate 2 different contracts in 2 different threads
We add more tickets to the backlog on Miro, filling in some gaps we perceive in what's need for the milestone. We also agree on making less tickets "ensemble-only" to allow team members to pick more stuff when working alone.
We end the session closing the loop with the "real" cardano ledger in the Head:
- AB had prepared To/FromJSON instances for most of our types, so we could start by wiring up the
Hydra.Ledger.Cardano
- When simply using
cardanoLedger
inhydra-node
, the newTx
and associatedUtxo
types were used - We had still an error when deserializing a
Commit
client input and added the aeson error to theAPIInvalidInput
error trace - Finally we realized that it was not parsing because the address format we used in our e2e fixture is using
bech32
, while we were still de-/serializing a raw hex serialization for the address
-
Two solutions had been researched, but not audited yet
- One is implemented now by Inigo
- Both would be transparently supported by the node (when verifying)
-
Some little addition required in libsodium to make it possible to construct a multisig signature so that, on the verifier side no change would be needed and classic Ed25519 verification can be used.
-
Generating and not reusing nonces is a vital part (of engineering)
-
While it works in practice at the moment, the theory behind the MuSig2 needs to be validated from a mathematical standpoint.
-
Next steps
- Have it theoretically defined -> will yield a formal definition
- Implementation / changes to libsodium are checked against that
-
Really the most complicated part is managing nonces
- To produce a partial signature the signing party needs to have the nonce for it from all other signers
- Signers need to produce those nonces and need to keep a state of already produced nonces because reusing a nonce means disclosing private key. This implies keeping state on nodes for generated nonces
- Aggregation can be done by anyone, it does not handle any secret
- Aggregation of public keys and conversion from prime order group can be done once, at startup time
-
Other projects at IOHK will be using a non-interactive multi-signature scheme which requires changes on the verifier
- They will require a hard-fork then?
-
Code is located at https://github.com/input-output-hk/musig2, not production ready but something we can iterate on.
Then discussion about non-custodial head deployment model with some delegation, something like lightning with watchtowers. Conclusion is that it seems feasible as long as lawyers agree this is indeed a non-custodial solution.
There's no generally agreed upon JSON representation of a cardano transaction, which is somewhat annoying: In the cardano-api there are ToJSON
instances but no FromJSON
, and only for some parts of the API.
Switch to representing the utxo as a map from TxIn
to TxOut
instead of an object with a ref field and an output field: This means will be encoding the TxIn
as it's done in the cardano-api, namely as a string with transaction id plus index.
We should write roundtrip serialisation tests for TxId
, using the Gen TxId
available in cardano-api perhaps?
There are actually 2 interesting properties:
- conformity with cardano-api
- roundtrip ToJSON/FromJSON
Quick reflection on the session: We lost track this morning of our TDD principles, did not run our ETE test once and got lost in the weeds of implementing serialisation code for the full transasction whereas we would only need the UTXO to make some progress (eg. be able to sned commit
commands).
Discussing demo:
- what kind of audience are we targeting? devs/enthusiasts/SPOs.. -> interested in technical stuff
- summit is also a wider audience so having something graphical to show?
- the message is about scaling to global use cases
- is it the first time in Cardano we are "locking" fund?
- commits are mandatory, and an integral part of the head's capabilities
Do we really need a "lay people" demo? => Probably not, it might blur the message, letting people think this is all done and packages where it really is not => something to leave for marketing people to work on, to carve the right message
No need to have the 3 of us working on finishing serialisation, AB is going to wrap up the Cardano
head ledger so that we can have ToJSON/FromJSON instances working and tested, then we pick up the ensemble session working on the validation and integration of actual ledger.
Managed to get a reduced out-of-order snapshot test case, after extracting an immutable prefix from the events stream so that keys and committed UTXOs are right. Trying to plug this reduced test case in the shrinker does not lead to more shrinks so it seems in a sense to be minimal.
Struggling with getting the ToJSON/FromJSON instances right. Wrote a roundtrip test for UTxO and then the JSON instances would seem easy enough, but there's this crypto
and era
type parameters which are pretty annoying.
- Solution is: Make
Crypto crypto
a constraint for all typeclasses and a parameter for types and this will make testing easier
Got everything to compile and test to run, but roundtrip is failing. Going to troubleshoot.
- Trying to reproduce test failure in the REPL, but it does not work as easily: Several instance of FromJSON are in scope as the
cabal repl
command loads all the files from the component - The issue is in the parsing of
TxIn
as key in mapsAs expected, working with encoded txIn is a PITA...Left "Error in $: parsing Natural failed, expected Number, but encountered String"
Making progress, can now properly serialise UTXO and witnesses:
Hydra.Ledger.Cardano
Cardano Head Ledger
JSON encoding of (UTxO (ShelleyMAEra 'Mary TestCrypto))
allows to encode values with aeson and read them back
+++ OK, passed 100 tests.
JSON encoding of (UTxO (ShelleyMAEra 'Mary TestCrypto))
produces the same JSON as is found in golden/UTxO (ShelleyMAEra 'Mary TestCrypto).json
JSON encoding of (WitnessSetHKD Identity (ShelleyMAEra 'Mary TestCrypto))
allows to encode values with aeson and read them back
+++ OK, passed 100 tests.
JSON encoding of (WitnessSetHKD Identity (ShelleyMAEra 'Mary TestCrypto))
produces the same JSON as is found in golden/WitnessSetHKD Identity (ShelleyMAEra 'Mary TestCrypto).json
Working again on signing a transaction to submit to the head, relevant function is signShelleyTransaction
from cardano-api, but unsure if this is the right way to go as it seems a bit hard to work with.
We start trying to salvage what we did for MaryTest in order to build the txBody
but give up after some efforts building the transaction: The ledger API used for MaryTest
is too low level, we should really start from what we need in cardano-api, namely signShelleyTransaction
and resolve issues from there.
Looking at how to build a transaction from the cardano-cli, what kind of data it provides. -> it uses TxBodyContent
passing the different bits of information. Seems like a good strategy is to use exclusively stuff from Cardano.Api
module which exposes the full node API.
For the moment, we use the getTxBody
and getTxWitnesses
to extract the data from the signed tx and just shove it as encoded strings into the JSON, but we want to have more details in the transactions.
We managed to build a complete transaction and print it in encoded form, now trying to format a transaction in JSON to send to the node. It's mildly annoying the cardano API does not provide default/empty values for a TxBodyContent
so that we could just update the parts we are interested in, but it's nice it provides explicit types (note Maybe x
) for all fields.
Also, we created a minimal JSON serialization of a Cardano Tx
by "viewing" the TxBodyContent
and using (partially) available ToJSON
instances for TxIn
and TxOut
.
{
"witnesses": [
"8200825820db995fe25169d141cab9bbba92baa01f9f2e1ece7df4cb2ac05190f37fcc1f9d58400599ccd0028389216631446cf0f9a4b095bbed03c25537595aa5a2e107e3704a55050c4ee5198a0aa9fc88007791ef9f3847cd96f3cb9a430d1c2d81c817480c"
],
"body": {
"outputs": [
{
"address": "addr1vx35vu6aqmdw6uuc34gkpdymrpsd3lsuh6ffq6d9vja0s6spkenss",
"value": {
"lovelace": 14
}
}
],
"inputs": [
"9fdc525c20bc00d9dfa9d14904b65e01910c0dfe3bb39865523c1e20eaeb0903#0"
]
},
"id": "56b519a4ca907303c09b92e731ad487136cffaac3bb5bbc4af94ab4561de66cc"
}
Now need to make the test pass!
-
Discussion on user interfaces and how or whether to split between a high-level "wallet" and more lower-level management UI
-
Set off to "use the cardano ledger"
- Revisited codebase and see what's in
MaryTest
, what would be missing and where we likely need to change things
- Revisited codebase and see what's in
-
What's our goal? Start from the outside! We want to have our end-to-end test be using cardano transactions
-
We use our "own" json format, but do intend to support the "serialized cbor" format for accepting transactions later on
- Rationale being, that some
ServerOutput
is showing full transactions and clients likely are interested in comparing sent / seen transactionoutputs
etc.
- Rationale being, that some
-
We stopped at signing the transaction, which in particular is not provided by the
cardano-ledger-core
based API we had been using for constructing addresses, so we think about switching to usingcardano-api
for constructing / signing a transaction- Also, it seems to be the most "blessed" and somewhat high-level API for dealing with Cardano transactions
-
Few tools and documents mentioned that are quite useful when working with Cardano data:
- bech32: very simple yet powerful for converting strings to/from bech32)
- cardano-addresses: handy command-line for creating, hashing and inspecting addresses and scripts. It has a nice(r than cardano-cli) interface.
- cbor.me: simple tool for inspecting hex-encoded CBOR content
- Mary CDDL & Alonzo CDDL: CBOR specifications for Cardano binary types.
- Creating a first draft for a terminal user interface using
brick
to "manage a hydra node"- This will be a Hydra client, which connects to a (local) hydra-node
- Will focus on introspecting the hydra-node and the Head state, as well as opening and closing
- Start with static Brick UI which only shows
version
of the TUI and can be quit - Attaching it to the
hydra-node
using aClient
component (see ADRs) which opens a websocket connection to thehydra-node
- For now hard-coded host and port
- Deserialize
ServerOutput
and handle them as "application-specific events" usingcustomMain
- For example
PeerConnected
updates a list ofconnectedPeers
in the State anddraw
paints them
- Making the
hydra-node
connection robust ist non-trivial though- Connectivity should ideally be known to the UI
- Changing the State to
Disconnected | Connected {...}
to make "invalid states unrepresentable" - Extend event type to be something like
ClientConnected | ClientDisconnected | Update ServerOutput
- Retry connection upon
ConnectionException
ofwebsockets
is not enough, need to catch and retry also onIOException
(initially)
- Next steps:
- Testing, any interesting properties in handling events / drawing?
- Command line parsing for picking
hydra-node
to connect to - Adding commands and conditional rendering on
HeadState
-> How to infer it and which command possible fromServerOutput
?
Seems like the JSON logger we are using is actually unreliable, some messages appear truncated in the output, like:
{"thread":"41","loc":null,"data":{"network":{"data":{"trace":[["event","receive"],["agency","ClientAgency TokIdle"],["send",{"contents":{"tran\sactions":[{"outputs":[1797,1798,1799,1800,1801,1802,1803,1804,1805,1806],"id":329,"inputs":[1785,1795]},{"outputs":[1807,1808,1809],"id":330,\"inputs":[1801,1802,1803,1804,1806]},{"outputs":[1810,1811,1812,1813,1814,1815],"id":331,"inputs":[1798,1805,1807,1808]},{"outputs":[1816,1817\,1818],"id":332,"inputs":[1796,1800,1810,1813,1815]},{"outputs":[1819,1820,1821],"id":333,"inputs":[1797,1799,1809,1811,1812,1814,1816,1818]},\{"outputs":[1822,1823,1824,1825],"id":334,"inputs":[1819,1820,1821]},{"outputs":[1826,1827,1828,1829],"id":335,"inputs":[1817,1823,1824,1825]}\,{"outputs":[1830],"id":336,"inputs":[1826,1828]},{"outputs":[1831,1832,1833,1834,1835,1836,1837],"id":337,"inputs":[1822,1830]},{"outputs":[1\838,1839,1840,1841],"id":338,"inputs":[1829]},{"outputs":[1842,1843,1844,1845,1846,1847,1848],"id":339,"inputs":[1827,1831,1832,1834,1835,1836\,1837,1840]},{"outputs":[1849,1850,1851,1852,1853],"id":340,"inputs":[1833,1841,1843,1844,1847,1848]},{"outputs":[1854,1855],"id":341,"inputs"\:[1838,1839,1842,1846,1849,1850,1851,1852,1853]},{"outputs":[1856,1857,1858,1859,1860],"id":342,"inputs":[1845,1854,1855]},{"outputs":[1861,18\62,1863,1864,1865,1866,1867,1868,1869],"id":343,"inputs":[1858,1860]},{"outputs":[1870,1871,1872,1873,1874,1875,1876,1877,1878,1879],"id":344,\"inputs":[1856,1857,1862,1865,1866,1868]},{"outputs":[1880,1881],"id":345,"inputs":[1859,1870,1871,1874,1875,1876,1878,1879]},{"outputs":[1882\,1883,1884,1885,1886,1887],"id":346,"inputs":[1861,1864,1877]},{"outputs":[1888,1889,1890,1891,1892,1893,1894,1895,1896],"id":347,"inputs":[18\63,1873,1880,1883,1884,1885,1887]},{"outputs":[1897,1898,1899,1900,1901,1902],"id":348,"inputs":[1869,1872,1882,1886,1888,1890,1891,1894]},{"o\utputs":[1903],"id":349,"inputs":[1867,1889,1892,1893,1898,1899,1900,1901,1902]},{"outputs":[1904,1905],"id":350,"inputs":[1881,1895,1896]},{"\outputs":[1906,1907,1908,1909,1910,1911,1912],"id":351,"inp]},{"outp
Simplifying logs using a simple queue where log messages are written to and read from in another thread which is responsible for dumping them in JSON to stdout, adding timestamp and various metadata. Ended up not using Katip as it still adds some cruft on top of what we really need, jsut wrote a simple thread-based logger that pumps from a queue and write to stdout.
Added some simple test as I noticed this code is never tested directly.
Logging format is simpler, extracting events become:
$ cat /run/user/1001/bench-f531fd03b79aa8ca/1 | jq -c 'select((.message.tag == "Node") and (.message.node.tag|test("ProcessingEvent"))) | .message.node.event'
Rerunning the benchmark I still have an incorrectly formatted log entry for the first node, which seems to be the one generating the error, but it's unclear from the error message.
So the logs are truncated because of the error sent, which is ok but it's unclear why an entry in the middle of the file could be incorrect. Could be caused by the flushing of the logs I have added to Logging
as we can still write more logs even when the inner action is interrupted by an exception that prevents proper evaluation of the JSON data?
Dumped events from node 3 whose last action is a ReqSn
and which seems to crash to as it's output is incomplete, trying to reproduce the failure using those logs. But I am still unable to reproduce the error
thrown from update
in HeadLogic, even though feedEvents
now just discards LogicError
s, probably because the logs are truncated when the exception is thrown.
Trying to remove the error
call and replace with a standard LogicError
specialised for InvalidSnapshot
. Still got a benchmark failure as some messages are not received.
I probably should give up for now... Cannot reproduce the failure using the logs which is really annoying, will try again later.
Back to work on the External PAB
What we are really interested in observing are the transactions that will be reflected to the Node as ChainTx
values. Can we observe the "redeemer", or we don't need to, we just need to observe the inputs of the transactions (eg. the AbortTx
Utxos come from the inputs)
We don't need much information from the tx comming back from the chain, because by definition they have been validated so they are correct. This implies we need to split the OnChainTx
in 2, one for sending txs and one for receiving them. By separating the OnChainTx
in 2 types, we add more logic to the head code but remove logic from the PAB/Contracts which is good.
For OnCloseTx
we only need the snapshot number, and then we can verify the number against our latest confirmed snapshot:
- If it's same => OK
- If it's lower => post a contestTx
- If it's greater => We have a problem
Adapting MockChain
to convert between on and posted transactions, so far it was only forwarding what it received.
In the PAB we now want to convert whatever we observe from the current state to an OnChainTx
which we'll send to the client. We cannot use the same types in the PAB and the client (Node) because that introduces coupling, and the PAB share types with onchain contarct and we don't want to tie our haskell code with plutus code.
We are stuck on the abort
test, the endpoint is called but server returns an error 500 saying it's not there, which might come from incorrect body (but not the case here) or more simply from the fact the endpoint
promise is not called in the contract.
select
in plutus says that:
-- | @select@ returns the contract that makes progress first, discarding the
-- other one
So if the first "progressing" contract is waiting on the chain for something, then you're stuck. We could use 2 different contracts activated, one for listening to transactions, another one for endpoints and interacting with the head, but plutus team is working on another solution: passing a Promise
to the waitForupdate
function that makes it possible to have endpoints active while waiting, leading to a Timeout
in the waiter code.
We part ways for the week setting our goal for next week: Integrate real ledger into the head. We'll leave the PAB and Plutus stuff aside for the moment and focus a couple weeks on the Hydra node itself, possibly adding some frontend (wallet for end users, TUI for admins).
-
Discussing issue with ν_initial validator:
- We initially thought the PTs would be paid to public key of participants, but actually this does not work because we need to be able to post an abort transaction which implies any participant must be able to consume the multiple UTxO from the Init transaction.
- We need to have the output containing the PTs to be actually paid to a script, which is the ν_initial script, and have the verification key be passed as datum so that the commit transactions are valid iff their signing key matches verification key.
- Tricky thing about this - in order to "discover a Hydra head" the address to which the PT is paid is ideally known in advance
- Having a "single" validator for all Hydra head instances should be fine (according to researchers)
-
We currently rely on the fact that the datum of the statemachine validator ν_SM is included in the
init
transaction- Manuel: This is not always the case! Datums do not *need to be included in the transaction producing [the outputs holding] them.
-
How are we going to get the committed UTXO to pass to the collectCom endpoint to build the transaction? Right now in the test, they are known in advance but that won't be the case in real code, because the poster only has access to the chain? Or maybe not and just pass around the UTxO off-chain.
-
We need to pass in the ν_initial datum the parameter to reconstruct the address of the ν_sm state machine validator.
-
Added and vetted new coding standards
-
Started to consolidate
master
withcontract-sm
andorigin/KtorZ/experiment-move-lift-params-to-datums
work streams on our plutus contracts- Individual modules for each of our three contracts (head statemachine, initial and commit validators)
- Offchain / PAB glue code into the
Hydra.Contract.PAB
module
-
SN discovered that
doom
emacs has a feature for yanking (and browsing) github at point usingSPC g y
- Updating
plutus
and dependencies to continue investigation of weird behavior of smart contracts on semantically equivalent changes- New version of
plutus
changes howendpoint
work, this function now takes a continuation - The default port and HTTP api paths have also changed
- New version of
- After having it compile again, the changes which made it pass before do fail the test now whereas what was passing before does fail now!?
- Plutus team suspects it's ordering issues
Back to work after 2 weeks vacations, catching up.
Recreated yubikey after the first one got destroyed when I dropped the laptop on hte wrong side, need to reorder a new one as spare. Fortunaly, I had an encrypted volume containing the secret keys as backup so I was able to restore the keys on the new card relatively easily following Dr.Duh's guide https://github.com/drduh/YubiKey-Guide/#configure-smartcard.
The only snag I had was that gpg keeps the state of the key in its store, so reimporting it does not change the flag saying the key is on the card. I had to --delete-secret-keys
manually to remove the key completely from the storethen reimport it then move it to smartcard.
Also recreating development VM. For some reason, the disks and FW rules still existed and were not completely destroyed when I used terraform destroy
so I had to remove them from the console directly. Also, need to gcloud config configurations activate default
to use the correct account settings. It's annoying gcloud does not allow per-directory configurations... Took about 1h8m to recreate haskell dev VM from scratch.
Reading layer2 market survey document coordinated by Ahmad. Seems like isomorphic transactions are really a distinguishing of Hydra from all other proposals, which either are limited to special transactions, eg. payments, or rely on specialised contracts.
Checking Red bin to see if there's some useful employment of my time to do:
- Working on cleaning up working directory for tests
- Exposing a test prelude in a new package.
Mb
- Done some more testing and exploration with MPT, in particular, playing around with different alphabet's sizes.
- Refined a bit the test suite with a more precise formula w.r.t to the computation of the 'average' proof size.
- Read up on Verkle Trees and vector hashes
- More discussions and meeting on the Adrestia's side.
MB
-
Continued the work on getting static addresses for init and commit contracts. This also included reworking a bit the existing contractSM so that it'd distribute the participation tokens to static init contracts, which can be observed from the
watchInit
endpoint. Doing so, I've also refactored a bit the module structure to more clearly separate on-chain from off-chain code. -
Opened a PR on Plutus to add an extra
MustSatisfyAnyOf
primitive for theTxConstraints
, necessary for expressing some of the conditions we have in our various Hydra contracts. (https://github.com/input-output-hk/plutus/pull/3706) -
Investigated Plutus' contract size, and why they are so large. Opened an issue with some findings, and discussions with MPJ: https://github.com/input-output-hk/plutus/issues/3702
MB
- Mostly busy with Slack conversations and document reviews on various topics, but mainly, Adrestia, Plutus-core and the use of datum-parameterized contracts vs compilation-parameterized contracts.
MB
-
Stumbled upon https://vitalik.ca/general/2021/06/18/verkle.html, which I haven't read beside the intro which has enough to keep me captivated:
[Verkle Trees] serve the same function as Merkle trees. [...] The key property that Verkle trees provide, however, is that they are much more efficient in proof size.
-
Also discussed with Matthias Fitzi some possible improvements of the MPTs proofs:
- Each node could store their children hashes as Merkle Trees, this allow to reduce the overall proof size by a factor of 3.
- We may want to try shorter alphabet, to create slightly longer proofs but with less neighbors on each levels.
- We've seen in previous simulations that even with short concurrency factors (~20), head networks would still perform reasonably well. So there's a right limit to find which lead to satisfactory performances.
-
I also had a go at inspecting the sizes' of our Hydra contracts. It's rather big. 11KB for the state-machine, 8KB for the close and initial. We may want to consider optimization to make scripts smaller.
MB
-
Worked on an implementation of Merkle-Patricia Tree, including already a few of the necessary optimization w.r.t to the storage of the prefixes.
-
Writing a few QuickCheck properties revealed that we may have a proof size problem. While it is true that the size of proofs is in
log(|U|)
(forU
the Utxo set), each element in the proof may embed up to 15 hashes, so for reasonably large UTxO set, we end up with proofs carrying 35/40 hashes! Since a proof is needed *per input and per output for each transaction, we may rapidly consume all the available space.
- We have been observing weird / blocking issues with Plutus: https://github.com/input-output-hk/plutus/issues/3648
-
Continued the work on the Hydra-Plutus-PAB integration to remove the hard-coded contestation period and have it part of the
init
transaction. Like a few other types (e.g.HeadParameters
,Party
) we have duplicating type-definitions for similar concepts between the Plutus on-chain code and the rest of the application. -
We discussed (and rejected) the idea of removing that duplication in favor of a single type definition at the root of the dependency tree; yet this is rather unsatisfactory because:
- Although data-types have the same names and represent similar concept on-chain and in the application, they aren't necessarily an exact overlap. Thus, we would end up with types that are more complex than they need to, because they need to satisfy more downstream consumers.
- More importantly, the Plutus machinery is more restrictive on what primitive types can be used. For example, there's no
Word
in Plutus, onlyInteger
. NoDiffTime
, onlyInteger
. These restrictions forces data types defined for Plutus to loose a lot in expressiveness and satefy compared to what we would normally do for a Haskell program. Thus, while some little duplications is unfortunate, it actually helps to get a nicer designs for the main, off-chain, parts of the application while providing a minimal API surface for the on-chain code.
-
As a bridge between on-chain and off-chain types, we rely on JSON serialization (which fits nicely in the PAB design). Thus, PAB clients submits parameters as plain JSON, which gets deserialized into their on-chain compatible version using a more restricted set of primitives. Since this approach introduces some duplication in type-definitions, it now becomes utterly important to ensure it works as expected through tests for which property-based roundtrip tests are a good fit. Our first property actually even immediately caught what was a legitimate mistake of ours on its first execution.
Somewhat ad-hoc agenda on the Close-OCV again:
- MPT helps only for fan out txs .. but not when we on the Close / Contest
- MPT would move the "space burden" into redeemers instead of datum, but still requires non-bound size for validating hanging transactions", right?
- Applicability of Hanging transactions should be encodable by MPT insert / remove operations
- Sandro will check again
- We can/will walk through that with him
- Coordinated protocol (= no hanging transactions) might not be affected by this
- Signed snapshots would suffice to validate in close/contest
- Only fanout needs to check presence of utxos in datum
- Refactored
HeadState
andHeadStatus
back into a single data type to make invalid states unrepresentable:ReadyState
won't ever have a valid list ofparties
as noInitTx
, which announces the list of participants, was observed before that state - Reviewed status quo of
ExternalPAB
and discussed various aspects of it (like paying PTs to pubkeys or some quirks of the PAB)
- As https://github.com/input-output-hk/plutus/issues/3546 was resolved, I set off to update
plutus
dependencies - This time a simple bump of
plutus
andcardano-ledger-specs
was anough to satisfycabal
- Two additional changes required in the code though:
-
IsData
is now three type classesToData
,FromData
andUnsafeFromData
- Boilerplate/gluecode for PAB is now using the
HasDefinitions
type class
-
- The reproducer does now work!
- We cannot update our code statemachine to use
Just threadToken
though as we do (currently) rely on forging our own tokens (including the PTs sharing the same currency symbol) and recent changes madePlutus.Contract.Statemachine
forge thread token automagically.. which is not what we want 😿
- Talked about Tail simulation results, what would be a representative experiment for "general applicability" of the tail protocol, as well as future extensions / adapations and how they would compare to something like Ligthning network on Hydra Heads. Which also seems to be a good avenue to payment use cases for Hydra Heads.
- Read about and look into the the Raiden network
- This onboarding document seems to be a good "introduction" to their tech (stack)
- The spec is quite heavy-weight, but feels a bit "ad-hoc" or engineered rather than backed by research
- The raiden services seem to be "adding value" by short/cheap path finding and offline-capability (like LN watchtowers?) in exchange of some
RDN
token fee (their ROI?)
Solving together the issue with snapshots not being emitted for transactions once we run out of transactions to submit Wrote failing behavioural test, solutions proposed:
- have a concurrent snapshot thread like in hydra-sim
- making sure we can have more than one snapshot in flight
Trying 2. as having multiple threads is unappealing. Test is still failing after changes seenSnapshot :: Maybe Snapshot
-> seenSnapshots :: Map SnapshotNumber Snapshot
We need to debug what's going on as the failure message is unhelpful => dump IOSim traces whene something goes wrong in BehaviorSpec
Taking a step back and thinking how we should solve the snapshot number problem.
We need to add a snapshot number in the state, storing the nextSnapshotNumber
and updating it in 2 places: When one emits a ReqSn
and when one receives it, the former happening only when node is a leader
It's a monotonically increasing counter but it's redundant
Other solution is to store a Maybe Snapshot
in the index, instead of a snapshot, so that the leader can use the index without constructing the snapshot
Test is failing because the snapshot 2 contains 2 confiremd txs instead of one => we should probably update the seenTxs
as soon as we emit a ReqSn
?
- If we remove the snapshotted txs from the seenTxs as soon as we emit a ReqSn => it does not work
Adding more edge cases for leader handling in snapshot emission, code seems more and more complicated
- We could also simply not handle
ReqTx
when there is a snapshot in flight in the leader?
Reverting back to where we had failing tests (And traces from IOSim logs), fiddling with merge/revert conflicts
Trying a mixed approach, not having a separate thread but having a separate event for requesting a new snapshot. The idea is that as we enqueue an event for each transaction anyway, we don't lose anyone of them, the new snapshot will be created with whatever exists at the time of its processing, and if there is another snapshot in flight, we will wait/discard it.
We managed to get "parallel" benchmark working by using a NewSn
message that decorrelates the request for a new snapshot from the actual creation of snapshot.
The NewSn
message is enqueued and waited for if there's already a snapshot in flight, and discarded if there aren't any transactions to snapshot (seenTxs
) is empty.
This alleviates the need to have a separate thread runnning to trigger the snapshot, and it also works if we want a finer grained snapshot policy, like after N txs.
Then fixing HeadLogicSpec unit tests which are now failing because the snapshot logic has changed. Push to master
was a bit too hasty...
- Read and discussed recent Hydra research
- Extended the visual roadmap in discussions with researchers and product manager
An interesting minor suggestion for improving code reviews, and commit messages: https://ncjamieson.com/conventional-comments/ If we insist on doing reviews that is...
An interesting scalability paper co-authored by C.Decker, the guy behind eltoo.
Writing a unit test exposing the problem we are seeing with our parallel benchmark, namely that we get a signature for a snapshot we have not seen yet:
(OpenState headState@CoordinatedHeadState{seenSnapshot, seenTxs}, NetworkEvent (AckSn otherParty snapshotSignature sn)) ->
case seenSnapshot of
Nothing -> error "TODO: wait until reqSn is seen (and seenSnapshot created)"
Just (snapshot, sigs)
Reverting back to when we had parallel confirmations to try to load test the cluster leads to another failure -> Try providing a more helpful message when a waitMatch
fails in ETE tests, as the current one is not very useful
- Wait function was missing
HasCallStack
=> no stack trace, wrong information from the failing tests - The wait timeouts and there aren't any message received, this is puzzling. I can the snapshot being confirmed in the node's log and the
ClientEffect
trace, so could it be an artefact of deserialisation? - It seems we never get more than one snapshot when submitting txs in parallel, which looks like an issue in the way we are doing the protocol
- There aren't any
Wait
effect in the logs, so this means we never get into the situation where a tx or snapshot cannot be handled
I think I understand what's going on: We only request a snapshot when processing a transaction and there's no snapshot currently being processed, but given we have a single queue, we end up submitting all txs, then doing one snapshot, but other transactions do not trigger a snapshot request because there is still a snapshot unconfirmed in flight. Then the snapshot ends up being confirmed, but there's no more any transaction to trigger ReqSn
.
Managed to edit the API documentation file with some descriptions, now trying to generate human-readable documentation from it. I can transform the document to JSON using
$ yq . hydra-node/api.yaml
which is a thin wrapper over jq and takes the same kind of expressions. Added nix expressions to the shell.nix for jq and yq, I guess only the latter is necesseary as jq is a dependency of it. Trying to add a python3 package called [https://github.com/coveooss/json-schema-for-humans] but it does work in nix: The package is not part of nix database. Rodney and others has some pointers on how to add a python package to nix.
- To install non-standard python packages, follow instructions here: https://nixos.wiki/wiki/Python. This basically mean writing a nix derivation that install the package and invoking it in the shell...
- There's also nix-mach which provides tooling to produce nix derivations from python requirements.
Don't know why but I got a déja vu feeling with those JSON Schemas, like I was back in the days of XML processing where XML was everything and everything was described in XML, with complex tools to parse, analyze, validate, merge/split/transform XML documents. It's not as worse with JSON but still feels quite similar. I guess the question is, as always, what's the best format for specifying interfaces and APIs: A pivot format from which to generate or verify code, or code from which to generate doc?
- Having a quick look at the generic Schema generator for Swagger in Haskell: It does not extract fields comments from data types records which is annoying as this means we'll need to repeat the same information twice.
Working on the ETE benchmark test, generating more transactions to input. We move the generator from test package to code package which introduce dependency to QC in hydra-node which is probably fine as we already depend on it in the Prelude => add more stuff from QC to the Prlude?
We probably want to separate submitting of transactions from confirmation in 2 different threads in order to make sure we observe confirmation as soons as possible, while loading the server with more trnasactions.
We are struggling to get a Set from a Vector of Value, until we realise the solution is simple: there is a Foldable toList
method!
After parallelizing submission and confirmation of txs, we get an error in the waitForPeersConnected
function when runnning the test. We spend some time troubleshooting it:
- This is weird as the error seems to be happening at the beginning of the test but we can see the nodes get transactions and messages so this means there is a thread that keeps running that timeout somewhere.
- Something's fishy in the
waitForAll
function, adding traces to udnerstand what's going on - Adding more traces around the
timeout
call: Could it be triggered asynchronously somehow? - Node 1 is starting to wait for peers connected again at some point after the initial head is open
- Adding more traces around various
waitXXX
functions - It looks as if it was calling
waitForNodesConnected
a second time after the first round, like there was some thread running in parallel doing that only for node1 - Trying to reformat the code and use
concurrently
instead ofconcurrently_
explicitly discarding the result - Turns out the
action
is actually run twice (or more): When we are disconnected, it throws anIOException
and this is handled bytryConnect
always as a connection failure, which triggers re-running the action. We should only catch exceptions thrown byrunClient
and not the other ones, but this is not possible as failure to connect and disconnection seems to be represented as the same exception type. - There is a race condition in using the
race_
function between detecting the process failed with exit code <> 0 and failing to connnect to it, which leads to non-deterministic test result
We workaround the issue by ensuring we don't retry to onnect when the action has started running which is clunky and uses a Bool
flag but works well
Converting the SimpleTx
generator to use getSize
to be able to generate more transactions -> we see the process crashing and the error about reqSn
not being properly implemented
Switching to upgrading dependencies, making sure we can get the latest plutus stuff from SN's branch. Plutus tests are failing and unfortunately the error message is not very informative:
[WARNING] Slot 3: W1: Validation error: Phase2 3ffcc708303460d9cb6871495ae3391ad855745bcec9d5af02c662705eb29c74: ScriptFailure (Evaluatio
nError [])
The Init
transaction is failing so the commit
is also failing too as it does not have the participation token to spend. Following luigi's advice in issues we raised, all the tests in the upgrade dependencies branch are now passing. The issue was a mismatch in the type of the monetary policy validator: A new parameter was added in a PR recently, like 1 month ago.
Trying to troubleshoot our close
contract again. Removing the call to close
endpoint still shows collectCom
failure.
master
branch is passing the test, so perhaps the issue comes from our types?
Trying to add small changes to the types to see if tests still pass here. Might be an issue with INLINABLE
but this should break at compile time.
Test still fails with only a change to the types, trying to just add a simple no-arg constructor => still fails
Changing the order of ctors with a no-arg constructor pass the test, but not with Closed Snapshot
.
There seems to be interaction between the order of constructors and the order of the case branches in the validator ??
Adding Close/Closed to state/transition at the end of datatypes make the test fail on close
... which is weird
Adding traceIfFalse
statements to check what exactly is failing (not obvious from Plutus' emulator messages) -> not very conclusive either
We should probably try with a more recent version of Plutus and check if we have same errors/better error reporting. Plutus SC is on our critical path anyway so no point in side-stepping it, but the Plutus team is drowning under pressure and deadlines. Looks like we are hitting a wall, next step is:
- upgrade dependencies and see if we can move forwared
- circle back with Plutus team for some help
Moving to implementing benchmark
Goal is to have a simple benchmark, running a number of nodes and hitting those nodes with transactions through their clients. Dimensions of the bench are: number of nodes, concurrency level, also structure of transaction.
Discussing the respective merits of monotonic time, clock time, Data.Time
or System.time
packages...
io-sim-classes uses a DiffTime
to represent differences and also to represent monotonic time.
monotonic time starts at undefined moment in the past (start of system) but is a Word64
in Haskell => No need to care about all this right now
Got JSON output of each transaction submission time and confirmation duration, in the form of a list Now refactoring tx submission to actually confirm all txs that are returned by the snapshot confirmed message
Got benchmark compiling and outputting the confirmation time of a single transaction, extracting txs and txid from the JSON values we get from the server. Next step is to send more transactions from a single node, then send transactions in parallel, and finally sned them to several nodes.
See Miro board
Viewing Testing smart contracts by John Hughes
Trying to start again implementing a proper model for the head smart contracts, based on https://alpha.marlowe.iohkdev.io/doc/plutus/tutorials/contract-testing.html and John Hughes video. I think this should be our very first next step because it will help us get a complete picture of the smart contracts we need to implement and guide the implementation whatever form it can take, invidual validators or state machine based. I want to get back to the paper and formalise the SM specification there in code.
- Defining value dimensions for a Layer 2 solution:
- Speed
- Transaction cost
- Security model, custodial vs. non-custodial, level of trust required
- Decentralisation
- Ledger capability
- Scale of participants
- ...
- Map different solutions on a spider web chart (aka. radar chart)
- Do the same thing for technical parts/components needed, defining how feasible they are:
- The dimensions are the technical components of possible solution(s)
- The scale of the dimension is the maturity level of that particular technical part
- Solutions are composed of components
- We can then relate the "desirability"/value of a solution in front of its "feasibility"/maturity
- This should be done collaboratively wiht various stakeholders in order to foster discussions on values, solutions, dimensions
Idea: Create a Hydra testnet with several Hydra nodes connected together, that expose an API that can be used by clients, eg. a dedicated wallet for experimenting.
Detailed notes here along with link to Miro board.
- Added notes on eltoo to Lightning network page.
- Fun fact: The LND implementation of lightning network daemon as over 1000 files of Go source code
These slides from Orfeos are pretty much useless without accompanying talk
Following links from eltoo site, I found Yet another micropayment network proposal.
- Updating dependencies in
cabal.project
to build the minimal example from 2 days ago with most recentplutus
version - This is non-trivial! When simply bumping all the
source-repository-package
tags to the ones plutus is using, I get a conflict when resolving dependencies - Disabling
hydra-node
(andlocal-cluster
) helps, as the conflict is somehow because of our use ofcardano-node
/ouroroboros-network
vs. it's usage via plutus in thehydra-pab
exe? - To get the nix-shell updated, this is also involving a lot of updating sha256 sums and taking some things from
cardano-node
and others fromplutus
, e.g.- if we use the same
haskell.nix
rev asplutus
, but stick withghc8104
, this would have the compiler be built along with ALL the packages (plutus seems to be using a custom ghc) - using the
haskell.nix
rev fromcardano-node
, requires us to be using a slightly older hackage index-state and useghc8105
to get at least SOME of the dependencies and not need to compile ghc
- if we use the same
- An alternative is to use an ad-hoc shell to get my hands on ghc and cabal, e.g.
nix-shell -p ghc -p cabal-install -p pkgconfig -p zlib
- the packages are not enough and we need the patched libsodium
- so I updated
shell.nix
with a "cabal-only" / "non-haskell.nix" variant of a shell derivation accessible bynix-shell -A cabalOnly
- Seems like multiple things have changed in recent
plutus
-
MonetaryPolicy
is now namedMintingPolicy
, wrapping / compiling it is different, Several constraints have been renamed,BlockChainActions
seems to have been removed etc. - Instead of fixing all this, I removed non-repro code from the cabal module list
-
- Finally, I needed to update the
SM.hs
code to use the newPlutus.Contract.StateMachine.getThreadToken
function and could re-run the minimal reproducer example ->runStep
still fails withInsufficientFunds
- Reported this as an issue
- Added analysis of how much of the confirmed transactions are within
1
,10
and0.1
slots and discussed that with researchers. Here are the previously recorded results for the1000
slot compression (25323
slots) results withs=10800
, with pro-active snapshotting atp=0.8
:
txs-1000clients-25323slots-10800s-0.8p.csv
Analyze
{ numberOfConfirmedTransactions = 17999
, averageConfirmationTime = 625.6537533845784
, percentConfirmedWithin1Slot = 0.9408300461136729
, percentConfirmedWithin10Slots = 0.9408856047558197
, percentConfirmedWithinTenthOfSlot = 0.9245513639646648
}
and without pro-active snapshotting
txs-1000clients-25323slots-10800s.csv
Analyze
{ numberOfConfirmedTransactions = 19761
, averageConfirmationTime = 747.7579325380825
, percentConfirmedWithin1Slot = 0.9313293861646678
, percentConfirmedWithin10Slots = 0.9313799908911492
, percentConfirmedWithinTenthOfSlot = 0.9168058296644906
}
- I also did perform simulation of the data with
100
slot compression (253227
simulated slots) to see how a ~3h settlement delays=10800
would fair
txs-1000clients-253227slots-10800s-0.8p.csv
Analyze
{ numberOfConfirmedTransactions = 46952
, averageConfirmationTime = 980.7909634163951
, percentConfirmedWithin1Slot = 0.9134861134775941
, percentConfirmedWithin10Slots = 0.9135500085193389
, percentConfirmedWithinTenthOfSlot = 0.8959362753450332
}
- Surprisingly, the performance is not as good as on the
1000
slot compression data. We would have expected that the settlement delay would fall "in between" transactions more often with a data set with less traffic. In discussions we speculated that the data might be biased due to the nature of the simulation, where all txs are "fast" untils
and in the end all txs require snapshots (as no funds are incoming)- Proposed solution: Add an option to
--discard-edges
on analysis, i.e. keep it for measuring confirmation times (influence), but not calculate it into the results. - This required a breaking change in how the results are written to disk: Previously the txref was "node + slot + .." packed as string and lexical sorting was not honoring "when" a transaction happened. So I do record now the slot + this label as a
TxRef
, making the type effectively ordered by slot, but factually invalidating all previous results (they were not ordered), so I just added the slot as another column in the CSV to avoid confusion
- Proposed solution: Add an option to
- Given another simulation run on the
20000
slot compression (1266
slots) set withs=100
, we expect that the "edges" in the range ofs
with slots 0-100 and 1166-1266 (or even more) are biased in one way or the other (i.e. all txs untils
are "fast"). The non-discarded results are:
λ cabal exec hydra-tail-simulation -- analyze --discard-edges 0 txs-1000clients-1266slots-100s.csv
Analyze
{ numberOfConfirmedTransactions = 37448
, averageConfirmationTime = 10.018317012686664
, percentConfirmedWithin1Slot = 0.9032258064516129
, percentConfirmedWithin10Slots = 0.9038666951506088
, percentConfirmedWithinTenthOfSlot = 0.879619739371929
}
- Discarding edges for
100
and200
confirm that it "settles" in the "center" of the simulation run, although only slightly in this example:
λ cabal exec hydra-tail-simulation -- analyze --discard-edges 100 txs-1000clients-1266slots-100s.csv
Analyze
{ numberOfConfirmedTransactions = 30787
, averageConfirmationTime = 10.968831519346255
, percentConfirmedWithin1Slot = 0.8934940072108357
, percentConfirmedWithin10Slots = 0.8942735570208205
, percentConfirmedWithinTenthOfSlot = 0.8709845064475266
}
λ cabal exec hydra-tail-simulation -- analyze --discard-edges 200 txs-1000clients-1266slots-100s.csv
Analyze
{ numberOfConfirmedTransactions = 26721
, averageConfirmationTime = 10.80831723318664
, percentConfirmedWithin1Slot = 0.8944276037573444
, percentConfirmedWithin10Slots = 0.8953257737360129
, percentConfirmedWithinTenthOfSlot = 0.872609558025523
}
-
Two-party payment channels secured via hashes + time locks
-
Networking effect comes from routing payments using layered hashes/secrets (similar to Tor / onion routing)
- natural consequence: Lighning only works with fungible tokens!
- great for privacy as each party only knows previous and next hop
-
Liquidity is a big problem
- "can't receive until sent", Wallets tackle this by providing channels on-demand, e.g. Phoenix's pay-to-open
- by default, each payment channel needs to be liquid enough to forward the transactions value -> hard to pay $1M through lightning
- spreading payments over multiple channels is getting researched (and implemented?) recently
-
Lightning nodes need to be online to be safe .. right?
- there is this game-theoretic way of punishing if peer broadcasts old commitment txs (LN-penalty)
-
Watchtowers
- it seems like these are used to allow lightning nodes to be offline for a longer time without losing much safety
- they detect and ensure that no old states are posted on chain and do even dispute with more recent states of the payment channel
- typically implemented as a third party service to which lightning nodes send encrypted data with the tx id triggering the dispute being the encryption key
-
eltoo
- a not-yet-implemented way to enforce continuity of states without incorporating all of them (?)
- drop-in replacement of the penalty mechanism
- My thoughts on this: Are watchtowers this a way to make the Head protocol somewhat offline capable as well? i.e. backup multi-signed snapshots for potential contestation before you go offline .. obviously this is a trade-off for privacy (the watchtower sees all intermediate snapshots) unless we can also make this in an encrypted fashion?
-
How are non-custodial lightning wallets possible?
- They seem to have a lightning node integrated / running
- But only use a "light" bitcoin node to interact with the main chain (primarily as the storage required is huge), e.g. neutrino
-
(AB) took some notes on Lightning Network paper
- Talked to a developer of Marlowe as I found out that they are looking into "Merkelization" of the interpreter AST, which seems to be quite similar to our (researchers) ideas of using MPTs for not needing to store the whole UTxO set in the Hydra mainchain tx.
- Besides multiple organisational things, I started to look into the issue of
Plutus.Contract.Statemachine
'srunStep
erroring withInsufficientFunds
when usingJust threadToken
- I started by creating a minimal reproducer example which contains of a very simple plutus state machine with two states
data State = First | Second
and a singledata Input = Step
as well as this trivialtransition oldState _ = Just (mempty, oldState{stateData = Second})
- The following code is then printing
SMCContractError (WalletError (InsufficientFunds \\\"Total: Value (Map [(,Map [(\\\\\\\"\\\\\\\",99985508)])]) expected: Value (Map [(858535eed6775064eed795dd9261d258dd97ad51983877cc3df52e3a10ed6108,Map [(\\\\\\\"thread token\\\\\\\",1)])])\\\"))
contract :: Contract () BlockchainActions String ()
contract = do
threadToken <- mapError show Currency.createThreadToken
logInfo @String $ "Forged thread token: " <> show threadToken
let client = stateMachineClient threadToken
void $ mapSMError $ SM.runInitialise client First mempty
logInfo @String $ "Initialized state machine"
res <- mapSMError $ SM.runStep client Step
case res of
SM.TransitionFailure (SM.InvalidTransition os i) -> logInfo @String $ "Invalid transition: " <> show (os, i)
SM.TransitionSuccess s -> logInfo @String $ "Transition success: " <> show s
where
mapSMError = mapError (show @String @SM.SMContractError)
- Next: updating
cabal.project
to a newer version ofplutus
and all our transitive dependencies
Working on PAB again, trimming down what we have done so far to the bare minimum.
The goal is to have the InitTx
transaciton with PTs posted and observed from the mainchain:
-
setup
contract creates the thread token and starts the SM - We need to have the cardano pubkeys available to post the init transaction, because we want to pay PTs to those pubkeys
- We see the transaction that creates a UTXO for token creation purpose, then the transaction that starts the state machine, with the thread token being added to the initiator's wallet.
- We got stuck with explicit
threadToken
threading, seems like it's not actually implemented right now so passing the thread token to themkStateMachine
function does not work and makes the transactions unbalanced.- https://github.com/input-output-hk/plutus/pull/3452 is the PR that introduces auto-forging of ST
- We want to reuse the same
CurrencySymbol
for the ST and the PTs but this won't be possible after that PR is merged. - Using
Currency.forgeContract
to forge both ST and PTs but ignoring the former at the moment because it makes the transition fails.
Added wallet identifier to run withEXternalPAB
and have a test with 2 parties so that we can actually be sure the party that checks the transaction has been posted is different from the one that actually initiate the head.
To observe the init transaction being posted, we listen to outputs paid to our pubkeyhash with some arbitrary currency symbol (the "unique id" of our head) and our pubkeyhash as a token name (and amount of 1),
The fact we have a Party
in the node and another one in the contracts code is annoying => we have the same wire format so its fine for the moment, but a PArty
should not be tied to specific types in Crypto
module, it should jsut provide material to build keys which are just bytestrings
- Produced plots for multiple scenarios with high settlement times (as started yesterday)
- Obviously we have either very fast or transactions just after the settlement delay, but there is also a noticable set of txs at
2s
- Investigating why this could be:
- The tail server does decide when a client
NeedSnapshot
, based on the results ofmatchBlocked
- There are multiple
withTMVar
blocks -> suspecting a race condition - Created a reproducer
events.csv
:
slot,clientId,event,size,amount,recipients
0,1,pull,,,
0,2,pull,,,
0,1,new-tx,297,53964900,2
1,1,pull,,,
1,1,new-tx,297,53964900,2
2,1,pull,,,
2,1,new-tx,297,53964900,2
results in these times with s=3600
:
txId,confirmationTime
1053964900297[2],1.1441610433e-2
1153964900297[2],7200.044700244114
1253964900297[2],1.1418798826e-2
- After adding some
trace
s and above minimalevents.csv
, it can be found that the second tx isreenqueue
d, but the third transaction actually gets handled before, thus delaying the second tx again -> double settlement delay on the confirmation time - Refactoring the code to not re-enqueue, but handle the message directly on
SnapshotDone
improves this particular situation, but due to concurrency in the server there is still a chance of "new"NewTx
being handled before the "re-handled"NewTx
and even a confirmation of around3s
was seen
- Implementing the
--pro-active-snapshot
was not trivial due to the lack of tests, but with a small / constructed set of events and sometrace
it could be done (.. should've really written test myself this time :/) - Preliminary results of the 1000 compression data set (24320 compressed slots) with
s=3600
and pro-active-snapshot limit ofp=0.8
show a242.50993413052075
slot average confirmation time, which is quite a bit better than the313.5453135554798
slots average confirmation without the pro-active snapshotting. - There is a different number of
confirmedTransactions
when running the same dataset just with--pro-active-snpapshot
on and off, why? - Likely caused by the way the simulation is structured:
- Threads are forked for running the server and client loops, each client processing
[Event]
and reacting on messages fromt the server (e.g. blocking the loop while doing a snapshot) - Each client is having a local notion of
currentSlot
and gets delayed when blocking the event processing loop - The simulation is stopped after a pre-calculated time, currently
lastSlot of events + 2 * settlementDelay
converted to seconds
- Threads are forked for running the server and client loops, each client processing
Topic: Discuss Custodial Hydra Head or whatever we should rather aim for as our MVP
- Provide some context and back story to Duncan
- There is always "something custodial"
- It's not like "this" or "that"
- Custodial systems raise regulatory obligations to the operating parties
- Hydra Pay (Tail) is also in this situation, the server is also a custodian
- If you are processing other people's payments, you need to register (on many jurisdictions)
- Are we facing these issues with all of our variants?
- Lightning is not having this problem? It's on a smaller scale (really?)
- Besides "custodial issue" the tail has some more risks
- It involves creation of a client as an additional component
- Research is still in very early stages, contracts seem complex
- Faces the same problem of "finding the right server" (vs. "finding the right head")
- After having a
cardano-node
andogmios
server running on fully synched on the main chain, I could finally use anpm run pipeline
inhydra-sim/scripts/tail
to download blocks and construct aevents.csv
- I opted for
npm run pipeline 1000 1000 24320
to try to re-produce the 1000 node with 1000 slot compression dataset as it is also checked in to the repo, which has events of slot24320
- Although I see (some of) the same events be produced, I realized that the current
cardano-node-ogmios
instance I am using is still (or again) having issues to synchronize fully:
[f30db018:cardano.node.DnsSubscription:Error:6539] [2021-07-07 07:41:28.21 UTC] Domain: "relays-new.cardano-mainnet.iohk.io" Application Exception: 18.158.202.103:3001 InvalidBlock (At (Block {blockPointSlot = SlotNo 28173266, blockPointHash = e42dacc3f7406a85b2e561fffc84118c63a5b71d05d3cb0272dbc2d11c235d2c})) (ValidationError (ExtValidationErrorLedger (HardForkLedgerErrorFromEra S (S (S (Z (WrapLedgerErr {unwrapLedgerErr = BBodyError (BlockTransitionError [LedgersFailure (LedgerFailure (DelegsFailure (WithdrawalsNotInRewardsDELEGS (fromList [(RewardAcnt {getRwdNetwork = Mainnet, getRwdCred = KeyHashObj (KeyHash "47edd4e27aa5ef468603ded3c3250b3fd53ac196d9009c3a189e3f2a")},Coin 14393994)])))),LedgersFailure (LedgerFailure (DelegsFailure (WithdrawalsNotInRewardsDELEGS (fromList [(RewardAcnt {getRwdNetwork = Mainnet, getRwdCred = KeyHashObj (KeyHash "aac051310c2760fae362766ab5e7dd27404da3f72732d68ea7ec0c2a")},Coin 1296084)])))),LedgersFailure (LedgerFailure (DelegsFailure (WithdrawalsNotInRewardsDELEGS (fromList [(RewardAcnt {getRwdNetwork = Mainnet, getRwdCred = KeyHashObj (KeyHash "e7a92b469d4af1e2b70efc3638f084757655e99a954d48aae232d488")},Coin 12093106)])))),LedgersFailure (LedgerFailure (DelegsFailure (WithdrawalsNotInRewardsDELEGS (fromList [(RewardAcnt {getRwdNetwork = Mainnet, getRwdCred = KeyHashObj (KeyHash "4ddc9a17c1e23a56f1e01718387f45e646b3bf9f83c0ba285b04e347")},Coin 1781746)])))),LedgersFailure (LedgerFailure (DelegsFailure (WithdrawalsNotInRewardsDELEGS (fromList [(RewardAcnt {getRwdNetwork = Mainnet, getRwdCred = KeyHashObj (KeyHash "ee9345b6e27716c48d68abd805aaca347ba2a65060c47f3e46904320")},Coin 1297137)]))))])})))))))
- Having another go with a
1.25.1
cardano-node
and a separateogmios
instance to get it fully synchronized while I continue in running tail simulations on the part of the dataset what I have - After confirming that I have somewhat similar data, I ran the simulation
cabal exec hydra-tail-simulation run -- --payment-window 100 --settlement-delay 120 datasets/events-clients:1000-compression:20000.csv
to see whether I get also somewhat similar results as in the paper- I got these results and the average confirmation time seems to somewhat correspond to the graph in the T2P2 paper for 1000 nodes (12.4 seconds)
RunOptions
{ slotLength = 1 s
, paymentWindow = Just
( Ada 100 )
, settlementDelay = SlotNo 120
, verbosity = Verbose
, serverOptions = ServerOptions
{ region = LondonAWS
, concurrency = 16
, readCapacity = 102400 KBits/s
, writeCapacity = 102400 KBits/s
}
}
SimulationSummary
{ numberOfClients = 1000
, numberOfEvents = 1035738
, numberOfTransactions = NumberOfTransactions
{ total = 517869
, belowPaymentWindow = 259512
, belowHalfOfPaymentWindow = 207355
, belowTenthOfPaymentWindow = 123539
}
, averageTransaction = Ada 24
, lastSlot = SlotNo 1184
}
[...]
Analyze
{ numberOfConfirmedTransactions = 28346
, maxThroughput = 218.99746835443037
, actualThroughput = 23.92067510548523
, averageConfirmationTime = 10.123735995981168
}
-
Just realized that if I run the simulations with an events file I got from MB for 1000 nodes and 20000 compression, I get the same
12.4
seconds average confirmation time.- I wonder though why that
events-clients_1000-compression_20000.csv
only has519
compressed slots, while my recreated20000
compression of a shorter block chain has1184
compressed slots?
- I wonder though why that
-
Possible next steps:
- Store all confirmation times and plot them (likely will show that the average is strongly biased by a small number of slow txs which required a snapshot)
- Increase settlement delay to 3h and re-run a simulation
- Add pro-active snapshotting when reaching a certain window limit (without lookahead)
- Only do pro-active snapshotting when sender knows not to have another tx anytime soon (> settlement delay)
-
Using the same data as above, but with a 500 slot settlement delay, we get double on the average confirmation time and about half of
confirmedTransactions
:
Analyze
{ numberOfConfirmedTransactions = 14048
, maxThroughput = 218.99746835443037
, actualThroughput = 11.854852320675105
, averageConfirmationTime = 22.912008787073507
}
- Compression 1000 and settlement delay 3600 slots (~1h):
Analyze
{ numberOfConfirmedTransactions = 22672
, maxThroughput = 11.628016940092923
, actualThroughput = 0.9321985115743596
, averageConfirmationTime = 278.11186776079404
}
- Compression 1000 and settlement delay 10800 slots (~3h):
Analyze
{ numberOfConfirmedTransactions = 12842
, maxThroughput = 11.628016940092923
, actualThroughput = 0.5280210517659636
, averageConfirmationTime = 484.0768607394298
}
Trying (again) to complete the first test of PAB, passing some parties' keys to the Init
transaction and have it recorded and observed on the chain through smart contracts and PAB.
We are making too many shortcuts in the PAB/Main thing so things don't make sense to me...
There's confusion between the contract activation logic which basically instantiate a contract and returns a contract identifier that can later be used to invoke endpoints on it, and the actual endpoints handling. To watch transactions from the state machine one needs to run a contract that waits for state changes which requires the thread token (or a state machine client which is created through the thread token)
There is a logical problem: We cannot start the state machine until we have the initTx
command, that's what got me confused. Also, how does the endpoint mapping works through the webserver? Seems like we are using Builtin
and calling init
endpoint but it does seem to be declared anywhere => This is the case in our presetn incarnation, as we have not declared any endpoint so this can't possibly work.
How can I observe the state of a SM if I don't know it's thread token? And then how do I know it's thread token if I did not create the SM in the first place?
As show in the Auction
contract's test, the buyer
needs an external way of getting the thread token to observe the SM progression:
auctionTrace1 :: Trace.EmulatorTrace ()
auctionTrace1 = do
sellerHdl <- Trace.activateContractWallet w1 seller
_ <- Trace.waitNSlots 3
currency <- extractAssetClass sellerHdl
hdl2 <- Trace.activateContractWallet w2 (buyer currency)
_ <- Trace.waitNSlots 1
Trace.callEndpoint @"bid" hdl2 trace1WinningBid
void $ Trace.waitUntilTime $ apEndTime params
void $ Trace.waitNSlots 2
In all instances of StateMachine
I could find, this is done through forging a currency which can then be used as a unique identifier that's either used directly or as part of some larger initial state.
Here in the TokenSale
example from week08
of PPP, the token is part of the TokenSale
initialiser.
tsStateMachine :: TokenSale -> StateMachine (Maybe Integer) TSRedeemer
tsStateMachine ts = mkStateMachine (Just $ tsNFT ts) (transition ts) isNothing
This implies that a node that's not initiating a head won't be able to know what's the head identifier is if there's no way to get it through another mean: Either out-of-band, through the Head members network, or by watching a specific contract's address which is only dependent of the HeadParameters
and henceforth knowable by all parties. Could also be some other address where the participation tokens are forged, with the PTs being defined with a currency symbol which is exactly the thread token.
Init for SM should then:
- Create a unique thread token for the head -> this will be the head identifier
- Post a transaction with outputs containing PTs for each head participant sent to their HeadParameter's pub keys -> like what we do now
All parties observe this known address and retrieve the PT sent to them to know what is the SM thread token -> then they can start monitoring the SM and observe its state changes specifically.
We need to pass both HydraKey
abd CardanoKey
in the InitTx
so that listeners can retrieve Participation tokens and then the state machine's token.
Listening should happen in 2 stages:
- listen to the
InitTx
by listening to PTs being paid to one's pubkey - listen to the state machine's changes using the PT's currency symbol as key to the state machine's instance
-
Had multiple meetings with researchers today
-
First it was about doing some additional Tail simulations:
- Using shelley data
- Focus on (optimistic) latency and ‘window recycling’
- Do some kind of "pro-active" snapshotting when reaching a certain window watermark, e.g. 0.8[-w,w]
- Ideally when the sender knows it is offline for at least
s
(settlement delay) - Increase
s
to a more realistic length of ~500 slots / ~3h - We are most interested in confirmation times (not really throughput) -> plot each tx individually as pointcloud? with time and value as axes?
-
But originally, this is motivated by "prioritization" of Hydra Tail / Head, which was then discussed in the full Research Meeting
- Pointed out that the Hydra Head is more realistic to be implemented any time soon
- Maybe "prioritization" issue is about just the wrong appearance of Tail being the only solution to (micro-)payments?
- Eventually pitched our MVP idea for a delegated hydra head
- Was somewhat well received and also incremental approach on creating this made sense to most
-
After Aggelos talked to Charles though, the "delegated hydra head" seemed to be a non-solution because of regulatory obligations being implied by it being actually a "custodial hydra head"
- In order to be able to redo some of the simulations, I started by following these instructions
- First I was using the combined Docker image for
cardano-node-ogmios
against a somewhat olddb
of acardano-node
and invokingnpm run pipeline
against the ogmios server running with that state- For this I had to add
nodejs-14_x
to theshell.nix
ofhydra-sim
- Also the download first fails because of
TypeError: reader.end is not a function
, but restarting the pipeline picks up the downloadedblocks.json
- Contrary to the
README
, there is a third parameter which seems to be limiting themaxSlot
(after compression?)
- For this I had to add
- When seeing that the data is not complete (slot no is way too low), I realized my
cardano-node
had problems extending the chain - After trying several different tags and also re-synching completely from scratch, I found that there seems to be an issue with recent
cardano-node
versions and the allegra hard fork (in retrospect) - This was also observed by others on slack - Synching with the
1.25.1
node (using a docker image) seems to work now
While fixing PR review's issues, I noticed one problem with the use of JUnit formatter: It does not output anything anymore, the output is sent to the XML file but not to the console which is annoying. Trying to find a way to configure format and get both outputs. Managed to combine the 2 formatters, JUnit XML file generator and console reporter. Seems like there could be a generic function to define there, something like:
both :: (a -> m ()) -> (a -> m ()) -> (a -> m ())
both one two a = one a >> two a
This is aptly defined as tee.
Trying to understand why our close
contract fails to validate properly, and at which points validation fails.
We get an error about some signature not being done but it seems we add the constraints and lookups that are needed. Trying to dump the generated transcations using logInfo
to see how it looks like.
- Trying to use
ownPubKey
to sign the transaction instead of the pub key we pass in the contract but to no avail, it's still failing. - Trying to add
traceIfFalse
in theOnChain
code => now fails in thecollectCom
transaction. - Trying to trim down the
close
validator to bare minimum. - Turnaround time is long: > 1 minute per compilation cycle which is horribly slow
- Why is
collectCom
somtimes failing while we are changing seemingly unrelated code?
Going through Plutus code that constructs a transaction, trying to understand where it's validated and what each part is doing.
Trying to trace the transactions that are posted. It's not possible to get the ones that are not validated by the wallet, seems like only the failing on ledger ones are dumped.
Plan for afternoon:
- Remove mock-chain -> complete PAB with lightweight contract logic so that we get a complete
OnChain
client talking to PAB - Good to have a look at the plutus pioneer program again
Keep troubleshooting cloes
contract failure: Seems pretty clear the failure is in the amounts but what's unclear is why adding traces can have side effects that make the collectCom
transaction validation to fail. going to try to validate the transaction and then investigate the effects of traces.
Just adding the single validation mustBeSignedByOneOf
makes the test fails at the collectcom call which does not really make sense.
Trying to remove and
and the list of constraints makes the test pass!
- Adding check on amounts with
&&
operator (which is supposed to fail) makes the test fail and the error message is cryptic - Trying to add
&& True
makes the test pass -> Seems like it's not the operator the problem, but the operand?
It's hard to troubleshoot errors when one cannot print/trace values: traceIfFalse
takes a String
but one cannot pass show
from vlaues on chain apparently?
-
Trying to remove check on equality of committed values/closed values which seems to break thigns a lot, focusing on the correct transitioning to
Closed
state -
Trying to replace the amount computed from the inputs with constant lovelace value of 1 and pass that off-chain
With a constant
adaLovelaceValue 1
the transactions successfully completes but the test now fails with the wallet's balance not being the one expected. Alice's wallet should have changed by -1000 but it actually changed by 999 which probably comes from the fact it submitted the close transaction that only output 1 lovelace and the rest of the inputs went to Alice's wallet. Changing the test to have the transaction posted by Bob gives the same result, plus the state is not changed!! -
The
payToTheScript
constraint in off-chain correctly generates a value that contains the committed ADAs and the participation tokens. Need to add that constraint in the on-chain validator which seems to be what we doing before but maybe not?Putting an incorrect value in validator's verification raises an error in the
close
contract as expected. -
Ended up submitting a Plutus Bug to try to have a better understanding of what's going on with our failures in the
close
contract's invocation.
Going to beef up Dev VM with C2 instance to have a faster CPU => does not significantly change turnaround time.
Changing the way the amount is computed in the validator changes the outcome of the test: Now I can see the transaction validation failing on-chain and the transaction is dumped to the console, which does not help that much troubleshooting it as it's huge progress nevertheless.
It seems the transaction has no input with value, I can see only a single input which is the script address with datum and redeemer.
The inputs and outputs of the CollectCom
transaction are:
{inputs:
- 268bf918b954642de3e4a1b2d108dee48f2ed4a0f9c974b35c6291b60070ab54!1
Redeemer: <>
- 3832e5b62e1bf8df95054f42d522ec24388b407652dd8564281a30367dcac0ad!1
Redeemer: <>
- 63fcd1840fb27fa8eef570f6f8ea42f1309518c17d4e252f3ee2ddc4c4492848!1
Redeemer: <>
collateral inputs:
outputs:
- Value (Map [(,Map [("",2000)]),(1dd1049cbd0ff6c47602ac8ce76d9d0558edec1c8f417ad6fd4a2111d8b10f10,Map [(0x21fe31dfa154a261626bf854046fd2271b7bed
4b6abe45aa58877ef47f9721b9,1),(0x39f713d0a644253f04529421b9f51b9b08979d08295959c4f3990ee617f5139f,1)])]) addressed to
addressed to ScriptCredential: 43071a376b64c7e305c2fedff9125e3d498f2d00ab0f486fdcaad81baa99dbe1 (no staking credential)
The I/Os for the close transaction are:
{inputs:
- 4fb4fd36491388958f3299ea539b574fdca790e8b5668599e99cac6ba5a39fac!1
Redeemer: <<1,
[<<<"2\130\148\246\255\SUB;X\235\236\191j\224\fZj\137\221\SOs i\EM\235\137\FS\199H(\192\178\178">,
<>>,
[<"", [<"", 1480>]>],
<>>,
<<<"\FS|\224\211\DC3\244\DEL\141R\144\229^i\171B\170E\DC2+E\180\186k\171\154>]T\252\171ub">,
....
outputs:
- Value (Map [(,Map [("",2000)]),(1dd1049cbd0ff6c47602ac8ce76d9d0558edec1c8f417ad6fd4a2111d8b10f10,Map [(0x21fe31dfa154a261626bf854046fd2271b7bed
4b6abe45aa58877ef47f9721b9,1),(0x39f713d0a644253f04529421b9f51b9b08979d08295959c4f3990ee617f5139f,1)])]) addressed to
addressed to ScriptCredential: 43071a376b64c7e305c2fedff9125e3d498f2d00ab0f486fdcaad81baa99dbe1 (no staking credential)
Logging the utxoAt
result in the close
endpoint that gives the TxOutTx
attached to the script's address:
Contract log: String "State machine UTxO: fromList [
(TxOutRef {txOutRefId = 4fb4fd36491388958f3299ea539b574fdca790e8b5668599e99cac6ba5a39fac, txOutRefIdx = 1}
,TxOutTx {txOutTxTx = Tx {
txInputs = fromList [ TxIn {txInRef = TxOutRef {txOutRefId = 268bf918b954642de3e4a1b2d108dee48f2ed4a0f9c974b35c6291b60070ab54, txOutRefIdx = 1}
, txInType = Just (ConsumeScriptAddress Validator { <script> } (Redeemer {getRedeemer = Constr 0 []}) (Datum {getDatum = Constr 0 [Constr 0 [Constr 0 [B \"9\\247\\DC3\\208\\166D%?\\EOTR\\148!\\185\\245\\ESC\\155\\b\\151\\157\\b)YY\\196\\243\\153\\SO\\230\\ETB\\245\\DC3\\159\"],Constr 1 []],List [Constr 0 [B \"\",List [Constr 0 [B \"\",I 1000]]]],Constr 1 []]}))}
,TxIn {txInRef = TxOutRef {txOutRefId = 3832e5b62e1bf8df95054f42d522ec24388b407652dd8564281a30367dcac0ad, txOutRefIdx = 1}
, txInType = Just (ConsumeScriptAddress Validator { <script> } (Redeemer {getRedeemer = Constr 0 []}) (Datum {getDatum = Constr 0 []}))}
,TxIn {txInRef = TxOutRef {txOutRefId = 63fcd1840fb27fa8eef570f6f8ea42f1309518c17d4e252f3ee2ddc4c4492848, txOutRefIdx = 0}
, txInType = Just ConsumePublicKeyAddress}
,TxIn {txInRef = TxOutRef {txOutRefId = 63fcd1840fb27fa8eef570f6f8ea42f1309518c17d4e252f3ee2ddc4c4492848, txOutRefIdx = 1}
, txInType = Just (ConsumeScriptAddress Validator { <script> } (Redeemer {getRedeemer = Constr 0 []}) (Datum {getDatum = Constr 0 [Constr 0 [Constr 0 [B \"!\\254\\&1\\223\\161T\\162abk\\248T\\EOTo\\210'\\ESC{\\237Kj\\190E\\170X\\135~\\244\\DEL\\151!\\185\"],Constr 1 []],List [Constr 0 [B \"\",List [Constr 0 [B \"\",I 1000]]]],Constr 1 []]}))}]
, txCollateral = fromList [TxIn {txInRef = TxOutRef {txOutRefId = 63fcd1840fb27fa8eef570f6f8ea42f1309518c17d4e252f3ee2ddc4c4492848, txOutRefIdx = 0}, txInType = Just ConsumePublicKeyAddress}]
, txOutputs = [
TxOut {txOutAddress = Address {addressCredential = PubKeyCredential 21fe31dfa154a261626bf854046fd2271b7bed4b6abe45aa58877ef47f9721b9
, addressStakingCredential = Nothing}
, txOutValue = Value (Map [(,Map [(\"\",99915670)])])
, txOutDatumHash = Nothing}
, TxOut {txOutAddress = Address {addressCredential = ScriptCredential 43071a376b64c7e305c2fedff9125e3d498f2d00ab0f486fdcaad81baa99dbe1, addressStakingCredential = Nothing}
, txOutValue = Value (Map [(,Map [(\"\",2000)]),(1dd1049cbd0ff6c47602ac8ce76d9d0558edec1c8f417ad6fd4a2111d8b10f10,Map [(0x21fe31dfa154a261626bf854046fd2271b7bed4b6abe45aa58877ef47f9721b9,1),(0x39f713d0a644253f04529421b9f51b9b08979d08295959c4f3990ee617f5139f,1)])])
, txOutDatumHash = Just 2b0b15c43ac83cb6d7a68f7ed516e3017964b30f44d2d828408dd9559c8df82d
}]
, txForge = Value (Map [])
, txFee = Value (Map [(,Map [(\"\",52304)])])
, txValidRange = Interval {ivFrom = LowerBound NegInf True, ivTo = UpperBound PosInf True}
, txForgeScripts = fromList []
, txSignatures = fromList [(d75a980182b10ab7d54bfed3c964073a0ee172f3daa62325af021a68f707511a,2dba3d2cc78c83aef5e080be8dbf85645f90a44edf596913abe466b8cd0634a4250239a127792629702cd8cc4178360999699590e05b38ad2cee9eed12d9bb01)]
, txData = fromList [(2b0b15c43ac83cb6d7a68f7ed516e3017964b30f44d2d828408dd9559c8df82d,Datum {getDatum = Constr 1 ...[]]]]})
,(2cdb268baecefad822e5712f9e690e1787f186f5c84c343ffdc060b21f0241e0,Datum {getDatum = Constr 0 []}),(d38a1142ade90b55793912774ec6b633b03b810ce2f7513b9776d628a5387aa5,Datum {getDatum = Constr 0 [....]]})
,(f37dfa2dac3e68fad98162f5fe2db3ea5e253dccad695ba540b16cbcdc486ece,Datum {getDatum = Constr 0...]})]}
, txOutTxOut = TxOut {txOutAddress = Address {addressCredential = ScriptCredential 43071a376b64c7e305c2fedff9125e3d498f2d00ab0f486fdcaad81baa99dbe1
, addressStakingCredential = Nothing}
, txOutValue = Value (Map [(,Map [(\"\",2000)]),(1dd1049cbd0ff6c47602ac8ce76d9d0558edec1c8f417ad6fd4a2111d8b10f10,Map [(0x21fe31dfa154a261626bf854046fd2271b7bed4b6abe45aa58877ef47f9721b9,1),(0x39f713d0a644253f04529421b9f51b9b08979d08295959c4f3990ee617f5139f,1)])])
, txOutDatumHash = Just 2b0b15c43ac83cb6d7a68f7ed516e3017964b30f44d2d828408dd9559c8df82d}})]"
Submitted TX in the close:
Tx {
txInputs = fromList [ TxIn {txInRef = TxOutRef {txOutRefId = 4fb4fd36491388958f3299ea539b574fdca790e8b5668599e99cac6ba5a39fac, txOutRefIdx = 0}
, txInType = Just ConsumePublicKeyAddress}
, TxIn { txInRef = TxOutRef {txOutRefId = 4fb4fd36491388958f3299ea539b574fdca790e8b5668599e99cac6ba5a39fac, txOutRefIdx = 1}
, txInType = Just (ConsumeScriptAddress Validator { <script> } (Redeemer {getRedeemer = Constr 1 [Constr...}))}]
, txCollateral = fromList [ TxIn {txInRef = TxOutRef {txOutRefId = 4fb4fd36491388958f3299ea539b574fdca790e8b5668599e99cac6ba5a39fac, txOutRefIdx = 0}
, txInType = Just ConsumePublicKeyAddress}]
, txOutputs = [ TxOut {txOutAddress = Address {addressCredential = PubKeyCredential 21fe31dfa154a261626bf854046fd2271b7bed4b6abe45aa58877ef47f9721b9, addressStakingCredential = Nothing}
, txOutValue = Value (Map [(,Map [(\"\",99894975)])])
, txOutDatumHash = Nothing}
,TxOut {txOutAddress = Address {addressCredential = ScriptCredential 43071a376b64c7e305c2fedff9125e3d498f2d00ab0f486fdcaad81baa99dbe1, addressStakingCredential = Nothing}
, txOutValue = Value (Map [(,Map [(\"\",2000)]),(1dd1049cbd0ff6c47602ac8ce76d9d0558edec1c8f417ad6fd4a2111d8b10f10,Map [(0x21fe31dfa154a261626bf854046fd2271b7bed4b6abe45aa58877ef47f9721b9,1),(0x39f713d0a644253f04529421b9f51b9b08979d08295959c4f3990ee617f5139f,1)])])
, txOutDatumHash = Just 98f5b7eed56b55ca67fb14a2f90708dc7e4939bdc424af280ad934d8343388fb}]
, txForge = Value (Map [])
, txFee = Value (Map [(,Map [(\"\",20695)])])
, txValidRange = Interval {ivFrom = LowerBound NegInf True, ivTo = UpperBound PosInf True}
, txForgeScripts = fromList []
, txSignatures = fromList [(d75a980182b10ab7d54bfed3c964073a0ee172f3daa62325af021a68f707511a,bd8f627fef117528a32c8a48c0fa7992e2bbd03fee8b219c72b3a1f95ea8ec97140875009f4bfd7f1e9da7f263c619ff556a202a1d0d2cc9173b10a2445b8b01)]
, txData = fromList [(2b0b15c43ac83cb6d7a68f7ed516e3017964b30f44d2d828408dd9559c8df82d
,Datum {getDatum = Constr 1 [...]]})
,(98f5b7eed56b55ca67fb14a2f90708dc7e4939bdc424af280ad934d8343388fb
,Datum {getDatum = Constr 2 [Constr 0 [...]})]}"
Transaction seems correct, though!
Deciding what to do next in the aftermath of first milestone meeting and update:
- Protocol is still not complete: There are a bunch of TODOs and Contest is not at all implemented . This is rather straightforward so better do it in solo mode
- PAB integration: we'll wait for SN to do this together
- Continuing smart contracts: We are not handling all transitions in the Head SCs and there's still a failing test (commented out)
Fixing commented test in ContractTest
: Adding a tryCallEndpoint
that returns something if an error thrown within the contract endpoint, then we can assert the return value. However there is a assertContractError
functino that assert predicate over a ContractError
supposedly thrown by a contract's instances.
Implemented basic endpoints logic for close
-
Add a
Snapshot
type containing a list of UTxO and a snapshot number. We should add the multisig later. -
Add
OnChain
validator: It's pretty straightforward as there's not much to check. -
Add
OffChain
code to submit transaction for the close -> test fails with a mysterious[WARNING] Slot 7: 00000000-0000-4000-8000-000000000000 {Contract instance for wallet 1}: Contract instance stopped with error: WalletError (ValidationError (ScriptFailure (EvaluationError ["Missing signature","checkScriptContext failed"])))
Note: In the OCV algorithms, there's no mention of checking equality between the amount(s) initially committed and the amount of each transition in the SM, nor with the UTxO decommitted. This is implicit in the fact the snapshot committed is valid and signed hence has been produced by a valid ledger, yet it would probalby be better to check it in the OCV?
Bandwidth is fixed to 2000MB/s, all nodes are colocated in same DC, transactions are assumed to be always non-conflicting.
Nr. nodes | concurrency | tps (snapshot) | snap size |
---|---|---|---|
20 | 10 | 685 | 100 |
20 | 1 | 259 | 10 |
50 | 10 | 709 | 250 |
50 | 1 | 296 | 25 |
100 | 10 | 717 | 500 |
100 | 1 | 314 | 50 |
To compare with Simple Protocol's results.
We should make an ADR for hiding technical layers behind modules, eg. Hydra.Network
encapsulates and re-exports everything network-related.
What happens when CollectComTx and AbortTx happen concurrently?
- We should not observe both coming back from the chain, but this assumes the chain is safe
- We have this property stating we can receive messages in any state, but it's probably wrong
- We should guard the
OnChainEvent
handlers too with the state
We should take care of mainchain rollbacks at some point:
- Chain can be rolled back up to 36 hours in the past
- This means our whole state could disappear, with the rug pulled under our feet while we run the head
- => we need to wait for opening the head until we get sufficient confidence it cannot be rolled back?
- Also relevant for contestation => if you have enough stake you could succeed in forcing a rollback which means you could cheat...
- => delays full finalization even further
- Heads us in the direction of long-running heads, with incremental commits/decommits (with same problem of rollbacks)
- Ouroboros Genesis should solve part of the issue
There's a tradeoff between acceptable risk and head duration -> TINSTAAFL
There is an error in our code: We transition Abort
to ClosedState
which is wrong, but we cannot really observe that.
- The only way is to make sure there are some actions we can or cannot do. Also, states naming is not consistent with the paper -> renaming
InitState
toReadyState
- Also, there is no need for a
FinalState
as there's only one head. When weAbort
or finalize the head, we move back toReadyState
which means we can test the correctness of the abort by snedingInit
again. - Test fails because we had some uniqueness requirement on the txs in mocked chain in
BehaviorSpec
so sendingInitTx [1,2]
twice fails -> remove the uniqueness requirements
Removing our property stating we should handle on-chain transactions in all states -> it's not true anymore. Then adding some unit tests in HeadLogicSpec to assert CollectComtX
and AbortTx
are exclusive of each other.