-
Notifications
You must be signed in to change notification settings - Fork 87
Logbook 2023 H1
-
Started looking into this a bit since our release will be out today and finally I can work on something new.
-
Seems like external commits feature was taking forever, we really need to practise having bunch of smaller PRs instead of one big that is hard to review and work on.
-
Anyway, this issue is about signing network messages (
ReqTx
,ReqSn
andAckSn
) to make sure party knows it came from the trusted source. -
I would like to start writing a test but probably need to read a bit more about what we are trying to do so that I don't miss the scope of this task.
-
Seems like we will need to have the instance for
SignableRepresentation
forTx
since this type is present in the network messages we are sending so let's start there. -
Actually we might want to sign the complete message so the instance should be for a
Message
type? I will try to implement this instance and see how it goes. -
Okay, writing an instance for Message type didn't seem hard at all but now I need to check if it is doing the correct thing. I'll write a test that signs some network messages and tries to verify the signature.
-
For now I don't need to add the signed contents field in the
Message
constructors since this test can be really low level one.
-
As we are opening a head on
preview
right now, I was trying to use the new external commits feature for it. -
Output of
cardano-cli query utxo --testnet-magic 2 --tx-in 8da5f321ea588edffe10e5d905e098fe241f93bb615e672e19513be615530c91#3 --out-file /dev/stdout
should be usable for the request body directly. That is: just aUTxO
object without the wrapping object. -
Using the latest
master
makes things a bit simpler as there is no "tag" on the request payload. -
Trying to use the Eternl as the external wallet:
- As I want to use the eternl wallet key to commit from external, I switched it to "single address mode" to make the address/key setup simpler and used the single receive address then to query the UTxOs.
- Eternl seemingly uses the
TextEnvelope
{.verbatim} for exporting/importing transactions. - Although the Eternl import works after crafting such an envelope json, it shows an error about TTL ("Unable to submit! Transaction life time has expired. Please recreate the transaction with a new expiration date (TTL)."). Seemingly they don't allow importing non-timed transactions.. too bad. Maybe we can just add an upper bound validity of 2-3 hours in the future to satisfy that?
- Using the BIP39 seed / extended key directly is not straight-forward. At
least I didn't know how to convert that into something th
cardano-cli
can use (to sign and submit tx that way)
-
Using a separate key pair to send funds to first and then commit from there was working just fine as we demonstrated in the monthly.
AB + PG on #904
- We wrote a test on Thursday that highlights the fact that, under proposed change to only circulate txIds in snapshots and wait for those transactions to have been seen by the node in order to be able to resolve them, conflicting txs (eg. transactions spending same UTxO) could easily led to a deadlock of the Head
- We discussed the issue throughout the day and realised this was already discussed a few months ago in the spec and with researchers: To be able to resolve any transaction proposed by the leader in a
ReqSn
requires to maintain a proper map of all transactions ever seen by the node, and not only the ones that apply cleanly on the current state (seenUTxO
). - This does not solve the problem of the leader being able to front-run transactions proposed by other nodes, which was somehow the intention behind this development
- This problem could be "solved" by assuming all transactions submitted must be non conflicting which is the initial formulation in the paper and accepting the only way out of such a situation would be to close the head
- Or we could solve it by preventing conflicting transactions altogether by serialising all transactions and snapshots
- We decided to make the failing test pass first, before discussing its implications in more details
- In doing so, we realised we needed to refactor the
Outcome
and how it's handled, because we need to be able to have both aNewState
and aWait
outcome.- We introduced a
Combined
outcome constructor which offers the nice property of simplifyingNewState
- We introduced a
- Then we made the test pass by introducing an
allTxs
map in theCoordinatedHeadState
that indexes all transactions received through aReqTx
. - We still need to clean up the map...
- Should we try to compute costs for script utxos too?
- Not in scope of the external commit work
- Good idea, maybe for some well-known (always true) scripts
- Maybe not so useful? Scripts are so different.
- If we do this, we should keep the results separate from the more comprehensible ADA-only results.
- If we do this, we could/should also do multi-asset outputs (less variable than scripts)
- Maybe add a github discussion (idea) on this
- What is going to happen with CommitTx post tx? Are we going to deprecate it or expand it to allow committing scripts?
- Planned to deprecated in 0.12.0 right now
- Should we deprecate code? No, it would raise warning and we don’t publish a library (right now)
- Be very clear on Release notes and API reference though!
- This repository holds some latex markup about tracking all transactions on L2 somewhere in history: https://github.com/ch1bo/hydra-spec
-
Defining Arbitrary instances for newtype can lead to more expressive code because the name of the newtype makes it clearer what we are generating.
-
Never create an Arbitrary ByteString, as it could lead to confusion. Orphans are never okay (only sometimes acceptable) and should be avoided. ByteString is such an ubiquitous type that having a "random" instance in scope is annoying. It's good to have Arbitrary for our own types because that eases writing properties. Not so much for "generic" types.
-
If you define an arbitrary instance which contains recursive data structures beware
roundtripAndGoldenSpecs
may never end and take all your memory. Moreover, even if you kill the terminal running the spec, a process by the name oftest
will continue running on the background until it takes all your swap. To avoid this, when defining the recursive arbitrary instance we need to create the recursive generator using:
sized $ \n -> do
k <- choose (0, n)
-- make use of `resize k` or `resize (n-1)` to build the recursive data structure sized and arbitrarly.
-
Yesterday I ran into a problem where we can't just issue a http request from the tui client code since the transaction type is abstract there (
UTxOType tx
) so need to find a workaround. -
So after banging my head for 30 minutes I decided to call my superhero colleague AB and see if he can help out. Our problem is, and we did run into it a couple of times, is that ghc can't tell if
tx
inIsTx tx
is the same one inUTxOType tx
. When we call functions abstract over thetx
type variable we need to fix the type to be either@Tx
or@SimpleTx
. In this case ghc could not unify the tx type and I would errors similar to when there is noScopedTypeVariables
enabled (could not match the type tx with tx0...
). -
One solution is to add a
IsTx tx
constraint to all functions needed and at the call site fix the type to@tx
(note the lowercase tx for the type variable). -
Another more generalized one is to make
UTxoType tx
type family injective which is exactly what we did. -
All code compiles and I am always left with this weird feeling of I like to play around with abstract stuff like this in Haskell but is it worth it in the long run? IMO if the code is just simple plain value level Haskell then you can train or hire people more easily and the abstract code is hard to follow and maintain.
-
Anyway I am happy to have this thing working now.
- External commit using script utxo PR has become hard to review: let's meet this afternoon to walk through the changes
-
This is one of the outstanding tasks to get Commit from external wallet done.
-
I removed the c command for committing but we have one test that wants to do full head lifecycle so I either come up with a workaround or just nuke the test since these tests are flaky anyway.
-
Ok, turns out I misunderstood what the task is about. I should be supporting the new way of committing in the TUI instead of removing it from the tui.
-
Back to the drawing board, we use
ClientInput
messages from the tui and they are sent using websockets. I need to add support for a REST http call to get a draft commit transaction and sign it.
This run failed for obscure reason and pass when run again: https://github.com/input-output-hk/hydra/actions/runs/5343637049/jobs/9687064065?pr=923
To rerun use: --match "/Hydra.Logging/HydraLog/" --seed 1184570036
- We should a head on preview with external commits before the release
- We have not dog fed it already
When running cabal haddock hydra-node
on a built hydra-node, it needs to rebuild 369 dependencies which have already been compiled and so it's taking forever:
cabal haddock hydra-node
Build profile: -w ghc-8.10.7 -O1
In order, the following will be built (use -v for more details):
- Only-0.1 (lib) (requires build)
- PyF-0.11.1.1 (lib) (requires build)
- SHA-1.6.4.4 (lib) (requires build)
- StateVar-1.2.2 (lib) (requires build)
- Win32-network-0.1.1.1 (lib) (requires build)
- algebraic-graphs-0.7 (lib) (requires build)
- appar-0.1.8 (lib:appar) (requires build)
- attoparsec-0.14.4 (lib:attoparsec-internal) (requires build)
- auto-update-0.1.6 (lib) (requires build)
- base-compat-0.12.2 (lib) (requires build)
- base-deriving-via-0.1.0.1 (lib) (requires build)
- base-orphans-0.9.0 (lib) (requires build)
- base16-bytestring-1.0.2.0 (lib) (requires build)
- base58-bytestring-0.1.0 (lib) (requires build)
- base64-bytestring-1.2.1.0 (lib) (requires build)
- basement-0.0.15 (lib) (requires build)
- bimap-0.4.0 (lib) (requires build)
- blaze-builder-0.4.2.2 (lib) (requires build)
- bsb-http-chunked-0.0.0.4 (lib) (requires build)
- byteorder-1.0.4 (lib:byteorder) (requires build)
- bytestring-builder-0.10.8.2.0 (lib) (requires build)
- cabal-doctest-1.0.9 (lib) (requires build)
- call-stack-0.4.0 (lib) (requires build)
- canonical-json-0.6.0.1 (lib) (requires build)
- cereal-0.5.8.3 (lib) (requires build)
- clock-0.8.3 (lib) (requires build)
- colour-2.3.6 (lib) (requires build)
- composition-prelude-3.0.0.2 (lib) (requires build)
- contra-tracer-0.1.0.1 (lib) (requires build)
- data-array-byte-0.1.0.1 (lib) (requires build)
- data-default-class-0.1.2.0 (lib:data-default-class) (requires build)
- digest-0.0.1.7 (lib) (requires build)
- dlist-1.0 (lib) (requires build)
- dom-lt-0.2.3 (lib) (requires build)
- double-conversion-2.0.4.2 (lib) (requires build)
- entropy-0.4.1.10 (lib:entropy) (requires build)
- erf-2.0.0.0 (lib:erf) (requires build)
- filelock-0.1.1.6 (lib) (requires build)
- fingertree-0.1.5.0 (lib) (requires build)
- fmlist-0.9.4 (lib) (requires build)
- generic-monoid-0.1.0.1 (lib) (requires build)
- generically-0.1.1 (lib) (requires build)
- ghc-paths-0.1.0.12 (lib:ghc-paths) (requires build)
- gray-code-0.3.1 (lib) (requires build)
- groups-0.5.3 (lib) (requires build)
- half-0.3.1 (lib) (requires build)
- happy-1.20.1.1 (exe:happy) (requires build)
- haskell-lexer-1.1.1 (lib) (requires build)
- hostname-1.0 (lib:hostname) (requires build)
- hourglass-0.2.12 (lib) (requires build)
- hsc2hs-0.68.9 (exe:hsc2hs) (requires build)
- hspec-discover-2.10.10 (lib) (requires build)
- indexed-traversable-0.1.2.1 (lib) (requires build)
- int-cast-0.2.0.0.0.0.0.0.1 (lib) (requires build)
- integer-logarithms-1.0.3.1 (lib) (requires build)
- logict-0.8.0.0 (lib) (requires build)
- microlens-0.4.13.1 (lib) (requires build)
- mime-types-0.1.1.0 (lib) (requires build)
- monoidal-synchronisation-0.1.0.2 (lib) (requires build)
- mtl-compat-0.2.2 (lib) (requires build)
- multiset-0.3.4.3 (lib) (requires build)
- network-byte-order-0.1.6 (lib) (requires build)
- newtype-0.2.2.0 (lib) (requires build)
- non-integral-1.0.0.0 (lib) (requires build)
- old-locale-1.0.0.7 (lib) (requires build)
- parallel-3.2.2.0 (lib) (requires build)
- parser-combinators-1.3.0 (lib) (requires build)
- partial-order-0.2.0.0 (lib) (requires build)
- prettyprinter-1.7.1 (lib) (requires build)
- quiet-0.2 (lib) (requires build)
- readable-0.3.1 (lib) (requires build)
- reflection-2.1.7 (lib) (requires build)
- regex-base-0.94.0.2 (lib) (requires build)
- safe-0.3.19 (lib) (requires build)
- safe-exceptions-0.1.7.3 (lib) (requires build)
- selective-0.6 (lib) (requires build)
- semigroups-0.20 (lib) (requires build)
- setenv-0.1.1.3 (lib) (requires build)
- some-1.0.4.1 (lib) (requires build)
- sop-core-0.5.0.2 (lib) (requires build)
- split-0.2.3.5 (lib) (requires build)
- splitmix-0.1.0.4 (lib) (requires build)
- string-conv-0.2.0 (lib) (requires build)
- syb-0.7.2.3 (lib) (requires build)
- tagged-0.8.7 (lib) (requires build)
- th-abstraction-0.4.5.0 (lib) (requires build)
- th-compat-0.1.4 (lib) (requires build)
- time-units-1.0.0 (lib:time-units) (requires build)
- transformers-compat-0.6.6 (lib) (requires build)
- transformers-except-0.1.3 (lib) (requires build)
- type-equality-1 (lib) (requires build)
- unbounded-delays-0.1.1.1 (lib) (requires build)
- unix-bytestring-0.3.7.8 (lib) (requires build)
- unix-compat-0.7 (lib) (requires build)
- unliftio-core-0.2.1.0 (lib) (requires build)
- utf8-string-1.0.2 (lib) (requires build)
- void-0.7.3 (lib) (requires build)
- wl-pprint-annotated-0.1.0.1 (lib) (requires build)
- word8-0.1.3 (lib) (requires build)
- zlib-0.6.3.0 (lib) (requires build)
- contravariant-1.5.5 (lib) (requires build)
- time-manager-0.0.0 (lib) (requires build)
- gitrev-1.3.1 (lib) (requires build)
- measures-0.1.0.1 (lib) (requires build)
- memory-0.18.0 (lib) (requires build)
- foundation-0.0.29 (lib) (requires build)
- HUnit-1.6.2.0 (lib) (requires build)
- extra-1.7.13 (lib) (requires build)
- ansi-terminal-types-0.11.5 (lib) (requires build)
- primitive-0.8.0.0 (lib) (requires build)
- hashable-1.4.2.0 (lib) (requires build)
- data-default-instances-containers-0.0.1 (lib:data-default-instances-containers) (requires build)
- cookie-0.4.6 (lib) (requires build)
- data-default-instances-dlist-0.0.1 (lib:data-default-instances-dlist) (requires build)
- pretty-show-1.10 (lib) (requires build)
- terminal-size-0.3.4 (lib) (requires build)
- network-3.1.2.8 (lib:network) (requires build)
- code-page-0.2.1 (lib) (requires build)
- old-time-1.1.0.3 (lib:old-time) (requires build)
- data-default-instances-old-locale-0.0.1 (lib:data-default-instances-old-locale) (requires build)
- regex-posix-0.96.0.1 (lib) (requires build)
- tracer-transformers-0.1.0.2 (lib) (requires build)
- validation-selective-0.2.0.0 (lib) (requires build)
- random-1.2.1.1 (lib) (requires build)
- distributive-0.6.2.1 (lib) (requires build)
- boring-0.2.1 (lib) (requires build)
- th-lift-0.8.3 (lib) (requires build)
- th-extras-0.0.0.6 (lib) (requires build)
- th-expand-syns-0.4.11.0 (lib) (requires build)
- microlens-th-0.4.3.12 (lib) (requires build)
- generic-deriving-1.14.3 (lib) (requires build)
- th-env-0.1.1 (lib) (requires build)
- network-uri-2.6.4.2 (lib) (requires build)
- transformers-base-0.4.6 (lib) (requires build)
- mmorph-1.2.0 (lib) (requires build)
- microlens-mtl-0.2.0.3 (lib) (requires build)
- deriving-compat-0.6.3 (lib) (requires build)
- zlib-bindings-0.1.1.5 (lib) (requires build)
- pem-0.2.4 (lib) (requires build)
- cryptonite-0.30 (lib) (requires build)
- asn1-types-0.3.4 (lib:asn1-types) (requires build)
- hspec-expectations-0.8.2 (lib) (requires build)
- bech32-1.1.2 (lib) (requires build)
- ansi-terminal-0.11.5 (lib) (requires build)
- vector-0.12.3.1 (lib) (requires build)
- typerep-map-0.3.0 (lib) (requires build)
- resourcet-1.3.0 (lib) (requires build)
- cborg-0.2.8.0 (lib) (requires build)
- atomic-primops-0.8.4 (lib) (requires build)
- unordered-containers-0.2.19.1 (lib) (requires build)
- time-compat-1.9.6.1 (lib) (requires build)
- text-short-0.1.5 (lib) (requires build)
- scientific-0.3.7.0 (lib) (requires build)
- psqueues-0.2.7.3 (lib) (requires build)
- data-fix-0.3.2 (lib) (requires build)
- constraints-0.13.4 (lib) (requires build)
- case-insensitive-1.2.1.0 (lib) (requires build)
- async-2.2.4 (lib) (requires build)
- OneTuple-0.3.1 (lib) (requires build)
- socks-0.6.1 (lib) (requires build)
- simple-sendfile-0.2.30 (lib) (requires build)
- recv-0.1.0 (lib) (requires build)
- iproute-1.7.12 (lib) (requires build)
- doctest-0.21.1 (lib) (requires build)
- unix-time-0.4.9 (lib:unix-time) (requires build)
- data-default-0.7.1.1 (lib:data-default) (requires build)
- uuid-types-1.0.5 (lib) (requires build)
- tf-random-0.5 (lib) (requires build)
- mersenne-random-pure64-0.2.2.0 (lib:mersenne-random-pure64) (requires build)
- QuickCheck-2.14.2 (lib) (requires build)
- MonadRandom-0.6 (lib) (requires build)
- comonad-5.0.8 (lib) (requires build)
- barbies-2.0.4.0 (lib) (requires build)
- dec-0.0.5 (lib) (requires build)
- th-reify-many-0.1.10 (lib) (requires build)
- monad-control-1.0.3.1 (lib) (requires build)
- streaming-0.2.3.1 (lib) (requires build)
- cardano-crypto-1.1.1 (lib) (requires build)
- asn1-encoding-0.9.6 (lib) (requires build)
- prettyprinter-ansi-terminal-1.1.3 (lib) (requires build)
- optparse-applicative-fork-0.16.1.0 (lib) (requires build)
- ansi-wl-pprint-0.6.9 (lib) (requires build)
- vector-th-unbox-0.2.2 (lib) (requires build)
- vector-binary-instances-0.2.5.2 (lib) (requires build)
- th-lift-instances-0.1.20 (lib) (requires build)
- nothunks-0.1.4 (lib) (requires build)
- nonempty-vector-0.2.2.0 (lib) (requires build)
- math-functions-0.3.4.2 (lib) (requires build)
- heapwords-0.1.0.1 (lib) (requires build)
- bitvec-1.1.4.0 (lib) (requires build)
- ListLike-4.7.8 (lib) (requires build)
- vault-0.3.1.5 (lib) (requires build)
- relude-1.2.0.0 (lib) (requires build)
- ekg-core-0.1.1.7 (lib) (requires build)
- charset-0.3.9 (lib) (requires build)
- Unique-0.4.7.9 (lib) (requires build)
- formatting-7.1.3 (lib) (requires build)
- attoparsec-0.14.4 (lib) (requires build)
- constraints-extras-0.4.0.0 (lib) (requires build)
- megaparsec-9.2.1 (lib) (requires build)
- http-types-0.12.3 (lib) (requires build)
- unliftio-0.2.24.0 (lib) (requires build)
- streaming-commons-0.2.2.6 (lib) (requires build)
- protolude-0.3.3 (lib) (requires build)
- concurrent-output-1.10.17 (lib) (requires build)
- universe-base-1.1.3.1 (lib) (requires build)
- indexed-traversable-instances-0.1.1.2 (lib) (requires build)
- base-compat-batteries-0.12.2 (lib) (requires build)
- prettyprinter-configurable-1.1.0.0 (lib:prettyprinter-configurable) (requires build)
- quickcheck-io-0.2.0 (lib) (requires build)
- generic-random-1.5.0.1 (lib) (requires build)
- random-shuffle-0.0.4 (lib:random-shuffle) (requires build)
- bifunctors-5.5.15 (lib) (requires build)
- th-orphans-0.13.14 (lib) (requires build)
- lifted-base-0.2.3.12 (lib) (requires build)
- streaming-bytestring-0.2.4 (lib) (requires build)
- asn1-parse-0.9.5 (lib:asn1-parse) (requires build)
- optparse-applicative-0.17.0.0 (lib) (requires build)
- orphans-deriving-via-0.1.0.1 (lib) (requires build)
- mwc-random-0.15.0.2 (lib) (requires build)
- vector-algorithms-0.9.0.1 (lib) (requires build)
- process-extras-0.7.4 (lib) (requires build)
- parsers-0.12.11 (lib) (requires build)
- io-streams-1.5.2.2 (lib) (requires build)
- http-date-0.0.11 (lib) (requires build)
- dns-3.0.4 (lib) (requires build)
- cassava-0.5.3.0 (lib) (requires build)
- dependent-sum-0.7.2.0 (lib) (requires build)
- wai-3.2.3 (lib) (requires build)
- http2-4.1.2 (lib) (requires build)
- websockets-0.12.7.3 (lib) (requires build)
- http-client-0.7.13.1 (lib) (requires build)
- fin-0.3 (lib) (requires build)
- witherable-0.4.2 (lib) (requires build)
- attoparsec-iso8601-1.1.0.0 (lib) (requires build)
- hspec-core-2.10.10 (lib) (requires build)
- moo-1.2.0.0.0.0.1 (lib) (requires build)
- semigroupoids-5.3.7 (lib) (requires build)
- profunctors-5.6.2 (lib) (requires build)
- assoc-1.0.2 (lib) (requires build)
- th-utilities-0.2.5.0 (lib) (requires build)
- lifted-async-0.10.2.4 (lib) (requires build)
- streaming-binary-0.2.2.0 (lib) (requires build)
- x509-1.7.7 (lib) (requires build)
- tasty-1.4.3 (lib) (requires build)
- pretty-simple-4.1.2.0 (lib:pretty-simple) (requires build)
- mono-traversable-1.0.15.3 (lib) (requires build)
- dense-linear-algebra-0.1.0.0 (lib) (requires build)
- snap-core-1.0.5.1 (lib) (requires build)
- io-streams-haproxy-1.0.1.0 (lib) (requires build)
- dependent-sum-template-0.1.1.1 (lib) (requires build)
- wai-websockets-3.0.1.2 (lib) (requires build)
- bin-0.1.3 (lib) (requires build)
- http-api-data-0.5.1 (lib) (requires build)
- hspec-2.10.10 (lib) (requires build)
- strict-list-0.1.7 (lib) (requires build)
- modern-uri-0.3.4.4 (lib) (requires build)
- invariant-0.6.1 (lib) (requires build)
- free-5.2 (lib) (requires build)
- foldl-1.4.14 (lib) (requires build)
- either-5.0.2 (lib) (requires build)
- these-1.1.1.1 (lib) (requires build)
- hedgehog-1.2 (lib) (requires build)
- async-timer-0.1.4.1 (lib) (requires build)
- x509-store-1.6.9 (lib) (requires build)
- warp-3.3.25 (lib) (requires build)
- tasty-quickcheck-0.10.2 (lib) (requires build)
- tasty-hunit-0.10.0.3 (lib) (requires build)
- tasty-expected-failure-0.12.3 (lib) (requires build)
- word-array-1.1.0.0 (lib) (requires build)
- conduit-1.3.4.3 (lib) (requires build)
- snap-server-1.1.2.1 (lib) (requires build)
- deque-0.4.4 (lib) (requires build)
- recursion-schemes-5.2.2.4 (lib) (requires build)
- adjunctions-4.4.2 (lib) (requires build)
- list-t-1.0.5.6 (lib) (requires build)
- strict-0.4.0.1 (lib) (requires build)
- semialign-1.2.0.1 (lib) (requires build)
- tasty-hedgehog-1.4.0.1 (lib) (requires build)
- hedgehog-quickcheck-0.1.1 (lib) (requires build)
- x509-validation-1.6.12 (lib) (requires build)
- x509-system-1.6.7 (lib) (requires build)
- libyaml-0.1.2 (lib) (requires build)
- io-classes-0.3.0.0 (lib) (requires build)
- ral-0.2.1 (lib) (requires build)
- kan-extensions-5.2.5 (lib) (requires build)
- flat-0.6 (lib) (requires build)
- serialise-0.2.6.0 (lib) (requires build)
- quickcheck-instances-0.3.29.1 (lib) (requires build)
- aeson-2.1.2.1 (lib) (requires build)
- tls-1.6.0 (lib) (requires build)
- typed-protocols-0.1.0.3 (lib) (requires build)
- strict-stm-0.2.0.0 (lib) (requires build)
- io-sim-0.4.0.0 (lib) (requires build)
- plutus-core-1.1.1.0 (lib:index-envs) (requires build)
- lens-5.2.2 (lib) (requires build)
- yaml-0.11.11.0 (lib) (requires build)
- tree-diff-0.3.0.1 (lib) (requires build)
- statistics-0.16.2.0 (lib) (requires build)
- katip-0.8.7.4 (lib) (requires build)
- ekg-json-0.1.0.7.0.0.0.0.1 (lib) (requires build)
- deriving-aeson-0.2.9 (lib) (requires build)
- cardano-prelude-0.1.0.1 (lib) (requires build)
- cardano-binary-1.5.0.1 (lib) (requires build)
- base64-bytestring-type-1.0.1 (lib) (requires build)
- aeson-pretty-0.8.9 (lib) (requires build)
- connection-0.3.1 (lib:connection) (requires build)
- typed-protocols-cborg-0.1.0.2 (lib) (requires build)
- monoidal-containers-0.6.4.0 (lib) (requires build)
- lens-aeson-1.2.2 (lib) (requires build)
- goblins-0.2.0.1 (lib) (requires build)
- statistics-linreg-0.3 (lib:statistics-linreg) (requires build)
- ekg-0.4.0.15 (lib) (requires build)
- vector-map-0.1.1.2 (lib) (requires build)
- strict-containers-0.1.0.0 (lib) (requires build)
- cardano-strict-containers-0.1.1.0 (lib) (requires build)
- cardano-slotting-0.1.0.2 (lib) (requires build)
- cardano-crypto-wrapper-1.4.2 (lib) (requires build)
- hydra-prelude-0.10.0 (lib) (configuration changed)
- cardano-prelude-test-0.1.0.1 (lib) (requires build)
- http-client-tls-0.3.6.1 (lib) (requires build)
- network-mux-0.3.0.0 (lib) (requires build)
- iohk-monitoring-0.1.11.1 (lib) (requires build)
- small-steps-0.1.1.2 (lib) (requires build)
- cardano-crypto-class-2.0.0.1 (lib) (requires build)
- cardano-ledger-byron-0.1.1.2 (lib) (requires build)
- cardano-binary-test-1.3.0.1 (lib) (requires build)
- prometheus-2.2.3 (lib) (requires build)
- ouroboros-network-testing-0.2.0.1 (lib) (requires build)
- ouroboros-network-api-0.1.0.0 (lib) (requires build)
- small-steps-test-0.1.1.2 (lib) (requires build)
- plutus-core-1.1.1.0 (lib) (requires build)
- cardano-data-0.1.1.2 (lib) (requires build)
- cardano-crypto-praos-2.1.0.0 (lib) (requires build)
- cardano-crypto-test-1.4.2 (lib) (requires build)
- ouroboros-network-protocols-0.3.0.0 (lib) (requires build)
- ouroboros-network-mock-0.1.0.0 (lib) (requires build)
- ouroboros-network-framework-0.3.0.0 (lib) (requires build)
- plutus-tx-1.1.1.0 (lib) (requires build)
- set-algebra-1.0.0.0 (lib) (requires build)
- byron-spec-ledger-0.1.1.2 (lib) (requires build)
- ouroboros-consensus-0.3.0.0 (lib) (requires build)
- ouroboros-network-0.4.0.1 (lib) (requires build)
- plutus-tx-plugin-1.1.0.0 (lib) (requires build)
- plutus-ledger-api-1.1.1.0 (lib) (requires build)
- cardano-ledger-core-0.1.1.2 (lib) (requires build)
- byron-spec-chain-0.1.1.2 (lib) (requires build)
- ouroboros-consensus-byron-0.3.0.0 (lib) (requires build)
- ouroboros-consensus-diffusion-0.3.0.0 (lib) (requires build)
- plutus-ledger-api-1.1.1.0 (lib:plutus-ledger-api-testlib) (requires build)
- cardano-ledger-shelley-0.1.1.2 (lib) (requires build)
- cardano-ledger-byron-test-1.4.2 (lib) (requires build)
- cardano-ledger-shelley-ma-0.1.1.2 (lib) (requires build)
- cardano-ledger-alonzo-0.1.1.2 (lib) (requires build)
- cardano-ledger-babbage-0.1.1.2 (lib) (requires build)
- cardano-ledger-conway-0.1.1.2 (lib) (requires build)
- cardano-protocol-tpraos-0.1.1.2 (lib) (requires build)
- ouroboros-consensus-protocol-0.3.0.0 (lib) (requires build)
- cardano-ledger-pretty-0.1.1.2 (lib) (requires build)
- ouroboros-consensus-shelley-0.3.0.0 (lib) (requires build)
- cardano-ledger-shelley-test-0.1.1.2 (lib) (requires build)
- ouroboros-consensus-cardano-0.3.0.0 (lib) (requires build)
- cardano-ledger-shelley-ma-test-0.1.1.2 (lib) (requires build)
- cardano-api-1.36.0 (lib) (requires build)
- cardano-ledger-alonzo-test-0.1.1.2 (lib) (requires build)
- cardano-ledger-babbage-test-0.1.1.2 (lib) (requires build)
- hydra-cardano-api-0.10.0 (lib) (configuration changed)
- hydra-plutus-0.11.0 (lib) (configuration changed)
- hydra-node-0.11.0 (lib) (configuration changed)
SH points us to a related cabal haddock rebuilds issue. Looking at it, it appears that one can solve this problem by adding the, quite surprising, --disable-documentation
option to haddock. Running the following command, haddock does not rebuild all the dependencies and we're happy (although a bit puzzled by what seem to be a poorly documented option):
cabal haddock --disable-documentation --haddock-tests all
https://github.com/input-output-hk/hydra/actions/runs/5326013941/jobs/9657973116?pr=934
Failures:
test/Hydra/LoggingSpec.hs:28:5:
1) Hydra.Logging HydraLog
Assertion failed (after 1 test and 9 shrinks):
[Envelope {timestamp = 1864-05-09 12:15:01.628863580524 UTC, threadId = 0, namespace = "", message = Node {node = BeginEvent {by = Party {vkey = HydraVerificationKey (VerKeyEd25519DSIGN "e5ed627ae3b2ca9805a93398a57d8106073576b0b25c99ace6d100b6f1aef4ad")}, eventId = 0, event = OnChainEvent {chainEvent = Observation {observedTx = OnCloseTx {headId = HeadId "\NUL\SOH\NUL\SOH\NUL\NUL\NUL\NUL\SOH\NUL\SOH\NUL\NUL\NUL\SOH\NUL", snapshotNumber = UnsafeSnapshotNumber 0, contestationDeadline = 1864-05-09 22:41:38.390801215733 UTC}, newChainState = ChainStateAt {chainState = Closed (ClosedState {closedThreadOutput = ClosedThreadOutput {closedThreadUTxO = (TxIn "895b6a1d8ee17b34c545302dbf852a456e12ff48318dad657a6b3dad8e4f74fd" (TxIx 0),TxOut (AddressInEra (ShelleyAddressInEra ShelleyBasedEraBabbage) (ShelleyAddress Testnet (ScriptHashObj (ScriptHash "8ae095dca4d14a1b8edffb37faa6c84ec60340fbf389a62f027e0b76")) StakeRefNull)) (TxOutValue MultiAssetInBabbageEra (valueFromList [(AdaAssetId,6000000),(AssetId "950ab6be42af2f48c80cc83fa1ecd96684a3f848fd9e12d85ece169d" "'\234\DLER\191k\144\221\133\197\SYNh\EM[\SI\159\150\r,\219\206\&2\211c\235ls\n",1),(AssetId "950ab6be42af2f48c80cc83fa1ecd96684a3f848fd9e12d85ece169d" "HydraHeadV1",1),(AssetId "950ab6be42af2f48c80cc83fa1ecd96684a3f848fd9e12d85ece169d" "\\\250\246\ETB\175_/\196\227\"\NUL\233\148\162y\143\132\204E\ETB@k\STX\222\205\161r\208",1)])) (TxOutDatumHash ScriptDataInBabbageEra "80f1afc8499f888418b63a5064fff5cbafe4f7cd15786552a8d164ea7676dc51") ReferenceScriptNone,HashableScriptData "\216{\159\159X `\173\f\SO6O\200\253#'\173\165\204\&1m\168C\156\155\214\ENQ\199\229o\134\182\188t\241\226\&7NX \148\231\148\220\146\199B\225\CAN\216\NUL\n\241\157\171NZZ\by\DEL\191\133`w2b\249\RS\177\NULl\255\SOHX \170\182B\235\&5\EM\170r\144\173\217K\246\194\233\231qc\135N\SOH\190\226V\r\179\227c\177\162h@\SUB\EOT\252\215\CAN\216y\159\SUB\EOT\EM\143\128\255X\FS\149\n\182\190B\175/H\200\f\200?\161\236\217f\132\163\248H\253\158\DC2\216^\206\SYN\157\128\255" (ScriptDataConstructor 2 [ScriptDataList [ScriptDataBytes "`\173\f\SO6O\200\253#'\173\165\204\&1m\168C\156\155\214\ENQ\199\229o\134\182\188t\241\226\&7N",ScriptDataBytes "\148\231\148\220\146\199B\225\CAN\216\NUL\n\241\157\171NZZ\by\DEL\191\133`w2b\249\RS\177\NULl"],ScriptDataNumber 1,ScriptDataBytes "\170\182B\235\&5\EM\170r\144\173\217K\246\194\233\231qc\135N\SOH\190\226V\r\179\227c\177\162h@",ScriptDataNumber 83679000,ScriptDataConstructor 0 [ScriptDataNumber 68784000],ScriptDataBytes "\149\n\182\190B\175/H\200\f\200?\161\236\217f\132\163\248H\253\158\DC2\216^\206\SYN\157",ScriptDataList []])), closedParties = ["`\173\f\SO6O\200\253#'\173\165\204\&1m\168C\156\155\214\ENQ\199\229o\134\182\188t\241\226\&7N","\148\231\148\220\146\199B\225\CAN\216\NUL\n\241\157\171NZZ\by\DEL\191\133`w2b\249\RS\177\NULl"], closedContestationDeadline = POSIXTime {getPOSIXTime = 83679000}, closedContesters = []}, headId = HeadId "\149\n\182\190B\175/H\200\f\200?\161\236\217f\132\163\248H\253\158\DC2\216^\206\SYN\157", seedTxIn = TxIn "0101000100010101000100000101010100000101000001010100000000010101" (TxIx 66)}), recordedAt = Nothing, previous = Nothing}}}}}}]
{'node': {'by': {'vkey': 'e5ed627ae3b2ca9805a93398a57d8106073576b0b25c99ace6d100b6f1aef4ad'}, 'event': {'chainEvent': {'newChainState': {'chainState': {'contents': {'closedThreadOutput': {'closedContestationDeadline': 83679000, 'closedContesters': [], 'closedParties': [{'vkey': '60ad0c0e364fc8fd2327ada5cc316da8439c9bd605c7e56f86b6bc74f1e2374e'}, {'vkey': '94e794dc92c742e118d8000af19dab4e5a5a08797fbf8560773262f91eb1006c'}], 'closedThreadUTxO': ['895b6a1d8ee17b34c545302dbf852a456e12ff48318dad657a6b3dad8e4f74fd#0', {'address': 'addr_test1wz9wp9wu5ng55xuwmlan074xep8vvq6ql0ecnf30qflqkasj32pkc', 'datum': None, 'datumhash': '80f1afc8499f888418b63a5064fff5cbafe4f7cd15786552a8d164ea7676dc51', 'inlineDatum': None, 'referenceScript': None, 'value': {'950ab6be42af2f48c80cc83fa1ecd96684a3f848fd9e12d85ece169d': {'27ea1052bf6b90dd85c51668195b0f9f960d2cdbce32d363eb6c730a': 1, '4879647261486561645631': 1, '5cfaf617af5f2fc4e32200e994a2798f84cc4517406b02decda172d0': 1}, 'lovelace': 6000000}}, 'd87b9f9f582060ad0c0e364fc8fd2327ada5cc316da8439c9bd605c7e56f86b6bc74f1e2374e582094e794dc92c742e118d8000af19dab4e5a5a08797fbf8560773262f91eb1006cff015820aab642eb3519aa7290add94bf6c2e9e77163874e01bee2560db3e363b1a268401a04fcd718d8799f1a04198f80ff581c950ab6be42af2f48c80cc83fa1ecd96684a3f848fd9e12d85ece169d80ff']}, 'headId': '950ab6be42af2f48c80cc83fa1ecd96684a3f848fd9e12d85ece169d', 'seedTxIn': '0101000100010101000100000101010100000101000001010100000000010101#66'}, 'tag': 'Closed'}, 'previous': None, 'recordedAt': None}, 'observedTx': {'contestationDeadline': '1864-05-09T22:41:38.390801215733Z', 'headId': '00010001000000000100010000000100', 'snapshotNumber': 0, 'tag': 'OnCloseTx'}, 'tag': 'Observation'}, 'tag': 'OnChainEvent'}, 'eventId': 0, 'tag': 'BeginEvent'}, 'tag': 'Node'}: {'node': {'by': {'vkey': 'e5ed627ae3b2ca9805a93398a57d8106073576b0b25c99ace6d100b6f1aef4ad'}, 'event': {'chainEvent': {'newChainState': {'chainState': {'contents': {'closedThreadOutput': {'closedContestationDeadline': 83679000, 'closedContesters': [], 'closedParties': [{'vkey': '60ad0c0e364fc8fd2327ada5cc316da8439c9bd605c7e56f86b6bc74f1e2374e'}, {'vkey': '94e794dc92c742e118d8000af19dab4e5a5a08797fbf8560773262f91eb1006c'}], 'closedThreadUTxO': ['895b6a1d8ee17b34c545302dbf852a456e12ff48318dad657a6b3dad8e4f74fd#0', {'address': 'addr_test1wz9wp9wu5ng55xuwmlan074xep8vvq6ql0ecnf30qflqkasj32pkc', 'datum': None, 'datumhash': '80f1afc8499f888418b63a5064fff5cbafe4f7cd15786552a8d164ea7676dc51', 'inlineDatum': None, 'referenceScript': None, 'value': {'950ab6be42af2f48c80cc83fa1ecd96684a3f848fd9e12d85ece169d': {'27ea1052bf6b90dd85c51668195b0f9f960d2cdbce32d363eb6c730a': 1, '4879647261486561645631': 1, '5cfaf617af5f2fc4e32200e994a2798f84cc4517406b02decda172d0': 1}, 'lovelace': 6000000}}, 'd87b9f9f582060ad0c0e364fc8fd2327ada5cc316da8439c9bd605c7e56f86b6bc74f1e2374e582094e794dc92c742e118d8000af19dab4e5a5a08797fbf8560773262f91eb1006cff015820aab642eb3519aa7290add94bf6c2e9e77163874e01bee2560db3e363b1a268401a04fcd718d8799f1a04198f80ff581c950ab6be42af2f48c80cc83fa1ecd96684a3f848fd9e12d85ece169d80ff']}, 'headId': '950ab6be42af2f48c80cc83fa1ecd96684a3f848fd9e12d85ece169d', 'seedTxIn': '0101000100010101000100000101010100000101000001010100000000010101#66'}, 'tag': 'Closed'}, 'previous': None, 'recordedAt': None}, 'observedTx': {'contestationDeadline': '1864-05-09T22:41:38.390801215733Z', 'headId': '00010001000000000100010000000100', 'snapshotNumber': 0, 'tag': 'OnCloseTx'}, 'tag': 'Observation'}, 'tag': 'OnChainEvent'}, 'eventId': 0, 'tag': 'BeginEvent'}, 'tag': 'Node'} is not valid under any of the given schemas
{"messages":[{"message":{"node":{"by":{"vkey":"e5ed627ae3b2ca9805a93398a57d8106073576b0b25c99ace6d100b6f1aef4ad"},"event":{"chainEvent":{"newChainState":{"chainState":{"contents":{"closedThreadOutput":{"closedContestationDeadline":83679000,"closedContesters":[],"closedParties":[{"vkey":"60ad0c0e364fc8fd2327ada5cc316da8439c9bd605c7e56f86b6bc74f1e2374e"},{"vkey":"94e794dc92c742e118d8000af19dab4e5a5a08797fbf8560773262f91eb1006c"}],"closedThreadUTxO":["895b6a1d8ee17b34c545302dbf852a456e12ff48318dad657a6b3dad8e4f74fd#0",{"address":"addr_test1wz9wp9wu5ng55xuwmlan074xep8vvq6ql0ecnf30qflqkasj32pkc","datum":null,"datumhash":"80f1afc8499f888418b63a5064fff5cbafe4f7cd15786552a8d164ea7676dc51","inlineDatum":null,"referenceScript":null,"value":{"950ab6be42af2f48c80cc83fa1ecd96684a3f848fd9e12d85ece169d":{"27ea1052bf6b90dd85c51668195b0f9f960d2cdbce32d363eb6c730a":1,"4879647261486561645631":1,"5cfaf617af5f2fc4e32200e994a2798f84cc4517406b02decda172d0":1},"lovelace":6000000}},"d87b9f9f582060ad0c0e364fc8fd2327ada5cc316da8439c9bd605c7e56f86b6bc74f1e2374e582094e794dc92c742e118d8000af19dab4e5a5a08797fbf8560773262f91eb1006cff015820aab642eb3519aa7290add94bf6c2e9e77163874e01bee2560db3e363b1a268401a04fcd718d8799f1a04198f80ff581c950ab6be42af2f48c80cc83fa1ecd96684a3f848fd9e12d85ece169d80ff"]},"headId":"950ab6be42af2f48c80cc83fa1ecd96684a3f848fd9e12d85ece169d","seedTxIn":"0101000100010101000100000101010100000101000001010100000000010101#66"},"tag":"Closed"},"previous":null,"recordedAt":null},"observedTx":{"contestationDeadline":"1864-05-09T22:41:38.390801215733Z","headId":"00010001000000000100010000000100","snapshotNumber":0,"tag":"OnCloseTx"},"tag":"Observation"},"tag":"OnChainEvent"},"eventId":0,"tag":"BeginEvent"},"tag":"Node"},"namespace":"","threadId":0,"timestamp":"1864-05-09T12:15:01.628863580524Z"}]}
To rerun use: --match "/Hydra.Logging/HydraLog/"
Randomized with seed 642259793
Finished in 385.7940 seconds
266 examples, 1 failure, 6 pending
- Signing the cobra returned by Hydra API for external commit
- Hydra api returns only a cbor formatted transaction
- If you want to sign it with cardano-cli, need to wrap it in TextEnveloppe format
- Can try type: “Tx AlonzoEra” (coming from our API on NewTx input)
- Shouldn’t this be “Tx BabbageEra” in our API?
- Follow the error message
- Off-chain benchmarks / on website
- Avg. Confirmation Time displayed on website: (ms) 300.036740977
- Locally Arnaud got 2ms
- The github runners are slow and… that’s fine, is it?
- Make it clear, with a big warning that’s it’s slow because of infrastructure
- Use the dedicated runner?
- Git revision injection
- Figure out that only the git commit hash has changed and do not run nix build then
- ok but how to do that?
- Cardano-node patches the binary
I'm trying to reduce the cache size by using the same cache for dist-newstyle in the build all job and the haddock job but, for some reason, the haddock job look like it does not take any benefit out of using the same cache as the one build from the build all:
https://github.com/pgrange/hydra/actions/runs/5324124624
This is very surprising. To double check what's happening, I create a simple Dockerfile to build all and then, I run haddock and look at the docker diff to check what does haddock that's different from build all in terms of cache.
#> cat Dockerfile
FROM ghcr.io/pgrange/cabal_test:main
RUN apt-get update && \
apt-get install -y git && \
rm -rf /var/lib/apt/lists/*
WORKDIR /srv/
RUN git clone https://github.com/input-output-hk/hydra.git
WORKDIR /srv/hydra
RUN cabal update
RUN cabal build all
We build the docker image :
#> docker build . -t cabal_build_all
After quite some time, the docker image is ready, we can now try to run haddock inside the container and see what happens:
#> docker run -it cabal_build_all bash
#> time .github/workflows/ci-haddock.sh
...
real 5m0.656s
user 33m59.674s
sys 5m7.751s
So it only took 5 minutes... why on earth does it take more than an hour on CI? 😭
Actually it only took 5 minutes to fail:
Error: cabal: Failed to build documentation for basement-0.0.15 (which is
required by test:tests from hydra-node-0.11.0, test:tests from
hydra-tui-0.10.0 and others).
Also, looking at the docker diff to see what haddock did add to the image:
A /root/.cache/cabal/logs/ghc-8.10.7/...
A /root/.local/state/cabal/store/ghc-8.10.7/Only-0.1-1b6b00348980b37f27c03470dd49b6d3cec33bfbf017c05d75ca42a754286229
A /root/.local/state/cabal/store/ghc-8.10.7/PyF-0.11.1.1-f7b884af3bd2f3a631729f62b50ded16e5b3f2a5db3826d30a10a0fb4c34cfd7
A /root/.local/state/cabal/store/ghc-8.10.7/SHA-1.6.4.4-7180beab073e8125f6c0749ab81b6360a1e0e9a71f99abb4df429c0bb95e0456
A /root/.local/state/cabal/store/ghc-8.10.7/StateVar-1.2.2-7bbc80d828f3a79f3ac17d41957642b3b2e4712ebecb5ac1a078738ebab07234
A /root/.local/state/cabal/store/ghc-8.10.7/Win32-network-0.1.1.1-98c8666f4d26d4357c32837e983b80dfadf38db49c2f57e8f5a06c9c0f2e8310
A /root/.local/state/cabal/store/ghc-8.10.7/algebraic-graphs-0.7-95d5cfe05a50808d43f9dd58fa66f2437b63698b90ac997aa4b37f6d476dcc6d
A /root/.local/state/cabal/store/ghc-8.10.7/ap-normalize-0.1.0.1-01f35de951df01f0af9d28a2467aefde706939e34dd646ddc85ecbec64fe07d6
A /root/.local/state/cabal/store/ghc-8.10.7/appar-0.1.8-619cb8646c6c1ba6be33a2cee713adcb5912d7ee94e9076a29d2f1272700ad8d
A /root/.local/state/cabal/store/ghc-8.10.7/attoparsec-0.14.4-l-attoparsec-internal-619e18e1462007a7e694b37b147e4d3420c2fbdb90a3f860fff4a19f0a273897
A /root/.local/state/cabal/store/ghc-8.10.7/auto-update-0.1.6-dd353beb21984aa51e89e1df95473b87971dcac47c8ed01b6b25f25eee7892f3
A /root/.local/state/cabal/store/ghc-8.10.7/base-compat-0.12.2-537639753b74d27969b47ba29817bc21f5e2e4d46fd9df73d7f4b6dc4e5a7832
A /root/.local/state/cabal/store/ghc-8.10.7/base-deriving-via-0.1.0.1-f23fbb645c44b17be22ff8c5524393796618e729ed675ba205b3329a634843d1
A /root/.local/state/cabal/store/ghc-8.10.7/base-orphans-0.9.0-6cf14148c7b3a5a478ef25d60ef3d73bff3c7a351efaa9397dcf1cab0da9a94c
A /root/.local/state/cabal/store/ghc-8.10.7/base16-bytestring-1.0.2.0-e63e5cdbd927d035764e7b47abfb88dea94a0ad4ebc41e58fc7e8d0dfba19d66
A /root/.local/state/cabal/store/ghc-8.10.7/base58-bytestring-0.1.0-2149eceb4eb584908808145d5ae6f8c027b056f4edce4eb4467b00fb0a0f3c99
A /root/.local/state/cabal/store/ghc-8.10.7/base64-bytestring-1.2.1.0-54b176608f0772c4e696f85bad2e2354709fe6743fd92ffe2068a98fbbb5eed2
A /root/.local/state/cabal/store/ghc-8.10.7/bimap-0.4.0-b413f716f2d1d7f01e13bb6c2fb0363ff551f72fb38e19f802edd5b7df913953
A /root/.local/state/cabal/store/ghc-8.10.7/blaze-builder-0.4.2.2-27e5f0e6e69f3e271ae479f27f487f78f0fe414918fc1ee329bcaa05c8705650
A /root/.local/state/cabal/store/ghc-8.10.7/bsb-http-chunked-0.0.0.4-0f12e0a0b0986eb6abcbaa8783a81dd9188bc3f1edfafa92a8f06c3b1b2a8357
A /root/.local/state/cabal/store/ghc-8.10.7/byteorder-1.0.4-86ff8dbecc6bebe393e1961163b72fae617ee1b84973b55ac98bc91e31e358d0
A /root/.local/state/cabal/store/ghc-8.10.7/bytestring-builder-0.10.8.2.0-93fc4e4e67c7891d3804143fea69ea530162f60ef1bd260d89f6d6cb1b1b0a38
A /root/.local/state/cabal/store/ghc-8.10.7/cabal-doctest-1.0.9-aa22ff86258ea01bc63a68a60c9927060e57323e45a68b69bac7bdbad4f5188d
A /root/.local/state/cabal/store/ghc-8.10.7/call-stack-0.4.0-a6cdbe3e7e45ec14f55840c1db99e4eb19c411ad750eec2a944ca44f537ca649
A /root/.local/state/cabal/store/ghc-8.10.7/canonical-json-0.6.0.1-818ee9b4d23feb4ac41f5c6f401f44a9b577d7eee1e63db13dad558e2a0d25e0
A /root/.local/state/cabal/store/ghc-8.10.7/cereal-0.5.8.3-34ddb94eb4256c80b52c843e9729c62d5199e31897bde5797bb6d209276b588a
A /root/.local/state/cabal/store/ghc-8.10.7/clock-0.8.3-45c2cadfc10f24bdedbf2e3bb1edaa3fc14cd1b9c1e2ef8984b6b2128ecdbbc2
A /root/.local/state/cabal/store/ghc-8.10.7/colour-2.3.6-fd57b1caf6a422f8a8ff7c3132d4740e9268baadaf789222608b1cc433061da3
A /root/.local/state/cabal/store/ghc-8.10.7/composition-prelude-3.0.0.2-fce5371f271022fc832491977e3c344bc88a750102a3b1b6112a4ba662fb12d3
A /root/.local/state/cabal/store/ghc-8.10.7/contra-tracer-0.1.0.1-b40cbb524a233a136ff86c8ff3aee651d1127d821b82cc31150c298471411d4c
A /root/.local/state/cabal/store/ghc-8.10.7/data-array-byte-0.1.0.1-2775f75d7211204cc60f677f9777540c03e30eb4f547f6729975f049bfa90204
A /root/.local/state/cabal/store/ghc-8.10.7/data-clist-0.2-ba36f8b9a590de9928d5606b9e36d2db006b9192bb0d2842dd6babf98131a669
A /root/.local/state/cabal/store/ghc-8.10.7/data-default-class-0.1.2.0-c02137f3b596e844bef779396d9ce63da3b73a41d5018ec31fdc9e34ace10009
A /root/.local/state/cabal/store/ghc-8.10.7/digest-0.0.1.7-719c89885126b2dc445603580f657c1954e1ea9614ec666e8c9377c108ad7f31
A /root/.local/state/cabal/store/ghc-8.10.7/dlist-1.0-1b907d72630b46587835cdd3dadcee4e36dba1bc204164c0e449574d533ea76e
A /root/.local/state/cabal/store/ghc-8.10.7/dom-lt-0.2.3-27ce1ab2dfe87e572922632992725ba0872a58fde94977df7cf3dc83b41b27bd
A /root/.local/state/cabal/store/ghc-8.10.7/double-conversion-2.0.4.2-a52c89a070fedcf368f04ce7084733898b8e7b6795e36896718f51bb47624a74
A /root/.local/state/cabal/store/ghc-8.10.7/entropy-0.4.1.10-adee53226aa4394d659385688944b29caff384ad2b48823c1d8550e4e7c7e855
A /root/.local/state/cabal/store/ghc-8.10.7/erf-2.0.0.0-1404960a345707cc92c0554052613e35fb196ba63898bfeb8bfb2c5b883872cd
A /root/.local/state/cabal/store/ghc-8.10.7/fgl-5.8.1.1-f0773d931687a88918222dd6e0d09add2fdf9f5afa44fa18e02f22af104dbc54
A /root/.local/state/cabal/store/ghc-8.10.7/file-embed-0.0.15.0-a0a6ff78362a398f746d181caabef8c625e13df3413f62d208fcb55d9af3c342
A /root/.local/state/cabal/store/ghc-8.10.7/filelock-0.1.1.6-cd2104e2364ff9c3354c7199c9f58312dc683ae6be175d18e7f9494214126204
A /root/.local/state/cabal/store/ghc-8.10.7/fingertree-0.1.5.0-0e4159df2ad80eb94d83537ba807fdc7b2d76a247c969aae9b7d1dba8ab1bc12
A /root/.local/state/cabal/store/ghc-8.10.7/fmlist-0.9.4-9a883fbdef4d8259926b9f13405f18e325d7f127bebd1777cac40f7d8f9a9bde
A /root/.local/state/cabal/store/ghc-8.10.7/generic-monoid-0.1.0.1-62990c66611ec2f4d87ad813fee49b94427a502bccdd2d76eb78e36c6d8ffad9
A /root/.local/state/cabal/store/ghc-8.10.7/generically-0.1.1-db49175065e9321265e674974022ee9fe897fa91aba054df2334a54795115cdd
A /root/.local/state/cabal/store/ghc-8.10.7/ghc-paths-0.1.0.12-18295b069666c79a0b146e1bf31eaee5697bdc027871023b447a4520f1848eb1
A /root/.local/state/cabal/store/ghc-8.10.7/gray-code-0.3.1-6abf5e2912448ccc2314a71304e855f38b048dfedf816420f78d42ced485bafa
A /root/.local/state/cabal/store/ghc-8.10.7/groups-0.5.3-557e984b1e234f786a9b5fa322de6af682fb950e1963ce2611671459f9b472d6
A /root/.local/state/cabal/store/ghc-8.10.7/half-0.3.1-3f0e04aa34cd7da8061416bcdf9362fc6bb68a8570f84f7114e3430332c8543c
A /root/.local/state/cabal/store/ghc-8.10.7/happy-1.20.1.1-e-happy-1827178849a289c055041380f2a439501e91a78a690911998c9b2de93ea571e3
A /root/.local/state/cabal/store/ghc-8.10.7/haskell-lexer-1.1.1-699211d62dc6a5f625823e19692c6ff9fc6523ab34ce4cd813f753206cae4e5c
A /root/.local/state/cabal/store/ghc-8.10.7/hostname-1.0-2f7c0926eced36da11d613a0d7038cd6fccf067a802d95ff69458f53bea2232b
A /root/.local/state/cabal/store/ghc-8.10.7/hourglass-0.2.12-f91601479c62ede4529316ead38ec7000466e0ebf7d741a4897daa6daa4ba6e6
A /root/.local/state/cabal/store/ghc-8.10.7/hsc2hs-0.68.9-e-hsc2hs-e6e9d096fdcc6fdb4eb2ed9f9e77f4599b973ebe93600ae5df9294340e005b78
A /root/.local/state/cabal/store/ghc-8.10.7/incoming
A /root/.local/state/cabal/store/ghc-8.10.7/package.db
-
After a really fun pairing session on Friday with SB I am finalizing the deprecation of the fuel in Hydra codebase.
-
I need to write a unit test for the
findFuelOrLargestUTxO
to assert things are working as expected. -
The test itself is pretty simple:
- Generate normal utxo and utxo containing the correct marker datum
- Merge utxos and pass them onto
findFuelOrLargestUTxO
function - Assert that the function picked correctly the utxo containing the datum
-
Now I'll add one more test to assert the largest value utxo is picked if there is no utxo with marker datum.
-
What I notice is the we have a bug in the implementation where if we want to pick the largest utxo we need to take a look at the Lovelace value not just any value. If there is a token with a large quantity we would pick this one by mistake!
- Now as we have a nice nix expression to build all the haddocks, let's revisit building a versioned docusaurus website from multiple artifacts in a single job.
- Making the hydra-spec versioned does not bode well with wanting a nice link like
hydra.fmily/head-protocol/hydra-spec.pdf
- Similarly, html files or a directory of html files is not automatically versioned by docusaurus, which is a challenge for the
/haddock
documentation. This would be definitely versioned, but not clear how to do it best. Maybe just copy it into the finalbuild/
directory still?
Changing the wallet to select the largest UTxO is problematic whenever we have some to-be-committed output in the wallet when we initialize as the wallet selects it as a seed and when it comes to committing the utxo is not there anymore.
Failure with segfault on this attempt to test the tui:
Hydra.TUI
end-to-end smoke tests
/home/runner/work/_temp/d9ec2df2-1735-46ac-9f6b-995341d559ef: line 2: 2280 Segmentation fault (core dumped) nix develop .?submodules=1#tests.hydra-tui --command tests
Error: Process completed with exit code 139.
Just taking note here of something strange I observed today on CI with this job: hydra-cluster did not compile on attempt 1 and then compiled on attempt 2.
Both run used the same source code version:
attempt 1:
/usr/bin/git log -1 --format='%H'
'142478d6e1a68465cf7b397561611d4c5c67810c'
attempt 2:
/usr/bin/git log -1 --format='%H'
'142478d6e1a68465cf7b397561611d4c5c67810c'
The first attempt dit not compile with the following error:
> Building test suite 'tests' for hydra-cluster-0.10.0..
> [ 1 of 11] Compiling Paths_hydra_cluster ( dist/build/tests/autogen/Paths_hydra_cluster.hs, dist/build/tests/tests-tmp/Paths_hydra_cluster.o )
> [ 2 of 11] Compiling Test.CardanoNodeSpec ( test/Test/CardanoNodeSpec.hs, dist/build/tests/tests-tmp/Test/CardanoNodeSpec.o )
> [ 3 of 11] Compiling Test.DirectChainSpec ( test/Test/DirectChainSpec.hs, dist/build/tests/tests-tmp/Test/DirectChainSpec.o )
> [ 4 of 11] Compiling Test.EndToEndSpec ( test/Test/EndToEndSpec.hs, dist/build/tests/tests-tmp/Test/EndToEndSpec.o )
>
> test/Test/EndToEndSpec.hs:425:51: error:
> • No instance for (Arbitrary
> (Hydra.Chain.ChainStateType
> (cardano-api-1.36.0:Cardano.Api.Tx.Tx
> Hydra.Cardano.Api.Prelude.Era)))
> arising from a use of ‘arbitrary’
> • In the first argument of ‘generate’, namely ‘arbitrary’
> In a stmt of a 'do' block:
> openState :: OpenState Tx <- generate arbitrary
> In the expression:
> do hydraScriptsTxId <- publishHydraScriptsAs node Faucet
> let persistenceDir = dir </> "persistence"
> let cardanoSK = dir </> "cardano.sk"
> let hydraSK = dir </> "hydra.sk"
> ....
> |
> 425 | openState :: OpenState Tx <- generate arbitrary
> | ^^^^^^^^^
>
> test/Test/EndToEndSpec.hs:444:59: error:
> • No instance for (ToJSON
> (Hydra.Chain.ChainStateType
> (cardano-api-1.36.0:Cardano.Api.Tx.Tx
> Hydra.Cardano.Api.Prelude.Era)))
> arising from a use of ‘Aeson.encode’
> • In the second argument of ‘BSL.writeFile’, namely
> ‘(Aeson.encode alteredNodeState)’
> In a stmt of a 'do' block:
> BSL.writeFile
> (persistenceDir </> "state") (Aeson.encode alteredNodeState)
> In the second argument of ‘($)’, namely
> ‘do createDirectoryIfMissing True persistenceDir
> BSL.writeFile
> (persistenceDir </> "state") (Aeson.encode alteredNodeState)’
> |
> 444 | BSL.writeFile (persistenceDir </> "state") (Aeson.encode alteredNodeState)
> | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> [ 6 of 11] Compiling Test.GeneratorSpec ( test/Test/GeneratorSpec.hs, dist/build/tests/tests-tmp/Test/GeneratorSpec.o )
> [ 7 of 11] Compiling Test.Hydra.Cluster.FaucetSpec ( test/Test/Hydra/Cluster/FaucetSpec.hs, dist/build/tests/tests-tmp/Test/Hydra/Cluster/FaucetSpec.o )
> [ 8 of 11] Compiling Test.Ledger.Cardano.ConfigurationSpec ( test/Test/Ledger/Cardano/ConfigurationSpec.hs, dist/build/tests/tests-tmp/Test/Ledger/Cardano/ConfigurationSpec.o )
> [ 9 of 11] Compiling Test.LogFilterSpec ( test/Test/LogFilterSpec.hs, dist/build/tests/tests-tmp/Test/LogFilterSpec.o )
For full logs, run 'nix log /nix/store/fkjlmfybgp6kvy9z58spyi4sz5z455ag-hydra-cluster-test-tests-0.10.0.drv'.
error: 1 dependencies of derivation '/nix/store/dx1kgc160c99icxm646vlwd1kvkja7xc-hydra-cluster-tests-env.drv' failed to build
Error: Process completed with exit code 1.
-
When working on versioned docs and building the haddocks via nix, I realize that the cache misses come from a “too big” source derivation given to
haskell.nix
. Filtering the source and removing some of the often changing, but not Haskell impacting to files should help. -
To further speed up the documentation build in CI we could stick to building an english website only.
- Resetting
seenSnapshot
to{"tag": "LastSeenSnapshot","lastSeen": 5}
, i.e. the last confirmed snapshot number. -
seenTxs
are the transactions requested, but not yet snapshotted.. we can make it empty - Finally, the
seenUTxO
cannot be just cleared likeseenTxs
, but they basically should just represent the sameutxo
as the last confirmed snapshot.
-
Looking for
AckSn
in logs -
Checking the persisted head state to see who signed the last
seenSnapshot
. Turns out, only arnaud and sebastian’s node signed it. Signatures of franco and sasha are missing. -
Both, franco and sasha nodes produce errors like the following on
AckSn
received from sebastian/arnaud:{"timestamp":"2023-06-13T09:30:46.245123067Z","threadId":72,"namespace":"HydraNode-\"Sasha\"","message":{"node":{"by":{"vkey":"a65dd1af5d938e394b05198b97870a33343d3dd2ec8feb8906a2cbbd2cc0ac96"},"outcome":{"error":{"contents":"requireValidAckSn","tag":"RequireFailed"},"tag":"Error"},"tag":"LogicOutcome"},"tag":"Node"}}
-
Too long log lines which are not json parsable messed up my analysis. Should really get rid of the scripts in the chainstate
-
Sashas node was in a fresh open state at 11:10, naturally they did not accept the ReqSn or any AckSn at 11:30
-
There were multiple rollbacks, and a node restart at 6:00
-
The node is usting
--start-chain-from
and re-observing things from L1 was all ignored by the head logic withInvalidEvent
-
All but the
CloseTx
.. in fact, it was an old head that has been re-observed and the close, finalize and later correct observation of the actual head did result in a wiped L2 state.
I explore the logs in grafana (time is UTC).
Filtering on message_node_outcome_reason_tag=WaitOnNotApplicableTx
we can see that all the node complain about this transaction not being applicable starting at 9:30:46
{"timestamp":"2023-06-13T09:30:51.458073539Z","threadId":77,"namespace":"HydraNode-\"sebastian-node\"","message":{"node":{"by":{"vkey":"43d8ac3fb8e39b355005ed546c8fc9cd80e4909cb254835513a542b1c9a2e3ad"},"outcome":{"reason":{"tag":"WaitOnNotApplicableTx","validationError":{"reason":"ApplyTxError [UtxowFailure (UtxoFailure (AlonzoInBabbageUtxoPredFailure (ValueNotConservedUTxO (MaryValue 0 (fromList [])) (MaryValue 9832651 (fromList []))))),UtxowFailure (UtxoFailure (AlonzoInBabbageUtxoPredFailure (BadInputsUTxO (fromList [TxIn (TxId {_unTxId = SafeHash \"7c4ce7f10c6f40d4515e59f45aa8139846129a21c1959dfdcaf6e9b1ac91a4d6\"}) (TxIx 0)]))))]"}},"tag":"Wait"},"tag":"LogicOutcome"},"tag":"Node"}}
{"timestamp":"2023-06-13T09:30:51.485568246Z","threadId":79,"namespace":"HydraNode-\"arnaud\"","message":{"node":{"by":{"vkey":"286b3cbf98f62a4f9f3368dc7ddbc02001b2580ee5e3720cb7619e2492d1bb77"},"outcome":{"reason":{"tag":"WaitOnNotApplicableTx","validationError":{"reason":"ApplyTxError [UtxowFailure (UtxoFailure (AlonzoInBabbageUtxoPredFailure (ValueNotConservedUTxO (MaryValue 0 (fromList [])) (MaryValue 9832651 (fromList []))))),UtxowFailure (UtxoFailure (AlonzoInBabbageUtxoPredFailure (BadInputsUTxO (fromList [TxIn (TxId {_unTxId = SafeHash \"7c4ce7f10c6f40d4515e59f45aa8139846129a21c1959dfdcaf6e9b1ac91a4d6\"}) (TxIx 0)]))))]"}},"tag":"Wait"},"tag":"LogicOutcome"},"tag":"Node"}}
{"timestamp":"2023-06-13T09:30:46.181156071Z","threadId":72,"namespace":"HydraNode-\"franco\"","message":{"node":{"by":{"vkey":"935df47d312772df391cc7a934702e13e9f4263c611cc275b93f76ec6d31d02c"},"outcome":{"reason":{"tag":"WaitOnNotApplicableTx","validationError":{"reason":"ApplyTxError [UtxowFailure (UtxoFailure (AlonzoInBabbageUtxoPredFailure (ValueNotConservedUTxO (MaryValue 0 (fromList [])) (MaryValue 9832651 (fromList []))))),UtxowFailure (UtxoFailure (AlonzoInBabbageUtxoPredFailure (BadInputsUTxO (fromList [TxIn (TxId {_unTxId = SafeHash \"7c4ce7f10c6f40d4515e59f45aa8139846129a21c1959dfdcaf6e9b1ac91a4d6\"}) (TxIx 0)]))))]"}},"tag":"Wait"},"tag":"LogicOutcome"},"tag":"Node"}}
{"timestamp":"2023-06-13T09:30:46.18118879Z","threadId":72,"namespace":"HydraNode-\"Sasha\"","message":{"node":{"by":{"vkey":"a65dd1af5d938e394b05198b97870a33343d3dd2ec8feb8906a2cbbd2cc0ac96"},"outcome":{"reason":{"tag":"WaitOnNotApplicableTx","validationError":{"reason":"ApplyTxError [UtxowFailure (UtxoFailure (AlonzoInBabbageUtxoPredFailure (ValueNotConservedUTxO (MaryValue 0 (fromList [])) (MaryValue 9832651 (fromList []))))),UtxowFailure (UtxoFailure (AlonzoInBabbageUtxoPredFailure (BadInputsUTxO (fromList [TxIn (TxId {_unTxId = SafeHash \"7c4ce7f10c6f40d4515e59f45aa8139846129a21c1959dfdcaf6e9b1ac91a4d6\"}) (TxIx 0)]))))]"}},"tag":"Wait"},"tag":"LogicOutcome"},"tag":"Node"}}

It seems that we are trying to spend a UTXO from transaction 7c4ce7f10c6f40d4515e59f45aa8139846129a21c1959dfdcaf6e9b1ac91a4d6
. Looking for this pattern in the logs, the first occurence we can find is from yesterday at 9:37:04 (we were probably checking the head at that point in time
{"timestamp":"2023-06-12T09:37:04.488733275Z","threadId":3591499,"namespace":"HydraNode-\"sebastian-node\"","message":{"api":{"receivedInput":{"tag":"NewTx","transaction":{"auxiliaryData":"d90103a100a10e8507181b0718ff00","body":{"auxiliaryDataHash":"e6051ae982c21571dbdbea7b9a86db24dfa3d133af6478a49990491b74b003cf","fees":0,"inputs":["b82eba7d97c463f79d5ec97102c64bd3f3032386707d1d7a907cb3f9ba049a0b#0"],"outputs":[{"address":"addr1v92l229athdj05l20ggnqz24p4ltlj55e7n4xplt2mxw8tqsehqnt","datum":null,"datumhash":null,"inlineDatum":null,"referenceScript":null,"value":{"lovelace":9832651}}]},"id":"7c4ce7f10c6f40d4515e59f45aa8139846129a21c1959dfdcaf6e9b1ac91a4d6","isValid":true,"witnesses":{"keys":["8200825820b35c9fb66fe460e137e511282c8a3dbb4ed301f7132abefb0fe59966ceb720235840af4724fa2daaabcc9706c790ff1ce2fac2f2bba0566f5510bca1224d7d46724f8e2ce1e4266b6ace449af4f286b17054b2b6130c7ae57f8df54834497d9db00c"]}}},"tag":"APIInputReceived"},"tag":"APIServer"}}

All nodes receive the ReqTx message at 09:37:04:
{"timestamp":"2023-06-12T09:37:04.488945604Z","threadId":77,"namespace":"HydraNode-\"sebastian-node\"","message":{"node":{"by":{"vkey":"43d8ac3fb8e39b355005ed546c8fc9cd80e4909cb254835513a542b1c9a2e3ad"},"event":{"message":{"party":{"vkey":"43d8ac3fb8e39b355005ed546c8fc9cd80e4909cb254835513a542b1c9a2e3ad"},"tag":"ReqTx","transaction":{"auxiliaryData":"d90103a100a10e8507181b0718ff00","body":{"auxiliaryDataHash":"e6051ae982c21571dbdbea7b9a86db24dfa3d133af6478a49990491b74b003cf","fees":0,"inputs":["b82eba7d97c463f79d5ec97102c64bd3f3032386707d1d7a907cb3f9ba049a0b#0"],"outputs":[{"address":"addr1v92l229athdj05l20ggnqz24p4ltlj55e7n4xplt2mxw8tqsehqnt","datum":null,"datumhash":null,"inlineDatum":null,"referenceScript":null,"value":{"lovelace":9832651}}]},"id":"7c4ce7f10c6f40d4515e59f45aa8139846129a21c1959dfdcaf6e9b1ac91a4d6","isValid":true,"witnesses":{"keys":["8200825820b35c9fb66fe460e137e511282c8a3dbb4ed301f7132abefb0fe59966ceb720235840af4724fa2daaabcc9706c790ff1ce2fac2f2bba0566f5510bca1224d7d46724f8e2ce1e4266b6ace449af4f286b17054b2b6130c7ae57f8df54834497d9db00c"]}}},"tag":"NetworkEvent","ttl":5},"tag":"BeginEvent"},"tag":"Node"}}
{"timestamp":"2023-06-12T09:37:04.508282583Z","threadId":71,"namespace":"HydraNode-\"franco\"","message":{"node":{"by":{"vkey":"935df47d312772df391cc7a934702e13e9f4263c611cc275b93f76ec6d31d02c"},"event":{"message":{"party":{"vkey":"43d8ac3fb8e39b355005ed546c8fc9cd80e4909cb254835513a542b1c9a2e3ad"},"tag":"ReqTx","transaction":{"auxiliaryData":"d90103a100a10e8507181b0718ff00","body":{"auxiliaryDataHash":"e6051ae982c21571dbdbea7b9a86db24dfa3d133af6478a49990491b74b003cf","fees":0,"inputs":["b82eba7d97c463f79d5ec97102c64bd3f3032386707d1d7a907cb3f9ba049a0b#0"],"outputs":[{"address":"addr1v92l229athdj05l20ggnqz24p4ltlj55e7n4xplt2mxw8tqsehqnt","datum":null,"datumhash":null,"inlineDatum":null,"referenceScript":null,"value":{"lovelace":9832651}}]},"id":"7c4ce7f10c6f40d4515e59f45aa8139846129a21c1959dfdcaf6e9b1ac91a4d6","isValid":true,"witnesses":{"keys":["8200825820b35c9fb66fe460e137e511282c8a3dbb4ed301f7132abefb0fe59966ceb720235840af4724fa2daaabcc9706c790ff1ce2fac2f2bba0566f5510bca1224d7d46724f8e2ce1e4266b6ace449af4f286b17054b2b6130c7ae57f8df54834497d9db00c"]}}},"tag":"NetworkEvent","ttl":5},"tag":"BeginEvent"},"tag":"Node"}}
{"timestamp":"2023-06-12T09:37:04.508472784Z","threadId":72,"namespace":"HydraNode-\"Sasha\"","message":{"node":{"by":{"vkey":"a65dd1af5d938e394b05198b97870a33343d3dd2ec8feb8906a2cbbd2cc0ac96"},"event":{"message":{"party":{"vkey":"43d8ac3fb8e39b355005ed546c8fc9cd80e4909cb254835513a542b1c9a2e3ad"},"tag":"ReqTx","transaction":{"auxiliaryData":"d90103a100a10e8507181b0718ff00","body":{"auxiliaryDataHash":"e6051ae982c21571dbdbea7b9a86db24dfa3d133af6478a49990491b74b003cf","fees":0,"inputs":["b82eba7d97c463f79d5ec97102c64bd3f3032386707d1d7a907cb3f9ba049a0b#0"],"outputs":[{"address":"addr1v92l229athdj05l20ggnqz24p4ltlj55e7n4xplt2mxw8tqsehqnt","datum":null,"datumhash":null,"inlineDatum":null,"referenceScript":null,"value":{"lovelace":9832651}}]},"id":"7c4ce7f10c6f40d4515e59f45aa8139846129a21c1959dfdcaf6e9b1ac91a4d6","isValid":true,"witnesses":{"keys":["8200825820b35c9fb66fe460e137e511282c8a3dbb4ed301f7132abefb0fe59966ceb720235840af4724fa2daaabcc9706c790ff1ce2fac2f2bba0566f5510bca1224d7d46724f8e2ce1e4266b6ace449af4f286b17054b2b6130c7ae57f8df54834497d9db00c"]}}},"tag":"NetworkEvent","ttl":5},"tag":"BeginEvent"},"tag":"Node"}}
{"timestamp":"2023-06-12T09:37:04.520700198Z","threadId":79,"namespace":"HydraNode-\"arnaud\"","message":{"node":{"by":{"vkey":"286b3cbf98f62a4f9f3368dc7ddbc02001b2580ee5e3720cb7619e2492d1bb77"},"event":{"message":{"party":{"vkey":"43d8ac3fb8e39b355005ed546c8fc9cd80e4909cb254835513a542b1c9a2e3ad"},"tag":"ReqTx","transaction":{"auxiliaryData":"d90103a100a10e8507181b0718ff00","body":{"auxiliaryDataHash":"e6051ae982c21571dbdbea7b9a86db24dfa3d133af6478a49990491b74b003cf","fees":0,"inputs":["b82eba7d97c463f79d5ec97102c64bd3f3032386707d1d7a907cb3f9ba049a0b#0"],"outputs":[{"address":"addr1v92l229athdj05l20ggnqz24p4ltlj55e7n4xplt2mxw8tqsehqnt","datum":null,"datumhash":null,"inlineDatum":null,"referenceScript":null,"value":{"lovelace":9832651}}]},"id":"7c4ce7f10c6f40d4515e59f45aa8139846129a21c1959dfdcaf6e9b1ac91a4d6","isValid":true,"witnesses":{"keys":["8200825820b35c9fb66fe460e137e511282c8a3dbb4ed301f7132abefb0fe59966ceb720235840af4724fa2daaabcc9706c790ff1ce2fac2f2bba0566f5510bca1224d7d46724f8e2ce1e4266b6ace449af4f286b17054b2b6130c7ae57f8df54834497d9db00c"]}}},"tag":"NetworkEvent","ttl":5},"tag":"BeginEvent"},"tag":"Node"}}
Just after that, all nodes receive the ReqSn message:
{"timestamp":"2023-06-12T09:37:04.555649141Z","threadId":77,"namespace":"HydraNode-\"sebastian-node\"","message":{"node":{"by":{"vkey":"43d8ac3fb8e39b355005ed546c8fc9cd80e4909cb254835513a542b1c9a2e3ad"},"event":{"message":{"party":{"vkey":"43d8ac3fb8e39b355005ed546c8fc9cd80e4909cb254835513a542b1c9a2e3ad"},"snapshotNumber":5,"tag":"ReqSn","transactions":[{"auxiliaryData":"d90103a100a10e8507181b0718ff00","body":{"auxiliaryDataHash":"e6051ae982c21571dbdbea7b9a86db24dfa3d133af6478a49990491b74b003cf","fees":0,"inputs":["b82eba7d97c463f79d5ec97102c64bd3f3032386707d1d7a907cb3f9ba049a0b#0"],"outputs":[{"address":"addr1v92l229athdj05l20ggnqz24p4ltlj55e7n4xplt2mxw8tqsehqnt","datum":null,"datumhash":null,"inlineDatum":null,"referenceScript":null,"value":{"lovelace":9832651}}]},"id":"7c4ce7f10c6f40d4515e59f45aa8139846129a21c1959dfdcaf6e9b1ac91a4d6","isValid":true,"witnesses":{"keys":["8200825820b35c9fb66fe460e137e511282c8a3dbb4ed301f7132abefb0fe59966ceb720235840af4724fa2daaabcc9706c790ff1ce2fac2f2bba0566f5510bca1224d7d46724f8e2ce1e4266b6ace449af4f286b17054b2b6130c7ae57f8df54834497d9db00c"]}}]},"tag":"NetworkEvent","ttl":5},"tag":"BeginEvent"},"tag":"Node"}}
{"timestamp":"2023-06-12T09:37:04.570118348Z","threadId":71,"namespace":"HydraNode-\"franco\"","message":{"node":{"by":{"vkey":"935df47d312772df391cc7a934702e13e9f4263c611cc275b93f76ec6d31d02c"},"event":{"message":{"party":{"vkey":"43d8ac3fb8e39b355005ed546c8fc9cd80e4909cb254835513a542b1c9a2e3ad"},"snapshotNumber":5,"tag":"ReqSn","transactions":[{"auxiliaryData":"d90103a100a10e8507181b0718ff00","body":{"auxiliaryDataHash":"e6051ae982c21571dbdbea7b9a86db24dfa3d133af6478a49990491b74b003cf","fees":0,"inputs":["b82eba7d97c463f79d5ec97102c64bd3f3032386707d1d7a907cb3f9ba049a0b#0"],"outputs":[{"address":"addr1v92l229athdj05l20ggnqz24p4ltlj55e7n4xplt2mxw8tqsehqnt","datum":null,"datumhash":null,"inlineDatum":null,"referenceScript":null,"value":{"lovelace":9832651}}]},"id":"7c4ce7f10c6f40d4515e59f45aa8139846129a21c1959dfdcaf6e9b1ac91a4d6","isValid":true,"witnesses":{"keys":["8200825820b35c9fb66fe460e137e511282c8a3dbb4ed301f7132abefb0fe59966ceb720235840af4724fa2daaabcc9706c790ff1ce2fac2f2bba0566f5510bca1224d7d46724f8e2ce1e4266b6ace449af4f286b17054b2b6130c7ae57f8df54834497d9db00c"]}}]},"tag":"NetworkEvent","ttl":5},"tag":"BeginEvent"},"tag":"Node"}}
{"timestamp":"2023-06-12T09:37:04.570511992Z","threadId":72,"namespace":"HydraNode-\"Sasha\"","message":{"node":{"by":{"vkey":"a65dd1af5d938e394b05198b97870a33343d3dd2ec8feb8906a2cbbd2cc0ac96"},"event":{"message":{"party":{"vkey":"43d8ac3fb8e39b355005ed546c8fc9cd80e4909cb254835513a542b1c9a2e3ad"},"snapshotNumber":5,"tag":"ReqSn","transactions":[{"auxiliaryData":"d90103a100a10e8507181b0718ff00","body":{"auxiliaryDataHash":"e6051ae982c21571dbdbea7b9a86db24dfa3d133af6478a49990491b74b003cf","fees":0,"inputs":["b82eba7d97c463f79d5ec97102c64bd3f3032386707d1d7a907cb3f9ba049a0b#0"],"outputs":[{"address":"addr1v92l229athdj05l20ggnqz24p4ltlj55e7n4xplt2mxw8tqsehqnt","datum":null,"datumhash":null,"inlineDatum":null,"referenceScript":null,"value":{"lovelace":9832651}}]},"id":"7c4ce7f10c6f40d4515e59f45aa8139846129a21c1959dfdcaf6e9b1ac91a4d6","isValid":true,"witnesses":{"keys":["8200825820b35c9fb66fe460e137e511282c8a3dbb4ed301f7132abefb0fe59966ceb720235840af4724fa2daaabcc9706c790ff1ce2fac2f2bba0566f5510bca1224d7d46724f8e2ce1e4266b6ace449af4f286b17054b2b6130c7ae57f8df54834497d9db00c"]}}]},"tag":"NetworkEvent","ttl":5},"tag":"BeginEvent"},"tag":"Node"}}
{"timestamp":"2023-06-12T09:37:04.582423459Z","threadId":79,"namespace":"HydraNode-\"arnaud\"","message":{"node":{"by":{"vkey":"286b3cbf98f62a4f9f3368dc7ddbc02001b2580ee5e3720cb7619e2492d1bb77"},"event":{"message":{"party":{"vkey":"43d8ac3fb8e39b355005ed546c8fc9cd80e4909cb254835513a542b1c9a2e3ad"},"snapshotNumber":5,"tag":"ReqSn","transactions":[{"auxiliaryData":"d90103a100a10e8507181b0718ff00","body":{"auxiliaryDataHash":"e6051ae982c21571dbdbea7b9a86db24dfa3d133af6478a49990491b74b003cf","fees":0,"inputs":["b82eba7d97c463f79d5ec97102c64bd3f3032386707d1d7a907cb3f9ba049a0b#0"],"outputs":[{"address":"addr1v92l229athdj05l20ggnqz24p4ltlj55e7n4xplt2mxw8tqsehqnt","datum":null,"datumhash":null,"inlineDatum":null,"referenceScript":null,"value":{"lovelace":9832651}}]},"id":"7c4ce7f10c6f40d4515e59f45aa8139846129a21c1959dfdcaf6e9b1ac91a4d6","isValid":true,"witnesses":{"keys":["8200825820b35c9fb66fe460e137e511282c8a3dbb4ed301f7132abefb0fe59966ceb720235840af4724fa2daaabcc9706c790ff1ce2fac2f2bba0566f5510bca1224d7d46724f8e2ce1e4266b6ace449af4f286b17054b2b6130c7ae57f8df54834497d9db00c"]}}]},"tag":"NetworkEvent","ttl":5},"tag":"BeginEvent"},"tag":"Node"}}
The last AckSn
we can see from the four node are from yesterday at 9:30:47 for the snapshot number 6:
{"timestamp":"2023-06-13T09:30:47.429549058Z","threadId":77,"namespace":"HydraNode-\"sebastian-node\"","message":{"node":{"by":{"vkey":"43d8ac3fb8e39b355005ed546c8fc9cd80e4909cb254835513a542b1c9a2e3ad"},"event":{"message":{"party":{"vkey":"43d8ac3fb8e39b355005ed546c8fc9cd80e4909cb254835513a542b1c9a2e3ad"},"signed":"15163cc08e4345c9687f93fa87f9df01e72d25fd63c55c8256a79b7404eb4baaf160ae0bb61f4b00311210727aa72750cab473546604a336cc7da2be99269c00","snapshotNumber":6,"tag":"AckSn"},"tag":"NetworkEvent","ttl":5},"tag":"BeginEvent"},"tag":"Node"}}
{"timestamp":"2023-06-13T09:30:47.371260571Z","threadId":79,"namespace":"HydraNode-\"arnaud\"","message":{"node":{"by":{"vkey":"286b3cbf98f62a4f9f3368dc7ddbc02001b2580ee5e3720cb7619e2492d1bb77"},"event":{"message":{"party":{"vkey":"43d8ac3fb8e39b355005ed546c8fc9cd80e4909cb254835513a542b1c9a2e3ad"},"signed":"15163cc08e4345c9687f93fa87f9df01e72d25fd63c55c8256a79b7404eb4baaf160ae0bb61f4b00311210727aa72750cab473546604a336cc7da2be99269c00","snapshotNumber":6,"tag":"AckSn"},"tag":"NetworkEvent","ttl":5},"tag":"BeginEvent"},"tag":"Node"}}
{"timestamp":"2023-06-13T09:30:47.362421729Z","threadId":72,"namespace":"HydraNode-\"franco\"","message":{"node":{"by":{"vkey":"935df47d312772df391cc7a934702e13e9f4263c611cc275b93f76ec6d31d02c"},"event":{"message":{"party":{"vkey":"43d8ac3fb8e39b355005ed546c8fc9cd80e4909cb254835513a542b1c9a2e3ad"},"signed":"15163cc08e4345c9687f93fa87f9df01e72d25fd63c55c8256a79b7404eb4baaf160ae0bb61f4b00311210727aa72750cab473546604a336cc7da2be99269c00","snapshotNumber":6,"tag":"AckSn"},"tag":"NetworkEvent","ttl":5},"tag":"BeginEvent"},"tag":"Node"}}
{"timestamp":"2023-06-13T09:30:47.358664501Z","threadId":72,"namespace":"HydraNode-\"Sasha\"","message":{"node":{"by":{"vkey":"a65dd1af5d938e394b05198b97870a33343d3dd2ec8feb8906a2cbbd2cc0ac96"},"event":{"message":{"party":{"vkey":"43d8ac3fb8e39b355005ed546c8fc9cd80e4909cb254835513a542b1c9a2e3ad"},"signed":"15163cc08e4345c9687f93fa87f9df01e72d25fd63c55c8256a79b7404eb4baaf160ae0bb61f4b00311210727aa72750cab473546604a336cc7da2be99269c00","snapshotNumber":6,"tag":"AckSn"},"tag":"NetworkEvent","ttl":5},"tag":"BeginEvent"},"tag":"Node"}}
Then we only see the node receiving ReqTx
, starting yesterday at 9:30:51
{"timestamp":"2023-06-13T09:30:51.458071014Z","threadId":77,"namespace":"HydraNode-\"sebastian-node\"","message":{"node":{"by":{"vkey":"43d8ac3fb8e39b355005ed546c8fc9cd80e4909cb254835513a542b1c9a2e3ad"},"event":{"message":{"party":{"vkey":"43d8ac3fb8e39b355005ed546c8fc9cd80e4909cb254835513a542b1c9a2e3ad"},"tag":"ReqTx","transaction":{"auxiliaryData":"d90103a100a10e85050318ff0000","body":{"auxiliaryDataHash":"737e5712b5847e26385909d746ff33da0f4d93bb69a2ff6ab178efb4bc9bbc06","fees":0,"inputs":["7c4ce7f10c6f40d4515e59f45aa8139846129a21c1959dfdcaf6e9b1ac91a4d6#0"],"outputs":[{"address":"addr1v92l229athdj05l20ggnqz24p4ltlj55e7n4xplt2mxw8tqsehqnt","datum":null,"datumhash":null,"inlineDatum":null,"referenceScript":null,"value":{"lovelace":9832651}}]},"id":"99de349c37b7d0a1bb18b61fcd393694ea19062c35c3d788b715af2865245d5f","isValid":true,"witnesses":{"keys":["8200825820b35c9fb66fe460e137e511282c8a3dbb4ed301f7132abefb0fe59966ceb7202358409e61e29bfab8b6dbeaba08916b06f79234fc235acf24a6076e56446394b112a4ca6537b5ff1a77e5bb1836760e17adbf1791dd33d953ab5097a90b5a49e4f90c"]}}},"tag":"NetworkEvent","ttl":5},"tag":"BeginEvent"},"tag":"Node"}}
-
Come to think of it maybe it was a bit rushed to remove all fuel occurrences from the code. We probably want to deprecate fuel instead of forcing new changes to the users. My rationale was that you can still use fuel but all the code around it was removed so if the users have any test code that depends on hydra functions they would need to adjust it.
-
Anyway one last thing I wanted to do was to resolve a fixme in the wallet code since I am already touching code around it. Fixme is related to including the min utxo amount when calculating delta between input and output value when specifying a change output.
-
I already wrote the code that compiles and supposedly does the right thing but one wallet test was failing with:
uncaught exception: ErrorCall
Illegal value in txout
- After tracing I found out it is coming from usage of our own
minUTxOValue
which internally callsevaluateMinLovelaceOutput
of cardano-api and that one throws.
- link to async api docs
- when we run
yarn validate
this script is executed under the hood and requires a TAG to be present on every sample to work. - link to the async api validator we are using.
- logs are verified against the logs.yaml spec using a plain json schema validation tool called jsonschema.
-
The next thing I'd like to do is to make the fueling optional in Hydra internal wallet.
-
So far we needed fuel to make sure there are some utxos available for paying the Hydra transactions (e.g. collectCom, abort) but as we are moving to external commits it is not strictly mandatory to mark anything as fuel. We just need to make sure there is enough ADA available for driving the Hydra Head transactions as well as transacting with other participants.
-
I guess I'll see the implications of this change as I am not really sure of all of them. Seems like if you mark some utxos as fuel then you don't need to worry about what to show in the client app for the users to pick when sending new Hydra tx since you know there is some ada left to pay for Head state transitions.
-
With fuel removed we are probably going to show all utxos available but then how to make sure there is enough ada so we can do the abort tx for example? Maybe we just need to make sure there is some ada always left to drive Head txs? Or we make it the users responsibility to take care of that?
-
I will proceed with removing the mandatory fuel (still keeping the option to accomodate for the old workflow) and it seems there is not much to do in case Head txs don't have enough outputs to spend a part from returning
NotEnoughFuel
. -
My plan is to first remove the constructor with name
NoFuelUTXOFound
which would reveal all places in code where we handle this type of error. This should no longer be the error since it is not mandatory to mark some utxos as fuel. -
I also renamed
NotEnoughFuel
toNotEnoughFunds
-
This change highlighted the need to provide the inputs and outputs in the
ErrNotEnoughFunds
since it wrapsChangeError
. We really need to do some refactoring to get away from translating errors from one component to another. -
Renaming
findFuelUTxO
tofindLargestUTxO
. We are selecting the largest utxo to be used to pay the fees here. Wondering if this is the right approach... -
Next changes in line are the ones to actually purge all usages of
Fuel
in the codebase. My plan is to add a test that proves that we still support this feature but it is not mandatory and future Hydra does not require any fuel marking.
-
I am very sorry I didn't keep the logbook entry while working on this task :( There were many transaction errors we experienced while making it work and this would probably be useful to someone but here we are.
-
Continuing work, one of the tasks is to prevent the user to commit a utxo addressed to internal wallet.
-
Since user/operator of the hydra node are assumed to be the same person (at least for the time being) we want to prevent spending in the commit tx a utxo that is supposed to be used to pay for the L1 Hydra transactions.
-
In the chain handle we do have access to the wallet so it looks like we could easily get the wallet utxos and make sure the user provided utxos are not one of these.
-
This worked nicely. I feel guilty for jumping into code immediately but hopefuly test gods will forgive me since I wrote the test right after I made everything compile.
-
I see in the test that api server returns correctly
FailedToDraftTxWalletUtxoDetected
which is the new error I added. -
Writing such expectation was not so straight forward so I stole some test code from the req lib we are using for http requests.
AB & PG on #186
Trying to generate CPU profile for hydra-node process spawned by benchmarks but this does not work. There this SO question that seems to imply SIGINT could work but it doesn't
Trying the simple way: Add a timeout around hydra-node main to exit "gracefully" but this does not work either => No easy way to get CPU profiling for the node
void $ timeout 60_000_000 (run (identifyNode options))
Notes on how to build everything with profiling:
nix develop .#cabalOnly
cabal build all --enable-profiling
or
cabal bench hydra-cluster --enable-profiling
to just build the dependencies of bench with profiling
After a while we realise we are missing an important parameter: Our code does not run on more than 1 core because, even though we compiled it with -threaded
we have not activated the +RTS -N
to leverage SMP!
Then staring at the code, we find another "culprit" for the surprising low performance: Saving the state
Removing state persistence divides timing by an order of magniture (more or less)...
{"event":"NewTx","id":"aa4c8568d7a2665f41387793aea3f5f1b9f185eedf2d715ec84e7d137f375429","tag":"TraceEvent","timestamp":"2023-06-08T08:23:17.047801498Z","us":16.332}
{"event":"ReqTx","id":"aa4c8568d7a2665f41387793aea3f5f1b9f185eedf2d715ec84e7d137f375429","tag":"TraceEvent","timestamp":"2023-06-08T08:23:17.047845564Z","us":1093.13}
{"event":"ReqSn","id":"aa4c8568d7a2665f41387793aea3f5f1b9f185eedf2d715ec84e7d137f375429","tag":"TraceEvent","timestamp":"2023-06-08T08:23:17.048943092Z","us":441.386}
{"event":"NewTx","id":"7c17bbfa52458f8c7c0fe9afd32e1212471377e254f96154dd561946c7316800","tag":"TraceEvent","timestamp":"2023-06-08T08:23:17.054012977Z","us":16.381}
{"event":"ReqTx","id":"7c17bbfa52458f8c7c0fe9afd32e1212471377e254f96154dd561946c7316800","tag":"TraceEvent","timestamp":"2023-06-08T08:23:17.054057723Z","us":2621.656}
{"event":"ReqSn","id":"7c17bbfa52458f8c7c0fe9afd32e1212471377e254f96154dd561946c7316800","tag":"TraceEvent","timestamp":"2023-06-08T08:23:17.056683677Z","us":441.336}
{"event":"NewTx","id":"8da1a94337061b996e609317e76b2ca9ed8dd6cb1942a19badd5de093c951015","tag":"TraceEvent","timestamp":"2023-06-08T08:23:17.062108269Z","us":17.844}
{"event":"ReqTx","id":"8da1a94337061b996e609317e76b2ca9ed8dd6cb1942a19badd5de093c951015","tag":"TraceEvent","timestamp":"2023-06-08T08:23:17.062161843Z","us":997.384}
{"event":"ReqSn","id":"8da1a94337061b996e609317e76b2ca9ed8dd6cb1942a19badd5de093c951015","tag":"TraceEvent","timestamp":"2023-06-08T08:23:17.063163505Z","us":509.228}
Trying to remove persistence in API server too does lead to the same impact.
Let's evaluate the impact of a combination of variations:
| Persistence | Logs | Multicore | Conf time | | N | Y | N | 18ms | | N | Y | Y | 10ms | | Y | Y | N | 30ms | | N | N | N | 5ms | | N | N | Y | 2.15ms | | Y | Y | Y | 40ms |
Then we realise the profiling is still on, let's try to compute performance without profiling enabled and see if it makes a difference.
Logging has a significant impact too, but it's actually becuase of the lame LogicOutcome
log entry which is way too large. Removing persistence (saving state) and logging of LogicOutcome
already yields as significant improvement:
{"event":"NewTx","id":"bb1d045c609a9e91e0c40f6eb567b0b71bd939954e1402060e5c5224f496e2fa","tag":"TraceEvent","timestamp":"2023-06-08T10:03:19.530643018Z","us":5.179}
{"effect":"TxValid","id":"bb1d045c609a9e91e0c40f6eb567b0b71bd939954e1402060e5c5224f496e2fa","tag":"TraceEffect","timestamp":"2023-06-08T10:03:19.530795207Z","us":70.614}
{"event":"ReqTx","id":"bb1d045c609a9e91e0c40f6eb567b0b71bd939954e1402060e5c5224f496e2fa","tag":"TraceEvent","timestamp":"2023-06-08T10:03:19.530656443Z","us":210.43}
{"event":"ReqSn","id":"bb1d045c609a9e91e0c40f6eb567b0b71bd939954e1402060e5c5224f496e2fa","tag":"TraceEvent","timestamp":"2023-06-08T10:03:19.530868226Z","us":128.805}
{"effect":"SnapshotConfirmed","id":"bb1d045c609a9e91e0c40f6eb567b0b71bd939954e1402060e5c5224f496e2fa","tag":"TraceEffect","timestamp":"2023-06-08T10:03:19.531200748Z","us":179.562}
{"effect":"ReqTx","id":"e40ec70102ba99608e4ede86592c02773cf6fb5c671eabe4d01ad735fa2c2492","tag":"TraceEffect","timestamp":"2023-06-08T10:03:19.531624615Z","us":1.933}
{"event":"NewTx","id":"e40ec70102ba99608e4ede86592c02773cf6fb5c671eabe4d01ad735fa2c2492","tag":"TraceEvent","timestamp":"2023-06-08T10:03:19.531623082Z","us":3.857}
{"effect":"TxValid","id":"e40ec70102ba99608e4ede86592c02773cf6fb5c671eabe4d01ad735fa2c2492","tag":"TraceEffect","timestamp":"2023-06-08T10:03:19.531758179Z","us":68.69}
{"event":"ReqTx","id":"e40ec70102ba99608e4ede86592c02773cf6fb5c671eabe4d01ad735fa2c2492","tag":"TraceEvent","timestamp":"2023-06-08T10:03:19.531633331Z","us":194.64}
{"event":"ReqSn","id":"e40ec70102ba99608e4ede86592c02773cf6fb5c671eabe4d01ad735fa2c2492","tag":"TraceEvent","timestamp":"2023-06-08T10:03:19.531829134Z","us":217.674}
{"effect":"SnapshotConfirmed","id":"e40ec70102ba99608e4ede86592c02773cf6fb5c671eabe4d01ad735fa2c2492","tag":"TraceEffect","timestamp":"2023-06-08T10:03:19.532247429Z","us":85.944}
{"effect":"ReqTx","id":"5fe5c7d87e657d4c035cc4f8f66e2cb65d11ca796f1472a6e235e1b78a8881f7","tag":"TraceEffect","timestamp":"2023-06-08T10:03:19.53259484Z","us":1.914}
{"event":"NewTx","id":"5fe5c7d87e657d4c035cc4f8f66e2cb65d11ca796f1472a6e235e1b78a8881f7","tag":"TraceEvent","timestamp":"2023-06-08T10:03:19.532593127Z","us":4.007}
{"effect":"TxValid","id":"5fe5c7d87e657d4c035cc4f8f66e2cb65d11ca796f1472a6e235e1b78a8881f7","tag":"TraceEffect","timestamp":"2023-06-08T10:03:19.532727873Z","us":63.681}
{"event":"ReqTx","id":"5fe5c7d87e657d4c035cc4f8f66e2cb65d11ca796f1472a6e235e1b78a8881f7","tag":"TraceEvent","timestamp":"2023-06-08T10:03:19.532603997Z","us":188.469}
{"event":"ReqSn","id":"5fe5c7d87e657d4c035cc4f8f66e2cb65d11ca796f1472a6e235e1b78a8881f7","tag":"TraceEvent","timestamp":"2023-06-08T10:03:19.532793738Z","us":116.662}
Running the benchmark without profiling, saving state nor logging the logic outcome, but with other logging on gives:
Starting benchmark
Seeding network
Initializing Head
Comitting initialUTxO from dataset
HeadIsOpen
[2023-06-08 10:08:35.810846141 UTC] Client 1 (node 0): 0/3000 (0.00%)
[2023-06-08 10:08:40.812716298 UTC] Client 1 (node 0): 2737/3000 (91.23%)
All transactions confirmed. Sweet!
[2023-06-08 10:08:45.813345134 UTC] Closing the Head
Finalizing the Head
Writing results to: /tmp/bench-39dec0ad3b3ab69d/results.csv
Confirmed txs: 3000
Average confirmation time (ms): 1.6892676483333335
which is pretty good and actually in our target of 500-1000tps
With multithreading on by default (in the build), we get the following results:
log outcome | persistence (statE) | persistence (output) | time (ms) |
---|---|---|---|
N | Y | Y | 30 |
Y | Y | Y | 31 |
Y | N | Y | 4 |
Y | N | N | 2.5 |
N | N | N | 1.2 |
Confirmation time without any logging nor persistence: 0.75
Here is a summary of this effort:
- We noticed very early that the CPU usage running the benchmarks (~50%) did not make sense and was too low. Turns out we had not turned on effective use of multi-core CPUs for running the benchmark and the hydra-node. While we compiled code with SMP processing (
-threaded
) we did not activate it+RTS -N
, which made all our code single-threaded - Evolving the log-filter we were able to extract main events and effects duration from the logs, which showed the timings for ReqTx, ReqSn, or SnapshotConfirmed to be very long (~10ms). Looking at the code in
Hydra.Node
one of the obvious time sink seemed to be the persistence: Wesave
the state upon nearly each event as most of them entail a state change, and we do this very naively by dumping the whole state in a file. - Logs should not have an impact on production but of course, when execution is single-threaded it does! Also, we are logging very large structure when we log the
LogicOutcome
with the full state of the head, whose serialisation is very heavy - We ran benchmarks with various combinations of logging/persistence/multi-threading to assess the impact of all these elements and identify the main bottlenecks to work on
Ultimately, here are the outcomes from these experiments:
- Make state persistence smarter, using an append-only mechanism just like we do for server output (the impact of storing server output is minimal on performance)
- Simplify logging of state changes (
LogicOutcome
) content - Ensure SMP is active (although it can be detrimental with lot of IO)
We have started working on 1. with a simple goal: Persist the stream of events in an append-only way, then replay them at load time. While this is fine, it's still annoying we have to go through the update
function to restore the state, so we think a better solution would be to have the NewState
outcome contain a diff, eg. an actual event in the Event Sourcing sense of the word, that contains very few semantics and can be applied straightforwardly to update the state. This will then make it clear in the logs what's changed in the state, and what "event" (or rather a command actually) caused it.
Experimenting with CI
What if we setup a dev environment on CI and just build and test on it?
This script sets up a dev environment over Ubuntu (cardano not installed yet, though): https://github.com/input-output-hk/hydra/blob/alternative_ci/.github/workflows/bin/setup_dev_ubuntu
This job was successful in build the hydra project with this setup: https://github.com/input-output-hk/hydra/actions/runs/5210718929/jobs/9402138030
It took 1h39m and, of course, the GitHub cache setup did not work so can’t benefit from it in another run :)
Note: recommended GitHub-cache setup: https://github.com/actions/cache/blob/main/examples.md#haskell---cabal
I create this project to better understand how gh cache works: https://github.com/pgrange/to_remove_cache
The issue is reproduced in this build: https://github.com/pgrange/to_remove_cache/actions/runs/5215793729/jobs/9413776978
Actually cabal does not store stuff where I thought it was and, by the way, the GitHub doc is wrong.
At the end of day, the best I can get for compiling a dummy project when nothing changed since the last run is 26": https://github.com/pgrange/to_remove_cache/actions/runs/5216337562/jobs/9414956883
- Using the internal utxos of the node as collateral
- Are we ok with anyone being able to spend money from the node?
- The node can check that the transaction “passes” so it does look safe to use hydra-node funds as collateral as this collateral should not be spent at the end of the day.
- Also, the client having access to the hydra-node might be legit to spend fuel from the node.
- Switching to openapi?
- Hard to make regular HTTP Rest request into async api tooling
- Maybe would be easier to insert websocket approach into open API tooling
- github discussion + user feedback + grooming to address that
- How far to go on commit from scripts? Native, multiple plutus versions, from reference inputs?
- Multiple PR with reduced scope
- Incrementally increase features on this aspect should not be so much of breaking changes
- Have multiple small PR
- Should we close external commit normal utxo pr?
- Would not close the feature but would be releasable though
- The api is still going to change, probably
- It might be valuable enough to merge it even though the api will change later
- Let’s merge it!
- What about event store alternatives: https://redpanda.com/ and https://www.scylladb.com/
- Could improve operations (persistence, backups, analytics, debugging, etc)
- Provide views
- Concerns about website management
- Error handling in draft tx
- Should we have dedicated exceptions for draftTx, different from the ones from postTx?
- MonadThrow vs. Either
We have a check-documentation workflow which seems to have never worked in the CI. Quickly fixing it and let’s see if it can find problems on our website: https://github.com/input-output-hk/hydra/actions/runs/5185911492
Anyway, CI is still exposing strong performance discrepencies. In particular for the haddock job. It seems that it’s coming from the Prepare tools step. Let’s compare two runs:
Both runs seem to do pretty much the same so it should come either from a runner performance issue or bandwidth problem with the nix cache.
Looking at just the first 9 lines of output which download the same things:
We can try to reduce that by generating the doc with nix instead of cabal, hence downloading less stuff from the nix cache.
Doing so for hydra-node seem to work out of the box:
nom build .?submodules=1#hydraProject.x86_64-linux.hsPkgs.hydra-node.components.library.doc --print-out-paths
But, for some we reason we have failure with hydra-plutus (although calling cabal haddock hyrda-plutus
works):
nom build .?submodules=1#hydraProject.x86_64-linux.hsPkgs.hydra-plutus.components.library.doc
…
hydra-plutus-lib-hydra-plutus-0.11.0-haddock> GHC Core to PLC plugin: E043:Error: Reference to a name which is not a local, a builtin, or an external INLINABLE function: Variable Plutus.Extras.wrapValidator
...
There is a very similar issue in Haskell.nix that's been open for a while. We may be stuck until it's fixed, at least for hydra-plutus.
Trying to replace sendTextData
with sendBinaryData
in the HydraNode
WS client: Perhaps the encoded message breaks the sending in some ways?
- Sending binary data does not help, still having the same issue
It's as if the Close
message never reached the server... Perhaps the message killed the trhead on the other side?
- Trying to trace the raw messages received from the client by the server to see where it fails
- Also ensure that we use binary data on the server side, replacing
sendTextData
withsendBinaryData
. - Does not seem to have much effect though...
Trying to use low-level socket handling and pass it to WS client to have better control over what's happening in the network communication By using a socket directly, I can see the following log entries:
{"timestamp":"2023-06-06T14:19:51.043198764Z","threadId":395,"namespace":"Test","message":{"contents":"<socket: 13>","tag":"SocketClosed"}}
{"timestamp":"2023-06-06T14:19:51.043205638Z","threadId":395,"namespace":"Test","message":{"contents":"<socket: 13>","tag":"SocketClosed"}}
{"timestamp":"2023-06-06T14:19:51.043534025Z","threadId":387,"namespace":"Test","message":{"message":{"tag":"Close"},"nodeId":0,"tag":"SentMessage"}}
So this seems to indicate somehow the action
is ending?
So indeed the action wrapped with the connection terminates:
{"timestamp":"2023-06-06T15:08:05.321220113Z","threadId":2514,"namespace":"Test","message":{"nodeId":0,"tag":"NodeStopped"}}
{"timestamp":"2023-06-06T15:08:05.32125931Z","threadId":2514,"namespace":"Test","message":{"contents":"<socket: 13>","tag":"SocketClosed"}}
{"timestamp":"2023-06-06T15:08:05.321576676Z","threadId":2506,"namespace":"Test","message":{"message":{"tag":"Close"},"nodeId":0,"tag":"SentMessage"}}
Looking at the result of the action: It's a huge map from txids to events, eg. the output of processTransactions
. Why is this the "result" of the action?
We just realised that there's actually 2 connections open to the node: One for the main body of the test, and one for each client that submits tx!
When the processTransactions
stops, the other thread has to reread the whole list of server outputs which takes a while. Yet it's not clear why the test crashes 🤔
- With a single connection, it works!
With the old version of the code (eg. 2 clients) I can see the data is accumulated in socket buffers:
$ ss | grep 4000
tcp ESTAB 87167 0 127.0.0.1:59504 127.0.0.1:4000
tcp ESTAB 0 253903 127.0.0.1:4000 127.0.0.1:59504
tcp ESTAB 0 0 127.0.0.1:59520 127.0.0.1:4000
tcp ESTAB 0 248 127.0.0.1:4000 127.0.0.1:59520
Lessons learnt:
- You should consume the data from an open connection/Socket!
- If you're having "random" connection crashes, check accumulation of unread/buffered data on both ends
After a while, the socket with accumulated unread buffer gets closed:
$ ss | grep 4000
tcp ESTAB 87167 0 127.0.0.1:59504 127.0.0.1:4000
tcp ESTAB 0 0 127.0.0.1:59520 127.0.0.1:4000
tcp ESTAB 0 248 127.0.0.1:4000 127.0.0.1:59520
- Benchmarks and timeout issue on connection
- Pairing with someone else!
- Errors on large / long running benchmarks, connections get dropped
- Worked a bit on picking json instance in the api server (for tx and utxo exclusion) using reflection lib
- Opening a head?
- What do we want to do about it?
- Not sure what happened, let’s create a new head and see if the issue happens again
- Open a new head at 3pm
-
Link url for async-api channel http bindings.
-
The
yarn validate
command (under docs) needs to be extended every time we introduce a new channel to the api. -
Aeson adds a
tag
attribute on sum types by default. -
Seems the best final troubleshooting to overcome a broken head is to: everybody backup and delete their persistence state and start the chain from a point in time in the past, before the head is initialised. This throws away all the off-chain operations but is better than to close and re-open it later.
We saw that adding a new line in the right place fixes the issue. To see if I can file an issue on docusaurus, I try to reproduce on a brand new small project and it does reproduce.
https://github.com/facebook/docusaurus/issues/9036
Today the monthly seem broken again. Reloading the page, it was fixed. Probably a cache issue on my machine but surprising because I was seeing the right page last week.
Looking into the details, SN notices that there is also a problem with one picture. This is not the picture we should see:
Reloading the page does not fix it this time. Looking at the html we can see that the html does, indeed, include the same picture twice:
<img loading="lazy" alt="CI perf" src="/head-protocol/assets/images/2023-05-ci-perf-6f40f4793f448feae776a3e87c1c4d59.png" width="1942" height="1142" class="img_ev3q">
...
<img loading="lazy" alt="CI perf" src="/head-protocol/assets/images/2023-05-ci-perf-6f40f4793f448feae776a3e87c1c4d59.png" width="1942" height="1142" class="img_ev3q">
Looking at the source code of the monthly on master we see, on the contrary, that we use two different pictures here:
![CI perf](./img/2023-05-ci-perf.png) <small><center>C.I. execution total execution time</center></small>
...
![Two instances of HydraNow (in browser) connected via a hydra-pay channel](./img/2023-05-hydra-now.png) <small><center>Two instances of HydraNow (in browser) connected via a hydra-pay channel</center></small>
Asking SB, he also see the following:
- Wrong monthly page without may
- Reloading shows the right page with may
- Clicking on may he can see the right picture
- Reloading the page, now, it's the wrong picture
- The html code has changed between both page loadings
Looking at the html files on the gh-pages branch we can see the appropriate html code, referencing the right image, in both the stable and unstable monthly:
git switch gh-pages
grep images head-protocol/monthly/2023-05/index.html
grep images head-protocol/unstable/monthly/2023-05/index.html
On the browser, we disable caching and look at network. We can see that the html does, indeed, point to the right picture. But The page displays the wrong one and inspecting this wrong picture, we can see that, here, the html is wrong. There is something wrong with JS messing with the source page: we put a breakpoint to prevent Javascript to run and we see the appropriate picture. We let Javascript run and the picture changes.
Reproducing locally:
We download the docs-unstable artifact from master and run it locally. Loading the may monthly report we see the right picture first. Then, reloading the page, we see the wrong one.
Reproducing more locally:
git switch master
cd docs
yarn build-dev
yarn serve
We reproduce the problem here: right picture first, then wrong picture.
There is a docusaurus upgrade, let see
yarn upgrade @docusaurus/core@latest @docusaurus/plugin-content-docs@latest @docusaurus/preset-classic@latest @docusaurus/theme-mermaid@latest
This does not fix the problem.
Just in case, testing with another fresh browser (brave instead of safari) reproduces the issue.
Introducing a newline at a very different place of the file fixes the issue:
--- a/docs/monthly/2023-05-monthly.md
+++ b/docs/monthly/2023-05-monthly.md
@@ -16,7 +16,8 @@ This month the team released version 0.10.0 which includes many important featur
The project [roadmap](https://github.com/orgs/input-output-hk/projects/21) was
only slightly updated this month and already saw one more feature completed:
-![The roadmap without idea items](./img/2023-05-roadmap-ex-ideas.png) <small><center>The roadmap without idea items</center></small>
+![The roadmap without idea items](./img/2023-05-roadmap-ex-ideas.png)
+<small><center>The roadmap without idea items</center></small>
Pushing a quick fix commit to master to fix the monthly report.
- Investigating the benchmark failures with arnaud
- On my machine they are consistently failing with
--scaling-factor 100
after~2
minutes - We notice a significant delay between the server sent
timestamp
and the one added on client side for tracing? (maybe a red herring) - Could confirm the exception visible is ultimately from the
receiveData
call inwaitNext
- Trying inheritted stdout and stderr to see if anything is printed by the
hydra-node
on error (which is not caught by theproc
call) - Nope. Just as before:
Closing the Head {"timestamp":"2023-06-05T16:39:16.732920841Z","threadId":315,"namespace":"HydraNode-\"0\"","message":{"api":{"reason":"Warp: Client closed connection prematurely","tag":"APIConnectionError"},"tag":"APIServer"}} Logfile written to: /tmp/bench-0f110ba838d65887/logs/hydra-node-0.log Logfile written to: /tmp/bench-0f110ba838d65887/logs/cardano-node.log Load test on 1 local nodes in /tmp/bench-0f110ba838d65887 [✘] Failures: bench/Bench/EndToEnd.hs:115:29: 1) Load test on 1 local nodes in /tmp/bench-0f110ba838d65887 uncaught exception: ErrorCall receiveData failed: Network.Socket.recvBuf: resource vanished (Connection reset by peer)
- The exception gets thrown from around here: https://hackage.haskell.org/package/network-3.1.4.0/docs/src/Network.Socket.Buffer.html#recvBuf
- Looking at the server side, what happens before the Warp exception handler?
- The error text is coming from Warp and seems to be a
ConnectionClosedByPeer
exception. - Try to trace before trying to send a response on server side to see if this is happening.
- Logging the request in the warp handler confirms it’s the websocket connection that saw the exception:
Just Request {requestMethod = "GET", httpVersion = HTTP/1.1, rawPathInfo = "/", rawQueryString = "", requestHeaders = [("Host","127.0.0.1:4000"),("Connection","Upgrade"),("Upgrade","websocket"),("Sec-WebSocket-Key","5gwg2rTMTrRpSaWexFTx4w=="),("Sec-WebSocket-Version","13")], isSecure = False, remoteHost = 127.0.0.1:36138, pathInfo = [], queryString = [], requestBody = <IO ByteString>, vault = <Vault>, requestBodyLength = KnownLength 0, requestHeaderHost = Just "127.0.0.1:4000", requestHeaderRange = Nothing} Warp: Client closed connection prematurely
- Searching the exception in the warp code base: https://github.com/search?q=repo%3Ayesodweb%2Fwai%20ConnectionClosedByPeer&type=code
Discussing again about our documentation and the state of the website, we realize that we do not need, nor want, to version the whole website but only part of it.
Should be versioned:
- API
- Benchmark plutus
- Specification
The rest should be up to date with master. Especially the demo part because, when you think about it, users will checkout master and run the demo and, if it does not match master and fail, they will file an issue.
If we clean the QuickStart guide from the --help
output we can also stop versioning it without problem.
At the end of the day, we can make it work with docusaurus versioning.
It would look something like:
- Download spec form release and put it in docs/specification
- yarn docusaurus docs:version:specification 0.10.0
- Download spec from master and put it in docs/specification
- Download haddoc-and-benchmarks from 0.10.0
- Put benchmarks in docs/benchmarks
- yarn docusaurus docs:version:benchmarks 0.10.0
- Download benchmarks from master and put it in docs/benchmarks
- Yarn build
To have a clean version dropdown, we had to hack a custom component.
Unfortunately, the next obstacle becomes the API. It's all handled by an haddock component which makes the docusaurus versioning ineffective for this part.
Trying to generate the API doc in html and including it in docusaurus content fails with the following error:
7 |
8 | <h1 {...{"id":"api"}}>{`API`}</h1>
> 9 | <iframe style={{width: '100%', height: '100%'}} src=''@site/static/api/api.html' title="Api"></iframe>
| ^
10 | </MDXLayout>
11 | )
12 | };
Trying to generate the API doc in markdown and including it in docusaurus content fails with the following error:
434 | <tr parentName="tbody">
435 | <td parentName="tr" {...{"align":null}}>{`transaction.2.body.inputs`}</td>
> 436 | <td parentName="tr" {...{"align":null}}>{`array`}<string></td>
| ^
437 | <td parentName="tr" {...{"align":null}}>{`A list of inputs for this transaction. Technically, this is actually a Set, eg. the order of elements does not matter and they must be unique.`}</td>
438 | <td parentName="tr" {...{"align":null}}>{`-`}</td>
439 | <td parentName="tr" {...{"align":null}}>{`-`}</td>
To unblock the current situation, the double doc generation is reactivated with #902.
Monthly is not published on the website: https://hydra.family/head-protocol/monthly
Monthly is published on the unstable version instead: https://hydra.family/head-protocol/unstable/monthly/
Looking at the CI I can see it is moved from the unstable to the stable doc.
Looking at the gh-pages branch I can see that the monthly for may is indeed, in the appropriate directory.
But it appears that it’s not enough for it to be visible on the website.
I try to reproduce locally:
#> git switch gh-pages
#> python3 -m http.server 8000
#> (xdg-)open http://localhost:8000/head-protocol/monthly/
Expected: the website opens and I see the monthly with the missing may edition.
Actual: 404 error
I click on the monthly link on this page.
Expected: 404 error because I’m staying on this page
Actual: I’m sent to http://localhost:8000/head-protocol/unstable/monthly and can see the may monthly, but in the unstable website.
We used to build two docs for master, one not having the /unstable in its path: https://github.com/input-output-hk/hydra/pull/880/files
We don’t anymore. So, just moving the monthly from unstable to / breaks docusaurus javascript.
** FIX **
#> cd docs
#> yarn build
…
Copy the monthly directory to gh-pages, including assets according to what the CI currently does.
Ouops, we’re missing the Spanish monthly so need to fix the CI also for that.
Our hypothesis: AB's node restarted with, both, a state and a start-from-chain option. Replaying the logs from the cardano ledger it, somehow, messed with the state of AB's node which would explain why it rejected snapshot number 7 in the past.
We reproduce this setup on a devnet with Alice, Bob and Carol from the demo page (which needs to be fixed by the way):
- Open a head
- make a transaction -> observe Snapshot 1 is confirmed by all node
- Restart Alice from some old block (before the head has been opened) 3.3361699055b62eaca1744a9a73a77f3e96ef116005c334d1f76d2bf7901207d6
- make a Transaction
Expected: the snapshot 2 for the last transaction is not confirmed by Alice because it's in the same problematic state as AB's node.
Actual: the snapshot 2 is confirmed by all the nodes, everything works, thank you very much.
We need to find a better hypothesis.
- Latest findings from broken head investigation
- it appears that using
--start-chain-from
with a persisted state + the node restarting was problematic - arnaud's node did not confirm the snapshot because it had an initial off-chain state (got reset)
- Not fully confirmed yet, but very likely
- Shall we take action on this?
- Not allow both “persisted state” and --start-chain-from at the same time
- Draft an issue and directly work on that
- it appears that using
Our head is broken, we collect all nodes logs and explore. First problem: having everything in json. Some logs are more than 2048 characters long and, depending on the system (when logging to journald), they are cut in multiple lines.
Now we have the logs... what are we looking for?
The logs are a tree structure which makes it hard to explore. Maybe we want to have something more flat or at least flatten the things that would allow to easily filter in/out what we want to see. Right now, grep
is easier to use than jq
which indicates it's a bit complicated to read the path to find the tag you want to filter on.
After several hours we can teach it down to the snapshot 7 being rejected by AB's node but why?
May 23 09:36:02 Debian-1106-bullseye-amd64-base hydra-node[1794778]: {"timestamp":"2023-05-23T09:36:02.778494815Z","threadId":79,"namespace":"HydraNode-\"arnaud\"","message":{"node":{"by":{"vkey":"286b3cbf98f62a4f9f3368dc7ddbc02001b2580ee5e3720cb7619e2492d1bb77"},"outcome":{"error":{"contents":"requireReqSn","tag":"RequireFailed"},"tag":"Error"},"tag":"LogicOutcome"},"tag":"Node"}}
Added the following configuration to collect metrics from the Hydra node:
metrics:
configs:
- name: integrations
remote_write:
- basic_auth:
password: <password>
username: <username>
url: https://prometheus-prod-24-prod-eu-west-2.grafana.net/api/prom/push
scrape_configs:
# Add here any snippet that belongs to the `metrics.configs.scrape_configs` section.
# For a correct indentation, paste snippets copied from Grafana Cloud at the beginning of the line.
- job_name: agent
static_configs:
- targets: ['127.0.0.1:6001']
global:
scrape_interval: 60s
wal_directory: /tmp/grafana-agent-wal
Some notes on things to improve in logs and hydra-node operations:
- Remove duplication of
BeginEffect/EndEffect
(already done for events) - Add more information on events' errors and delays (eg. esp for
AckSn
andReqSn
as they are the most sensitive ones). We currently just return something likeError $ RequireFailed "requireValidAckSn"
which is not very useful when troubleshootin issues - Simplify logs to contain only the "relevant" information, esp. the logs about the
HeadState
- Our initial log implementation directly used "business domain objects" to be logged, which was fine as it enabled us to have something faster and remove the need for custom log items, but this is not tenable in the long run
- We want log entries to be mostly "flat" in order to ease analysis, filtering, correlation, etc.
- In general, we need to find a strategy for orderly reset to some known correct state should something goes wrong, eg. being able to "rollback" all heads to some earlier snapshots. The cost of a stuck head is too high to be generally acceptable as a failure recovery strategy
- Even more generally, we need to secure network communications in order to provide and enforce some guarantees on messages delivery. Ideally we would like an exactly once semantics which implies some form of persistence, tracking and acking of messages delivered to various parties, resending lost messages, etc. A Hydra network should be resilient to (At least) failstop failures and possibly to byzantine failures.
I want to write a test that will prove that one can not use --start-chain-from
option if a state already exists.
To set this option, I want to generate a random chain point but I need this chain point to not be the Genesis block or I’ll get a runtime error from the node so:
startChainPoint <- generate arbitrary `suchThat` (/= ChainPointAtGenesis)
let nodeArgs =
toArgs
defaultRunOptions
{ …
, defaultChainConfig
{ …
, startChainFrom = Just startChainPoint
…
So I need to import this ChainPointAtGenesis
but hls, for some reason, does not help me today…. Ok.
But ghc is in a good mood:
test/Test/EndToEndSpec.hs:494:66: error:
• Data constructor not in scope:
ChainPointAtGenesis
:: cardano-api-1.36.0:Cardano.Api.Block.ChainPoint
• Perhaps you want to add ‘ChainPointAtGenesis’ to the import list
in the import of ‘Hydra.Cardano.Api’
(test/Test/EndToEndSpec.hs:(26,1)-(43,2)).
It’s a bit funny that it will tell me that ChainPointAtGenesis is part of cardano-api but that I should import it from Hydra.Cardano.Api 🤯 ... anyway, doing this I then get the following error:
test/Test/EndToEndSpec.hs:489:46: error:
• Couldn't match type ‘Gen’ with ‘IO’
Expected type: IO ChainPoint
Actual type: Gen ChainPoint
• In a stmt of a 'do' block:
startChainPoint :: ChainPoint <- generate arbitrary
`suchThat` (/= ChainPointAtGenesis)
In the expression:
do hydraScriptsTxId <- publishHydraScriptsAs node Faucet
let persistenceDir = dir </> "persistence"
let cardanoSK = dir </> "cardano.sk"
let hydraSK = dir </> "hydra.sk"
....
In the second argument of ‘($)’, namely
‘\ node@RunningNode {nodeSocket}
-> do hydraScriptsTxId <- publishHydraScriptsAs node Faucet
let ...
....’
|
489 | startChainPoint :: ChainPoint <- generate arbitrary `suchThat` (/= ChainPointAtGenesis)
If I comment suchThat
, then all is fine so, for some reason, suchThat
is moving me from Gen
to IO
monad and I’ve no idea why... and I know, it's probably not what is happening because it makes no sense but that's the best I can find to interpret ghc message right now. Once again: GHC 1 - Pascal 0
Some minutes later, doing something else and forgetting about ghc message, the light comes... I'm missing a $
, of course:
startChainPoint <- generate $ arbitrary `suchThat` (/= ChainPointAtGenesis)
I'm that close to think that ghc should just quit trying to give error message. It's not that it's not helpful. In that case it's just plain misleading for just an obvious one character typo.
- While trying to run our ETE benchmark I spent a few hours scratching
my head to understand why the benchmark was failing with an
IOException
of typeResourceVanished
. I even went as far as firing up Wireshark to capture the TCP traffic between the client and the hydra-node, and try to understand what was causing this exception from the protocol exchange. - It turned out the exception I was seeing was masking the actual
cause of the problem, namely a
timeout
while waiting for some message to be received by the client. Increasing the timeout value fixed the problem and I was able to run the benchmark with higher load. - I am clueless on how to really solve the problem which involves complex low-level machinery of Haskell's exception handling, network connections, multithreading...
-
PG: Plutus-merkle-tree benchmarks observation #879
- We use the produced outputs in the website
- It’s puzzling why leaving away –output-directory does not yield in an error
-
FT: how to decode redeemer from client input?
- JSON String containing Hex-encoded CBOR should be the minimum viable
- What does the cardano-cli do? Could be a source of inspiration
- Ask users
-
FT: More questions around external commits -> let’s have a dedicated session
- support multiple utxo?
- are the following specified in plural for a reason? datums, scripts, redeemers
- in commitTx initialInput and initialScriptRef are both TxIn, what is the diff?
- how can we check if the provided user script utxo can harm the head?
-
Continuing I want to add a generator for the test case that produces a script utxo.
-
I used a silly trick to generate script utxo like this
genTxOut
suchThat(\(TxOut addr _ _ rs) -> rs == ReferenceScriptNone && not (isKeyAddress addr))
in lack of a better way.suchThat
is very slow so I'll need to come up with alternative solution. -
Realizing I need to submit this utxo to the network I change plans to alter our own
publishScript
function for this use case. This function is something we use for new releases when we need to publish altered hydra scripts. -
The script itself might not be enough since what I produce is just the basic one that doesn't use datum and redeemer but it is a first step.
-
After fiddling with cardano-api types left and right as usual we are successfully seeing this error:
uncaught exception: SubmitTransactionException
SubmitTxValidationError (TxValidationErrorInMode (ShelleyTxValidationError ShelleyBasedEraBabbage (ApplyTxError [UtxowFailure (AlonzoInBabbageUtxowPredFailure (ShelleyInAlonzoUtxowPredFailure (MissingScriptWitnessesUTXOW (fromList [ScriptHash "42d53b506eb20afa5593373c1567e494da5375552cfcf32b2cb47d71"]))))])) BabbageEraInCardanoMode)
-
This is nice since it seems we are not including a script witness to our transaction and it is a step forward.
-
My plan is to copy and alter the
commit
function so that I can include the new script, datum and redeemer. -
FT and me were looking at the
Tx
module to figure out what we will need to be able to ingest user script utxo. Seems likeScriptData
is the appropriate type here but we will double check - what we need is a type that can be easily sent across the network and decoded back to the representation we need to commit it to a head.
It's not clear if previous work will bring much improvement but, anyway, the doc step is taking as long as half an hour so that's probably the next thing to address.
In this build, for instance, we can see that the Documentation step takes 9 minutes and almost all the time is spent in Creating an optimized production build...
I try to run it locally to see but, unfortunately, with introduced a bench of pre-requisite to run the documentation step and that leads to, missing stuff, like the spec pdf file or broken links errors. All in all it's a bit painful to setup everything to run it locally.
We reverted all the changes related to putting files needed by tests in cabal data-files or data-dir and just decided to change to the module directory before running the tests. That solves all the problems.
Still hard to figure out how better we perform with all the buffers involved in the process.
For instance in this sheet it is hard to compare. both runs that use nix. It's also hard to compare with the version were we don't use nix but just build everything once to populate the cache first.
-
FT and me kicked off this work last Friday when we came up with the initial plan to tackle things.
-
We started with a test, just a modified end-to-end spec where we used one party, initialized the Head and then instead of commiting as a client we send out a http request (to the non-existing http server) to produce a draft commit tx. Then we sign and submit a transaction and wait to see the head is opened.
-
For the http server and in general all written code, for now we opted in for a dirty solution where we decided to not care about the code quality at all and have a working solution. Afterwards we will take care of improving the code once we know we have implemented all necessary changes.
-
One thing we discovered to be a PITA is that our server side code is abstract over the transaction type so it is a bit hard to work with it.
-
Initially we thought that introducing a callback (handle pattern) would be a solution for abstract tx type in the server code so we introduced it but as we saw fast our initial plan to extend the
Chain
handle is exactly what we needed. -
Our plan to extend the
Chain
handle seems to work. We added another field next topostTx
calleddraftTx
. At this level we have access to all the neccessary ingredients to callcommitTx
function to obtain the commit transction that user needs to sign and submit. -
Currently we are at position in our test where http request/response seems to work and the final thing to do is to sign the transaction (on the behalf of the user) submit it and wait to see the
HeadIsOpen
message. -
There were some troubles with balancing the transaction (I was torn between using our wallet's
coverFee_
andbuildTransaction
functions as well as cardano-api functions for ballancing) so after usingcoverFee_
the test complained about the missing input. -
Missing input was related to the generated user utxo so it seemed like this utxo was missing in
coverFee_
. This was indeed the case and at the same time I realized instead of callingcommitTx
fromdraftTx
to get the drafted commit transaction I should be callingcommit
function that does some sanity checking/rejects wrong txs. -
Adding the utxo to the
HeadState
was one attempt that didn't pay off and while working on this I noticedgetKnownUtxo
infinalizeTx
just before callingcoverFee
function of the wallet handle. -
So we are passing the head utxo + the utxos comming from the
Initial
state and I realize the user utxo needs to be added to these. -
After this change we got green test!
-
The next part of the work is to introduce the option for the users to commit script utxos.
When running hydra-cluster tests directly from tests binary, we have failure because the config files are not found. It happens that we were not using Paths_
mechanism in those tests so we were not taking advantage of cabal path finding stuff. See Fix.
Then we have the same kind of issues with the golden files not being found but, this time, we rely on hspec-golden-aeson to open the files so we'll have to find a way. Also, these golden files are not part of the data-files in cabal which make sense since we don't want them in our final binary.
Current experiments:
- P.R. #857 we add a cabal build all step so that cache is populated before all the other build steps
- P.R. #867 we use the nix binaries to run the tests and not cabal test
It’s hard to figure out if we get any gain from one or the other or what gain we get so I draw this sheet (see C.I. serfs) to compare runs between master and both P.R. We can see that both P.R. perform better than master and the one with nix seem to be the most performant.
But it’s strange… just adding the populate cache double the total performance compared to master. So I take a look at master performance history (see C.I. schedule perfs in the same sheet We can see there that master can sometimes take 30 minutes and sometimes 1 hour. It could be related to issues with the nix cache ?
At the end of the day, I must admit that, since we’re relying on a cache to avoid building as much things as we can, execution times suffer from a huge variability (50% or so) which makes measurement of improvement very hard to achieve. To improve the process, we would need a stable process, which usually means removing all the buffers.
-
Finally this task is going to be tackled - seems like interesting chunk of work.
-
Plans are to talk to the users and let them decide on these two options:
- clients are responsible for committing the commit tx
- hydra-node is responsible for submitting the commit tx
-
We distinguish these two because when we open a commit tx to the users they could have the option to commit a script UTxO which is a bit more complex than pub key UTxO in a sense that we would need to add datum, redeemers etc. which leads to different api endpoint. So might be easier if clients directly submitted the tx instead of adding a signature and returning the tx contents back to hydra-node for submission.
-
OTOH we could also accept all necessary parameters for commiting a script output and let the hydra-node submit a commit tx so this is something we need to check with out users.
-
I wanted to start early experiments to come up with the optimal code changes for this task. Main problem is that we would like to connect two layers (api and direct chain) to enable api layer to return a draft commit tx without involving head logic layer. Problem is - we also need to know our head state since commit should be only possible if head is in
Initializing
state. -
So after looking at the code it seems there will be no major issues in implementing this change.
-
On the api level we will add a new client input for getting the draft tx and this will include the utxo clients wants to commit.
-
On this level we can also get access to the current head state (we already have projections related to snapshot confirmed/greetings)
-
On the chain handle we can add a new field
drafTx
. We already have access here toLocalChainState
from which we can get the latestChainStateAt
which contains chain state needed to construct the commit tx (we needInitialState
) -
The last step is to connect our api layer and the chain layer. Since
withAPIServer
starts after thewithDirectChain
we can passChain
type as the argument and have access to call the newdraftTx
function from the chain handle.
-
-
After working on this previously and talking to SB the plan changed to split work into two pieces:
-
Have a pr that just queries the cardano-node to get the
GenesisParameters
since this is needed in another line of work already -
Proceed with removing the
--ledger-genesis
hydra-node argument and instead ofGenesisParameters -> ShelleyGenesis
have a function that goes fromGenesisParameters
to ledgerGlobals
immediatelly. Reasoning is that we don't need anything special that exists inShelleyGenesis
and have all we need in L2 ledger fromGlobals
.
-
-
Seems like going from
GlobalParameters -> Globals
is not going to be a hard work since all the information is there. -
After this conversion all that is left to do is to query to get
GenesisParameters
and convert those toGlobals
using the new code. Of course I also had to update bunch of files to remove now outdated genesis-shelley parameter but that is just pretty much manual work.
I draft P.R. #867. It's full of issues (see TODO in the commit) but can give us an idea of execution time.
I'm disappointed that we are still downloading and utterly big amount of nix derivations :(
Comparing execution time of jobs from another C.I. run with this one, we don't see much difference:
- This branch with everything pre-built in cache
- Another branch with everything pre-built in cache also
#857 optimizes the build stage of our C.I. by building everything once and just running all the tests later in a second stage. But it happens that just downloading all the nix derivations needed to run cabal test takes several minutes already.
Another thing we try is build the test executable and just run it in the test stage. That way, the nix derivations needed is way smaller and download could go faster, saving our C.I. minutes.
This might be quite simple as we just need to build and run the test. For instance, adding this in packages.nix
:
tests = {
hydra-node = nativePkgs.hydra-node.components.tests.tests;
}
And then build and run:
nix build .?submodules=1#tests.hydra-node
./result/bin/tests
For hydra-cluster it's a bit more complicated because of external binaries dependencies.
We create a derivation that composes the hydra-cluster tests and all its dependencies:
+ hydra-cluster = pkgs.buildEnv {
+ name = "integration";
+ paths = [
+ nativePkgs.hydra-cluster.components.tests.integration
+ cardano-node.packages.${system}.cardano-node
+ hydra-node
+ ];
+ };
+ };
We create a derivation which is a script that will run the test with the appropriate PATH:
+ hydra-cluster = pkgs.writeScriptBin "hydra-cluster-integration-test"
+''
+PATH=${cardano-node.packages.${system}.cardano-node}/bin
+PATH=$PATH:${hydra-node}/bin
+PATH=$PATH:${nativePkgs.hydra-cluster.components.tests.integration}/bin
+exec integration
+'';
+ hydra-cluster = pkgs.mkShell {
+ name = "";
+ buildInputs =
+ [
+ nativePkgs.hydra-cluster.components.tests.integration
+ hydra-node
+ cardano-node.packages.${system}.cardano-node
+ ];
+ };
At this stage, this looks like the best option.
We sometimes can observe, in C.I., the ci/eval step failing. When it’s the case, we don’t get a lot of feedback.
But why do we use this step in the first place? It seems that it’s used to build nix derivation and cache them in cache.zw3rk.com but if happens that we also build derivations and push them to hydra-node.cachix.org. Why would we need both?
cache.zw3rk.com is supposed to build derivation for Darwin machines that we don’t build otherwise. But do we use that? Yes, FT seems to take benefit of it but still need to check more.
for i in /nix/store/*plutus-tx-lib-plutus-tx-1.0.0.0/envDep; do curl -fsS https://cache.zw3rk.com/$(readlink -f $i | cut -c12-43).narinfo >/dev/null && echo OK: $i || echo $i ; done
OK: /nix/store/gg13n4rkgqjj7734zh1sywbkswvrc4sn-plutus-tx-lib-plutus-tx-1.0.0.0/envDep
I need to get an idea of what is cached where so I cleanup my nix store first:
nix-collect-garbage --delete-old
[…]
30979 store paths deleted, 61174.35 MiB freed
Then, I enter nix develop
to fetch what missing (if anything):
#> nix develop
...
And then I build this Google doc with all the items in the nix store and in which cache they are stored. Only done for x86_64-linux for now.
We can observe that all that is stored in hydra-node.cachix.org is also stored in cache.zw3rk.com and even in cache.iog.io (except for loop-lib-loop-0.3.0).
- AB
- Clean up red bin items
- how-to-use-red-bins
- Should contain issue that we see and next thing to do is to fix it
- How to add git revision to nix-built binaries?
- There is a solution using env variables if git is not available. SB might have an idea.
- Clean up red bin items
Incremental commits/decommits:
- This is issue number #199
- This is pictured in the original paper also (extension)
- Open a head right away and do incremental commits
- Add two more transactions increment and decrement
- Both change number of utxos in the head state
- Decommit requires certificate and all participants need to agree on taking something out of the head state
- Sandro: this is a rough sketch/doesn't include anything related to L2 code
- We can start from the paper and build intuition further using knowledge from research team (pick their brain not do any additional work)
- Some questions:
- txs on L1 -> how would they look?
- What does this mean - request saying some participant is adding utxos?
- Matthias: We would extend snapshots to include new information.
- We need to make sure the head is not stalled (commits you want to include may not exist anymore)
- How do you combine this certificate related to decommits?
- How do you know when the head is ready to be opened?
- How to check head state on-chain to make sure utxo is included there?
- Related to MPT's? We need it in order for validator to be able to check that new state includes new utxos. We don't want to store utxo's on-chain
- We need a inclusion proof for the on-chain code
- Currently utxos are represented as a hash in the on-chain code
- Idea: We could provide a snapshot that would include a utxo hash + all utxos hash
- Part of it would be present in the off-chain
- Provide a multisig certificate that every party agrees on the state and then check that the new state has the correct hash
- What we need to sign in the multisig is also the utxos we want to commit
- This off-chain solution should be cheaper than the on-chain one
- Decommit/Decrement
- What is fanned out should correspond to what was decommitted plus what is still remaining
- FT: Shouldn't this idea of certificates be part of collect-com. Why do we need to agree on L2 what will be commited on L1? Response: it is cheaper
- For this we don't even need tree data structure. We currently use a flat list and that can suffice.
- Sandro: We should think more and not try to implement this initial brainstorm
-
We have the possibility to remove the leder genesis file from hydra-node and query it from cardano-node. IMO the less arguments the better - one more thing that can be improved.
-
I am starting from the cardano-node code to see how ledger globals or shelley genesis can be queried.
-
After poking around in cardano-node and cardano-ledger code I couldn't find a query ready to be used immediatelly. Nothing is easy on cardano! There is a query to obtain
GenesisParameters
but the function to go fromGenesisParameters -> ShelleyGenesis
is missing. -
There is a function
fromShelleyGenesis :: Shelley.ShelleyGenesis era -> GenesisParameters
that does the reversed thing so I can write the reverse in hydra code base but overal feeling is this should exist already so I am proceeding with a slight bitter taste.
-
When updating the hydra-tui on my hydra instance, I was needing to build it.. even though CI should have built and pushed the static binaries into the cachix caches
-
I suspect the hydra-node.cachix.org cache is getting garbage collected
-
Building hydra-tui-static on my dev machine and pushing into a new cachix cache to confirm
-
Pushing things into cachix:
- Runtime dependencies:
nix-store -qR $(nix build .#hydra-tui-static --print-out-paths) | cachix push <cache>
- Build & runtime dependencies:
nix-store -qR --include-outputs $(nix-store -qd $(nix build .#hydra-tui-static --print-out-paths)) | cachix push <cache>
- Runtime dependencies:
-
Confirm locally by garbage collecting my nix store and re-building with only downloading from cache (
-j 0
enforces this):rm result # was holding the garbage collect root nix-collect-garbage nix build .#hydra-tui-static -j 0
-
Using the standard caches this fails with
these 12 derivations will be built: /nix/store/f2kgsxvb8226jvpcdsw6fi6cxskkcnw2-cardano-ledger-shelley-test-lib-cardano-ledger-shelley-test-x86_64-unknown-linux-musl-0.1.1.2.drv /nix/store/19jfbfhpclibvy7487ba8h6ldm1j9klj-cardano-ledger-shelley-ma-test-lib-cardano-ledger-shelley-ma-test-x86_64-unknown-linux-musl-0.1.1.2.drv /nix/store/4p8wch1r35yiqlib6r0353r64b5w3wgm-cardano-ledger-alonzo-test-lib-cardano-ledger-alonzo-test-x86_64-unknown-linux-musl-0.1.1.2.drv /nix/store/6vd8bn3vqhykswzhjjzyh2gsl3ki80gz-ouroboros-consensus-shelley-lib-ouroboros-consensus-shelley-x86_64-unknown-linux-musl-0.3.0.0.drv /nix/store/9wgcvhab87wn0injq5xa6g6q5hwc5nnd-cardano-ledger-babbage-test-lib-cardano-ledger-babbage-test-x86_64-unknown-linux-musl-0.1.1.2.drv /nix/store/snsr9vy9b1ng0yii2sk1rghizz9sm74b-ouroboros-consensus-cardano-lib-ouroboros-consensus-cardano-x86_64-unknown-linux-musl-0.3.0.0.drv /nix/store/n6rfzmcwz5kzgchifkqpr3bknhv7bv90-cardano-api-lib-cardano-api-x86_64-unknown-linux-musl-1.36.0.drv /nix/store/d97iv5m2d2jf4kv1xqbwv48z7gwamaya-hydra-cardano-api-lib-hydra-cardano-api-x86_64-unknown-linux-musl-0.10.0.drv /nix/store/mjipxj81fh4qhxxlq507ag7c0pyw47iz-hydra-plutus-lib-hydra-plutus-x86_64-unknown-linux-musl-0.10.0.drv /nix/store/gc4mdm2s4fkmnq97wkgjmx2q51z1kbi1-hydra-node-lib-hydra-node-x86_64-unknown-linux-musl-0.10.0.drv /nix/store/qqw9a23ajlkj6zkjqszxsfiip9gi1xd3-hydra-tui-lib-hydra-tui-x86_64-unknown-linux-musl-0.10.0.drv /nix/store/q8fdyhmnpy212xxl7qaqf9v265fi6shl-hydra-tui-exe-hydra-tui-x86_64-unknown-linux-musl-0.10.0.drv error: unable to start any build; either increase '--max-jobs' or enable remote builds. https://nixos.org/manual/nix/stable/advanced-topics/distributed-builds.html
-
But with the new, just populated cache it works instantly (also using
nix-output-monitor
)rm result # was holding the garbage collect root nix-collect-garbage nom build .#hydra-tui-static -j 0 --substituters https://cardano-scaling.cachix.org --trusted-public-keys cardano-scaling.cachix.org-1:RKvHKhGs/b6CBDqzKbDk0Rv6sod2kPSXLwPzcUQg9lY=
Using saved setting for 'allow-import-from-derivation = true' from ~/.local/share/nix/trusted-settings.json. Using saved setting for 'extra-substituters = https://cache.iog.io https://hydra-node.cachix.org https://cache.zw3rk.com' from ~/.local/share/nix/trusted-settings.json. Using saved setting for 'extra-trusted-public-keys = hydra.iohk.io:f/Ea+s+dFdN+3Y/G+FDgSq+a5NEWhJGzdjvKNGv0/EQ= hydra-node.cachix.org-1:vK4mOEQDQKl9FTbq76NjOuNaRD4pZLxi1yri31HHmIw= loony-tools:pr9m4BkM/5/eSTZlkQyRt57Jz7OMBxNSUiMC4FkcNfk=' from ~/.local/share/nix/trusted-settings.json. copying path '/nix/store/26vdkq8lq42wf8ca3y8shn8g6fkygyyv-binary-0.8.8.0-r1.cabal' from 'https://hydra-node.cachix.org' copying path '/nix/store/agk17iz90d9z3fzsmn3lxax1l2r1203h-filepath-1.4.2.1-r2.cabal' from 'https://hydra-node.cachix.org' copying path '/nix/store/s4ra7j0lss5dxs28g4kpsc9rgd8hdj13-unix-2.7.2.2-r8.cabal' from 'https://hydra-node.cachix.org' copying path '/nix/store/l6w4lqw2i0i72wa9r6xw6xarfpw7vc5r-parsec-3.1.14.0-r4.cabal' from 'https://hydra-node.cachix.org' copying path '/nix/store/991vsfbfcpnsfmi8dwdks5jxfh6sl5g9-exceptions-0.10.4-r3.cabal' from 'https://cache.iog.io' copying path '/nix/store/rbad9xdhbzx0ypqycfkzx9z770qp6lda-haskell-project-plan-to-nix-pkgs' from 'https://hydra-node.cachix.org' copying path '/nix/store/4myd12sgqpzxg2fw6pp8ig36rzkx50az-hackage-to-nix-cardano-haskell-packages' from 'https://hydra-node.cachix.org' copying path '/nix/store/wb7r7c945s2zj9ffy7whwwwxnxz2lxzs-haskell-project-plan-to-nix-pkgs' from 'https://hydra-node.cachix.org' copying path '/nix/store/89j034a72q3q9zi48in6vbgln31rqvg4-process-1.6.13.2-r1.cabal' from 'https://hydra-node.cachix.org' copying path '/nix/store/mggnnhhcb0x2w07mcl8fcjimykfkjq1n-terminfo-0.4.1.4-r1.cabal' from 'https://hydra-node.cachix.org' copying path '/nix/store/7fka2r75x0g652mf1djv1cabk48bvqi6-Cabal-3.2.1.0-r1.cabal' from 'https://cache.iog.io' these 3 paths will be fetched (8.62 MiB download, 34.59 MiB unpacked): /nix/store/i526pxpyg0g8s8gms9iwizxjh6hfn6l7-ncurses-x86_64-unknown-linux-musl-6.4 /nix/store/phx9zj6kx3bvph61zjkqf8kwvpjv7pc7-hydra-tui-exe-hydra-tui-x86_64-unknown-linux-musl-0.10.0 /nix/store/z49dzh19f5d1kpz55vv6gx17gznp4ayh-musl-x86_64-unknown-linux-musl-1.2.3 copying path '/nix/store/z49dzh19f5d1kpz55vv6gx17gznp4ayh-musl-x86_64-unknown-linux-musl-1.2.3' from 'https://cache.zw3rk.com' copying path '/nix/store/i526pxpyg0g8s8gms9iwizxjh6hfn6l7-ncurses-x86_64-unknown-linux-musl-6.4' from 'https://cache.zw3rk.com' copying path '/nix/store/phx9zj6kx3bvph61zjkqf8kwvpjv7pc7-hydra-tui-exe-hydra-tui-x86_64-unknown-linux-musl-0.10.0' from 'https://cardano-scaling.cachix.org' ┏━━━ Downloads │ Host ┃ │ │ │ localhost ┃ │ ↓︎ 2 │ │ [1]: https://cache.iog.io ┃ │ ↓︎ 2 │ │ [2]: https://cache.zw3rk.com ┃ │ ↓︎ 1 │ │ [3]: https://cardano-scaling.cachix.org ┃ │ ↓︎ 9 │ │ [4]: https://hydra-node.cachix.org ┗━ ∑︎ ↓︎ 0 │ ↓︎ 14 │ ⏳︎ 0 │ Finished at 11:48:27 after 6s
Exploring on how we could update our dependencies, I was surprised with cabal index-state management.
We have this cabal.project file which states that the index-state for chap should be 2023-03-22
.
I run cabal update which says it updates chap to 2023-05-08
(more recent).
But then I run cabal build and, for some reason, it's using none of those index states but a third one, 2023-03-21
, which is older than the one in cabal.project.
And it's explaining me that it's doing that because 2023-03-22
is newer than cardano-haskell-packages
which I know for a fact it's false because I just updated it to 2023-05-08
before:
#> grep -A2 index-state cabal.project
index-state:
, hackage.haskell.org 2023-04-21T02:04:45Z
, cardano-haskell-packages 2023-03-22T09:20:07Z
#> cabal update
Downloading the latest package lists from:
- hackage.haskell.org
- cardano-haskell-packages
Package list of cardano-haskell-packages has been updated.
The index-state is set to 2023-05-08T23:12:59Z.
To revert to previous state run:
cabal v2-update 'cardano-haskell-packages,2023-05-03T10:02:50Z'
Package list of hackage.haskell.org has been updated.
The index-state is set to 2023-05-09T05:00:23Z.
To revert to previous state run:
cabal v2-update 'hackage.haskell.org,2023-05-04T14:59:37Z'
#> cabal build all
Warning: Requested index-state 2023-03-22T09:20:07Z is newer than
'cardano-haskell-packages'! Falling back to older state
(2023-03-21T14:44:35Z).
After some conversation with Haskell veterans I got the following answer:
- cabal update will fetch the latest index in existence as of this moment and print it (2023-05-08)
- then in your project you specified “I want a state at 2023-03-22T09:20:07”
- Cabal sees that there is no index exactly at the time stamp you requested
- Cabal finds that the youngest “older or equal than the one you asked for” index is 2023-03-21T14:44:35
- The warning error is just super confusing. I think it means “the requested index is newer than CHaP if I would trim the history of indices to this time stamp, because one at this time stamp doesn’t exist, only older ones”. But it is the error that is just confusing
- In particular, you could change the index-state in your cabal.project to 2023-03-21:14:44:35Z and the build plan will be exactly the same you are getting with this index-state you currently have declared
Then I decided to open P.R. #853 to actually use index 2023-03-21T14:44:35Z
to get rid of the warning and never think about that, ever. It works... but the C.I. failed and I need to figure out why although the message does not help that much:
in job ‘devShells.default’:
error: 1 dependencies of derivation '/nix/store/a8pslf2sd8i06cp1lncqpys39azcv905-haskell-project-plan-to-nix-pkgs.drv' failed to build
in job ‘packages.hydraw’:
error: 1 dependencies of derivation '/nix/store/a8pslf2sd8i06cp1lncqpys39azcv905-haskell-project-plan-to-nix-pkgs.drv' failed to build
in job ‘packages.hydra-node’:
error: 1 dependencies of derivation '/nix/store/a8pslf2sd8i06cp1lncqpys39azcv905-haskell-project-plan-to-nix-pkgs.drv' failed to build
in job ‘devShells.ci’:
error: 1 dependencies of derivation '/nix/store/a8pslf2sd8i06cp1lncqpys39azcv905-haskell-project-plan-to-nix-pkgs.drv' failed to build
in job ‘packages.hydra-tui’:
error: 1 dependencies of derivation '/nix/store/a8pslf2sd8i06cp1lncqpys39azcv905-haskell-project-plan-to-nix-pkgs.drv' failed to build
Troubleshooting why my node cannot connect to the head. Here is the configuration of the node. Obviously, it's missing one key 🤦
{
"timestamp": "2023-05-07T14:36:37.611577721Z",
"threadId": 4,
"namespace": "HydraNode-\"arnaud\"",
"message": {
"runOptions": {
"apiHost": {
"ipv4": "0.0.0.0",
"tag": "IPv4"
},
"apiPort": 4001,
"chainConfig": {
"cardanoSigningKey": "keys/arnaud.sk",
"cardanoVerificationKeys": [
"keys/sebastian.cardano.vk",
"keys/sasha.cardano.vk"
],
"contestationPeriod": 60,
"networkId": {
"tag": "Mainnet"
},
"nodeSocket": "./node.socket",
"startChainFrom": {
"blockHash": "86302487af13b92b0f1e7536172b3d49f33ca609580174c6dcda8fb75514fbb4",
"slot": 91712195,
"tag": "ChainPoint"
}
},
"host": {
"ipv4": "0.0.0.0",
"tag": "IPv4"
},
"hydraScriptsTxId": "4a4f3e25887b40f1575a4b53815996145c994559bac1b5d85f7de0f82b8f4ed7",
"hydraSigningKey": "keys/arnaud-hydra.sk",
"hydraVerificationKeys": [
"keys/sebastian.hydra.vk",
"keys/sasha.hydra.vk"
],
"ledgerConfig": {
"cardanoLedgerGenesisFile": "cardano-configurations/network/mainnet/genesis/shelley.json",
"cardanoLedgerProtocolParametersFile": "protocol-parameters.json"
},
"monitoringPort": 6001,
"nodeId": "arnaud",
"peers": [
{
"hostname": "fk.ncoding.at",
"port": 5001
},
{
"hostname": "13.37.150.125",
"port": 5001
}
],
"persistenceDir": "hydra-data",
"port": 5001,
"verbosity": {
"contents": "HydraNode-\"arnaud\"",
"tag": "Verbose"
}
},
"tag": "NodeOptions"
}
}
Checking the node's config as it's reported by the node itself should be the very first step, perhaps something that's missing from the API?
-
Currently Hydra uses the internal wallet to sign and submit Hydra transactions. This is not optimal since the fee calculation is not working properly and we are overestimating fees (adding manually + 2 ada).
-
There is a todo related to the current work already in a PR that enables us to use the proper ProtocolParameters and since we have those in place all we should do to get the correct fees is to call cardano-api for calculation.
-
There are problems when one wants to alter the transaction body (this is something IOG wants to discourage) so this is why
TxBodyContent BuildTx
is the most flexible type to work with and initially I wanted to pass it tocoverFee_
internal function. Problems with this is that our tests forcoverFee_
function need to generateTxBodyContent
but there is no exposed arbitrary instance. -
The workaround (for now) is to pass in cardano-api transaction and the non-byron witnesses needed for fee calculation and later on convert the transaction to cardano-ledger one (needed for updating certain tx fields).
- SB
- commits vs rollbacks
- MockChain rollback code - optional
- FT
- Timed txs
- Franco will make sure the communication is available to other team members
- Keep the interface simple, just give a means of constructing the timed txs (get the current slot and slot length)
- Timed txs
-
The smoke tests did not run properly today with error:
error: unable to download 'https://api.github.com/repos/input-output-hk/cardano-node/tarball/30d62b86e7b98da28ef8ad9412e4e00a1ba1231d': Problem with the SSL CA cert (path? access rights?) (77)
-
Suspect it is the
cabal build
inside preparing thenix develop
shell for the smoke test run as we are using asource-repository-package
. -
Logging in to the dedicated runner, I could download this tarball just normally.
-
However, running
nix run nixpkgs#hello
was also failing, but slightly differenterror: unable to download 'https://api.github.com/repos/NixOS/nixpkgs/commits/nixpkgs-unstable': HTTP error 401 response body: { "message": "Bad credentials", "documentation_url": "https://docs.github.com/rest" }
-
Tried removing the
access-tokens
for github from thenix.conf
, did not help directly .. or I don't know. -
Cloning
hydra
and doing anix develop
was yielding the same error -
Restarting the nix daemon with
systemctl restart nix-daemon.service
seemed to work.. at least on the interactivenix develop
? -
After this the github runner also worked, but maybe only because I did fetch sources interactively?
-
Started to work on this fixme item last week but noticed my approach is not ideal.
-
What I wanted to do is set the protocol parameters in the
txProtocolParams
tx field as late as possible but could not find an easy way to just update theTx
type in place. -
Got a way forward from @ch1b0 by passing along the
TxBodyContent BuildTx
type instead ofTx
so thatprepareTxToPost
receives this type and can easily set the real protocol params and then callunsafeBuildTransaction
. -
Introducing
unsafeBuildWithDefaultPParams
that should be used in tests only. It builds a tx using default protocol parameters for testing which brings me to the question: Can we try to use real protocol parameters intx-cost
executable to be able to calculate costs more accuratelly? -
After altering a lot of code because of postponing the tx building all compiles.
-
I noticed a comment about improving the fee calculation in
calculateNeedlesslyHighFee
so this one can be fixed as part of the next PR.
Looking for a solution to make the C.I. help us keep our dependencies up to date, it seems that cabal (nor stack) is not supported by these tools yet, given the fact that the corresponding P.R/issues are still open: dependabot renovate
Also, we're using flake so maybe that's yet another level of support that would be needed rather than just cabal. And, indeed, flake is not supported either by these tools.
I stumbled upon this GitHub action: update-flake-lock. I've been able to integrate it to our repository quite smoothly:
- P.R. #848 introduces this
- P.R. #847 has been created by this C.I. job
- A github token has been created and added to the project secrets to permit execution of the C.I. on the P.R. created by the C.I.
Setting up fourmolu version to 0.9.0 on my machine exposes a strange behavior. I'm expecting the following file to already be formatted according to 0.9.0 rules but these are the generated changes when I apply this formatting to the file:
#> fourmolu --version
fourmolu 0.9.0.0 UNKNOWN UNKNOWN
using ghc-lib-parser 9.2.5.20221107
#> fourmolu -i hydra-node/src/Hydra/HeadLogic.hs
Loaded config from: /Users/pascal/git/github.com/input-output-hk/hydra/fourmolu.yaml
#> git diff
diff --git a/hydra-node/src/Hydra/HeadLogic.hs b/hydra-node/src/Hydra/HeadLogic.hs
index 4add6a5b4..577a6d096 100644
--- a/hydra-node/src/Hydra/HeadLogic.hs
+++ b/hydra-node/src/Hydra/HeadLogic.hs
@@ -242,15 +242,15 @@ instance (IsTx tx, Arbitrary (ChainStateType tx)) => Arbitrary (OpenState tx) wh
-- | Off-chain state of the Coordinated Head protocol.
data CoordinatedHeadState tx = CoordinatedHeadState
- { -- | The latest UTxO resulting from applying 'seenTxs' to
- -- 'confirmedSnapshot'. Spec: L̂
- seenUTxO :: UTxOType tx
- , -- | List of seen transactions pending inclusion in a snapshot. Spec: T̂
- seenTxs :: [tx]
- , -- | The latest confirmed snapshot. Spec: U̅, s̅ and σ̅
- confirmedSnapshot :: ConfirmedSnapshot tx
- , -- | Last seen snapshot and signatures accumulator. Spec: Û, ŝ and Σ̂
- seenSnapshot :: SeenSnapshot tx
+ { seenUTxO :: UTxOType tx
+ -- ^ The latest UTxO resulting from applying 'seenTxs' to
+ -- 'confirmedSnapshot'. Spec: L̂
+ , seenTxs :: [tx]
+ -- ^ List of seen transactions pending inclusion in a snapshot. Spec: T̂
+ , confirmedSnapshot :: ConfirmedSnapshot tx
+ -- ^ The latest confirmed snapshot. Spec: U̅, s̅ and σ̅
+ , seenSnapshot :: SeenSnapshot tx
+ -- ^ Last seen snapshot and signatures accumulator. Spec: Û, ŝ and Σ̂
}
deriving stock (Generic)
@@ -277,8 +277,8 @@ data SeenSnapshot tx
| -- | ReqSn for given snapshot was received.
SeenSnapshot
{ snapshot :: Snapshot tx
- , -- | Collected signatures and so far.
- signatories :: Map Party (Signature (Snapshot tx))
+ , signatories :: Map Party (Signature (Snapshot tx))
+ -- ^ Collected signatures and so far.
}
deriving stock (Generic)
@@ -306,9 +306,9 @@ data ClosedState tx = ClosedState
{ parameters :: HeadParameters
, confirmedSnapshot :: ConfirmedSnapshot tx
, contestationDeadline :: UTCTime
- , -- | Tracks whether we have informed clients already about being
- -- 'ReadyToFanout'.
- readyToFanoutSent :: Bool
+ , readyToFanoutSent :: Bool
+ -- ^ Tracks whether we have informed clients already about being
+ -- 'ReadyToFanout'.
, chainState :: ChainStateType tx
, headId :: HeadId
, previousRecoverableState :: HeadState tx
@@ -388,8 +388,8 @@ instance Arbitrary WaitReason where
arbitrary = genericArbitrary
data Environment = Environment
- { -- | This is the p_i from the paper
- party :: Party
+ { party :: Party
+ -- ^ This is the p_i from the paper
, -- NOTE(MB): In the long run we would not want to keep the signing key in
-- memory, i.e. have an 'Effect' for signing or so.
signingKey :: SigningKey HydraKey
@@ -489,8 +489,8 @@ onInitialChainCommitTx ::
Outcome tx
onInitialChainCommitTx st newChainState pt utxo =
NewState newState $
- notifyClient :
- [postCollectCom | canCollectCom]
+ notifyClient
+ : [postCollectCom | canCollectCom]
where
newState =
Initial
@@ -619,7 +619,7 @@ onOpenNetworkReqTx env ledger st ttl tx =
case applyTransactions currentSlot seenUTxO [tx] of
Left (_, err)
| ttl <= 0 ->
- OnlyEffects [ClientEffect $ TxInvalid headId seenUTxO tx err]
+ OnlyEffects [ClientEffect $ TxInvalid headId seenUTxO tx err]
| otherwise -> Wait $ WaitOnNotApplicableTx err
Right utxo' ->
NewState
@@ -870,14 +870,14 @@ onOpenChainCloseTx ::
Outcome tx
onOpenChainCloseTx openState newChainState closedSnapshotNumber contestationDeadline =
NewState closedState $
- notifyClient :
- [ OnChainEffect
- { -- REVIEW: Was using "old" chainState before
- chainState = newChainState
- , postChainTx = ContestTx{confirmedSnapshot}
- }
- | doContest
- ]
+ notifyClient
+ : [ OnChainEffect
+ { -- REVIEW: Was using "old" chainState before
+ chainState = newChainState
+ , postChainTx = ContestTx{confirmedSnapshot}
+ }
+ | doContest
+ ]
where
doContest =
number (getSnapshot confirmedSnapshot) > closedSnapshotNumber
@@ -916,16 +916,16 @@ onClosedChainContestTx ::
Outcome tx
onClosedChainContestTx closedState snapshotNumber
| snapshotNumber < number (getSnapshot confirmedSnapshot) =
- OnlyEffects
- [ ClientEffect HeadIsContested{snapshotNumber, headId}
- , OnChainEffect{chainState, postChainTx = ContestTx{confirmedSnapshot}}
- ]
+ OnlyEffects
+ [ ClientEffect HeadIsContested{snapshotNumber, headId}
+ , OnChainEffect{chainState, postChainTx = ContestTx{confirmedSnapshot}}
+ ]
| snapshotNumber > number (getSnapshot confirmedSnapshot) =
- -- TODO: A more recent snapshot number was succesfully contested, we will
- -- not be able to fanout! We might want to communicate that to the client!
- OnlyEffects [ClientEffect HeadIsContested{snapshotNumber, headId}]
+ -- TODO: A more recent snapshot number was succesfully contested, we will
+ -- not be able to fanout! We might want to communicate that to the client!
+ OnlyEffects [ClientEffect HeadIsContested{snapshotNumber, headId}]
| otherwise =
- OnlyEffects [ClientEffect HeadIsContested{snapshotNumber, headId}]
+ OnlyEffects [ClientEffect HeadIsContested{snapshotNumber, headId}]
where
ClosedState{chainState, confirmedSnapshot, headId} = closedState
@@ -983,18 +983,18 @@ onCurrentChainRollback currentState slot =
rollback rollbackSlot hs
| chainStateSlot (getChainState hs) <= rollbackSlot = hs
| otherwise =
- case hs of
- Idle{} -> hs
- Initial InitialState{previousRecoverableState} ->
- rollback rollbackSlot previousRecoverableState
- Open OpenState{previousRecoverableState, currentSlot} ->
- case previousRecoverableState of
- Open ost ->
- rollback rollbackSlot (Open ost{currentSlot})
- _ ->
- rollback rollbackSlot previousRecoverableState
- Closed ClosedState{previousRecoverableState} ->
- rollback rollbackSlot previousRecoverableState
+ case hs of
+ Idle{} -> hs
+ Initial InitialState{previousRecoverableState} ->
+ rollback rollbackSlot previousRecoverableState
+ Open OpenState{previousRecoverableState, currentSlot} ->
+ case previousRecoverableState of
+ Open ost ->
+ rollback rollbackSlot (Open ost{currentSlot})
+ _ ->
+ rollback rollbackSlot previousRecoverableState
+ Closed ClosedState{previousRecoverableState} ->
+ rollback rollbackSlot previousRecoverableState
-- | The "pure core" of the Hydra node, which handles the 'Event' against a
-- current 'HeadState'. Resulting new 'HeadState's are retained and 'Effect'
@@ -1050,9 +1050,9 @@ update env ledger st ev = case (st, ev) of
onClosedChainContestTx closedState snapshotNumber
(Closed cst@ClosedState{contestationDeadline, readyToFanoutSent, headId}, OnChainEvent (Tick chainTime _))
| chainTime > contestationDeadline && not readyToFanoutSent ->
- NewState
- (Closed cst{readyToFanoutSent = True})
- [ClientEffect $ ReadyToFanout headId]
+ NewState
+ (Closed cst{readyToFanoutSent = True})
+ [ClientEffect $ ReadyToFanout headId]
(Closed closedState, ClientEvent Fanout) ->
onClosedClientFanout closedState
(Closed closedState, OnChainEvent Observation{observedTx = OnFanoutTx{}, newChainState}) ->
@@ -1099,17 +1099,17 @@ newSn :: Environment -> HeadParameters -> CoordinatedHeadState tx -> SnapshotOut
newSn Environment{party} parameters CoordinatedHeadState{confirmedSnapshot, seenSnapshot, seenTxs} =
if
| not (isLeader parameters party nextSn) ->
- ShouldNotSnapshot $ NotLeader nextSn
+ ShouldNotSnapshot $ NotLeader nextSn
| -- NOTE: This is different than in the spec. If we use seenSn /=
-- confirmedSn here, we implicitly require confirmedSn <= seenSn. Which
-- may be an acceptable invariant, but we have property tests which are
-- more strict right now. Anyhow, we can be more expressive.
snapshotInFlight ->
- ShouldNotSnapshot $ SnapshotInFlight nextSn
+ ShouldNotSnapshot $ SnapshotInFlight nextSn
| null seenTxs ->
- ShouldNotSnapshot NoTransactionsToSnapshot
+ ShouldNotSnapshot NoTransactionsToSnapshot
| otherwise ->
- ShouldSnapshot nextSn seenTxs
+ ShouldSnapshot nextSn seenTxs
where
nextSn = confirmedSn + 1
I'm a bit puzzled that we would find so many instabilities in a code formatting tool. Each version seem to introduce quite dramatic format changes :(
- While working on improving PR for publishing architecture page, CI failed on
Documentation
step with a very odd error - I was able to reproduce the error locally by running a simple
yarn validate:inputs
but it took me a while to track it down, and I had to resort to debuging thevalidate-api.js
script:- The problem stemmed from an update in
@asyncapi/specs
to a new minor version that was actually a breaking change. The valdator we use,asyncapi-validator
depends on@asyncapi/parser
that has an upper bound of^4.1.1
-
yarn install
therefore resolved this bound to be4.3.3
and although I addded anoverrides: {..}
seciton to thepackage.json
this had not effect on yarn's resolution mechanism - I ended up manually changing the
yarn.lock
file to link^4.1.1
to4.2.1
and not4.3.3
, which seems gross: There has to be a better way.
- The problem stemmed from an update in
- Now, the documentation is failing on docusaurus because of broken links I need to fix 🔥
-
Unattended upgrades - how to make dependencies upgrade less painful (When it's painful, do it more often)
- Do not fix any dependency version
- Upgrade every day (automatically)
- Fix what’s broken (including fixing some dependencies version because life sucks)
- Rinse and repeat
- practically:
- Bump cabal indexes (then
cabal update
will pick up new deps) automatically - Bump cardano source dependencies
- Bump cabal indexes (then
- Where are we with 0.10.0?
- Cut it when mainnet compat + commits/Rollbacks are merged
-
Architecture page - KISS and nicely looking?
- Mermaid was just an idea, no need to go to great length to use it
- We could just keep the diagram we have so as not to duplicate source of truths and representations to update?
- Goal: Easier to find and approach? No need to use C4/standardized tool really. OTOH, Miro is not great for sharing outside of the core team, so having an autoritative source that anyone could update would be useful
- Conclusion: Do something then review but not make it overcomplicated
-
Supported timed tx
- Currently I have a draft e2e spec failing for the right reason: attempting to submit a new tx using validity lower bound leads to tx invalid validation error. Now I need to fix the ledger to make the tx valid. Hints?
- Let's pair on this after the grooming session
After the tremendous job from @ch1bo to update our dependencies, let’s check if we can safely upgrade our Haskell build tools.
Only successful combination:
- Cabal 3.10.1.0
- GHC 8.10.7
So it seems we can safely upgrade our cabal version but are stuck with the old GHC.
#> cabal --version
cabal-install version 3.10.1.0
compiled using version 3.10.1.0 of the Cabal library
#> ghc --version
The Glorious Glasgow Haskell Compilation System, version 9.6.1
#> cabal build hydra-node
…
Resolving dependencies...
Error: cabal: Could not resolve dependencies:
[__0] trying: cardano-api-1.36.0 (user goal)
[__1] next goal: base (dependency of cardano-api)
[__1] rejecting: base-4.18.0.0/installed-4.18.0.0 (conflict: cardano-api =>
base>=4.14 && <4.17)
[__1] skipping: base-4.18.0.0, base-4.17.1.0, base-4.17.0.0 (has the same
characteristics that caused the previous version to fail: excluded by
constraint '>=4.14 && <4.17' from 'cardano-api')
[__1] rejecting: base-4.16.4.0, base-4.16.3.0, base-4.16.2.0, base-4.16.1.0,
base-4.16.0.0, base-4.15.1.0, base-4.15.0.0, base-4.14.3.0, base-4.14.2.0,
base-4.14.1.0, base-4.14.0.0, base-4.13.0.0, base-4.12.0.0, base-4.11.1.0,
base-4.11.0.0, base-4.10.1.0, base-4.10.0.0, base-4.9.1.0, base-4.9.0.0,
base-4.8.2.0, base-4.8.1.0, base-4.8.0.0, base-4.7.0.2, base-4.7.0.1,
base-4.7.0.0, base-4.6.0.1, base-4.6.0.0, base-4.5.1.0, base-4.5.0.0,
base-4.4.1.0, base-4.4.0.0, base-4.3.1.0, base-4.3.0.0, base-4.2.0.2,
base-4.2.0.1, base-4.2.0.0, base-4.1.0.0, base-4.0.0.0, base-3.0.3.2,
base-3.0.3.1 (constraint from non-upgradeable package requires installed
instance)
[__1] fail (backjumping, conflict set: base, cardano-api)
After searching the rest of the dependency tree exhaustively, these were the
goals I've had most trouble fulfilling: base, cardano-api
Same conflict issue as with ghc 9.6.1 between base and cardano-api.
#> cabal --version
cabal-install version 3.10.1.0
compiled using version 3.10.1.0 of the Cabal library
#> ghc --version
The Glorious Glasgow Haskell Compilation System, version 9.2.7
#> cabal build hydra-node
…
Building library for strict-containers-0.1.0.0..
[1 of 4] Compiling Data.Maybe.Strict ( src/Data/Maybe/Strict.hs, dist/build/Data/Maybe/Strict.o, dist/build/Data/Maybe/Strict.dyn_o )
src/Data/Maybe/Strict.hs:64:3: error: [-Wnoncanonical-monad-instances, -Werror=noncanonical-monad-instances]
Noncanonical ‘return’ definition detected
in the instance declaration for ‘Monad StrictMaybe’.
‘return’ will eventually be removed in favour of ‘pure’
Either remove definition for ‘return’ (recommended) or define as ‘return = pure’
See also: https://gitlab.haskell.org/ghc/ghc/-/wikis/proposal/monad-of-no-return
|
64 | return = SJust
| ^^^^^^^^^^^^^^
[2 of 4] Compiling Data.Unit.Strict ( src/Data/Unit/Strict.hs, dist/build/Data/Unit/Strict.o, dist/build/Data/Unit/Strict.dyn_o )
[3 of 4] Compiling Data.Sequence.Strict ( src/Data/Sequence/Strict.hs, dist/build/Data/Sequence/Strict.o, dist/build/Data/Sequence/Strict.dyn_o )
[4 of 4] Compiling Data.FingerTree.Strict ( src/Data/FingerTree/Strict.hs, dist/build/Data/FingerTree/Strict.o, dist/build/Data/FingerTree/Strict.dyn_o )
Error: cabal: Failed to build strict-containers-0.1.0.0 (which is required by
test:tests from hydra-node-0.10.0, bench:tx-cost from hydra-node-0.10.0 and
others). See the build log above for details.
Same compilation error as with ghc 9.2.7 on strict-containers.
#> cabal --version
cabal-install version 3.10.1.0
compiled using version 3.10.1.0 of the Cabal library
#> ghc --version
The Glorious Glasgow Haskell Compilation System, version 9.0.2
#> cabal build hydra-node
…
Error: cabal: Could not resolve dependencies:
[__0] trying: cardano-api-1.36.0 (user goal)
[__1] next goal: cardano-ledger-alonzo-test (dependency of cardano-api)
[__1] rejecting: cardano-ledger-alonzo-test-1.0.0.0 (conflict: cardano-api =>
cardano-ledger-alonzo-test^>=0.1)
[__1] trying: cardano-ledger-alonzo-test-0.1.1.2
[__2] trying: plutus-tx-1.3.0.0 (dependency of cardano-ledger-alonzo-test)
[__3] trying: plutus-core-1.3.0.0 (dependency of plutus-tx)
[__4] trying: hydra-plutus-0.10.0 (user goal)
[__5] trying: plutus-tx-plugin-1.3.0.0 (dependency of hydra-plutus)
[__6] trying: plutus-tx-plugin:-use-ghc-stub
[__7] next goal: ghc (dependency of plutus-tx-plugin -use-ghc-stub)
[__7] rejecting: ghc-9.0.2/installed-9.0.2 (conflict: plutus-tx-plugin
-use-ghc-stub => ghc>=9.2 && <9.4)
[__7] rejecting: ghc-9.6.1, ghc-9.4.5, ghc-9.4.4, ghc-9.4.3, ghc-9.4.2,
ghc-9.4.1, ghc-9.2.7, ghc-9.2.6, ghc-9.2.5, ghc-9.2.4, ghc-9.2.3, ghc-9.2.2,
ghc-9.2.1, ghc-9.0.2, ghc-8.10.7, ghc-8.10.2, ghc-8.10.1, ghc-8.8.3,
ghc-8.8.1, ghc-8.6.5, ghc-8.6.4, ghc-8.6.1, ghc-8.4.4, ghc-8.4.3, ghc-8.4.1,
ghc-8.2.2, ghc-8.2.1, ghc-9.2.3.20220620 (constraint from non-upgradeable
package requires installed instance)
[__7] fail (backjumping, conflict set: ghc, plutus-tx-plugin,
plutus-tx-plugin:use-ghc-stub)
After searching the rest of the dependency tree exhaustively, these were the
goals I've had most trouble fulfilling: ghc, cardano-api, plutus-tx-plugin,
plutus-core, plutus-tx, plutus-tx-plugin:use-ghc-stub, hydra-plutus,
cardano-ledger-alonzo-test
Success
PostTxOnChainFailed user issue - what to do?
- no bug per se but a not cool user experience
- we'll groom it next Tuesday to see what we can do to improve it
Running hydra-cluster tests locally on M2
- will need some help with that
- will share the actual error message in slack first
CI issues while Logging in to ghcr.io
- Github outage in progress so just have to wait
About supporting timed transactions:
- auction people seem aligned on the issue description
- voting is using posix time to build transactions
- about granularity:
- One a server output every L1 slot should be good
- Maybe more granular support in the future would be cool for short living auctions?
- First draft of a test sketched
Current head on mainnet liveness issue
- The leader didn't see ackSn from 2 of the peers
- need to check the other peers logs
About Benchmarks:
- Couple of questions yesterday about benchmarks from the community
- We'll mature/prioritize the benchmark issue already present in the roadmap
The Rollback event now contains two informations:
- the ChainPoint we were rolledback to
- the ChainState
The ChainPoint is not really used now by the HeadLogic but it makes it more and more clear to me that this could become, at some point, the only chain state information to persist.
We can imagine then that, on startup, the node will inject to the chain component the chainpoint from which to start. Then, it will record the chainpoint for which it has handled the corresponding roll forward or backward events... and that's it.
That would pave the way to a more generic chain component.
For now, we still need the chain state as the chain component uses it to do some smart observation.
When removing the Rollback ServerOutput some tests are
failing to tell that, oh wait! you should update the
log schema. The test output is super huge, it's hard
to figure out what I'm supposed to change.
In the end it was just that the previousRecoverableState
was still present in logs.yaml but the output was really
useless, I had to review, point by point, the json object
and compare it with the schema. I would have appreciated
a regular compiler message like missing mandatory property
.
That could have saved me at least 30 minutes.
Exploring the logs from #832 is really complicated.
One of the issue, I think, comes from the fact that each and every line of log has its own type/schema. Meaning it's very hard to filter the logs.
For instance, if I want to get the failureReason of a post of a transaction, depending on the log line, I will need a different filter, which makes it really hard and, I think, barely usable in regular web-based log management tools.
Discussing with AB, we realize that, actually, the failureReason should appear one and only once in the logs but appears 8 times and in 5 different places inside the log, meaning 5 different patterns to extract it:
.message.directChain.postTxError.failureReason
.message.node.event.postTxError.failureReason
.message.node.outcome.effects[0].serverOutput.postTxError.failureReason
.message.node.effect.serverOutput.postTxError.failureReason
.message.api.sentOutput.postTxError.failureReason
Looks like the problem comes from the fact, here that an error appears and is handled but, instead of letting the last layer of code log the event, all layers log it when they see it. In previous experience with that log pattern, this led to very complicated operations.
We should ensure we only show information once, unless we're in debug mode (which we don't want) where we just want to trace this line of code has been executed.
Here are these 8 lines of logs for this failure:
{"timestamp":"2023-04-20T23:33:44.269539705Z","threadId":80,"namespace":"HydraNode-\"1\"","message":{"directChain":{"postTxError":{"failureReason":"HardForkApplyTxErrFromEra S (S (S (S (S (Z (WrapApplyTxErr {unwrapApplyTxErr = ApplyTxError [UtxowFailure (FromAlonzoUtxowFail (NonOutputSupplimentaryDatums (fromList [SafeHash \"94823cbee8d16b5154da8f6e57cbd6a9de165ffef1299fa8520d3ca6acd36da5\"]) (fromList []))),UtxowFailure (FromAlonzoUtxowFail (ExtraRedeemers [RdmrPtr Spend 0])),UtxowFailure (UtxoFailure (FromAlonzoUtxoFail (UtxosFailure (CollectErrors [BadTranslation (TranslationLogicMissingInput (TxIn (TxId {_unTxId = SafeHash \"0b7ada4617bf2a7180f9eac4ea60a7a95f5c90c853af8bb309f37b4d2a2092c7\"}) (TxIx 0)))])))),UtxowFailure (UtxoFailure (FromAlonzoUtxoFail (ValueNotConservedUTxO (Value 96471200 (fromList [(PolicyID {policyID = ScriptHash \"6b4a78a641be2867925749129f97f8a5965631b9c80a222f82188355\"},fromList [(14b045cc21d4316f6ac896a438ac194fc19573619181c6a44a540215,-1),(4879647261486561645631,-1),(c6813d435dfed1f0d6dda71ebf97285ad1214a6dfd3abcfa4b740ab1,-1),(ddde64bb6ae682b34b69ad651375b6b23abfa9860d46329f4444fd31,-1)])])) (Value 856471200 (fromList [(PolicyID {policyID = ScriptHash \"2035da628491aba52c1ab37ff68124d40fd49c0e5f258992229f0da7\"},fromList [(566f7563686572,1)])]))))),UtxowFailure (UtxoFailure (FromAlonzoUtxoFail (BadInputsUTxO (fromList [TxIn (TxId {_unTxId = SafeHash \"0b7ada4617bf2a7180f9eac4ea60a7a95f5c90c853af8bb309f37b4d2a2092c7\"}) (TxIx 0)]))))]}))))))","tag":"FailedToPostTx"},"tag":"PostingFailed","tx":{"body":{"collateral":["0b7ada4617bf2a7180f9eac4ea60a7a95f5c90c853af8bb309f37b4d2a2092c7#1"],"fees":3528800,"inputs":["0b7ada4617bf2a7180f9eac4ea60a7a95f5c90c853af8bb309f37b4d2a2092c7#0","0b7ada4617bf2a7180f9eac4ea60a7a95f5c90c853af8bb309f37b4d2a2092c7#1"],"mint":{"6b4a78a641be2867925749129f97f8a5965631b9c80a222f82188355":{"14b045cc21d4316f6ac896a438ac194fc19573619181c6a44a540215":-1,"4879647261486561645631":-1,"c6813d435dfed1f0d6dda71ebf97285ad1214a6dfd3abcfa4b740ab1":-1,"ddde64bb6ae682b34b69ad651375b6b23abfa9860d46329f4444fd31":-1}},"outputs":[{"address":"addr_test1vq2tq3wvy82rzmm2ezt2gw9vr98ur9tnvxgcr34yff2qy9g8ny6m3","datum":null,"datumhash":null,"inlineDatum":null,"referenceScript":null,"value":{"lovelace":250000000}},{"address":"addr_test1wzmgf6tetqh88fgxwj088v222wkcxkvgchngs60xuzx2g2grsaafx","datum":null,"inlineDatum":{"constructor":0,"fields":[{"constructor":0,"fields":[{"list":[]},{"constructor":0,"fields":[{"constructor":0,"fields":[{"bytes":"1052386136b347f3bb7c67fe3f2ee4ef120e1836e5d2707bb068afa6"},{"int":8000000000000}]}]}]},{"constructor":0,"fields":[{"bytes":"2035da628491aba52c1ab37ff68124d40fd49c0e5f258992229f0da7"}]}]},"inlineDatumhash":"cfc8aa1b3344cbb0544b1bee9fb7add731ce5b14c36f41429065d4c158e1ee0c","referenceScript":null,"value":{"2035da628491aba52c1ab37ff68124d40fd49c0e5f258992229f0da7":{"566f7563686572":1},"lovelace":2000000}},{"address":"addr_test1vrrgz02rthldruxkmkn3a0uh9pddzg22dh7n4086fd6q4vgeumv4j","datum":null,"datumhash":null,"inlineDatum":null,"referenceScript":null,"value":{"lovelace":500000000}},{"address":"addr_test1vrwaue9mdtng9v6tdxkk2ym4k6er40afscx5vv5lg3z06vgqehfxu","datum":null,"datumhash":"a654fb60d21c1fed48db2c320aa6df9737ec0204c0ba53b9b94a09fb40e757f3","inlineDatum":null,"referenceScript":null,"value":{"lovelace":100942400}}],"referenceInputs":["9b6787a1aecb93f748d0498222e31d6f41f68948f69bc1b6b8d94cd9649fda34#2"],"scriptIntegrityHash":"6cc695500d48b1d96b6cffa2ef42a936a3ac6e511c3b4cc586654b697d5dabea","validity":{"notAfter":null,"notBefore":13151}},"id":"daec36ab16422c3214489266812cc66ba0a2f32a5ce53af7eb1193f3287861e9","isValid":true,"witnesses":{"datums":{"94823cbee8d16b5154da8f6e57cbd6a9de165ffef1299fa8520d3ca6acd36da5":"d87b9f9f58207abcda7de6d883e7570118c1ccc8ee2e911f2e628a41ab0685ffee15f39bba965820b37aabd81024c043f53a069c91e51a5b52e4ea399ae17ee1fe3cb9c44db707eb5820f68e5624f885d521d2f43c3959a0de70496d5464bd3171aba8248f50d5d72b41ff0158200f77911d83755729182157cddcfb716a749c702e46e7374e7d858280bab779681b00000187a10443c0d8799f1903e8ff581c6b4a78a641be2867925749129f97f8a5965631b9c80a222f8218835580ff"},"keys":["8200825820e122239580c539cd211ed1bd789a9b4b3dfdf69cff82dd1a8e79bd73442e339c5840d918eac9cc0f590f213d5c0ecaeb67907d085cfa0242d16e2a6d5d02a80b049376a9204b86c01e5fb491c8dbaf3ac5d4dd644201073ebcde869a2d2d4110ce08"],"redeemers":"82840000d87d9f03ff821a007681a61b000000016b953070840100d87a80821a005f1dd91ae876b38f","scripts":{"6b4a78a641be2867925749129f97f8a5965631b9c80a222f82188355":""}}}},"tag":"DirectChain"}}
{"timestamp":"2023-04-20T23:33:44.271499168Z","threadId":75,"namespace":"HydraNode-\"1\"","message":{"node":{"by":{"vkey":"b37aabd81024c043f53a069c91e51a5b52e4ea399ae17ee1fe3cb9c44db707eb"},"event":{"postChainTx":{"contestationDeadline":"2023-04-20T23:33:44Z","tag":"FanoutTx","utxo":{"8e8bf7795d7c3ed786758032085e14fe3127ffff302b519462984e0247d7f416#0":{"address":"addr_test1vq2tq3wvy82rzmm2ezt2gw9vr98ur9tnvxgcr34yff2qy9g8ny6m3","datum":null,"datumhash":null,"inlineDatum":null,"referenceScript":null,"value":{"lovelace":250000000}},"99cf7222af2f656b00240b61fe25d2aa6f9e09966d71754a522f123f72f4d80d#0":{"address":"addr_test1wzmgf6tetqh88fgxwj088v222wkcxkvgchngs60xuzx2g2grsaafx","datum":null,"inlineDatum":{"constructor":0,"fields":[{"constructor":0,"fields":[{"list":[]},{"constructor":0,"fields":[{"constructor":0,"fields":[{"bytes":"1052386136b347f3bb7c67fe3f2ee4ef120e1836e5d2707bb068afa6"},{"int":8000000000000}]}]}]},{"constructor":0,"fields":[{"bytes":"2035da628491aba52c1ab37ff68124d40fd49c0e5f258992229f0da7"}]}]},"inlineDatumhash":"cfc8aa1b3344cbb0544b1bee9fb7add731ce5b14c36f41429065d4c158e1ee0c","referenceScript":null,"value":{"2035da628491aba52c1ab37ff68124d40fd49c0e5f258992229f0da7":{"566f7563686572":1},"lovelace":2000000}},"99cf7222af2f656b00240b61fe25d2aa6f9e09966d71754a522f123f72f4d80d#1":{"address":"addr_test1vrrgz02rthldruxkmkn3a0uh9pddzg22dh7n4086fd6q4vgeumv4j","datum":null,"datumhash":null,"inlineDatum":null,"referenceScript":null,"value":{"lovelace":500000000}}}},"postTxError":{"failureReason":"HardForkApplyTxErrFromEra S (S (S (S (S (Z (WrapApplyTxErr {unwrapApplyTxErr = ApplyTxError [UtxowFailure (FromAlonzoUtxowFail (NonOutputSupplimentaryDatums (fromList [SafeHash \"94823cbee8d16b5154da8f6e57cbd6a9de165ffef1299fa8520d3ca6acd36da5\"]) (fromList []))),UtxowFailure (FromAlonzoUtxowFail (ExtraRedeemers [RdmrPtr Spend 0])),UtxowFailure (UtxoFailure (FromAlonzoUtxoFail (UtxosFailure (CollectErrors [BadTranslation (TranslationLogicMissingInput (TxIn (TxId {_unTxId = SafeHash \"0b7ada4617bf2a7180f9eac4ea60a7a95f5c90c853af8bb309f37b4d2a2092c7\"}) (TxIx 0)))])))),UtxowFailure (UtxoFailure (FromAlonzoUtxoFail (ValueNotConservedUTxO (Value 96471200 (fromList [(PolicyID {policyID = ScriptHash \"6b4a78a641be2867925749129f97f8a5965631b9c80a222f82188355\"},fromList [(14b045cc21d4316f6ac896a438ac194fc19573619181c6a44a540215,-1),(4879647261486561645631,-1),(c6813d435dfed1f0d6dda71ebf97285ad1214a6dfd3abcfa4b740ab1,-1),(ddde64bb6ae682b34b69ad651375b6b23abfa9860d46329f4444fd31,-1)])])) (Value 856471200 (fromList [(PolicyID {policyID = ScriptHash \"2035da628491aba52c1ab37ff68124d40fd49c0e5f258992229f0da7\"},fromList [(566f7563686572,1)])]))))),UtxowFailure (UtxoFailure (FromAlonzoUtxoFail (BadInputsUTxO (fromList [TxIn (TxId {_unTxId = SafeHash \"0b7ada4617bf2a7180f9eac4ea60a7a95f5c90c853af8bb309f37b4d2a2092c7\"}) (TxIx 0)]))))]}))))))","tag":"FailedToPostTx"},"tag":"PostTxError"},"tag":"BeginEvent"},"tag":"Node"}}
{"timestamp":"2023-04-20T23:33:44.27150021Z","threadId":75,"namespace":"HydraNode-\"1\"","message":{"node":{"by":{"vkey":"b37aabd81024c043f53a069c91e51a5b52e4ea399ae17ee1fe3cb9c44db707eb"},"outcome":{"effects":[{"serverOutput":{"postChainTx":{"contestationDeadline":"2023-04-20T23:33:44Z","tag":"FanoutTx","utxo":{"8e8bf7795d7c3ed786758032085e14fe3127ffff302b519462984e0247d7f416#0":{"address":"addr_test1vq2tq3wvy82rzmm2ezt2gw9vr98ur9tnvxgcr34yff2qy9g8ny6m3","datum":null,"datumhash":null,"inlineDatum":null,"referenceScript":null,"value":{"lovelace":250000000}},"99cf7222af2f656b00240b61fe25d2aa6f9e09966d71754a522f123f72f4d80d#0":{"address":"addr_test1wzmgf6tetqh88fgxwj088v222wkcxkvgchngs60xuzx2g2grsaafx","datum":null,"inlineDatum":{"constructor":0,"fields":[{"constructor":0,"fields":[{"list":[]},{"constructor":0,"fields":[{"constructor":0,"fields":[{"bytes":"1052386136b347f3bb7c67fe3f2ee4ef120e1836e5d2707bb068afa6"},{"int":8000000000000}]}]}]},{"constructor":0,"fields":[{"bytes":"2035da628491aba52c1ab37ff68124d40fd49c0e5f258992229f0da7"}]}]},"inlineDatumhash":"cfc8aa1b3344cbb0544b1bee9fb7add731ce5b14c36f41429065d4c158e1ee0c","referenceScript":null,"value":{"2035da628491aba52c1ab37ff68124d40fd49c0e5f258992229f0da7":{"566f7563686572":1},"lovelace":2000000}},"99cf7222af2f656b00240b61fe25d2aa6f9e09966d71754a522f123f72f4d80d#1":{"address":"addr_test1vrrgz02rthldruxkmkn3a0uh9pddzg22dh7n4086fd6q4vgeumv4j","datum":null,"datumhash":null,"inlineDatum":null,"referenceScript":null,"value":{"lovelace":500000000}}}},"postTxError":{"failureReason":"HardForkApplyTxErrFromEra S (S (S (S (S (Z (WrapApplyTxErr {unwrapApplyTxErr = ApplyTxError [UtxowFailure (FromAlonzoUtxowFail (NonOutputSupplimentaryDatums (fromList [SafeHash \"94823cbee8d16b5154da8f6e57cbd6a9de165ffef1299fa8520d3ca6acd36da5\"]) (fromList []))),UtxowFailure (FromAlonzoUtxowFail (ExtraRedeemers [RdmrPtr Spend 0])),UtxowFailure (UtxoFailure (FromAlonzoUtxoFail (UtxosFailure (CollectErrors [BadTranslation (TranslationLogicMissingInput (TxIn (TxId {_unTxId = SafeHash \"0b7ada4617bf2a7180f9eac4ea60a7a95f5c90c853af8bb309f37b4d2a2092c7\"}) (TxIx 0)))])))),UtxowFailure (UtxoFailure (FromAlonzoUtxoFail (ValueNotConservedUTxO (Value 96471200 (fromList [(PolicyID {policyID = ScriptHash \"6b4a78a641be2867925749129f97f8a5965631b9c80a222f82188355\"},fromList [(14b045cc21d4316f6ac896a438ac194fc19573619181c6a44a540215,-1),(4879647261486561645631,-1),(c6813d435dfed1f0d6dda71ebf97285ad1214a6dfd3abcfa4b740ab1,-1),(ddde64bb6ae682b34b69ad651375b6b23abfa9860d46329f4444fd31,-1)])])) (Value 856471200 (fromList [(PolicyID {policyID = ScriptHash \"2035da628491aba52c1ab37ff68124d40fd49c0e5f258992229f0da7\"},fromList [(566f7563686572,1)])]))))),UtxowFailure (UtxoFailure (FromAlonzoUtxoFail (BadInputsUTxO (fromList [TxIn (TxId {_unTxId = SafeHash \"0b7ada4617bf2a7180f9eac4ea60a7a95f5c90c853af8bb309f37b4d2a2092c7\"}) (TxIx 0)]))))]}))))))","tag":"FailedToPostTx"},"tag":"PostTxOnChainFailed"},"tag":"ClientEffect"}],"tag":"OnlyEffects"},"tag":"LogicOutcome"},"tag":"Node"}}
{"timestamp":"2023-04-20T23:33:44.271501929Z","threadId":75,"namespace":"HydraNode-\"1\"","message":{"node":{"by":{"vkey":"b37aabd81024c043f53a069c91e51a5b52e4ea399ae17ee1fe3cb9c44db707eb"},"effect":{"serverOutput":{"postChainTx":{"contestationDeadline":"2023-04-20T23:33:44Z","tag":"FanoutTx","utxo":{"8e8bf7795d7c3ed786758032085e14fe3127ffff302b519462984e0247d7f416#0":{"address":"addr_test1vq2tq3wvy82rzmm2ezt2gw9vr98ur9tnvxgcr34yff2qy9g8ny6m3","datum":null,"datumhash":null,"inlineDatum":null,"referenceScript":null,"value":{"lovelace":250000000}},"99cf7222af2f656b00240b61fe25d2aa6f9e09966d71754a522f123f72f4d80d#0":{"address":"addr_test1wzmgf6tetqh88fgxwj088v222wkcxkvgchngs60xuzx2g2grsaafx","datum":null,"inlineDatum":{"constructor":0,"fields":[{"constructor":0,"fields":[{"list":[]},{"constructor":0,"fields":[{"constructor":0,"fields":[{"bytes":"1052386136b347f3bb7c67fe3f2ee4ef120e1836e5d2707bb068afa6"},{"int":8000000000000}]}]}]},{"constructor":0,"fields":[{"bytes":"2035da628491aba52c1ab37ff68124d40fd49c0e5f258992229f0da7"}]}]},"inlineDatumhash":"cfc8aa1b3344cbb0544b1bee9fb7add731ce5b14c36f41429065d4c158e1ee0c","referenceScript":null,"value":{"2035da628491aba52c1ab37ff68124d40fd49c0e5f258992229f0da7":{"566f7563686572":1},"lovelace":2000000}},"99cf7222af2f656b00240b61fe25d2aa6f9e09966d71754a522f123f72f4d80d#1":{"address":"addr_test1vrrgz02rthldruxkmkn3a0uh9pddzg22dh7n4086fd6q4vgeumv4j","datum":null,"datumhash":null,"inlineDatum":null,"referenceScript":null,"value":{"lovelace":500000000}}}},"postTxError":{"failureReason":"HardForkApplyTxErrFromEra S (S (S (S (S (Z (WrapApplyTxErr {unwrapApplyTxErr = ApplyTxError [UtxowFailure (FromAlonzoUtxowFail (NonOutputSupplimentaryDatums (fromList [SafeHash \"94823cbee8d16b5154da8f6e57cbd6a9de165ffef1299fa8520d3ca6acd36da5\"]) (fromList []))),UtxowFailure (FromAlonzoUtxowFail (ExtraRedeemers [RdmrPtr Spend 0])),UtxowFailure (UtxoFailure (FromAlonzoUtxoFail (UtxosFailure (CollectErrors [BadTranslation (TranslationLogicMissingInput (TxIn (TxId {_unTxId = SafeHash \"0b7ada4617bf2a7180f9eac4ea60a7a95f5c90c853af8bb309f37b4d2a2092c7\"}) (TxIx 0)))])))),UtxowFailure (UtxoFailure (FromAlonzoUtxoFail (ValueNotConservedUTxO (Value 96471200 (fromList [(PolicyID {policyID = ScriptHash \"6b4a78a641be2867925749129f97f8a5965631b9c80a222f82188355\"},fromList [(14b045cc21d4316f6ac896a438ac194fc19573619181c6a44a540215,-1),(4879647261486561645631,-1),(c6813d435dfed1f0d6dda71ebf97285ad1214a6dfd3abcfa4b740ab1,-1),(ddde64bb6ae682b34b69ad651375b6b23abfa9860d46329f4444fd31,-1)])])) (Value 856471200 (fromList [(PolicyID {policyID = ScriptHash \"2035da628491aba52c1ab37ff68124d40fd49c0e5f258992229f0da7\"},fromList [(566f7563686572,1)])]))))),UtxowFailure (UtxoFailure (FromAlonzoUtxoFail (BadInputsUTxO (fromList [TxIn (TxId {_unTxId = SafeHash \"0b7ada4617bf2a7180f9eac4ea60a7a95f5c90c853af8bb309f37b4d2a2092c7\"}) (TxIx 0)]))))]}))))))","tag":"FailedToPostTx"},"tag":"PostTxOnChainFailed"},"tag":"ClientEffect"},"tag":"BeginEffect"},"tag":"Node"}}
{"timestamp":"2023-04-20T23:33:44.271816487Z","threadId":75,"namespace":"HydraNode-\"1\"","message":{"node":{"by":{"vkey":"b37aabd81024c043f53a069c91e51a5b52e4ea399ae17ee1fe3cb9c44db707eb"},"effect":{"serverOutput":{"postChainTx":{"contestationDeadline":"2023-04-20T23:33:44Z","tag":"FanoutTx","utxo":{"8e8bf7795d7c3ed786758032085e14fe3127ffff302b519462984e0247d7f416#0":{"address":"addr_test1vq2tq3wvy82rzmm2ezt2gw9vr98ur9tnvxgcr34yff2qy9g8ny6m3","datum":null,"datumhash":null,"inlineDatum":null,"referenceScript":null,"value":{"lovelace":250000000}},"99cf7222af2f656b00240b61fe25d2aa6f9e09966d71754a522f123f72f4d80d#0":{"address":"addr_test1wzmgf6tetqh88fgxwj088v222wkcxkvgchngs60xuzx2g2grsaafx","datum":null,"inlineDatum":{"constructor":0,"fields":[{"constructor":0,"fields":[{"list":[]},{"constructor":0,"fields":[{"constructor":0,"fields":[{"bytes":"1052386136b347f3bb7c67fe3f2ee4ef120e1836e5d2707bb068afa6"},{"int":8000000000000}]}]}]},{"constructor":0,"fields":[{"bytes":"2035da628491aba52c1ab37ff68124d40fd49c0e5f258992229f0da7"}]}]},"inlineDatumhash":"cfc8aa1b3344cbb0544b1bee9fb7add731ce5b14c36f41429065d4c158e1ee0c","referenceScript":null,"value":{"2035da628491aba52c1ab37ff68124d40fd49c0e5f258992229f0da7":{"566f7563686572":1},"lovelace":2000000}},"99cf7222af2f656b00240b61fe25d2aa6f9e09966d71754a522f123f72f4d80d#1":{"address":"addr_test1vrrgz02rthldruxkmkn3a0uh9pddzg22dh7n4086fd6q4vgeumv4j","datum":null,"datumhash":null,"inlineDatum":null,"referenceScript":null,"value":{"lovelace":500000000}}}},"postTxError":{"failureReason":"HardForkApplyTxErrFromEra S (S (S (S (S (Z (WrapApplyTxErr {unwrapApplyTxErr = ApplyTxError [UtxowFailure (FromAlonzoUtxowFail (NonOutputSupplimentaryDatums (fromList [SafeHash \"94823cbee8d16b5154da8f6e57cbd6a9de165ffef1299fa8520d3ca6acd36da5\"]) (fromList []))),UtxowFailure (FromAlonzoUtxowFail (ExtraRedeemers [RdmrPtr Spend 0])),UtxowFailure (UtxoFailure (FromAlonzoUtxoFail (UtxosFailure (CollectErrors [BadTranslation (TranslationLogicMissingInput (TxIn (TxId {_unTxId = SafeHash \"0b7ada4617bf2a7180f9eac4ea60a7a95f5c90c853af8bb309f37b4d2a2092c7\"}) (TxIx 0)))])))),UtxowFailure (UtxoFailure (FromAlonzoUtxoFail (ValueNotConservedUTxO (Value 96471200 (fromList [(PolicyID {policyID = ScriptHash \"6b4a78a641be2867925749129f97f8a5965631b9c80a222f82188355\"},fromList [(14b045cc21d4316f6ac896a438ac194fc19573619181c6a44a540215,-1),(4879647261486561645631,-1),(c6813d435dfed1f0d6dda71ebf97285ad1214a6dfd3abcfa4b740ab1,-1),(ddde64bb6ae682b34b69ad651375b6b23abfa9860d46329f4444fd31,-1)])])) (Value 856471200 (fromList [(PolicyID {policyID = ScriptHash \"2035da628491aba52c1ab37ff68124d40fd49c0e5f258992229f0da7\"},fromList [(566f7563686572,1)])]))))),UtxowFailure (UtxoFailure (FromAlonzoUtxoFail (BadInputsUTxO (fromList [TxIn (TxId {_unTxId = SafeHash \"0b7ada4617bf2a7180f9eac4ea60a7a95f5c90c853af8bb309f37b4d2a2092c7\"}) (TxIx 0)]))))]}))))))","tag":"FailedToPostTx"},"tag":"PostTxOnChainFailed"},"tag":"ClientEffect"},"tag":"EndEffect"},"tag":"Node"}}
{"timestamp":"2023-04-20T23:33:44.271819339Z","threadId":75,"namespace":"HydraNode-\"1\"","message":{"node":{"by":{"vkey":"b37aabd81024c043f53a069c91e51a5b52e4ea399ae17ee1fe3cb9c44db707eb"},"event":{"postChainTx":{"contestationDeadline":"2023-04-20T23:33:44Z","tag":"FanoutTx","utxo":{"8e8bf7795d7c3ed786758032085e14fe3127ffff302b519462984e0247d7f416#0":{"address":"addr_test1vq2tq3wvy82rzmm2ezt2gw9vr98ur9tnvxgcr34yff2qy9g8ny6m3","datum":null,"datumhash":null,"inlineDatum":null,"referenceScript":null,"value":{"lovelace":250000000}},"99cf7222af2f656b00240b61fe25d2aa6f9e09966d71754a522f123f72f4d80d#0":{"address":"addr_test1wzmgf6tetqh88fgxwj088v222wkcxkvgchngs60xuzx2g2grsaafx","datum":null,"inlineDatum":{"constructor":0,"fields":[{"constructor":0,"fields":[{"list":[]},{"constructor":0,"fields":[{"constructor":0,"fields":[{"bytes":"1052386136b347f3bb7c67fe3f2ee4ef120e1836e5d2707bb068afa6"},{"int":8000000000000}]}]}]},{"constructor":0,"fields":[{"bytes":"2035da628491aba52c1ab37ff68124d40fd49c0e5f258992229f0da7"}]}]},"inlineDatumhash":"cfc8aa1b3344cbb0544b1bee9fb7add731ce5b14c36f41429065d4c158e1ee0c","referenceScript":null,"value":{"2035da628491aba52c1ab37ff68124d40fd49c0e5f258992229f0da7":{"566f7563686572":1},"lovelace":2000000}},"99cf7222af2f656b00240b61fe25d2aa6f9e09966d71754a522f123f72f4d80d#1":{"address":"addr_test1vrrgz02rthldruxkmkn3a0uh9pddzg22dh7n4086fd6q4vgeumv4j","datum":null,"datumhash":null,"inlineDatum":null,"referenceScript":null,"value":{"lovelace":500000000}}}},"postTxError":{"failureReason":"HardForkApplyTxErrFromEra S (S (S (S (S (Z (WrapApplyTxErr {unwrapApplyTxErr = ApplyTxError [UtxowFailure (FromAlonzoUtxowFail (NonOutputSupplimentaryDatums (fromList [SafeHash \"94823cbee8d16b5154da8f6e57cbd6a9de165ffef1299fa8520d3ca6acd36da5\"]) (fromList []))),UtxowFailure (FromAlonzoUtxowFail (ExtraRedeemers [RdmrPtr Spend 0])),UtxowFailure (UtxoFailure (FromAlonzoUtxoFail (UtxosFailure (CollectErrors [BadTranslation (TranslationLogicMissingInput (TxIn (TxId {_unTxId = SafeHash \"0b7ada4617bf2a7180f9eac4ea60a7a95f5c90c853af8bb309f37b4d2a2092c7\"}) (TxIx 0)))])))),UtxowFailure (UtxoFailure (FromAlonzoUtxoFail (ValueNotConservedUTxO (Value 96471200 (fromList [(PolicyID {policyID = ScriptHash \"6b4a78a641be2867925749129f97f8a5965631b9c80a222f82188355\"},fromList [(14b045cc21d4316f6ac896a438ac194fc19573619181c6a44a540215,-1),(4879647261486561645631,-1),(c6813d435dfed1f0d6dda71ebf97285ad1214a6dfd3abcfa4b740ab1,-1),(ddde64bb6ae682b34b69ad651375b6b23abfa9860d46329f4444fd31,-1)])])) (Value 856471200 (fromList [(PolicyID {policyID = ScriptHash \"2035da628491aba52c1ab37ff68124d40fd49c0e5f258992229f0da7\"},fromList [(566f7563686572,1)])]))))),UtxowFailure (UtxoFailure (FromAlonzoUtxoFail (BadInputsUTxO (fromList [TxIn (TxId {_unTxId = SafeHash \"0b7ada4617bf2a7180f9eac4ea60a7a95f5c90c853af8bb309f37b4d2a2092c7\"}) (TxIx 0)]))))]}))))))","tag":"FailedToPostTx"},"tag":"PostTxError"},"tag":"EndEvent"},"tag":"Node"}}
{"timestamp":"2023-04-20T23:33:44.272494594Z","threadId":257,"namespace":"HydraNode-\"1\"","message":{"api":{"sentOutput":{"postChainTx":{"contestationDeadline":"2023-04-20T23:33:44Z","tag":"FanoutTx","utxo":{"8e8bf7795d7c3ed786758032085e14fe3127ffff302b519462984e0247d7f416#0":{"address":"addr_test1vq2tq3wvy82rzmm2ezt2gw9vr98ur9tnvxgcr34yff2qy9g8ny6m3","datum":null,"datumhash":null,"inlineDatum":null,"referenceScript":null,"value":{"lovelace":250000000}},"99cf7222af2f656b00240b61fe25d2aa6f9e09966d71754a522f123f72f4d80d#0":{"address":"addr_test1wzmgf6tetqh88fgxwj088v222wkcxkvgchngs60xuzx2g2grsaafx","datum":null,"inlineDatum":{"constructor":0,"fields":[{"constructor":0,"fields":[{"list":[]},{"constructor":0,"fields":[{"constructor":0,"fields":[{"bytes":"1052386136b347f3bb7c67fe3f2ee4ef120e1836e5d2707bb068afa6"},{"int":8000000000000}]}]}]},{"constructor":0,"fields":[{"bytes":"2035da628491aba52c1ab37ff68124d40fd49c0e5f258992229f0da7"}]}]},"inlineDatumhash":"cfc8aa1b3344cbb0544b1bee9fb7add731ce5b14c36f41429065d4c158e1ee0c","referenceScript":null,"value":{"2035da628491aba52c1ab37ff68124d40fd49c0e5f258992229f0da7":{"566f7563686572":1},"lovelace":2000000}},"99cf7222af2f656b00240b61fe25d2aa6f9e09966d71754a522f123f72f4d80d#1":{"address":"addr_test1vrrgz02rthldruxkmkn3a0uh9pddzg22dh7n4086fd6q4vgeumv4j","datum":null,"datumhash":null,"inlineDatum":null,"referenceScript":null,"value":{"lovelace":500000000}}}},"postTxError":{"failureReason":"HardForkApplyTxErrFromEra S (S (S (S (S (Z (WrapApplyTxErr {unwrapApplyTxErr = ApplyTxError [UtxowFailure (FromAlonzoUtxowFail (NonOutputSupplimentaryDatums (fromList [SafeHash \"94823cbee8d16b5154da8f6e57cbd6a9de165ffef1299fa8520d3ca6acd36da5\"]) (fromList []))),UtxowFailure (FromAlonzoUtxowFail (ExtraRedeemers [RdmrPtr Spend 0])),UtxowFailure (UtxoFailure (FromAlonzoUtxoFail (UtxosFailure (CollectErrors [BadTranslation (TranslationLogicMissingInput (TxIn (TxId {_unTxId = SafeHash \"0b7ada4617bf2a7180f9eac4ea60a7a95f5c90c853af8bb309f37b4d2a2092c7\"}) (TxIx 0)))])))),UtxowFailure (UtxoFailure (FromAlonzoUtxoFail (ValueNotConservedUTxO (Value 96471200 (fromList [(PolicyID {policyID = ScriptHash \"6b4a78a641be2867925749129f97f8a5965631b9c80a222f82188355\"},fromList [(14b045cc21d4316f6ac896a438ac194fc19573619181c6a44a540215,-1),(4879647261486561645631,-1),(c6813d435dfed1f0d6dda71ebf97285ad1214a6dfd3abcfa4b740ab1,-1),(ddde64bb6ae682b34b69ad651375b6b23abfa9860d46329f4444fd31,-1)])])) (Value 856471200 (fromList [(PolicyID {policyID = ScriptHash \"2035da628491aba52c1ab37ff68124d40fd49c0e5f258992229f0da7\"},fromList [(566f7563686572,1)])]))))),UtxowFailure (UtxoFailure (FromAlonzoUtxoFail (BadInputsUTxO (fromList [TxIn (TxId {_unTxId = SafeHash \"0b7ada4617bf2a7180f9eac4ea60a7a95f5c90c853af8bb309f37b4d2a2092c7\"}) (TxIx 0)]))))]}))))))","tag":"FailedToPostTx"},"seq":14,"tag":"PostTxOnChainFailed","timestamp":"2023-04-20T23:33:44.271502914Z"},"tag":"APIOutputSent"},"tag":"APIServer"}}
{"timestamp":"2023-04-20T23:33:44.272791175Z","threadId":261,"namespace":"HydraNode-\"1\"","message":{"api":{"sentOutput":{"postChainTx":{"contestationDeadline":"2023-04-20T23:33:44Z","tag":"FanoutTx","utxo":{"8e8bf7795d7c3ed786758032085e14fe3127ffff302b519462984e0247d7f416#0":{"address":"addr_test1vq2tq3wvy82rzmm2ezt2gw9vr98ur9tnvxgcr34yff2qy9g8ny6m3","datum":null,"datumhash":null,"inlineDatum":null,"referenceScript":null,"value":{"lovelace":250000000}},"99cf7222af2f656b00240b61fe25d2aa6f9e09966d71754a522f123f72f4d80d#0":{"address":"addr_test1wzmgf6tetqh88fgxwj088v222wkcxkvgchngs60xuzx2g2grsaafx","datum":null,"inlineDatum":{"constructor":0,"fields":[{"constructor":0,"fields":[{"list":[]},{"constructor":0,"fields":[{"constructor":0,"fields":[{"bytes":"1052386136b347f3bb7c67fe3f2ee4ef120e1836e5d2707bb068afa6"},{"int":8000000000000}]}]}]},{"constructor":0,"fields":[{"bytes":"2035da628491aba52c1ab37ff68124d40fd49c0e5f258992229f0da7"}]}]},"inlineDatumhash":"cfc8aa1b3344cbb0544b1bee9fb7add731ce5b14c36f41429065d4c158e1ee0c","referenceScript":null,"value":{"2035da628491aba52c1ab37ff68124d40fd49c0e5f258992229f0da7":{"566f7563686572":1},"lovelace":2000000}},"99cf7222af2f656b00240b61fe25d2aa6f9e09966d71754a522f123f72f4d80d#1":{"address":"addr_test1vrrgz02rthldruxkmkn3a0uh9pddzg22dh7n4086fd6q4vgeumv4j","datum":null,"datumhash":null,"inlineDatum":null,"referenceScript":null,"value":{"lovelace":500000000}}}},"postTxError":{"failureReason":"HardForkApplyTxErrFromEra S (S (S (S (S (Z (WrapApplyTxErr {unwrapApplyTxErr = ApplyTxError [UtxowFailure (FromAlonzoUtxowFail (NonOutputSupplimentaryDatums (fromList [SafeHash \"94823cbee8d16b5154da8f6e57cbd6a9de165ffef1299fa8520d3ca6acd36da5\"]) (fromList []))),UtxowFailure (FromAlonzoUtxowFail (ExtraRedeemers [RdmrPtr Spend 0])),UtxowFailure (UtxoFailure (FromAlonzoUtxoFail (UtxosFailure (CollectErrors [BadTranslation (TranslationLogicMissingInput (TxIn (TxId {_unTxId = SafeHash \"0b7ada4617bf2a7180f9eac4ea60a7a95f5c90c853af8bb309f37b4d2a2092c7\"}) (TxIx 0)))])))),UtxowFailure (UtxoFailure (FromAlonzoUtxoFail (ValueNotConservedUTxO (Value 96471200 (fromList [(PolicyID {policyID = ScriptHash \"6b4a78a641be2867925749129f97f8a5965631b9c80a222f82188355\"},fromList [(14b045cc21d4316f6ac896a438ac194fc19573619181c6a44a540215,-1),(4879647261486561645631,-1),(c6813d435dfed1f0d6dda71ebf97285ad1214a6dfd3abcfa4b740ab1,-1),(ddde64bb6ae682b34b69ad651375b6b23abfa9860d46329f4444fd31,-1)])])) (Value 856471200 (fromList [(PolicyID {policyID = ScriptHash \"2035da628491aba52c1ab37ff68124d40fd49c0e5f258992229f0da7\"},fromList [(566f7563686572,1)])]))))),UtxowFailure (UtxoFailure (FromAlonzoUtxoFail (BadInputsUTxO (fromList [TxIn (TxId {_unTxId = SafeHash \"0b7ada4617bf2a7180f9eac4ea60a7a95f5c90c853af8bb309f37b4d2a2092c7\"}) (TxIx 0)]))))]}))))))","tag":"FailedToPostTx"},"seq":14,"tag":"PostTxOnChainFailed","timestamp":"2023-04-20T23:33:44.271502914Z"},"tag":"APIOutputSent"},"tag":"APIServer"}}
C.I. fixed (back from docker to nix 🙄)
Demo video produced.
Public testnet tests in progress, should work smoothly.
Setting no lower bound and slot 1 upper bound solved their former issue.
Requesting the capability to commit multiple UTxOs. Without that, they rely on the following workaround:
- One delegate commits the stuff to bid on
- Another delegate commits double collateral and then split on layer 2 and share with the first.
This workaround is a bit problematic. The delegates should be independent from one another. External commit could help and is in experimental stage in #774.
They’ll evaluate the P.R. instead to drop the work-around.
-
Now that we implemented the 'Hydra.API.Projection' and added 'HeadStatus' to the 'Greetings' message there is a user request to also add in Snapshot utxo to the same message.
-
Starting with the test as usual I created a failing test - then did the most minimal thing to make it pass - and then rinse and repeat. Add more to the test so we are actually testing what we want and then work gradually on the implementation until we are happy. Nice mantra!
-
Initial test where we just check if field
snapshotUtxo
is present in theGreetings
message is passing now. -
I want to go further and say that if the api server emits
SnapshotConfirmed
message oursnapshotUtxo
should reflect and have the same utxo set inside. -
Connecting the dots - introducing a new projection in the
Server
module and connecting the get and update functions. One concern: How to pass these projections (and there can be many) around easily and not as separate arguments? -
All works after plugging in the new projection for snapshotUtxo. To ensure all works with the persistence too I added another test.
- Adding the spec to the repo: The commit history is not so interesting, but the last bit. Let’s start by dumping
master
completely and do a test PR that unmarks things from #777 on top of this. - Adding a
nix build .#spec
was a bit finicky, but with some templates it was okay to do. Currently this pulls in a big texlive distribution though. Maybe can make this smaller if the CI builds are getting too slow. - As we have the spec now buildable, let’s integrate it into the CI
Removing the rollback logic is concerning so we want to exercise, at least, the close in the model spec. This was not exercised before.
When introducing it in f7b286 we discover that the code would not support to close a head in the presence of rollback. So we would need to figure out what to do with that:
- bloc 1: initTx -- state initializing
- bloc 2: commit
- bloc 3: collectCom - state open
- rollback -- state initializing
- pose closeTx -- FAILURE
This confirms that you can not close a head safely if it has been rolled back before the open point
We do not handle rollbacks correctly:
- a client would not know what to do with a rollback event because there's pretty much no information in it
- we do not handle these events internally correctly either
https://github.com/input-output-hk/hydra/issues/185 is about dealing appropriately with the rollback.
With https://github.com/input-output-hk/hydra/issues/185 we will be able to really solve the issues in an appropriate way for both, the client using a hydra-node and the internals of our system. In the mean time, d0b8df6 removes the rollback logic from Hydra head so that:
- we do not send rollback events to the client
- we bet that the seen transactions will be seen again in the new fork of the chain
TODO:
- remove rollback and forward function from BehaviorSpec as it's not used
- add the chainState to the rollback event so that we remove all the rollback logic from HeadLogic
Open a head
- We want to use a commit BEFORE commit 6a49fc0
- Update your node with image
ghcr.io/input-output-hk/hydra-node@sha256:aade79384db54674c4f534265d78be31e3d2336c3d5efae3071ea4016939881d
- Slack thread in progress with the hydra script tx id also
-
Status on commits vs. rollbacks
- Debugged the MockChain / Model / ModelSpec to see why we have timeouts and stuff
- Realized that the “waiting code” in the Model was thinking it saw the initialized, open, etc and continues “performing”, however if a rollback happens one should NOT look at the whole history, but rather only ahead when waiting for a specific state, e.g. wait for the Head to “re-initialize” if there was a rollback after commit
- Changed the model to wait after perform, e.g. wait for Commit after performing a commit to increase atomicity AND wait by only looking at next messages (not into the history)
- Goes well until we would have a rollback “past open” because this throws way L2 state (which UTxO already spent) and following NewTx actions would fail.
- This is a problem: We would need to extend the model based test significantly to react correctly on rollbacks during execution.. Which is not “in scope” right now even in the production code. Only somethin like Rollbacks II would allow that.
- Next step is a bit unclear. Either:
- Make rollbacks explicit in Model and do not generate them in an open head
- Try to make the BehaviorSpec incorporate more of the system (like the MockChain) and write unit tests instead
-
Port issue sharing knowledge
- Flaky tests in CI
- Improving logging was the first and most important step!
- Added sanity test case to ensure how re-used ports behave
- Picked the solution of port selection, which would also work across processes (using the system-level find port using port number 0)
- Still had red tests sometimes, but the traces already indicated that it was NOT related to binding to ports!
- Adding a threadDelay before calling the action in withAPIServer helps!
- Hypothesis: Race condition when starting servers!?
-
Latest unstable image has issues to find vInitial script.
- That is because we have just merged a breaking change on
hydra-plutus
to master - We should not use unreleased versions of scripts on mainnet
- For opening the head
- We can use the newest master (== unstable) to open a head on preview/preprod
- If we want to do mainnet, we should pick a version prior to 6a49fc0
- That is because we have just merged a breaking change on
-
We got a request from a client to include a hydra node state in the
Greetings
server output or alternatively provide a way for a hydra client to know in which state the hydra-node is currently. -
The solution requires drafting a new ADR to make sure everything is accounted for and we have a clear plan on what we want to do and support in the future.
-
But there is also a
quick-and-dirty
solution @ch1b0 suggested here and this is what I'll work on for start. -
The plan is to keep the hydra-node state in a TVar, read it from the api layer and extend whatever
ServerOutput
we need to include the wanted state. -
Let's start by introducing a test case first.
- Discussing about ouroboros-network and how it was to port it to Golang
- Lack of API documentation and a standardization (before implementation)
- Now there are CDDLS of ouroboros protocol network messages: https://github.com/input-output-hk/ouroboros-network/tree/master/ouroboros-network-protocols/test-cddl/specs
- What would be desirable (thinking about Hydra network protocol):
- CDDL at least if using CBOR
- gRPC? or at least Protobufs?
- Some are building also on a gRPC interface to the cardano-node (dolos)
- Checking ST / PT present in input or output?
- What is more intuitive? .. performant?
- Discussed & decided https://github.com/input-output-hk/hydra/pull/777/files#r1171123423
- Consider doing existing tests in
KupoSpec
also run for Hydra where it makes sense. - Setting normal env variables in an
.envrc
which already hasuse flake
seems not to work. - Should add
hpack-0.35.0
to kupo shell - Hash the entire snapshot to come up with a block header hah equivalent.
- Get the transactions into the chain sync.. matthias likely continues on this. Could use the cbor output.
- Checkpoint is used for intersection to continue sync. Can skip history just using this. Use the
seq
orsnapshotNumber
?- This goes a bit against "not keeping history" in Hydra.
- Maybe just always start from latest state which the
hydra-node
already provides?
- Maybe make the response about no metadata (FetchBlock) for better UX
We've identified the problem when rollback happens in presence of commits.
We've managed to build a model test that reproduces the issue.
The problem is due to concurrent access to the chainstate. In the following picture, we can see that when the event has been processed by the head logic, the previous chain state will be chainState 1
instead of chainState 0
, making a sane rollback impossible:
Now that we've discovered all that, we're implementing the following strategy where we would duplicate the chain state. The chain component will make its state evolve independently of the head state. The head state will store the last chain state it has seen so that, when we restart the node, it nows when to restart the chain component from:
For some reason the ModelSpec fail. A participant tries to commit while the head is still Idle which makes non sense. It's a bit complicated to spot the reason for this. Trying to find the commit which introduced the issue is complicated because the C.I. is barely green on this branch :(
First, we think the other nodes don't see the transactions.
Then, we think the other nodes do see the transaction but don't observe it.
The problem came from the fact that all nodes were sharing the same chainState. It explains why only one node would observe an init transaction (kudos to SN):
- node A observes init transaction and make the state evolve
- node B see the init transaction but do not observe it because the state makes it wait for a commit instead.
Moving the chainState creation a few lines above solves the problem.
We should extract the node initialization function so that we avoid
that kind of trap in the future. Right now, when we are looking at this
line of code
it is not that obvious that we are actually in another function than
the surrounding mockChainAndNetwork
function.
It's been a debugging week about how to make L1 stuff work in L2:
- Commits
- Collateral
- CommandFailed was not very informative
SN suggests the team should open an issue but it’s not always clear to them if it’s a problem worth raising an issue or if it’s just them doing something wrong… This is a valuable feedback I guess that Hydra is still complicated to use.
We discuss about this fueled UTxO stuff:
- coin selection could be better but here we are explicit and do not accidentally use a UTxO that was supposed to be used for something else
- look at how other wallets deal with Ada only UTxO?
The team is asking that because they happened to have committed the fuel into the head accidentally.
- After refactoring the chain sync to use cardano-api types, the
onRollForward
callback withBlock Era
is harder to use than it should be when wanting to create blocks inMockChain
. Moving to a =onRollForward:** BlockHeader -> [Tx] -> m ()= callback should make our lives easier. - On the model based testing suite, the default implementation of
HasVariables
is problematic as it percolatesGeneric
constraints everywhere while it is not useful at all? - JSON regression: Script envelope type
SimpleScriptV2
is not supported anymore, it’s justSimpleScript
now. - On
hydra-node
model test don’t pass. Maybe it’s something wrong in the chain layer which does only appear using the MockChain and not within the HandlerSpec. I’ll revisit it after the e2e tests. - Model/Spec is annoying to debug, but it seems that the
HeadIsInitializing
server output is not produced. - The time conversion fails. Likely because we generate non-sense slots or the arbitrary time handle cannot convert the slot to time. Also weird.. the exception does not kill the test.
- Using
link
after spawning threads withasync
is required to make the model based tests fail if the node fails.
- Using
- Finally, after making the tx-cost benchmark compile & run .. I can see that all execution units are 0 here as well.
- After fixing the cost model problems I could finally assess whether we get any size improvements of newer
plutus-tx
versions on thefix-scripts
work. Onmaster
we have4621 + 2422 + 8954 + 4458
total script size. After updating dependencies with same scripts we have4686 + 2513 + 8967 + 4537
.. so a slight regression even. The script fixes come in at4727 + 2513 + 9492 + 4537
.. way too big. - MPJ is surprised the scripts even got bigger between 1.0.0.0 and 1.1.1.0 versions of plutus.
- Using a custom script context I could remove 70 bytes again from the commit script.
- Not using the
POSIXTimeRange
constructor in the custom script context removed ~300 bytes. - Will move forward with using a custom script context on commit & initial to unblock the statemachine fixes branch.
- Enabling the haskell.nix shell to get hoogle and hls for easier error fixing (alternating shells if need be)
- Updating haskell.nix (for hackage index) and CHaP as flake inputs was simple
nix flake lock –update-input haskellNix nix flake lock –update-input CHaP
- What followed was a 900MB download of the hackage index, and several packages being rebuilt (as expected) -
nix-output-monitor
was great to see what’s going on. - Building hls 1.8.0.0 seems not to work via haskell.nix
tools
way. I check whatplutus
does and they are building it “from source” using haskell.nix. - Start work on updating hydra-plutus
- Is using the
SerialisedScript
the right interface at theHydra.Contract.XXX
module level? - Surprised that
scriptOutputsAt
andvalueLockedBy
helper functions got removed inplutus-ledger-api
and are not inplutus-apps
- When making
hydra-node
package compile, thetx-cost
benchmark is a good starting point. This will avoid making the tests and other benchmarks compile, but already can be used to see whether our transactions validate. - The ledger-interfacing functions are a challenge.. somehow
PParams
is missing aHasField
instance for_costmdls
now - The nodeToClient protocol somehow got updated and our direct chain connection I cannot understand. Something related to node to client versions and now block protocol versions. Should move to using the cardano-api IPC module and let them figure it out.
- The
Hydra.Chain.Direct.Wallet
was also annoying to update as it is still using the ledger types.
- Updating dependencies workflow
- Use a cabalOnly shell .. it removes one variable from the equation (nix)
- Disable most packages in cabal.project and start with one with fewest dependencies (e.g. hydra-cardano-api)
- Bump index state and/or source-repository-package in cabal.project
- Cardano-node is a good starting point for determining constraints or allow-newer entries
- Quite some surprises in latest updates to ledger and cardano-api:
-
ValidatedTx
is now calledAlonzoTx
.. also forBabbage
era -
HashedScriptData
is a new original-bytes-carrying indirection on the cardano-api `ScriptData` -> annoying and not clear when to use what?
-
- After askin in the
ledger
team: one should be using theCore.Tx
et al type families and lenses to introspect/create ledger data structures going forward. They are working on acardano-ledger-api
package. For now, I’ll try to stick with the genericCore
data types where possible.. no lenses inhydra-cardano-api
just yet. - Finally
hydra-cardano-api
compiles..hydra-test-utils
is the next in line. - I’m a bit surprised by the amount of dependencies which were built after adding
hydra-test-utils
? - The
Test.Plutus.Validator
module inhydra-test-utils
is a bit odd as it has many overlaps withPlutus.Extras
inhydra-plutus
andHydra.Ledger.Cardano.Evaluate
inhydra-node
. There must be a better way to do this. -
hydra-test-utils
was a bit of a pain in the neck.. but the newcardano-api
functions for evaluating scripts appear to be simpler! - Many dependencies in
plutus-merkle-tree
et al could be cleaned up! - Benchmarks are easy to overlook .. never forget to
cabal bench
! - Something is wrong with running validators. Is the the refactor in
hydra-test-utils
or is it something in the dependencies? Need to investigate…
-
Plutonomy ? Would we be fine with it?
- Linked to https://github.com/input-output-hk/hydra/pull/777
- Optimisations would require heavy refactoring, what could we get by plugging-in plutonomy?
- Why ETE tests need longer timeouts? Plutonomy might increase compilation time which impacts tests execution
- Plutonomy is an optimiser so something you apply as a last resort, last step in your toolchain. Using Plutonomy might just be putting our problems under the rug?
- Some stuff will be upstreamed into Plutus-core in the "short term", we might just get some benefits by upgrading our compiler toolchain, let's first try bumping GHC/toolchain?
- FT
- What should we do with the draft opened PRs?
- Are they assumed to be continued by the owner?
- What is the plan for versioned docs?
- Can we add a REST api to the server?
- The code is already prepared to be extended in this direction but currently there is no need for this.
- Any weekly learning topic? This idea is still under development.
- What should we do with the draft opened PRs?
- FYI some hydra comms to come: Cardano360 recording, SPO call and next week a twitter space
- Capturing users feedback: we get a lot of feedbacks and questions from users these day, let's not loose them
- Use the questions to populate the F.A.Q
- Checking priorities. Where are we with 0.10.0? How much users feedbacks we want to incluse in this next release?
- Remove #696 from 0.10.0
- We have 13 draft prs in the project right now
- Let’s be reasonable and not have too much draft P.R. in the repo
- two new opened issues:
- https://github.com/input-output-hk/hydra/issues/813
-
https://github.com/input-output-hk/hydra/issues/812
- SB will ask how to reproduce it
-
Documentation generation is a bit messy
- First, publishing did not work, had to fix it
- Then, the monthly review was not published (versioned but we want it to always be fresh)
- FYI there is a realease branch that is the current version of the site
- Red-bin item has been created to keep track of it
- TL;DR Our document pipeline is a bit complicated
-
Schedule a node revival?
- PG will fix his node this afternoon so to be ready tomorrow morning to open a new head
-
We want to continue exploring work started this morning in branch ensemble/client-side-utxo-configuration
-
We've made it possible to configure the format of transaction to be either cobra or json but there are some part of the API that would not use this parameter right now... we should fix this
-
Race condition issue with the chainState
- let's take some time tomorrow morning to share the discovery about some race conditions that we have in handling the chainState
- Where are we at?
- Overleaf cancelled on 9-th of April (only Seb and one more collaborator will be possible)
- We used overleaf for our internal audit also
- For external audit nothing is scheduled
- Spec will be on github
- Overall the spec is quite stable
- Matthias did some work on spec (securities part and messages between the nodes)
- We'll take the changes as is and call it done
- Document still has some red marked text (eg. Sign with eta-0) and one blue part about the minting policy
- In the on-chain protocol section and then in the Init tx part there is an explanation about the minting policy
- Blue changes are what we implemented already which differs from the google document about the full minting policy
- In burning case (using burn redeemer) all tokens need to have negative quantity
- In minting part we are counting all tokens with quantity 1 and expect them to be equal to number of parties + 1 for ST
- Sandro would like to make sure there is nothing else a part from what we want in the minting field (no extra tokens)
- In the interest of safety we should be explicit and worry about performance later
- Main things to check for:
- Check that all quantities of minting policy are in the mint fields and have quantity 1 (correct cardinality, n + 1 for ST)
- Check that ST needs to be in the head output (sent to the head validator)
- All PTs are sent to the initial validator (as an initial output)
- Many nix rebuilds in CI - cachix cache full?
- Using 5.9GB
- 44Gb bandwidth last month
- need more exploration
- Red tests on master
- SB is looking at it
- Flakiness in websocket tests reproduced locally
- mainnet issue (see #784 reproduced by introducing rollbacks in model spec:
- Still subtle race condition present with the chainState
- As a client, in case of a rollback, wtf do I do? 🙀
- What should be our recommendation to users on how to deal with rollbacks
- See #185
- Setup and schedule a design session for each item tomorrow
- SN
- Mainnet compatibility status - known issues documentation
- add a banner to the readme to warn users about the possible pitfalls. Also on our website. (next to installation, move it somewhere where it is more visible)
- TL;DR
- Open a head with people you trust (for now)
- You may lock your funds (for now)
- Reach out to hydra-comms?
- Open a head again? - Let’s do it!
- Save history?
-
SB
- Monthly report https://github.com/input-output-hk/hydra/pull/798/
-
AB
- Client API - follow up with a discussion with Matty (CF) - decide on what to do. Arnaud will put something on the agenda next week.
- Mock Chain - plutus-apps are doing something similar. Useful for a lot of projects.
-
PG
- FYI WIP board in Miro
- Merge!
-
- Rebase
- 2 wait an hour for the tests to run
-
- Come back three hours later
- 4 merge branch outdated, go to 1. -> postponed for tuesday
- Yun meetings
- Last feedbacks - we are aware
- Next meetings - overleaf subscription canceled. What do we do next week? We don’t need to meet unless she finds something off in the spec.
Log exploration can be a bit complicated right now without some infrastructure. The best i've been able to do for now:
Let's say the logs are in hydra.log file:
cat hydra.log | jq -s > logs.json #will build an actual json object (array) from the list of json objects in the logs
fx logs.json
I'm using fx
-
SB
- Monthly report? Franco and me can work on it https://github.com/input-output-hk/hydra/pull/795
-
PG
- Engineering meeting? Cancelled
- One mockchain to rule them all? Not really tied to ModelSpec, perhaps it can be re-used? Would be good not to duplicate code.
- Mainnet runner 🙁 - let's continue when we find some time
-
Wanted to take a peek why this PR is red and notice timeout error in end-to-end tests
-
Sprinkled some traces only to realize this:
cardano-node: thread killed
-
So for some reason our cardano-node proces is killed and there are no changes to this part of the code in the PR
-
I tried to catch the exception and got
AsyncCancelled
exception. -
Cardano-node logs don't reveal anything major but I do see
DBClosed
log and I suspect this has to do with the database access somehow. Perhaps two threads trying to get access a db at the same time? -
Turns out I just need to increase the fail timeout. So it seems our test take a bit longer to run and we should investigate.
Current status is mainly about technical issues from migrating to L2 and also upgrading to Hydra 0.9.0.
The team uses hydra-cluster a lot and have some suggestions on it. They understand we mainly use them for internal test but believe it could be useful for people to code over Hydra or even to deploy Hydra for testing. One feedback in this area is that it’s not possible to inject one’s protocol parameters or genesis block as it’s hard-coded in hydra cluster.
One of the developer was having a hard time compiling on Mac (Intel). They worked a bit with nix to cross-compile things but gave up and setup a development machine... we can feel that too.
Delegate server with API in progress.
They worked on a Monad to make things isomorphic between hydra and L1 and plan to share with Hydra. See if we want to include that in Hydra.
The writing of some architecture documentation on different modules is in progress. Some questions about plutus serialization of integer, just to know if we had any advice, which we don't.
-
I noticed we get arithmetic underflow exception happening when running smoke-tests but already have preserved state.
-
The way forward is to sprinkle some traces around the code and try to figure out the exact place where this happens.
-
Seems like the function where this happens is
queryTip
. That one calls out to cardano-api namely a function calledgetLocalChainTip
and seems like something in there is throwing the aritmetic underflow. Needs further investigation.
We discuss the following issues:
- #735 Ability to read protocol parameters via websocket.
- #789 GetUTxO should provide ability to pass address as a filter
Asking for the protocol parameters to use Cardano-api makeTransactionBodyAutoBalance
introduces other drawbacks and might not be needed as we have less logic to implement to balance transaction on L2 when the fees are 0. Although these two requests makes us think: maybe we need to introduce a REST endpoint focused on request/response patterns and not put everything in the web socket.
If so, there's probably an ADR to specify and that would imply:
- move getUTxO to a REST API endpoint
- try to mimic what exist as much as possible, taking inspiration from Blockfrost for instance so that we don't re-invent yet another way of fetching Cardano data.
-
FT
- Do we want to keep the site artifact as part of the ci result? If so, should we add a scheduled action to clean artifacts? Artifacts are removed by github (probably after 90 days)
- Currently CI runs on schedule, should we remove it? Nuke it and we can add it back if needed
- Too many drafts: let’s start closing stuff
-
SB
- Distribute Sebastian’s tasks?
- PG will attend auction
- SB will check the researcher meeting
- We should do the monthly
- Distribute Sebastian’s tasks?
-
PG
- What about next head? Stop checking head status and re-open a new one. Keep this one as a history.
- Stop checking head on preview - we already know it is broken.
-
https://github.com/input-output-hk/hydra/issues/735
- Let’s groom it today
- Hydra comm channel on discord? Yes, we all have access.
-
FT Discussion on how to fix the link checker for published docs - we have a red bin item for the link checking
-
SB Idea on websockets testing - Should send another message and ensure previous messages are not seen by the client
-
PG Rollback in modelspec - Wants to start to work on this item
-
Added test case to check for ignoring of historical messages: Strategy was to:
- output server messages before client connects
- connect client and make sure it sees the greeting message
- send one more message from the server
- assert this message is seen and historical messages are ignored
-
Second part of the task is to display tx's in cbor format if client requests so.
-
Strategy is to just monkey-patch the output to find the transaction field and overwrite it.
-
We are doing this because picking appropriate json instance is currently hard.
The demo pages do not instruct the user to checkout a stable version before running it. As a consequence, the user would run the demo on an unstable software. We should instruct them to checkout a release before. And we also align the demo scripts with that (see above).
Also, the demo scripts do not specify any docker image version, meaning it will always use the latest version.
But master might have been evolved in a way that the script could not run with the latest docker image but only
with the unstable (matching the latest master commit). We should explicitly set unstable
version for these
scripts on master. We should also explicitly specify the images version on released tags.
This would impact our release process to update these scripts accordingly.
-
Ok, seems like the issue is now pretty clear on what we need to do
-
Idea is that we want to use websocket path, using query params to specify if a client wants to skip history display or wants his txs to be served as cbor instead of json.
-
I was already looking at websockets Haskell package so I know this should not be too hard
-
After plugging things in - using the path to determine what a client wants - I created a function that overwrites the tx json field and replaces json with cbor.
-
More elegant solution is needed here but in the current state of code it is hard to come up with one.
-
Ideally we create a newtype wrapper around abstract
tx
and then pick the instance we want. -
For now
monkey-patch
solution will work just fine.
- https://github.com/input-output-hk/hydra/issues/380 Don't shy away from editing the issue body while an issue is in draft state?
We added some entries in github .env config file:
#> cat actions-runner/.env
LANG=C.UTF-8
PATH="/home/admin/.nix-profile/bin:/nix/var/nix/profiles/default/bin:/usr/local/bin:/usr/bin:/bin"
NIX_PROFILES="/nix/var/nix/profiles/default /home/admin/.nix-profile"
NIX_SSL_CERT_FILE="/etc/ssl/certs/ca-certificates.crt"
Solved the nix install action but then we have a problem with cachix. We just remove the cachix upload capability from the smoke test and the tests now just run smoothly.
Now we run the smoke test on mainnet so that we can sync our node but get this error:
hydra-cluster: /nix/store/xvjwbmcblwc0rrzv596ax1wc658by2pb-hydra-cluster-lib-hydra-cluster-0.10.0-data/share/ghc-8.10.7/x86_64-linux-ghc-8.10.7/hydra-cluster-0.10.0-INsc4XJNfsF2bUTsPLKlrI/config/cardano-configurations/network/mainnet/genesis/shelley.json: openBinaryFile: does not exist (No such file or directory)
71
We needed to add these files to the cabal setup.
Everything looks fine. Writing a brand new install procedure, testing it on a new server and submitting #775
I'm exploring the idea of having a self-hosted GitHub runner to save the cardano database so that we keep a decently synchronized cardano-node.
Things are quite smooth to add the runner and run the tests on it. By specifying an absolute path, we get the result we want. Cool.
I'm adding some concurrency constraint so that we avoid to have two cardano node accidentally sharing the same database at the same time. But, in a way, that's not mandatory since it appears that we can not run more than one job at a time on a runner unless we add more runners on the runner (I know, I know). See https://github.com/orgs/community/discussions/26769#discussioncomment-3539575
A few things we need to do on the server:
- install git
- prepare /srv/var/cardano directory
- We should use an unprivileged user
For some reason, if GitHub runner is run as a service and not from a tty, it will not detect that nix is installed and try to install it everytime (and fail). Adding the following to the environment does not help:
cat /etc/systemd/system/actions.runner.input-output-hk-hydra.to-remove.service.d/environment.conf
[Service]
Environment=NIX_PROFILES=/nix/var/nix/profiles/default /home/admin/.nix-profile
Environment=NIX_SSL_CERT_FILE=/etc/ssl/certs/ca-certificates.crt
Environment=PATH=/home/admin/.nix-profile/bin:/nix/var/nix/profiles/default/bin:/usr/local/bin:/usr/bin:/bin
-
Discuss about using docusaurus versioning feature to tackle
- We would need to keep two versions at all times
- We should build the versions we want and also the latest one and have a simple switcher between the versions
-
#735: is it really what the users need? Yes, do it Pascal!
- Exploring how to make
commitTx
take multiple UTxO for the auctions project - Initial redeemer is a
[TxOutRef]
and Commit datum needs to hold a list now[Commit]
- I run into a weird plutus-tx error message, likely caused by an introduced
foldMap
in vInitial:tx-cost: Error: Reference to a name which is not a local, a builtin, or an external INLINABLE function: Variable GHC.Base.map No unfolding Context: Compiling definition of: GHC.Base.$fFunctor[] Context: Compiling definition of: Hydra.Contract.Initial.checkCommit Context: Compiling definition of: Hydra.Contract.Initial.validator Context: Compiling expr at “hydra-plutus-0.9.0-inplace:Hydra.Contract.Initial:(161,6)-(161,45)”
- The issue was that I used
<&>
fromPlutusPrelude
. Probably not intended to be used on-chain. - Using a list or full
UTxO
makes a lot of code nicer. Hower addressing compilation issues alone will not be enough. The tests do still assume that it’s only one committed UTxO. - All in all though.. that was quite painless. All tests were 💚 on first try. Finally that Haskell mantra is really true: “When it compiles, it works” 😅
- What do we demo next week?
- Running Hydraw on mainnet would be cool
- Lower the expectation in the meantime (mention known limitations)
- Prepare cardano-node on mainnet
- Creating a golden test suite to ensure we do not accidentally change scripts and also persist the binary scripts (in case we cannot reproduce them).
- Getting the
Plutus.Script
from aCompiledCode
can be done also usingfromCompiledCode
(not onlyemkMintingPolicyScript
function). - First execution:
Hydra.Plutus.Golden μHead script has not changed First time execution. Golden file created. νHead script has not changed First time execution. Golden file created. νInitial script has not changed First time execution. Golden file created. νCommit script has not changed First time execution. Golden file created.
- When flipping a
$
in the code:test/Hydra/Plutus/GoldenSpec.hs:37:3: 1. Hydra.Plutus.Golden μHead script has not changed expected: "6683772ef9e41b05e3ae8255644fcc56ea839bb1daeb8ca872bb4886" but got: "2e1755eb17febb4f71bdcd2ca1c81ae5f80fa98bb03a987f8f09597e"
We figured out why we had problems returning the funds to the faucet.
The code is using makeTransactionBodyAutoBalance to build the transaction to return the funds. The way this function works is that it expects UTxOs to spend, the receiving address, the amount and a change address. It will automatically compute the fees and will feed the change address with the remaining funds from the UTxOs, deduced from the fees. The fees are always taken from the change. So in our case, it will build a transaction with 3 outputs (the fees is not an output but you get it):
- amount sent back to the faucet
- fees
- change sent back to Alice
We can't just send all the money back to the faucet then, we need to leave at least enough funds for the fees. But, using this function, we also need to leave at least fees + minimum Lovelace value (1 Ada) or the change output will be too small and make the transaction fail.
For now, we've decided to send back to the faucet everything Alice owns minus 1,200,000 Lovelace as 200,000 Lovelace should be a bit more than the fees and 1,000,000 is the minimum for the change.
At some point we could revise this to really sending everything back to the faucet by computing the fees ourselves.
The team has managed to work around Hydra API commit limitations for adding a redeemer, by writing their own code and publishing the transaction themselves (mostly copy/paste and patch of Hydra code). It might make sense at some point to upstream some of that to Hydra.
They also need the protocol parameters. We asked them to react on #735 and mention that they do know them already. The whole discussion made PG think that they would have expected the Cardano-api to work seamlessly with Hydra. But they understand that Cardano-api is for L1 stuff and hydra Cardano api for L2 stuff. Reminds PG some feedbacks from Pi about them already using Ogmios in their dApp and wanting to access Hydra the same way they used to access L1. There might be something here about adoption path.
We talked a bit about Hydra API breaking changes and we all understand that we can break things between releases in our current 0.x stage. The team suggested that having non-trivial applications like auctions working with Hydra could be an indicator that Hydra would be mature enough to start being stable.
Some conversation about how to deal with collateral on Hydra.
- New script problems founds
- We can steal funds with the newfound vulnerability
- Head liveness session (open a couple heads and play around with it)
- Smoke-tests fees problem? We need to calculate the fees and use min utxo amount when constructing the tx.
- When reviewing the spec I realized we do not check that Head scripts do pay to head scripts again.
- Writing a mutation for this is fairly simple, we can just generate any address and set it as the output address at index 0.
- Not all value collected is a tougher problem. This will become expensive! Also my first try is not correct.. need to debug.
- We had a technique (but removed it?) where we would only look for other input value and change to ensure they are consistent. These input/outputs are more constant in number and complexity, so this is bounded.
To explore the return the funds failure, I try to manually build the transaction with cardano-cli
.
First, I start a Cardano node on preview:
cardano-node run --topology state-testnet/cardano-node/topology.json --config state-testnet/cardano-node/config.json --database-path state-testnet/db/ --socket-path state-testnet/cardano.socket
Alice owns these two UTxOs:
CARDANO_NODE_SOCKET_PATH=state-testnet/cardano.socket cardano-cli query utxo --address addr_test1vru2drx33ev6dt8gfq245r5k0tmy7ngqe79va69de9dxkrg09c7d3 --testnet-magic 2
TxHash TxIx Amount
--------------------------------------------------------------------------------------
09e76c21dea8dd6a5e5de77fab98bf16eea1f479d9b50ca03911ca686095ce0e 1 1820023 lovelace + TxOutDatumNone
4464c922d22146dc5600e034ef6cfdd886689f4310e49375817f497481d56ace 0 82356000 lovelace + TxOutDatumHash ScriptDataInBabbageEra "a654fb60d21c1fed48db2c320aa6df9737ec0204c0ba53b9b94a09fb40e757f3"
I compute the fees for the transaction I plan to build:
cardano-cli transaction calculate-min-fee --tx-body-file dtc.draft --tx-in-count 2 --tx-out-count 1 --witness-count 1 --byron-witness-count 0 --testnet-magic 2 --protocol-params-file state-testnet/protocol.json
173113
I build a transation with all these parameters:
cardano-cli transaction build-raw --tx-in 09e76c21dea8dd6a5e5de77fab98bf16eea1f479d9b50ca03911ca686095ce0e#1 --tx-in 4464c922d22146dc5600e034ef6cfdd886689f4310e49375817f497481d56ace#0 --tx-out ${FAUCET}+84002910 --fee 173113 --out-file tx.raw
I make Alice sign the transaction:
cardano-cli transaction sign --tx-body-file tx.raw --signing-key-file hydra-cluster/config/credentials/alice.sk --testnet-magic 2 --out-file tx.signed
I submit the transaction:
CARDANO_NODE_SOCKET_PATH=state-testnet/cardano.socket cardano-cli transaction submit --tx-file tx.signed --testnet-magic 2
All works fine and I can see the transation on preview.
And we can compare with this transaction that did work with the hacked version of the code.
-
In order to run the smoke-tests on mainnet one of the tasks is to refund the faucet after successfull test run
-
Made some changes and refactored the code to de-duplicate some functions in the Faucet.hs module.
-
Found a bug
hydra-cluster:arithmetic underflow
and created a red-bin item for it so I can take a look later on. -
Got
FailedToBuildTx {reason = TxBodyErrorAdaBalanceNegative (Lovelace (-270397))}
when trying to return funds back to the faucet key. Probably related to the fact that I use all utxos from theActor
and try to send it to the faucet. -
Hmm it seems like the fees are not accounted for. Leaving todo in code.
-
With this todo in place all works and I'll do a draft pr just to gather reviews and tackle the fees.
-
Hydraw broken
- Not only tag missing, but docker image not serving static assets
-
Weekly summary
- Fixed reference script usage with Hydra head
- Prepare usage of Hydra-node on mainnet (network flag)
- Improved our Mutation test suite for expected errors
- Re-opened our persistent demo head on preprod using version 0.9.0
- Updated tx traces in the specification
Next week:
-
Complete mainnet capability task
-
Address all todo’s in the Hydra spec
-
Prepare for Hydra workshop and prepare demo
-
Fix Hydraw issue
-
Changes the script hashes https://github.com/input-output-hk/hydra/pull/768
- Need to introduce a test that checks if the script hashes are changed (encodes a script hash as a “golden” value)
Open a head today with whom is ready and see later:
- There should be 4 nodes ready
- SB
- SN
- PG
- AB
-
Continuing on this - I have green tests and now want to make sure the checks are all in place.
-
I am noticing couple of red test runs in hydra-cluster - let's investigate.
-
Ok, the problem was that I was producing non empty list for the misconfiguration errors since I prepended it with the type of state (IniitialState, OpenState etc.). Fixed it now, added the entry in the logs.yaml and introduced record accessor just so it looks a bit nicer in the output.
-
Now the only problem with tests is the
restartedNodeCanObserveCommitTx
. In this scenario we are restarting two nodes and want to make sure we can still observe commit tx. Problem is likely that we are reusing temp directory for both nodes and we share state that yieldsPersistenceException "InitialState: Parties mismatch. "
. -
After taking a deeper look we were just missing to sort both lists before comparing.
-
Let's go through comments from Matthias
- comments in blue are either noted incorrectly?
- finds txs immediatelly applicable to T-hat -> this is what we do right now
- very hard to take notes about this!
- if we do greedy snapshots it is not a problem
- maybe we need to remove txs that are snapshotted?
- then we would still be fine?
- not doing it by applicability but removing snapshotted txs
- that is true for the snapshot - not necessary true for the non snapshots
- T-hat might be incosistent? why?
- Your T-hat is not necessarily consistent with the T-hat for the snapshot
- Maybe we'll have to re-do some stuff
- We might consider at some point using the Mercle tree snapshot of all txs - suggestion
- We already need a plan how to make this auditable
- Keeping this as a note
- Record what was snapshotted (we now take U-hat) but if we have Merkle tree we can prove existence of a tx
- This is addition, we cary M-tree and later we can prove payment by showing an old snapshot
- This is a good point
- We have no more T, only T-bar (not all T-bar's are consistent in the spec)
- We could request tx ids instead of complete txs for snapshots
- Ok, can't keep track with the conversation now :( (two different conflicting txs in the local view)
- People will be angry on obvious misbehavior of the hydra-node?
- We also make the people mad if we build a protocol that is never ready?
- If we dont have Tall we can't do conflict resolution
- Snapshot requestor does this (using their T-hat)
- Right now we don't do Treq hash
- If we request all txs there is no point of doing reqTx? NO we need to broadcast
- Inconsistencies in the spec need to be sorted.
- Either we change the implemtation or take note about the optimization options we have (related to this topic)
- Pascal: second option makes sense
- Optimization being - request txids instead of whole txs
- This would require changing the spec anyway
- We need to keep track of Tall to do this properly
- If we don't do this optimisation - we don't need to worry if we see all txs
- Let's alter the spec
- (Sebastian types some latex that we don't understand but related to future optimization)
- Related issue #728 - ReqSn only sends transation ids
- Filling out some details on the issue
- Record all txs irrespective of their applicability in the local ledger state
- Aligning the current spec with what we actually have in the code
- How do we prune the local state?
- Matthias: Right the spec is consistent but not complete
- There is no correctness problem
- Taking note: should do something more intelligent than doing multiple passes
- Dropped txs should be resubmitted (inform the users about it? - we currently don't)
- comments in blue are either noted incorrectly?
-
Talk about the security properties of the protocol
-
Matthias got stuck here - didn't know what the protocol was like
-
It depends on the protocol itself so let's not talk about the concrete things
-
What we can do right now?
-
If you can collect you can also fanout - something we worked on in the last few weeks
-
Question: Is this a property: Whatever you lock into a head (collect) you can fanout on L1?
-
This is implied by the completness? NO
-
Security conditions are all about txs on L2 - should we just collect more of these properties?
-
It should follow from already specified properties
-
If you don't sign any tx then soundness tells you that you get U0 out again
-
Maybe this is a different kind of class of properties? There is a practicall shell to all of this
-
Maybe there are more relevant properties (practical ones) e.g. I can always abort a head
-
Users care about this
-
Securities we have are more about the offchain ledger and tx handling
-
Maybe we start by expressing some properties needed for alternative implementations (hydra-node in Rust?)
-
Matthias: Sees it as a part of the documentation/user manual
-
Sebastian: Sees it as RFC/specification
-
We want to convince ourselves that what is in the spec is good (not a paper)
-
Matthias will take a look at blue sections
-
Maybe we should meet again when we have something new to discuss?
We need a review on #719
- PG partially reviewed it but take another look
Who is hydra-node "12043"?
- PG's node :)
Before opening the new head with 0.9.0 we have to check:
- You’re on pre-prod
- Everybody is connected to your node
- You have fuel
Opening a hydra-head with 0.9.0
- PG's node must have a problem with peers configuration and has to check
-
I have the code that does the check already and want to write a test case for it.
-
Inspecting the code to see if I can reuse something
-
Seems like one of the scenarios where we restart the hydra-node could be useful.
-
In the copied test case we use different number of peers on restart to trigger the misconfiguration logging.
-
I see the log message in the test driver output but I can't match on it in the test. Reason is
EndToEnd
tracer don't know anything about theHydraLog
messages so I would probably need to add a constructor that will do that. Prolem isHydraLog
has two type variables and I would need to change a lot of code. Use existentials? Maybe just set concrete types in there likeFromHydraLog (HydraLog Tx ())
? -
Second option looks better and less involved.
-
It seems that it is not that trivial to observe the logs at this level. We are not polimorphic on the log type and I don't want to duplicate functions. Maybe look at BehaviourSpec?
-
Seems like the easiest option is to spawn the node process in end-to-end spec but beforehand save some state. Then when providing arguments for the hydra-node make sure to use different ones than the state to trigger the error.
- Now the fanout fails although the others pass. Seemingly because hashes don’t align. Looking at the (off-chain) observed committed / initial UTxO vs. the fanout UTxO I cannot spot a difference. Let’s look at the hashes.
- Could narrow down the problem to an inconsistency between the off-chain, observed UTxO and the hash actually put into the datum.
- Found the culprit in
fromPlutusTxOut
- So the problem is that reference scripts are dropped when converting a cardano-api
TxOut
to a plutusTxOut
. The latter only contains a reference script hash. Our current code uses the serializedPlutus.TxOut
to keep track of what was committed and of course it is also used to match the fanout outputs on-chain (these outputs are of course alreadyPlutus.TxOut
).- I was thinking whether to not put the full description of a
Commit
on-chain in the first place, thsi would require ourobserveCommitTx
to have access to the resolved inputs of the transaction though.. which is not straight forward, as we only keep track of theChainState
and not arbitraryUTxO
(which might exist on the change for way longer already).- We could think requesting the transaction inputs
[TxIn]
when calling back from the chain layer, and the continuatin get’s passed the correspondingUTxO
, but that would complicate things even further? OTOH though.. this is mostly what the chain state holds, and this could make things less stateful?- So what if we have this signature:
type ChainCallback tx m = [TxInType tx] -> (UTxOType tx -> Maybe (ChainEvent tx)) -> m ()
?- This is a 🐰 🕳️
- So what if we have this signature:
- We could think requesting the transaction inputs
- I was thinking whether to not put the full description of a
- Going back to the commit problem.. maybe it’s better to just disallow
ReferenceScript
for now and provide a clear error before opening a head with them. Just like byron outputs.
Maybe we could protect the signing key of the faucet on mainnet using GitHub ci secrets. That could be enough given for that kind of amount. Let's explore this.
Opening a hydra-head with 0.9.0
- Some maintenance to perform on AWS before
- We need to close the Head (it’s a breaking change after all)
- Let’s close the head today and reconfigure through the afternoon until tomorrow
-
One thing we noticed when working on some test code is that you can get to a position to restart the hydra-node using different parameters (for CP and peers) than what is already present in the node state.
-
We would like to have a way of notifying the users about that.
-
Inspecting the code reveals we can use
HeadParameters
to look at the configured parameters and then compare those to what is present in the state. -
In case there is a mismatch we will log it.
-
We need a test case that verifies this of course.
- When using
genUTxOSized
in the acceptance test, I see anTxBodyOutputOverflow 72482138201786137481 (TxOutInAnyEra...
error in theunsafeBuildTransaction
ofcollectComTx
. - Seems like it’s coming from too big lovelace values. Are our generators broken? We are using the
cardano-ledger
generator. - We need to scale the value generator down so we are not hitting
Word64
bounds when collecting from multiple big commits.
- Exploring why fanout / life-cycle with non-ada-only UTxO would fail.
- When making generated utxo in genFanoutTx more arbitrary, the tests quickly start to fail because utxo’s are very big.
- Its also annoying that propIsValid is only failing if the budget is overspent in a single validator.. not all of them. Let’s fix this first (we have a red bin item for it).
- Introducing a new
EvaluationError
to expressevaluateTx
outcome where the overall budget was overspent.
-
Started with specifying two issues we need to groom further in the next session. One is running the smoke-tests on mainnet and the other one is about introducing a hardcoded limit of 100 ada for commit tx.
-
Since there are some unknowns related to running the smoke-tests (already in the issue to be groomed) I am starting with the hardcoded limit issue.
-
Seems like it would be good to have a failing test but I am not really sure if we want to error out or just log in case somebody wants to commit more than 100 ada.
-
Logging seems like the better option to me at least so let's proceed with inspecting the code.
-
Seems like the
commit
function is a nice place since we returnEither
from it so it would be pretty easy to add this check here. -
I added the necessary check that depends on the mainnet and realized that writing the test for this would only work if we would actually run our tests on the mainnet.
-
Should I remove the network param and limit all of the tests to 100 ada commits?
-
I'll check the code to see if we can do this without affecting other parts of the code.
-
It is nice that we can just alter the network for the test runs.
- PG: Is marked fuel really necessary?
- Legacy: https://github.com/input-output-hk/hydra/issues/553 https://github.com/input-output-hk/hydra/issues/570
- Commit from external wallet would remove the need for it https://github.com/input-output-hk/hydra/issues/215
- Why not just do coin selection? Yes, we could improve the business logic of our internal wallet.
- Coin selection is non trivial. Maybe it exists as a library, but likely not or not easily integratable.
-
We want to be able to run our hydra-node, tui and hydraw on mainnet
-
What is missing is the network flag that would enable us to connect to the cardano-node running on mainnet.
-
This task is a part of 714
-
That means before having this PR reviewed I also need to experiment if I can actually run it on mainnet!
-
Syncing the mainnet cardano-node is a bit of a pain.
-
Realizing that
hydra-cluster
had awhen
in the code to prevent it from running on mainnet - fixing this. -
Published current scripts with tx id
4a4f3e25887b40f1575a4b53815996145c994559bac1b5d85f7de0f82b8f4ed7
. -
In order to fund my address I realize how
Fuel
is annoying. Hopefully we will be able to get rid of it soon. -
Altering
fuelTestnet.sh
andsquashUtxo
so that I can get my utxos in the right shape to run the protocol. -
hydra-node
connects to the mainnet and I can run the tui without any peers. -
Init the head, commit some funds, and Hydra Head is opened.
-
Time to continue in getting this PR into shape.
-
Conclusion is that not a lot of changes were needed to actually run an experiment on mainnet.
-
PG: How to keep users / dependencies on the radard? esp. as we are about to release 0.9.0
- e.g. https://github.com/cardano-foundation/hydra-java-client
- Can we keep track of downstream projects and ideally notify them when we do breaking changes?
- Manual process should be sufficient for now
- _Bare minimum: Notify them that there is a new release with the following breaking changes -> be a good citizen (at least until we are more established)
- For now / for this release: Collect the projects which are somehow dependant on us and put them already on a “dependency map”
- Dependency map: Let’s start a miro board to put the projects/components as stickies/links and draw arrows by hand for now
-
PG: Do we ensure that all head members have the same head parameter?
- No, they would just disagree on whether a transaction is valid or not if not using the same protocol parameters.
- Related: https://github.com/input-output-hk/hydra/issues/735
- Related (but different): https://github.com/input-output-hk/hydra/issues/195
- Review spec notes
- Looking at the Init tx
- Looking at the miro board to show new diagrams for the txs (CEM replacement)
- Init datum contains the out-ref now
- Explainig further the new graphs we use and notation (eg. ST present in the value)
- Interesting part: The minting policy
- removing all constrains when burning is not enough
- we need to check that all quantities are negative
- we don't constrain when to burn but need to ensure every token is burnt
- this is something we already implemented
- we also have to check the number of minted tokens
- what would happen when we mint tokens with quantity == 2?
- we already implemented the check for quantity
- we check that ST is paid to the head output
- we check that PTs are in the corresponding initial outputs
- so the checks we have prevent minting PTs/ST with quantity /= 1 and we also count tokens
- should align the spec with google doc (PTs related part - all initial outputs hold PT)
- should be more explicit that cid is minting policy hash
- maybe we don't need to have on-chain check for the out-ref but we decide to keep it since it is implemented already
- looking at important note that participants still need to check PTs and the initial state is consistent with the aggreed parameters
- adding a note to see the
onInit
part of the spec - maybe we should check off-chain in the init tx the n - length of the participants configured against the datum
- the minting policy does not allow this - checking n is implicit
- Offchain part of the spec
- we implemented on ackSn exactly as in spec
- Matthias will take a look when he gets back
- on newTx we actually don't check anything anymore (we don't want to do it)
- on reqTx we wait instead
- this is irrelevant for our protocol
- altered the spec to include tx pruning we do
- need to check with Matthias if it is ok to prune seen txs on reqSn (we guess it is ok)
- question about output seen - is seing a tx happen on reqTx or reqSn? Important for the clients
- for all txs requested in a snapshot we would emit seenTx
- should not emit seen for conflicting txs
- we will ask Matthias about the two open points in slack
- when and who will we work with the securities section? Maybe we don't need to do it?
- Sandro and Matthias need to review this section of the spec probably
- We realized execution budgets are not correctly checked in in StateSpec
propIsValid
- tx-cost benchmark is correctly detecting it
- new red bin item: either change semantics or claim
- important to not re-use the generators (we did this in the past) as they have different goals
- tx cost: find maximum / limits with somewhat deterministic generation
- state spec: find bugs by trying arbitrary / challenging values (not necessarily expensive)
-
sha2_256 benchmarks for collect/abort: Collect:
Parties UTxO (bytes) Tx size % max Mem % max CPU Min fee ₳ 4 228 1786 80.89 33.41 1.12 Abort:
Parties Tx size % max Mem % max CPU Min fee ₳ 4 6283 98.59 43.13 1.53 -
sha3_256 results for collect/abort Collect:
Parties UTxO (bytes) Tx size % max Mem % max CPU Min fee ₳ 4 227 1784 80.96 33.46 1.12 Abort:
Parties Tx size % max Mem % max CPU Min fee ₳ 4 6144 93.28 40.66 1.47 -
blake2b_256 results for collect/abort Collect:
Parties UTxO (bytes) Tx size % max Mem % max CPU Min fee ₳ 4 227 1782 80.26 33.16 1.12 Abort:
Parties Tx size % max Mem % max CPU Min fee ₳ 4 6282 98.95 43.24 1.54
- Plutus error codes: shall we use TH or not yet?
- PR already valuable in its current state
- Also we have another work depending on it (all reference scripts branch)
- Let’s get this merged first and then we can remove more boilerplate or use TH to not have the string literals twice
- New Tx graph design
- Inspired by tx traces and the paper - focuses on individual transactions
- Behavior change of not doing checks in L2: NewTx
- We would not have the feedback TxValid/TxInvalid in the API anymore
- But what does TxValid mean currently?
- Maybe txSeen and TxExpired are enough?
- Need to see how to make code and spec consistent on this
We observe strange output that look like random crashes while running the hydra-node tests.
After a bunch of trial and error, we manage to identify this specific test in
Hydra.API.ServerOutputSpec
as one that will make the test suite abruptly crash:
roundtripAndGoldenSpecsWithSettings
settings
(Proxy @(ReasonablySized (ServerOutput Tx)))
Removind (temporarily) the cardano dependencies does help. Although it's been running for more than four hours. Maybe related to the following warning:
warning: error: unable to download 'https://cache.zw3rk.com/nar/1ipr17w9698w14g9nx8j62jsksad9nqg8d4j4hd38wasab78hpdd.nar.zst': Timeout was reached (28); retrying in 256 ms
Anyway, I'm in the shell, let's test:
#> time cabal test hydra-node
...
test/Hydra/API/ServerSpec.hs:61:3:
1) Hydra.API.Server sends all sendOutput history to all connected clients after a restart
uncaught exception: IOException of type PermissionDenied
/tmp/ServerSpec-6e2fa8b859a638a5/history: openBinaryFile: permission denied (Permission denied)
Oups, we have a problem there. By removing the Durable
element in withBinary the test pass:
--- a/hydra-node/src/Hydra/Persistence.hs
+++ b/hydra-node/src/Hydra/Persistence.hs
@@ -9,7 +9,7 @@ import qualified Data.ByteString as BS
import qualified Data.ByteString.Char8 as C8
import System.Directory (createDirectoryIfMissing, doesFileExist)
import System.FilePath (takeDirectory)
-import UnliftIO.IO.File (withBinaryFileDurable, writeBinaryFileDurableAtomic)
+import UnliftIO.IO.File (withBinaryFile, writeBinaryFileDurableAtomic)
newtype PersistenceException
= PersistenceException String
@@ -63,7 +63,7 @@ createPersistenceIncremental fp = do
PersistenceIncremental
{ append = \a -> do
let bytes = toStrict $ Aeson.encode a <> "\n"
- liftIO $ withBinaryFileDurable fp AppendMode (`BS.hPut` bytes)
+ liftIO $ withBinaryFile fp AppendMode (`BS.hPut` bytes)
, loadAll =
liftIO (doesFileExist fp) >>= \case
False -> pure []
Research meeting notes:
Agenda:
-
Authenticate network messages
- Had some ideas on how we could implement this but need to check what
pairwise authenticated channels mean?
- It is essentially just key exchange. Correct implementation should use symetric authentication but we could also just sign the messages.
- We don't mean privacy here just authentication
- To summarize - protocol relies on authenticity not confidentiality
- It is not really necessary that ALL messages need to be authenticated. We could ignore or detect wrong messages.
- If we don't sign messages it is only DoS problem not security! For example:
- If you can impersonate someone else and request a snapshot in place of a current leader. Other parties can see this so we could have two potentially conflicting requests so we could detect cheating and stop interacting which kills the liveness property.
- Page 23 (reqSn) in the spec we could detect cheating (if message was not authenticated)
- We have asymmetric keys already and have the idea to sign messages (not alter transport layer)
- Conclusion: This is something we still need to do but would rather deal with this later down the line since we dont care that much about liveness right now.
- Had some ideas on how we could implement this but need to check what
pairwise authenticated channels mean?
-
Minting policy (new) parameters
-
We wanted to check if we can parametarize a MP with all params (embed all in minting policy so we can check all on-chain)
-
This is fine and should simplify off-chain code.
-
Head logic (track all txs, transaction pruning and require/wait flip)
-
We have comments from Matthias and would like to address some of them:
-
Check that we only allow the current and next snapshot (reqSn not allowed for 5 snapshots into the future)
- Before we checked that requested Snapshot is just the future one
- As this is more restrictive, it is fine
-
Our code seems actually would need require and wait to be flipped for AckSn: We can only do check if we saw a signature once we have the corresponding seen snapshot
- Problem can be that we see ackSn before reqSn
- Can we flip wait and require?
- We can flip/split it - should be ok. We shouldn't see two same ackSn's
-
On verifying signatures:
- We have a key for party so we could verify individual signature parts, not only the multisignature, but we could also verify both.
- If do it individually you could spot somebody cheating but not really important
- Matthias doesn't see a reason to change anything, whatever suits us best is ok
-
-
What does valid-tx actually mean?
- We can drop it from the spec - this means tx is well formed
-
Should we track all transactions we see (not signed)
-
Whenever we see reqTx and put the tx in this pool of all txs, it would not matter whether the tx is applicable or not when receiving the ReqTx.
-
We only wait limited amount of time (practical issue, queue can't be infinite)
-
If we pick from this pool after seen a ReqSn, what's the point of waiting for applicability in ReqTx?
-
This is up to interpretation of the spec - it is a bit weird
-
Also, it is not clear when we can clean
T_all
? We may want to keep them for different reason (reqSn). We also may want to remember important txs so nothing gets lost (audit trail).- This is against keeping storage requirements small
- Clients would see all transactions / snapshots anyways
-
Will this yield a comparable behavior than the Cardano node / mempool? We can go to any complexity here, but maybe something simple is good enough?
-
The current implementation drops the txs in reqSn, we also do prune txs by their id (not by their applicability)
-
Prune by applicability is ok, but we are concerned if we do it for ALL txs we see (that are in our pool)
-
We have to be careful of what we drop
-
For this issue lets reflect the current implementation in the spec and we talk next week.
-
-
- Continuing with the full minting policy work we had some luck and fixed two bugs in test code that were causing failing tests.
-
First bug was related to new check for parties where we need to check if parties match the parties from the datum. Basically there was a place in code where we didn't pass in the correct number of peers to
observeInitTx
(we need to pass all other peers excluding ourselvespickOtherParties
) -
Second bug was presend in the ModelSpec and again it is related to the same peer check. Basically we need to make sure at all times that we are passing our party and rest of the parties correctly to
observeInitTx
.
-
Lets check if we can get away with passing in *ALL parties and filtering in
observeInitTx
- Seems to work - only two failing tests:
- One related to tx size for the abort
- One from MBT which I need to fix now
- Seems to work - only two failing tests:
I'm looking at this strange test output again:
#>cabal test hydra-node
Build profile: -w ghc-8.10.7 -O1
In order, the following will be built (use -v for more details):
- hydra-node-0.9.0 (test:tests) (ephemeral targets)
Preprocessing test suite 'tests' for hydra-node-0.9.0..
Building test suite 'tests' for hydra-node-0.9.0..
Running 1 test suites...
Test suite tests: RUNNING...
Hydra.API.ClientInput
JSON encoding of (ReasonablySized (ClientInput SimpleTx))
allows to encode values with aeson and read them back
+++ OK, passed 100 tests.
JSON encoding of (ReasonablySized (ClientInput SimpleTx))
produces the same JSON as is found in golden/ReasonablySized (ClientInput SimpleTx).json
{"timestamp":"2023-02-22T15:01:45.149653Z","threadId":328,"namespace":"","message":{"listeningPort":50266,"tag":"APIServerStarted"}}
JSON encoding of (ReasonablySized (ClientInput (Tx BabbageEra)))
WARNING: Encoding new random samples do not match golden/ReasonablySized (ServerOutput SimpleTx).json.
Testing round-trip decoding/encoding of golden file.
74/100Test suite tests: FAIL
Test suite logged to:
/Users/pascal/git/github.com/input-output-hk/hydra_bare/dist-newstyle/build/aarch64-osx/ghc-8.10.7/hydra-node-0.9.0/t/tests/test/hydra-node-0.9.0-tests.log
0 of 1 test suites (0 of 1 test cases) passed.
This is a really strange behavior. But I've notice the following error when trying to built hydra-plutus and I'm suspecting some relationship here:
#> cabal build hydra-plutus
Build profile: -w ghc-8.10.7 -O1
In order, the following will be built (use -v for more details):
- hydra-plutus-0.9.0 (exe:inspect-script) (first run)
- hydra-plutus-0.9.0 (test:tests) (first run)
Configuring executable 'inspect-script' for hydra-plutus-0.9.0..
Configuring test suite 'tests' for hydra-plutus-0.9.0..
Warning: Packages using 'cabal-version: >= 1.10' and before 'cabal-version:
3.4' must specify the 'default-language' field for each component (e.g.
Haskell98 or Haskell2010). If a component uses different languages in
different modules then list the other ones in the 'other-languages' field.
Warning: Packages using 'cabal-version: >= 1.10' and before 'cabal-version:
3.4' must specify the 'default-language' field for each component (e.g.
Haskell98 or Haskell2010). If a component uses different languages in
different modules then list the other ones in the 'other-languages' field.
Preprocessing test suite 'tests' for hydra-plutus-0.9.0..
Preprocessing executable 'inspect-script' for hydra-plutus-0.9.0..
Building executable 'inspect-script' for hydra-plutus-0.9.0..
Building test suite 'tests' for hydra-plutus-0.9.0..
[1 of 1] Compiling Main ( exe/inspect-script/Main.hs, /Users/pascal/git/github.com/input-output-hk/hydra_bare/dist-newstyle/build/aarch64-osx/ghc-8.10.7/hydra-plutus-0.9.0/x/inspect-script/build/inspect-script/inspect-script-tmp/Main.o )
[1 of 3] Compiling Hydra.Data.ContestationPeriodSpec ( test/Hydra/Data/ContestationPeriodSpec.hs, /Users/pascal/git/github.com/input-output-hk/hydra_bare/dist-newstyle/build/aarch64-osx/ghc-8.10.7/hydra-plutus-0.9.0/t/tests/build/tests/tests-tmp/Hydra/Data/ContestationPeriodSpec.o )
[2 of 3] Compiling Spec ( test/Spec.hs, /Users/pascal/git/github.com/input-output-hk/hydra_bare/dist-newstyle/build/aarch64-osx/ghc-8.10.7/hydra-plutus-0.9.0/t/tests/build/tests/tests-tmp/Spec.o )
[3 of 3] Compiling Main ( test/Main.hs, /Users/pascal/git/github.com/input-output-hk/hydra_bare/dist-newstyle/build/aarch64-osx/ghc-8.10.7/hydra-plutus-0.9.0/t/tests/build/tests/tests-tmp/Main.o )
Linking /Users/pascal/git/github.com/input-output-hk/hydra_bare/dist-newstyle/build/aarch64-osx/ghc-8.10.7/hydra-plutus-0.9.0/t/tests/build/tests/tests ...
Linking /Users/pascal/git/github.com/input-output-hk/hydra_bare/dist-newstyle/build/aarch64-osx/ghc-8.10.7/hydra-plutus-0.9.0/x/inspect-script/build/inspect-script/inspect-script ...
I've seen cardano install doc has been updated to suggested downgrading llvm to 13 and I use version 14 so I'll downgrade my version and see what happens: it works!
Let's retry the tests now: same strange behavior.
- What is failing on aarch64?
- Temporarily removing cardano-node dependencies solves the issue
- The nix expression to get a cardano-node is not aarch64-darwin enabled in the version we use
-
Transaction pruning
- Moving the transaction pruning to
onReqSn
was actually painless.. it’s just the same semantics?
- Moving the transaction pruning to
-
When debugging hydra node logs, it is very handy to shorten / alias the long hex strings of public keys and signatures.
-
Not validationg txs in
NewTx
- If we always emit
TxValid
.. what doesvalid-tx
actually mean in the spec? Shall we remove TxValid/TxInvalid?
- If we always emit
-
Track
allTxs
and "prune"seenTxs
from there- To collect
allTxs
, we need to be able to update the state onWait
outcomes. - Investigating how the transaction pruning changes behavior if we draw from
allTxs
vs.seenTxs
. - As expected, the two tests with depending transactions fail.. but unexpectedly with
arithmetic underflow
? - Turns out.. it’s the TTL decreasing again and
ReqSn
is resulting inWait
and it’s ttl is not decreased. - Also the ReqSn is including the same transaction multiple time!?
- Of course.. we are always adding the tx on every
ReqTx
, the re-enqueuing is problematic here. Do we actually need it if we do re-consider transactions later fromallTxs
anyways?! - Tracking
allTxs
and using it to "prune”, or rather re-enqeue transactions is quite involved. It seems to be possible, but substantially changes the semantics of the protocol. Also there would be noTxExpire
!?
- To collect
-
Verifying multi signatures
- On verifying signatures: we agreed with researchers that it would not matter whether we check individual or the aggregate signature. However, the aggregate is what we would need to use to close the head .. so it better be correct.
I install nix as described on the official site and reboot. After the reboot, nix is not working. I think I already experienced the same issue some months ago.
I uninstall nix as described on the non official site and reboot so that I can retry the whole process.
I install nix again and reboot. It works. I update the nix.conf according to our contributing guide an reboot again. It still works.
I know run nix develop
in Hydra repository, accorgind to our contributing guide but it fails:
➜ hydra git:(master) ✗ nix develop
do you want to allow configuration setting 'allow-import-from-derivation' to be set to 'true' (y/N)? y
do you want to permanently mark this value as trusted (y/N)? y
do you want to allow configuration setting 'extra-substituters' to be set to 'https://cache.iog.io https://hydra-node.cachix.org' (y/N)? y
do you want to permanently mark this value as trusted (y/N)? y
do you want to allow configuration setting 'extra-trusted-public-keys' to be set to 'hydra.iohk.io:f/Ea+s+dFdN+3Y/G+FDgSq+a5NEWhJGzdjvKNGv0/EQ= hydra-node.cachix.org-1:vK4mOEQDQKl9FTbq76NjOuNaRD4pZLxi1yri31HHmIw=' (y/N)? y
do you want to permanently mark this value as trusted (y/N)? y
warning: ignoring untrusted substituter 'https://hydra-node.cachix.org'
error: flake 'git+file:///Users/pascal/git/github.com/input-output-hk/hydra' does not provide attribute 'devShells.aarch64-darwin.default', 'devShell.aarch64-darwin', 'packages.aarch64-darwin.default' or 'defaultPackage.aarch64-darwin'
Did you mean devShells?
After adding my user in nix.conf as trusted and rebooting the machine (restarting the nix daemon did not work) I don't see no warning about ignored untrusted substituters.
My current /etc/nix/nix.conf:
trusted-users = root pascal
build-users-group = nixbld
substituters = https://cache.iog.io https://cache.nixos.org
trusted-public-keys = hydra.iohk.io:f/Ea+s+dFdN+3Y/G+FDgSq+a5NEWhJGzdjvKNGv0/EQ= cache.nixos.org-1:6NCHdD59X431o0gWypbMrAURkbJ16ZPMQFGspcDShjY=
experimental-features = nix-command flakes
After a long time of it running I get the following error:
#> nix develop
trace: haskell-nix.haskellLib.cleanGit: /nix/store/qkra2hh7ghzfdm1nxmqa0n8gvkzjk659-source does not seem to be a git repository,
assuming it is a clean checkout.
error: attribute 'aarch64-darwin' missing
at /nix/store/d0lsdhfkqgvc21aznd7vjh1w5agc19y0-source/default.nix:19:15:
18| with (import ./nix/flake-compat.nix customConfig);
19| defaultNix // defaultNix.packages.${system} // {
| ^
20| private.project = defaultNix.legacyPackages.${system};
(use '--show-trace' to show detailed location information)
I try to see what happens if I just try to compile Hydra without installing nix. I want do that because I had lot of troubles previously when doing it with nix. One of the main issue could be that nix installed stuff does not play weill with os x installed stuff, vscode being a good example.
We have a, maybe not so much maintained, for maintainers wiki page explaining how to install Hydra without nix.
I followed the instructions from the documentation on how to compile cardano without nix.
Then, trying to compile Hydra, I first stumble upon some issues with llvm options updates not handled by haskell compilation stack. To solve that, I downgraded the version of llvm:
#> brew install llvm@14
Then, I've been able to compile Hydra without any problem:
cabal build all 10948,05s user 1608,11s system 539% cpu 38:48,28 total
Running the test is more problematic. They fail for, I guess, some dependency that might not be satisfied:
cabal test hydra-node
Build profile: -w ghc-8.10.7 -O1
In order, the following will be built (use -v for more details):
- hydra-node-0.9.0 (test:tests) (ephemeral targets)
Preprocessing test suite 'tests' for hydra-node-0.9.0..
Building test suite 'tests' for hydra-node-0.9.0..
Running 1 test suites...
Test suite tests: RUNNING...
Hydra.API.ClientInput
JSON encoding of (ReasonablySized (ClientInput SimpleTx))
allows to encode values with aeson and read them back
+++ OK, passed 100 tests.
JSON encoding of (ReasonablySized (ClientInput SimpleTx))
produces the same JSON as is found in golden/ReasonablySized (ClientInput SimpleTx).json
JSON encoding of (ReasonablySized (ClientInput (Tx BabbageEra)))
WARNING: Encoding new random samples do not match golden/ReasonablySized (ServerOutput SimpleTx).json.
Testing round-trip decoding/encoding of golden file.
61/100Test suite tests: FAIL
Test suite logged to:
/Users/pascal/git/github.com/input-output-hk/hydra/dist-newstyle/build/aarch64-osx/ghc-8.10.7/hydra-node-0.9.0/t/tests/test/hydra-node-0.9.0-tests.log
0 of 1 test suites (0 of 1 test cases) passed.
cabal: Tests failed for test:tests from hydra-node-0.9.0.
While looking for a solution to the tests problem, I start vscode and get the following error:
hspec-discover: runInteractiveProcess: posix_spawnp: illegal operation (Inappropriate ioctl for device)
This is due to the fact that we use the following line Spec.hs
but I don't have hspec-discover available in vscode:
{-# OPTIONS_GHC -F -pgmF hspec-discover -optF --module-name=Spec #-}
So I install hspec-discover globally so that vscode can access it:
cabal install hspec-discover
SN
- Arbitrary HydraKey is broken
- Quite likely conflicting public and private keys / not very collision resistant
- We should fix this in any case (was impacting minting policy work)
- Test this like we do the cardano public keys generator
- Acceptance test fails with non-ada UTxO
- #724 fails if we would use more arbitrary UTxO (not ada-only)
- This is likely a problem in our code and we don’t usually test with non ada-only UTxO -> our test coverage sucks in this regard
- Create a bug issue item to address this dedicatedly
- Protocol alignment
- How far to push this? Also how hard to push it?
- Change the implementation to match the spec vs. change the spec?
- Maybe this is a moot point.. We don’t REALLY know whether implementation is good enough. Try to align it with the spec will show potential behavior changes and only THEN we can decide to not do them.
- In the end, use cases count and we should make sure we can fulfill them.
AB
-
#722
- Hydra-api package
- Concrete users: black jack app, auctions with MLabs? Marlowe was also one user
- First round of feedback would be good and have discussion if necessary
- One of the nodes is down and we have some troubles restarting it properly with the exact same unstable version as the one running on the other nodes. Anyway, we should quickly cut a release.
- What to do about the hydra Head explorer down?
- This was an experiment and we have an open issue about having a more reliable approach to tracking hydra heads
- We’re ok with stopping this experiment until we address the open issue
-
Continuing work from Friday
-
Updating haddocks to include workding we groomed in the issue itself and further explain our intentions.
-
I would like to address one TODO in the Init mutation where we don't know exactly which error message to expect in case we drom one initial output.
-
We get a
ScriptErrorEvaluationFailed
with PT5 in this case. -
There was a check in v_initial that checks the number of outputs against the number of parties and it didn't have appropriate error specified.
-
This should be all related to on-chain part of the minting policy (until we have a chat with the researchers) and now there are still couple of checks that we need to do in the off-chain part, namely:
- the pub key hashes of all configured participants == the token names of PTs.
- the contestation period in the datum matches the configured one.
- the Hydra keys (parties) in the datum matches the configured keys.
-
For the pub key hashes of all configured participants == the token names of PTs - we already construct the PT tokens using preconfigured cardano keys so there is no need to check for anything additional.
-
The same goes for the contestation period in the datum matches the configured one since we are using the hydra-node contestation period param while constructing the datum.
-
Lastly, for the third check the Hydra keys (parties) in the datum matches the configured keys we don't need to touch anything since we are already doing the check.
-
These checks are all present when constructing the transaction but I noticed that we also need to ensure the same when observing the init transaction.
-
In order to check if hydra parties are matched when observing the init tx we need to pass in the list of parties as a parameter.
-
This has a rippling effect throughout the codebase so many places need to be updated.
-
After everything compiles I am noticing the tx size increase for fanout and abort as well as some other failures when running hydra-node tests.
-
Added some new cases for errors when observing init tx:
| PartiesMissmatch
| OwnPartyMissing
| CPMissmatch
...
-
After adding these explicit errors I am observing 27 failures so the updated test code seems to be doing the wrong thing.
-
Introducing
pickOtherParties
function to filter out our own party from the list of parties in tests.
- Looking into when to snapshot and how the require statements are different
- I realize that we don't have the "last seen snapshot number" just after confirming a snapshot
- When introducing
LastSeenSnapshot
to have more control over intermediate states, I encounter nastyArithException
ofarithmetic underflow
. Is it caused by aNatural
being subtracted below0
? - At least
naturalNegate
might throw this exception here https://hackage.haskell.org/package/ghc-bignum-1.3/docs/src/GHC.Num.Natural.html#naturalNegate.. but which Natural? - Seems like it was the
decreaseTTL
function inHydra.Node
paired with not handling / continuing to wait withttl == 0
inHeadLogic
!
-
After ensamble work this morning one of the things left to do is to prevent abusing the
Burn
redeemer in ourmu_head
to actually mint tokens. -
We need to add a check where we assert that we have only negative quantities in the tx mint field.
-
There is already a failing mutation from this morning where the message should be minting not allowed and now we actually need to write some code to exercise this.
- Encountered problems with
hydra-node-static
hydra job and some of the shell derivations. - Keeping them in to troubleshoot with Moritz. If not resolved, could strip down
the
hydraJobs
to at least have the other things at least built on Hydra CI. - Switch focus to other tasks for now.
- We do verify the individual signatures. Do we need to verify the aggregate as well?
- The AckSn handling function is pretty much unreadable :/
-
Continuing the refactor of HeadState into sub-types and see whether I can use that in the protocol logic TODOs
-
Make the change easy, then make the easy change
-
While refactoring, I observe that we could maybe put some type informatin into
Outcome
to group everything. Maybe have anOutcome s tx
, instantiate asOutcome InitialState tx
and then function(s) to convertOutcome InitialState tx -> Outcome HeadState tx
? -
Also, what if
Outcome
is aState
-like monad which we can more similar like in the spec? -
After refactoring, I also had a look at the snapshot emission. It’s different than in the spec, let’s try to inline the decision.
-
I realize our smoke tests are not running anymore and dev shell is broken because of the ’filepath wildcard’ problem when building
hydra-cluster
via nix. -
This was also an error I saw in Hydra CI before.
-
Indeed we cannot enter the
exes
devShell
locally even. The missing files are the ones from a submodule. -
I got pointed to use a query flag on the
nix
invocation:nix develop ".?submodules=1#exes"
does work.. but this is super annoying. -
When using this in the smoke test workflow, it still timed out..
waitFor
timeouts with60
seconds are too low. Who changed them back from600
? -
Now with a working shell and fixed timeouts the smoke tests proceeds.. but fails on close. With a validity interval of only one slot!? How could that happen:
"timestamp":"2023-02-16T16:45:17.563229772Z","threadId":61,"namespace":"hydra-cluster","message":{"message":{"postChainTx":{"confirmedSnapshot":{"initialUTxO":{},"tag":"InitialSnapshot"},"tag":"CloseTx"},"postTxError":{"failureReason":"HardForkApplyTxErrFromEra S (S (S (S (S (Z (WrapApplyTxErr {unwrapApplyTxErr = ApplyTxError [UtxowFailure (UtxoFailure (FromAlonzoUtxoFail (OutsideValidityIntervalUTxO (ValidityInterval {invalidBefore = SJust (SlotNo 9909917), invalidHereafter = SJust (SlotNo 9909918)}) (SlotNo 9909918))))]}))))))","tag":"FailedToPostTx"},"seq":5,"tag":"PostTxOnChainFailed","timestamp":"2023-02-16T16:45:17.56124487Z"},"nodeId":1,"tag":"ReceivedMessage"}}
SN on #724
- When putting together the transactions + observations for the full life cycle test, I got quickly stuck on the slot/point in time parameters. This should be a testament that these are not intuitive right now.
- Also annoying is that we need to do a
UTCTime -> SlotNo
conversion, even in tests.
AB - #722
Working on extracting a usable hydra-api package that (Haskell) client applications could easily import and use.
-
Ideally, this API should be transaction agnostic but it's hard now as there's a few ties to Cardano API
-
Removing the
IsTx
typeclass from the API could alleviate that issue, however it's not large and event itsTx
instance is pretty small, eg. no more than a dozen LoCs -
Strategy was to create the package, then move the
ClientInput
andServerOutput
modules, then move the constituents of the those data strcutures to make it compile properly -
I have kept the original
Hydra.XXX
modules in place and reexported theHydra.API.XXX
modules in order to minimise the impact to the changes in thehydra-node
package -
This lead to the hydra-api being comprised of 9 modules
Hydra.API.Chain
Hydra.API.ClientInput
Hydra.API.ContestationPeriod
Hydra.API.Crypto
Hydra.API.Ledger
Hydra.API.Network
Hydra.API.Party
Hydra.API.ServerOutput
Hydra.API.Snapshot
-
Having compiled the hydra-api and hydra-node packages, I did another pass on the other packages
-
Ideally, I wanted to get rid of the dependency on hydra-node in
hydra-tui
andhydraw
to make it clear those were client applications that do not rely on the node's internals- this is slightly more involved than I had hoped for as those depends on the
IsTx
andIsChainState
instances - The former could easily be solved by moving the
IsTx Tx
instance to thehydra-api
: It already depends onhydra-cardano-api
anyway - The latter is a bit more annoying, I think it would be better to remove this constraint from the API and provide some independent representation
- The
IsChainState
constraint is only needed because it appears in one constructor of thePostTxError
type which is exposed to the clients - It would probably be just fine to return some arbitrary
Text
there as there's really no way a client can be interested in this internale state representation
- this is slightly more involved than I had hoped for as those depends on the
Trying to move IsTx
instance for Cardano to API is problematic: The code depends on the Plutus hashTxOuts
function!
- I don't think it's a good idea to tie the 2 worlds: The Plutus contracts and the actual transactions manipulated off-chain. Or at least not in that direction. Testing Plutus code with Haskell is fine, but having Haskell production code depends on Plutus code introduces coupling at compile time which requires pulling in the whole Plutus dependencies.
- Perhaps it would be better to try to go back to having different implementations that are checked with a property?
All of sudden, while travelling through git history of my project, nix decided it needed to rebuild 214 packages!
- Lost about 1 hours waiting for nix to compile all dependencies for an unknown reason.
Now trying to implement an off-chain hashTxOuts
function that yields identical results to the on-chain counterpart. Luckily, we have tests!
Actually, getting back to using off-chain hash/serialisation is harder than exepected. Test is failing with this example:
utxo =
[
( TxIn "0001000100000000000001000001000001010101000101000001000000000100" (TxIx 55)
, TxOut
(AddressInEra (ShelleyAddressInEra ShelleyBasedEraBabbage) (ShelleyAddress Testnet (ScriptHashObj (ScriptHash "0d94e174732ef9aae73f395ab44507bfa983d65023c11a951f0c32e4")) StakeRefNull))
(TxOutValue MultiAssetInBabbageEra (valueFromList []))
TxOutDatumNone
( ReferenceScript
ReferenceTxInsScriptsInlineDatumsInBabbageEra
( ScriptInAnyLang
(SimpleScriptLanguage SimpleScriptV2)
( SimpleScript
SimpleScriptV2
( RequireMOf
0
[ RequireAnyOf
[ RequireSignature "3542acb3a64d80c29302260d62c3b87a742ad14abf855ebc6733081e"
, RequireMOf 0 [RequireSignature "3542acb3a64d80c29302260d62c3b87a742ad14abf855ebc6733081e", RequireSignature "a646474b8f5431261506b6c273d307c7569a4eb6c96b42dd4a29520a", RequireSignature "a646474b8f5431261506b6c273d307c7569a4eb6c96b42dd4a29520a", RequireSignature "76e607db2a31c9a2c32761d2431a186a550cc321f79cd8d6a82b29b8"]
, RequireSignature "b16b56f5ec064be6ac3cab6035efae86b366cc3dc4a0d571603d70e5"
]
, RequireMOf 2 [RequireAllOf [RequireSignature "a646474b8f5431261506b6c273d307c7569a4eb6c96b42dd4a29520a", RequireSignature "b5ae663aaea8e500157bdf4baafd6f5ba0ce5759f7cd4101fc132f54", RequireSignature "76e607db2a31c9a2c32761d2431a186a550cc321f79cd8d6a82b29b8"], RequireTimeAfter TimeLocksInSimpleScriptV2 (SlotNo 41075)]
, RequireMOf 0 [RequireAllOf [], RequireTimeAfter TimeLocksInSimpleScriptV2 (SlotNo 66291)]
, RequireAnyOf []
, RequireTimeBefore TimeLocksInSimpleScriptV2 (SlotNo 90040)
]
)
)
)
)
)
]
plutus =
[ TxOut
{ txOutAddress =
Address
{ addressCredential = ScriptCredential 0 d94e174732ef9aae73f395ab44507bfa983d65023c11a951f0c32e4
, addressStakingCredential = Nothing
}
, txOutValue = Value (Map [])
, txOutDatum = NoOutputDatum
, txOutReferenceScript = Just aafa897dd5ecb9d839a504f3ce458179545f95a234805c6a9142f029
}
]
I need to investigate what we had in place before.
There's a deep problem with the hashing of UTxO: There's no reason the representations shiould be identical, of course, hence why we ended up relying on one hashing function usable on-chain.
-
Shame on us for not recording our thoughts on this issue before as we were able to work in ensemble quite nicely.
-
Big chunk of work is already done in ensemble, mainly off-chain work about observing the init tx but we still have quite some code to alter.
-
I am continuing by picking to work minting policy and making sure we don't check for burning there.
-
We already check the tokens burning (
checkAbort
) so no need to do it in the minting policy (simplification). -
Removal of the burning checks from the minting policy was quite easy and now we need to continue with the on-chain part that needs to be discussed further with the researchers.
- We would want a 1.8.0.x or 1.9.0.0 version of the
haskell-language-server
to be able to disable the STAN plugin (it's quite noisy) -
1.8.0.0
and1.9.0.0
are on hackage and can be built viahaskell.nix
'sdevTools
. But1.9.0.0
requiresghc >= 9
- Other
1.8.x
versions of HLS are not on hackage. - Just depending on it as a
buildInput
is also not easay as theflake.nix
on https://github.com/haskell/haskell-language-server/ is not exposing builds for ghc8.10.x
-
Investigating what improvement we could see if we hash outputs on
commitTx
directly. -
One consequence: Cannot just deserialize the whole
TxOut
from the datum. But this should be fine as we can just lookup the UTxO from the inputs inobserveCommitTx
. -
Cost of commit is unchanged:
UTxO Tx size % max Mem % max CPU Min fee ₳ 1 578 21.03 8.51 0.41 -
Hm.. not being able to decode it from the datum is indeed a problem. The signature of
observeCommit
would need to include theUTxO
context of a transaction. This should be possible, but a major change in theHydra.Chain.Direct
signatures. -
Let's keep the information redundant and see if it makes a difference already.
-
Quite obviously not..
collectCom
is about the same cost (although we keep a hash + the original data). -
Let's put the serialized output into tx metadata for now (as hack). Only the off-chain code needs to put it / read it back.
-
Hitting the metadata max length of
64
bytes. -
With ada only utxos it fits in metadata. The
collectCom
benchmarks is not much different than onmaster
though:Parties Tx size % max Mem % max CPU Min fee ₳ 1 13055 21.78 8.71 0.96 2 13278 34.84 14.07 1.12 3 13719 53.43 21.75 1.34 4 14049 73.85 30.22 1.58 5 14381 96.25 39.58 1.85 -
This is not even faster.. with
adaOnly
commits, themaster
version has comparable performance:Parties Tx size % max Mem % max CPU Min fee ₳ 1 13136 22.07 8.83 0.97 2 13489 36.51 14.75 1.15 3 13841 54.24 22.09 1.36 4 14193 74.79 30.62 1.60 5 14545 97.62 40.15 1.87
- PR is fine in general but there are some comments that need addressing.
- Before merging this we also want to make sure we don't provide a way for our users to yolo run hydra-node on mainnet and risk loosing their funds.
- Addressed all of the PR comments and now this is ready for whenever we want to get it merged.
- Continuing with this since I am not convinced the corresponding Mutation works
- Changing the input datum to use arbitrary contestationDeadline produces still green mutation
- Changing the output datum to use arbitrary contestationDeadline produces still green mutation
- Changing both the input and the output datum to use arbitrary contestationDeadline produces still green mutation
- Seems like this mutation is not doing its job
- Ok, I am noticing we are using the same label for both related mutations. Introducing a new label for when we want to alter the contesters so that there is one left to contest in the output datum and the deadline should not get pushed.
- Ok, seeing green with the only contesters changed.
- Come to think of it, it makes sense. We don't need to alter the deadline since off-chain code should take care of that and we just need to make sure contesters are updated appropriatelly in the mutation test.
Making some progresss on #705 through P.R. 706.
It's been a very useful task and worthwhile to, soon, generalize to the other mutations:
- helped me to really understand what the intention of each of theses mutations was;
- helped fixed actual issues on some mutations introduced by some kind of drift between the way the validator has evolved and the assumptions at the time when the mutation've been written.
So kudos for having come up with this expected error message solution to secure theses tests 💪
SN: Meeting tomorrow to share learnings about how to profile our plutus application
SN: What would be the next step for the full minting?
- Need to specify what’s to be done and not
- SB plans to write an issue specifying it
SB: Should we make renderTx available more widely?
- Let's first move it to the Hydra cardano-api in our code base
- Then submit a P.R. to the cardano-api project to include their if they're interested
- Continuation from yesterday - I am examining where Franco left off.
- Seems like what is left to be done is to create a mutation that exersizes the statement that in case party is the last one to contest then the deadline should not get pushed further.
- Fixed a bug where we should have all contesters be current contester + all others already contested and compare length with the list of parties.
- For the mutation exposing the behavior where deadline should not get pushed I wanted to just alter the number of contesters in the output datum to be (length parties - 1) - seems that should work fine.
- One thing I am forgetting is I also need to change the input datum.
- Mutation is green but I am not convinced the work is done here.
Sharing the results about estimating complexity:
- See the above entry from today in the logbook.
- collectCom is more expensive than fanout, meaning we should always be able to fanout things that we've been able to collect.
- Might be useful to write a test to encode the findings “always can close and fanout an initial UTxO hash from collect”
Trying to profile plutus code by the book has been quite involved:
- See the above entry from today in the logbook.
- See also AB’s logbook notes from his past experiment.
- Current way of doing things: run the bench, make a change, run the bench and compare
Discussions about the change in deadline handling for contest:
- Implement as described in the spec and see later for possible optimizations
Issue #713 has been drafted, need feedbacks from everyone to improve it.
- Time to continue where I left off. What I know is that the deadline should get pushed forward on each contest except if it is the last contest to be made. Examining the code...
- Ok it looks like we should add contestation period to the datum (
Closed
state) so that we can push the contestation deadline further upon each contest. - If the peer contesting is the last one then the deadline should stay the same.
- After fixing the options parser for hydra-node and writing some tests for it I am noticing a related FIXME in the v_commit.
- Seems easy to tackle so I just added a param to the
deserialiseCommit
to accept theNetworkId
as the argument and have a helper functionnetworkIdToNetwork
to convert between networks in cardano and ledger api. - Continuing I realize the hydra-tui options parser needs to change to so let's tackle that one next.
- This was easy since I already have a parser to do the same thing (DRY?).
- Profiling validators on a transaction using these instructions. First: isolate a single transaction.
- Using the StateSpec tests we can run only a single passing
evaluateTx
with:main --match "Direct.State/collectCom/validates" --qc-max-success=1
. However, this does not give us much control about the number of parties or UTxO used. Likely better to use thetx-cost
benchmark instead. - In tx-cost we need to modify the code because the parameters are not exposed, but we can easily limit the
computeCollectComCost
to a value we care about and start investigating, let’s say4
parties. - Nicer output via
putTextLn =<< costOfCollectCom
- Enable profiling in
vHead
script using{-# OPTIONS_GHC -fplugin-opt PlutusTx.Plugin:profile-all #-}
- Now we need to “Acquiring an executable script”. Let’s see if we can extract it from the evaluated
Tx
- Profiling made the tx bigger, so we need to disable the
guard $ txSize < maxTxSize
- Scripts are in the witnesses of the tx, we cannot get hold of them via
cardano-api
’s standardTx
constructor/pattern. But we need to “dive down” to the ledger types. - Just accessing the scripts, datums and redeemers maybe is too cumbersome. Can we get a hold of the applied script / re-use ledger code to get that applied script?
- In fact, we want to do the same as the
evaluateTransactionExecutionUnits
function does. So vendoring / splitting of the preparation is what we need.- We would need to redo the ledger code in our code.. including resolving redeemers etc.
- Found function
collectTwoPhaseScriptInputs
in ledger.. seems like it gives us exactly what we need!
- Profiling made the tx bigger, so we need to disable the
- Before continuing
prepareTxScripts
, let’s find out what we actually need as return type.- Seems like we need applied scripts serialized in “flat” encoding.
- Applying arguments to a script to get a closed term is done using
mkTermToEvaluate
in theplutus-ledger-api
- This function is not exported?
- Only in the latest.. in our older
plutus-ledger-api
, the functions were actually exposed! We should make that stay that way. 🍀
- Running
evaluateScriptCounting
to verify script execution should work before tying applying - Worked flawlessly, now I have flat-encoded fully applied UPLC programs with profiling information and can store them on disk
- Using
uplc evaluate -i plutus.5.flat --if flat-namedDeBruijn --trace-mode LogsWithBudgets -t -o logs
we get a first overview of the budget and tallied primitives:
CPU budget: 2456848390
Memory budget: 7385623
Const 89056000 387200
Var 324714000 1411800
LamAbs 271377000 1179900
Apply 511474000 2223800
Delay 81995000 356500
Force 249274000 1083800
Builtin 133492000 580400
startup 100 100
compute 1661382000 7223400
AST nodes 13105
BuiltinApp 795466290 162123
Time spent executing builtins: 32.38%
FstPair 18017664 7168
UnBData 4245920 4352
UnListData 322470 320
HeadList 14834407 10976
TailList 19808542 15392
Sha2_256 1812896 4
UnMapData 1034478 864
ChooseData 2500736 4096
IfThenElse 31094616 386
LessThanEqualsByteString 1977690 10
AppendByteString 55961 91
UnIData 1907708 1408
EqualsByteString 35793984 163
SndPair 35833227 13344
AddInteger 4129540 40
Trace 514292324 77504
EqualsInteger 44502729 213
UnConstrData 17884712 17504
ChooseList 45416686 8288
Total budget spent: 2456848390 7385623
Predicted execution time: 2.457 ms
-
Following the instructions requires
uplc
andtraceToStacks
from plutus andflamegraph.pl
. We can get them usingnix
:nix shell nixpkgs#flamegraph github:input-output-hk/plutus#x86_64-linux.plutus.library.plutus-project-924.hsPkgs.plutus-core.components.exes.traceToStacks github:input-output-hk/plutus#x86_64-linux.plutus.library.plutus-project-924.hsPkgs.plutus-core.components.exes.uplc
-
The end result - flamegraphs on memory and cpu usage of the Head validator on a
collectCom
transaction 🎉CPU:
Memory:
- To determine if serializing the
Value
and comparing is the most performant thing to do I came up with the alternative implementation and wanted to test measure the benefits. - Implementation:
(===) :: Value -> Value -> Bool
(===) val val' =
and $ foldr' currencyExists [] (flattenValue val)
where
currencyExists (cs,tn,i) res =
(valueOf val' cs tn == i) :res
- Note that here I am using foldr' but I actually switched between foldr, foldr' and foldl' to compare the results.
- Perhaps I should have not used
flattenValue
andvalueOf
that comes fromPlutusTx
to detect if their implementations might be slow? - Results:
foldr results :
"headScriptSize": 9945
## Cost of Close Transaction
| Parties | Tx size | % max Mem | % max CPU | Min fee ₳ |
| :------ | ------: | --------: | --------: | --------: |
| 1| 10543 | 16.33 | 6.43 | 0.79 |
| 2| 10744 | 18.65 | 7.44 | 0.83 |
| 3| 10912 | 21.08 | 8.50 | 0.86 |
| 5| 11302 | 26.37 | 10.79 | 0.94 |
| 10| 12091 | 36.03 | 15.16 | 1.08 |
| 36| 14052 | 97.61 | 36.89 | 1.82 |
## Cost of Contest Transaction
| Parties | Tx size | % max Mem | % max CPU | Min fee ₳ |
| :------ | ------: | --------: | --------: | --------: |
| 1| 10569 | 16.95 | 6.64 | 0.80 |
| 2| 10735 | 18.80 | 7.48 | 0.83 |
| 3| 10958 | 22.45 | 9.00 | 0.88 |
| 5| 11289 | 26.68 | 10.89 | 0.94 |
| 10| 12046 | 35.05 | 14.77 | 1.07 |
| 32| 15793 | 99.06 | 42.36 | 1.95 |
=========================================================
foldl' results:
"headScriptSize": 9959
## Cost of Close Transaction
| Parties | Tx size | % max Mem | % max CPU | Min fee ₳ |
| :------ | ------: | --------: | --------: | --------: |
| 1| 10556 | 16.23 | 6.40 | 0.79 |
| 2| 10721 | 17.75 | 7.11 | 0.82 |
| 3| 10953 | 21.49 | 8.68 | 0.87 |
| 5| 11212 | 23.76 | 9.81 | 0.91 |
| 10| 11443 | 30.86 | 11.69 | 0.99 |
| 32| 15785 | 97.12 | 41.71 | 1.93 |
## Cost of Contest Transaction
| Parties | Tx size | % max Mem | % max CPU | Min fee ₳ |
| :------ | ------: | --------: | --------: | --------: |
| 1| 10616 | 17.60 | 6.89 | 0.81 |
| 2| 10749 | 18.37 | 7.32 | 0.82 |
| 3| 10946 | 20.98 | 8.45 | 0.86 |
| 5| 11279 | 25.13 | 10.31 | 0.92 |
| 10| 12106 | 35.68 | 15.03 | 1.08 |
| 32| 15796 | 95.69 | 41.12 | 1.91 |
============================================================
foldr' results :
"headScriptSize": 9965,
## Cost of Close Transaction
| Parties | Tx size | % max Mem | % max CPU | Min fee ₳ |
| :------ | ------: | --------: | --------: | --------: |
| 1| 10558 | 16.24 | 6.40 | 0.79 |
| 2| 10731 | 18.06 | 7.23 | 0.82 |
| 3| 10926 | 20.55 | 8.31 | 0.86 |
| 5| 11292 | 25.57 | 10.51 | 0.93 |
| 10| 12114 | 35.83 | 15.11 | 1.08 |
| 32| 15789 | 98.32 | 42.19 | 1.94 |
## Cost of Contest Transaction
| Parties | Tx size | % max Mem | % max CPU | Min fee ₳ |
| :------ | ------: | --------: | --------: | --------: |
| 1| 10590 | 16.56 | 6.50 | 0.80 |
| 2| 10755 | 18.69 | 7.44 | 0.83 |
| 3| 10954 | 21.47 | 8.64 | 0.87 |
| 5| 11353 | 27.71 | 11.30 | 0.96 |
| 10| 12135 | 37.21 | 15.62 | 1.10 |
| 32| 15809 | 98.22 | 42.12 | 1.94 |
=========================================================
serialize data results:
"headScriptSize": 9722,
## Cost of Close Transaction
| Parties | Tx size | % max Mem | % max CPU | Min fee ₳ |
| :------ | ------: | --------: | --------: | --------: |
| 1| 10319 | 15.62 | 6.66 | 0.78 |
| 2| 10521 | 17.82 | 7.99 | 0.81 |
| 3| 10653 | 18.60 | 8.27 | 0.83 |
| 5| 11047 | 22.32 | 10.51 | 0.89 |
| 10| 11873 | 30.17 | 14.86 | 1.02 |
| 49| 15116 | 68.12 | 31.23 | 1.59 |
## Cost of Contest Transaction
| Parties | Tx size | % max Mem | % max CPU | Min fee ₳ |
| :------ | ------: | --------: | --------: | --------: |
| 1| 10347 | 16.24 | 6.87 | 0.79 |
| 2| 10512 | 17.73 | 7.68 | 0.81 |
| 3| 10711 | 19.79 | 8.88 | 0.85 |
| 5| 11041 | 22.76 | 10.49 | 0.90 |
| 10| 11864 | 29.85 | 14.38 | 1.02 |
| 36| 16223 | 70.03 | 36.42 | 1.69 |
- The results clearly show that using folds is much more memory intensive and when serialising the data we are hitting close to max transaction size.
- Maybe there is alternative implementation of valking throught the
Value
map items I could come up with?
Morning's finding: Equality check (== from PlutusTx.Eq class) on values is expensive. This is due to the way values are represented in PlutusTx:
newtype Value = Value { getValue :: Map.Map CurrencySymbol (Map.Map TokenName Integer) }
These map types here are not actually maps!
They are associative lists and when we compare them, the comparison function makes values with the same contents equal even if their contents are shuffled. E.g. value with two nfts: [(pid1, [(t1, 1)]), (pid2, [(t2, 1)])]
is equal to [(pid2, [(t2, 1)]), (pid1, [(t1, 1)])]
This is very expensive, especially on the memory budget as the maps are copied and unioned etc. Doing a more strict way of equality would be fine for us! I.e. simply comparing the exact contents with no re-shuffling allowed.
-
Start with fanout on complexity estimation. Add the token burn check to see if number of parties are relevant.
-
Added serialized UTxO size to see how other generated UTxOs will affect the tx cost. With ada-only outputs the previous results are comparable (of course):
Parties UTxO UTxO (bytes) Tx size % max Mem % max CPU Min fee ₳ 5 1 57 14755 12.93 5.47 0.94 5 5 285 14897 18.98 8.97 1.02 5 10 570 15078 26.35 13.27 1.12 5 20 1139 15436 41.60 22.07 1.32 5 46 2620 16374 81.26 44.99 1.85 -
When using a more arbitrary generator which produces also
TxOut
with reference scripts, the number of outputs becomess less important and the muber of bytes is indeed a better metric. -
Also, the fanout transaction is size-bound. Do we see an effect if "big outputs" have a bigger cost or is it really proportional to the serialized size of the outputs, no matter their type?
-
Try to find an upper bound with
suchThat
anderror
-
Rebased on all reference scripts to have a lower baseline on the tx size
-
Using linear increases of UTxO size (via ada-only outputs) we can see that both, cpu and memory costs are proportional to the UTxO size: A base offset of about
13%
we pay about15%
per570
bytes (10 ada-only UTxOs) in memory and roughly9%
per570
bytes in cpu with a5.5%
baseline.Parties UTxO UTxO (bytes) Tx size % max Mem % max CPU Min fee ₳ 5 0 0 5071 13.36 5.44 0.52 5 1 57 5110 14.04 5.98 0.53 5 5 284 5253 20.14 9.50 0.61 5 10 570 5435 28.35 14.14 0.72 5 20 1137 5792 43.30 22.82 0.92 5 30 1706 6153 59.20 31.90 1.13 5 40 2276 6513 73.76 40.43 1.32 5 50 2845 6872 88.77 49.15 1.52 5 57 3248 7127 99.75 55.44 1.67 -
Interestingly.. it is not the bytes. When using more arbitrary (babbage) tx outs, the maximum memory budget is reached at around
10kB
with14
outputs or less -
Looking at
collectCom
: After using non-trivial utxo's to commit (more bytes), It's evident that collect com scales badly with number of parties, but okayish on how much was committed:Parties UTxO (bytes) Tx size % max Mem % max CPU Min fee ₳ 1 1466 3429 40.24 16.34 0.74 1 252 1171 27.41 11.09 0.50 1 103 852 25.30 10.23 0.47 1 1054 2789 34.72 14.12 0.66 2 2400 5675 78.10 31.84 1.26 2 1344 3620 65.15 26.50 1.03 2 885 2721 55.87 22.74 0.89 2 1217 3388 59.41 24.22 0.95 3 2045 5001 95.97 39.32 1.43 -
Comitting more than one utxo per party is not quickly done (changes in off-chain tx construction required)
-
Using complex UTxOs, we can be quite certain that the number of collected UTxOs can always be fanned out.
collectCom
just scales so bad right now:Cost of CollectCom Transaction
Parties UTxO (bytes) Tx size % max Mem % max CPU Min fee ₳ 1 783 2273 31.79 12.91 0.60 1 1288 3253 40.81 16.55 0.74 1 1278 3270 37.85 15.40 0.71 1 454 1551 29.40 11.91 0.54 2 895 2677 53.23 21.70 0.85 2 1617 4140 63.71 26.00 1.03 2 1633 3975 65.98 26.90 1.05 2 1614 4225 67.81 27.61 1.08 Cost of FanOut Transaction
Parties UTxO UTxO (bytes) Tx size % max Mem % max CPU Min fee ₳ 3 0 0 4942 11.48 4.70 0.49 3 1 240 5161 14.33 6.47 0.54 3 5 5495 11145 51.80 31.27 1.28 3 10 9576 14677 84.14 53.10 1.86 3 17 9837 14915 90.06 57.44 1.95 -
Using ada-only outputs, this is not as evident as we cannot test substantial sizes of UTxOs (a single ada-only TxOut has 57 bytes serialized).
- As the morning findings revealed comparing
Value
s is very expensive. - Also what we know is when trying to preserve v_head values we know the order of the currency ids and tokens should be the same between head input and output.
- I introduced
(===!)
function in theUtil
module to do this so that we can call it from multiple places (2 for now) - That gives us leverage we can use to just compare the serialized
representation of
Value
instead of using(==)
from PlutusTx - Looks like close transaction works now with 46 parties and contest can handle 37 (in this test run)
- Also I bumped the number of outputs for the fanout transaction to 45
- I will do one more PR that will introduce custom
ScriptContext
with only the values we actually use (and have them beBuiltinData
) if we can get away with not decoding them back to Plutus values.
-
Ok time to tackle another gap: Deadline handling changed between paper and V1 spec -> deadline gets "pushed out" on each contest this makes the contestation period easier to determine and scales with number of parties
-
In the specification this is mentioned as the section 5.6 Contest Transaction - point no 6:
Contestation deadline is updated correctly
T ′
final =
{ Tfinal if |C′| = n,
Tfinal + T otherwise
- I start by examining
checkContest
state transition to inspect the code and see what we are currently doing with the deadline. - Ok it looks like contestationDeadline is part of the datum and we need to make sure that it get's pushed forward each time somebody contests BUT only if there are parties left to contest. If we are the last one contesting the value should stay the same.
- Now that we are familiar with what we need to do let's start with the mutation test for this case.
- In order to write successfull mutation we need to alter the contestation deadline but also take into account the number of contested parties. If our party is the last to contest then the deadline should not get pushed further.
- To get mainnet ready as one of the first tasks I wanted to do command line options parsing.
- Currently we only allow hydra-node to accept
--network-id
flag that capturesNetworkMagic
number forTestnet
. - My idea was to not break backwards compatibility and add another string option ("Mainnet, mainnet or m" string) and alter the parser to accept those too in case we want to use Mainnet network.
What shall we do about the logging changes?
- Original problem was not seeing a message when shutting down nodes (#690)
- Removed a queue without seeing the purpose but then introduced another regression with some logs on the same line
- Next proposal is to just use a mutex to protect access to stdout but wasn't it better to use communication as a mean for synchronization?
- Did it avoid blocking in the writing / producing thread?
- Shall we just use a proper framework to handle log?
- Meet at 16h30 to agree on a course of action
-
When adding script registry and make vHead referenced, I encountered failing mutation tests.. without changing the script code or anything else to the off-chain construction. Also, it is quite deterministically failing?
-
In fact, the mutated transaction should have failed but it seemingly passes??
SomeMutation {expectedError = Just "invalid snapshot signature", label = MutateSignatureButNotSnapshotNumber, mutation = ChangeHeadRedeemer (Close {signature = ["\192\221\NAK\SO\USm\135\198\227\ETX\252/>\201*\136\163\175\156\RSC~\215\206\142S\183\132\187D\GSaV\193\ENQ\169q\187~2\215K~\244\136\DC4\161{N.\243e\244|A\245\STX#YC\148\RS\143\f","[\SI\232\204\135wGEQ\237\ETX>HTfXTq4U\239\202\&4\212\&8\248\DC4\213\244}\231\213+\203\225\SUB\144\&6m\159|*\245\RS\156t`\139e$\240FNc\190\245S\143\SYN\213\a\133\196\r"]})} Phase-2 validation should have failed with error message: "invalid snapshot signature" Redeemer report: fromList [(ScriptWitnessIndexTxIn 0,Right (ExecutionUnits {executionSteps = 755532047, executionMemory = 2596341}))]
-
Maybe the
ChangeHeadRedeemer
mutation is not working as we expect? -
Yes.. the
headOutputIndices
ofChangeHeadRedeemer
is completely broken!? It seemingly resolves redeemer pointers by index into the UTxO set! Where it should resolve it in the inputs of the tx. -
Makes sense now: it only tripped when we had more than one UTxO to spend from in the mutation tests -> the referenced outputs from the script registry now.
- Re-investigating the HLS problems with
lsp-log-io
to see the request/responses received. They seem not too bad to mpj at least. - Before wanting to open an issue on
lsp-mode
, I bumped again the package to the latest commit: With doom emacs I would need to change the original(package! lsp-mode :pin ...)
or override in my the custompacakges.el
with(package! lsp-mode :pin "06c28ca780dedea32e150462f1b9aa9c910c2952442fc51a2db09e503ca3b39f3d")
. Also don’t forgetdoom sync -u
after that to have the package cloned and installed. - And it turns out.. it’s better now and the delay is gone!?
- Opened a PR on doom emacs with these bumped packages (feel free to use my fork until merged): https://github.com/doomemacs/doomemacs/pull/7070
- Start work on aligning off-chain code with spec/gdoc
- Working on HeadLogic is very hard with the many arguments. Let’s try dedicated types.
- Subtyping the
HeadState
into individual types likeIdleState
feels good. But generic handling ofchainState
andpreviousRecoverableState
feels odd.
- There should be a check in some head state transitions that should prevent arbitrary minting or burning.
- Check seems simple - just use txInfoMint field to determine if the
Value
is empty. - The docs for
txInfoMint
say:txInfoMint :: Value The Value minted by this transaction.
so the it looks like if I justgetValue :: Map CurrencySymbol (Map TokenName Integer)
and check if the correspondingMap
is empty this check would work ->Map.empty == getValue (scriptContextTxInfo context)
. - What we experience is that the check failes for some reason which is puzzling!
If the tx is not minting/burning anything we should get the empty
Map
here, right? - Altering the check to do a lookup of the exact
CurrencySymbol
we are interested in works. Hmm... - Just for the experiment I tried looking up
adaSymbol
instead of our head currency symbol and was confused that this produces invalid tx in our mutation tests andPT5
error in the validator. Wat? - In order to figure out what is going on we need to take a look at how plutus
tx is being translated to ledger tx and why the
adaSymbol
is actually present in theValue
that comes out oftxInfoMint
. - In the ledger code we see this line
, PV1.txInfoMint = transMultiAsset (txBody ^. mintTxBodyL)
so we are usingtransMultiAsset
to translate the minting values to the ledger representation of the mintedValue
and the implemenation is folding usingPV1.singleton PV1.adaSymbol PV1.adaToken n
as the default value! So this means you always get ADA currency symbol as the key with value 0 and this is what caused unexpected behavior for that validator check. (Not sure about thePT5
error tho)
-
It is very useful to debug what is happening on-chain. As much as it is useful we lack the proper tools to do this since you can't trace show values in the PlutusTx - some limited tracing functionality is possible (traceIfFalse, traceError).
-
In order to see trace messages and values we get in the validator context we can patch cardano-ledger code locally.
-
Steps:
-
Make sure to reset the local cardano-ledger clone to whatever the version your project is using.
-
Find out which part of the ledger code you want to add the trace to and add it.
-
In your local project
cabal.project
file, in packages section specify the absolute local clone path instead of relying on published cardano-ledger version like so:packages: hydra-cluster hydra-node hydra-plutus hydra-prelude hydra-test-utils hydra-tui hydra-cardano-api plutus-cbor plutus-merkle-tree hydraw ../cardano-ledger/eras/impl/babbage
-
Run your tests and after some recomplation you should see your traces.
-
-
Continuing with the PR, now there is a missing check in the v_commit
-
Added a
mustNotMintOrBurn
to the commit validator and appropriate mutation and the test case is still red. -
Looks like the check should be part of the v_initial since that is where we check participant commits and this script is referenced in the commit tx.
-
Bam it worked! I need to spend some time studying v_initial piece of code.
-
All that is left is to generate better
Value
for the tests so that we can mimic minting/burning of arbitrary and hydra tokens.
- Practical aspects of creating a publicly available specification -> #448
- How about CIP? Would it be relevant to this work to publish as a CIP? :question_mark:
- ADR vs. feature (idea) issues
- Comes from the fact there are bigger open points from discussion w/ researchers about the protocol itself, not only gaps to be closed
- How do we materialise the work to be done and the decisions taken on these items?
- Write it down as new feature requests, possibly with different "polish" level (eg. hardcoding limits of head size vs. implementing incremental fanout)
- Should we write ADRs or Feature Ideas?
- Creating an ADR to document a decision is in scope, part of the thing we end up doing as part of each "extension"
- Spec is more abstract vs. ADR being more concrete/decision for this implementation
- ➡️ Draft features (SN)
- Don't forget to populate logbook entries for:
- Values (mint) are weird on-chain
- How to patch / debug ledger
- Information like this is valuable to be recorded in the logbook for future selves or other contributors
- We (AB) would like to experiment building an Hydra-api package/tool/spec, there's already a discussion about this https://github.com/input-output-hk/hydra/discussions/432
- ➡️ Outcome: a clear understanding in the form of a “feature” on our roadmap
- Also ➡️ Do the same for explorer -> prepare a feature on the roadmap
- About the review process
- Rule: If a PR is in review, it needs to be reviewed, if it's in draft it needs not
- We want to avoid noise from GH's notifications
- What's the best way to communicate about PRs?
- ❓ Automate the “request review”? Github CODEOWNERS would have that effect.
‼️ Convert PRs in review to draft when they need rework
- We had 2 more heads initialised!
- PG: What kind of regression is there in #677?
- The issue is that changes to the close tx lead to removal of checks ensuring the pid and parties stay constant
- We want a specific property mutation exposing that issue
- SN: What to do with the pictures in the spec?
- We need to be consistent with mathematical notation
- The figures are not consistent with the traces we use in Miro but this ship has sailed!
- Hand-written is fine for now, but Miro could fit the bill https://community.miro.com/inspiration-and-connection-67/latex-and-math-editor-tool-8698 (TikZ is annoying...)
- AB
- Please take a look at the discussion I started on github
- Known experiments were in haskell and had choose to depend on full hydra-node
- Please take a look at the discussion I started on github
- FT
- Does it make sense to add an off-chain logic to prevent users from posting ContestTx twice? If so, how? HeadLogic is missing a match case for this scenario.
- Off-chain will automatically contest and current implementation should not try to contest more than one but this is not tested
- It might be involved to create an explicit test for that?
- Will take 30' together to explore that this afternoon.
- Does it make sense to add an off-chain logic to prevent users from posting ContestTx twice? If so, how? HeadLogic is missing a match case for this scenario.
I'm having problems with my dev vm. Almost no lsp features is working and I can't figure out what happened. I need to understand how valididty intervals work but I can't access any hoogle documentation. I've tried this setup for several weeks without observing more efficiency of using doom emacs than I can get with vscode.
I go back to trying to work on my laptop again, although it has been problematic in the past:
- check-out the code
- run
nix develop
and... see tomorrow if it has been able to build everything ;)
In the meantime, I make some progress on fixing suspicious mutation test about validity bounds. There is a problem with the way we test validity bounds mutations. Most of the arbitrarily generated values will make the validator fail but there are no reason to not generate valid transaction with what we had.
To fix that, we distinguish between 3 situations:
- The lower bound is infinite
- The upper bound is infinite
- The upper bound is too high given the contestation deadline
Note: case 3. is not yet implemented, we need to do it and remove the corresponding FIXME in the code.
- We notice that the
MutateValidityInterval
close mutation is using “too arbitrary values”. That is, it could hit a valid range (bounded by the contestation period) and if the upper bound then happens to be exactly the healthy upper bound (slot), then the mutation would fail because the tx will remain valid. -> It is tricky to write actualllly helpful mutations!
Building Hydra (or probably any other cardano project for that matter) on a new MacBook Pro equipped with ARM 64 M2 chips is straightforward but for one small quirk: One has to ensure GHC uses LLVM 12 whereby it's LLVM 13 that's installed by default on the machine.
Here are the steps I followed, based on this reddit post:
- Install LLVM 12
brew install llvm@12
- Install ghcup
curl --proto '=https' --tlsv1.2 -sSf https://get-ghcup.haskell.org | sh
- Select GHC 8.10.7 and link to LLVM 12 toolchain:
OPT=/opt/homebrew/opt/llvm@12/bin/opt LLC=/opt/homebrew/opt/llvm@12/bin/llc ghcup install ghc 8.10.7 --force ghcup set ghc 8.10.7
- Follow instructions on installing cardano-node from source, to install forked libsodium and libsecp256k
- Install some more standard tools for building:
brew install libtool autoconf automake pkg-config
- Then building should be as simple as:
git clone https://github.com/input-output-hk/hydra cd hydra cabal update cabal build all
- HLS in emacs is hanging on code actions (e.g. fixing imports)
- Same version (1.8.0.0) on different project is having similar problems, although less (not always times out)
- Using vscode on that same other project is just fine (with 1.8.0.0)
- Updating
lsp-haskell
emacs package had no effect - Disabling tactics (wingman) makes it less a problem
- Compiling newer HLS 1.9.0.0 fails with GHC 8.10.7
- Trying to upgrade GHC to 9.2.4 on
hydra
(would also benefit other things)- GHC and cabal required a download of 2.5GB (?)
- Then it fails because of incompatible
base
version
- Building a HLS 1.9 externally also does not work as it requires to compile the project to be compilable with the same GHC (e.g. 9.2.4)
- Somehow got
hydra-prelude
to compile against GHC 9.2.4 and an 1.9 HLS paired up, but the same timeout / slow behavior is present there.
- Reduce the code duplication between Close.hs and CloseInitial.hs fixtures
- inline as much healthySomething functions as possible
- undo the inline and instead merge both files together, comparing the common versus differing functions
- extracting a few functions to reduce duplication eventhough there is still some remaining
- extracting replaceSomething function to update a state so that we can refactor between Contest and Close
- a few puzzles to solve:
- how to use replaceSnapshotNumber in MutateSnapshotToIllFormedValue
- I'm not sure I understand when we build mutation data structure and when we actually mutate things (how ChangeOutput and replaceSnapshotNumber articulate with one another)
- I believe some of the new tests fail for the wrong reason but don't know how to fix yet
- PG: ADR for where we document our processes / the fact that we want to migrate from the wiki
- It’s not really architecture about the software the itself
- Decision records are a good point though
- Where do we (at least) record the decision of processes in the repository itself (and the process of changing processes)
- “Governance Decision Records”? Is there something similar existing in the realm of open source (foundations). If yes, does it fit or is it too heavy-weight?
- Dedicated repository?
- What’s the scope of this? Already “Open Governance” of the project?
- SN: Did re-arrange the logbook yesterday
- SN: Not introduce any TODO/FIXMEs with PRs
- Not executed today and even only vaguely defined
- Motivation is still there.. each PR making the repository more mature / less brittle
- Maybe sufficient visibility of introduced FIXMEs is also fine? So we don’t lose track of things
- One thought: FIXME/TODO items, PRs and backlog items are “inventory”
- Fixed some mutation test failing for the wrong reason
- In fact the slot for calculating a
healthyContestationDeadline
in the close mutation was wrong (not healthy)
- In fact the slot for calculating a
- We also made mutations more "generic": relying less on fixture knowledge (the
healthy...
variables) - This already goes into the direction of eventually hopefully reducing duplication. For example the close:
- has at least two interesting scenarios: closing with the initial and with a signed snapshot
- we would like to re-use mutations between them, as most of them should be orthogonal to this
- we do rely a lot on the
healthy..
fixtures right now though
- We also need to balance progressing to this end with "getting things done" .. so we ended up keeping only the mutations for the
CloseInitial
scenario which were wrong to be at least safe against regressions with minimal duplication
Fixing the test in the P.R. we discover that the initialSnapshot close case is not covered by the tests. We introduce an explicit healthyInitialCloseTx fixture to mutate and assess properties again. We do that through some code duplication that will have to be reduced before we can merge this P.R.
Notes about things we discussed in our tactical today.
- PG: Mob retrospective
- Some frustration or loose ends
- Would be great to us some time to reflect in the end of a session
- Make it happen and maybe even bring the conclusion to the next tactical
- PG: What’s the difference between one list of mutation or one test per mutation type?
- i.e. the difference between the sum type of mutations vs. multiple explicit properties for each mutation
- Actually, not much.
- Maybe it’s even better to have a list of things we want to check, rather than the generator + coverage
- Because the coverage mechanism also not greatly tuned and we see sometimes more than 100 tests per property to reach coverage.
- What is actually “the mutation” - terms are maybe a bit overloaded and we could also call one thing the scenario.
- Also: how could we re-use mutations in different transaction / scenarios
- FT: should we write down our findings yesterday in logbook or somewhere
- We had a situation / confusion yesterday, but did not wrote them down
- And today we have found solutions for some things .. it would be great to match them up, to have a common learning written down?
- PG and FT will exchange findings this afternoon and put notes in the logbook
Investigating why our mutation validator fails, we stumble upon this issue:
serialising: (TxIn "9d682c3584482cfc0f476cda13b8898b53d5a833fb3a304c8e873314b6840fb3" (TxIx 31),TxOut (AddressInEra (ShelleyAddressInEra ShelleyBasedEraBabbage) (ShelleyAddress Testnet (KeyHashObj (KeyHash "66ea4f9b49f1a455de9629b974e51e7a58cab3992d0e8cfc77f5c877")) StakeRefNull)) (TxOutValue MultiAssetInBabbageEra (valueFromList [(AdaAssetId,1)])) TxOutDatumNone ReferenceScriptNone) to "\216y\159\216y\159\216y\159X\FSf\234O\155I\241\164U\222\150)\185t\229\RSzX\202\179\153-\SO\140\252w\245\200w\255\216z\128\255\161@\161@\SOH\216y\128\216z\128\255"
serialising: (TxIn "6fae592acd3722b8d9ae55f8d28bb7c54050ce38e03ffdd8456314625a8f25a8" (TxIx 86),TxOut (AddressInEra (ShelleyAddressInEra ShelleyBasedEraBabbage) (ShelleyAddress Testnet (KeyHashObj (KeyHash "f1aba6238afa8efde1d1596b7866b14ba4c9ec5cff7354c5b432e431")) StakeRefNull)) (TxOutValue MultiAssetInBabbageEra (valueFromList [(AdaAssetId,1)])) TxOutDatumNone ReferenceScriptNone) to "\216y\159\216y\159\216y\159X\FS\241\171\166#\138\250\142\253\225\209Ykxf\177K\164\201\236\\\255sT\197\180\&2\228\&1\255\216z\128\255\161@\161@\SOH\216y\128\216z\128\255"
The serialised form of 2 distinct txOuts seems the same but actually not, they are different 😌 We spent a few minutes scratching our heads on this.
"\216y\159\216y\159\216y\159X\FSf\234O\155I\241\164U\222\150)\185t\229\RSzX\202\179\153-\SO\140\252w\245\200w\255\216z\128\255\161@\161@\SOH\216y\128\216z\128\255"
"\216y\159\216y\159\216y\159X\FS\241\171\166#\138\250\142\253\225\209Ykxf\177K\164\201\236\\\255sT\197\180\&2\228\&1\255\216z\128\255\161@\161@\SOH\216y\128\216z\128\255"
We think the hashTxOuts
function does not care about the ordering of the elements it hashes
- We are puzzled by this ordering problem, there seemed to be a problem in the way we compute the hash so we wrote a test checking reshuffling a list of TxOuts and hashing gives different results and it's the case
- This is of course wrong => we add a property test to assert that
Tracing the tx before mutation shows that the outputs are correctly shuffled
- Trying to not reorder the outputs and see if it fails => OK
When we observe the commit tx, we record the commit output in the initialCommits
:
- A triple
(txIn, txOut, scriptData)
-> scriptData contains what's committed, which is created in theobserveCommitTx
function - in the
observeCommit
we don't store in the state thecommitted
, we use it only to report theOnCommitTx
- in the
abortTx
function we consume this map to create the reimbursed outputs -> reimbursment outputs are ordered by the txIn of the commit output, not of the committed output
Sorting the abort tx's outputs by the intial TxIn of the committed TxOut fix the issue
Working on making sure commmitted outputs are properly reimbursed in the Abort
transaction. This is made somewhat tricky by the fact this tx can consume UTxO locked by either the initial or the commit scripts.
What we want to check is that the hash of the list of serialised committed UTxO, ordered by their TxOutRef
, is the same as the hash of the list of the first m
outputs of the transaction, where m is the number of committed UTxO.
This is more general than what we currently do and would allow commmits to have 0, 1 or more UTxO.
Introduced mutation IdenticalCommits
does not work -> the mutation test is failing (eg. the validation succeeds)
- Removing it from the base
ensemble/mutation-testing-framework
to start from a clean slate
Struggling to get our mutation test for reordering outputs right:
- The mutation test is failing even though it should not because we are reordering the outputs and hashing which should produce different
- One explanation is that there's only one UTxO committed so it does not matter if we reorder the outputs!
- But it seems also the outputs are wrong, esp. as the values are not preserved even though the outputs have > 1 lovelace -> seems like nu_commit inputs are not built correctly They should have value equala to 2ADA + locked value in the committed UTxO
- Moreover, the scripts are not all referenced in the tx which might or not might not be an issue in our case because we are validating at phase 2 ??
All fork and no merge makes Jack a dull boy.
This week we've tried to work in parallel on both #658 and #670.
Unfortunately these two tasks were touching the very same code base. In particular, 658 changed how the commit validator works and 670 was moving this validation logic to the head validator.
This has led to some painfull rebase from 658 branch to 670 branch to keep up with the new changes made there. We had several issues with test working and then not working anymore after a merge which were cause by code disappearing from the branch because of error during conflict resolution.
We also suffered from context switching with errors like commiting in the wrong branch.
At some point, we just stopped work on 670 to focus, ensemble, on 658 so that we finish one thing, fully, before moving forward with the other task and that made things smoother.
It appears that the tasks we have to do to fix the gaps identified in 452 require too much synchronization to perform them in parallel. Horizontal scalability not possible here, need vertical scalability ;)
Also, not proud of it, but we reduced some error traces messages just to make our test transaction stay under the transaction size limit. We keep actual optimizations for later as every gap fixing has an impact on performance and handling that globally in the end will probably be better.
- Trying to tie together things left of from last Friday session
- Most of the changes from the PR review are implemented
- Looking at CollectCom tests to see why ST is missing in the outputs
- I am taking a look at the Miro board and realized these outputs only need to contain PTs and the head validator is the one we can detect with the ST.
- Continuing further I also need to check for PT burning in the
v_commit
in theViaAbort
redeemer. We think the check should be there and afterwards we can decide to move it if we need to reduce script size since we already check for that inv_head
. - One tricky test to fix was the mutation one for the
abort
txs. It was failing because of a check in v_commit script since we missed to generate party commits that use the correct policy id for the tokens. - Removing the
v_head
datum checks from thev_commit
script since the ST check is enough to determine we are a part of the correct head instance. - Removing the
TxSpec
test since they are highy specialized/crafted and we check the same properties already when testing the generated txs. - Removing specific mutation tests for
MutateHeadScriptInput
sincev_commit
does usev_head
input for the checks
The branch has some tests which are failing. It happens that the validator was checking the output instead of the inputs to find the commits.
Fixing that, led to the normal abort test case to fail. That is because we only match of inputs to spend a PT before analysing it, not explicitly checking it's a commit and not an initial. But an initial does not have any datum which made the validator fail. We know just ignore the case where an input would spend a PT but does not have datum.
Just in case, we secured the collectCom validator for the case where one would try to spend an intial instead of a commit in it with a new MutateCommitToInitial
.
Looking on the tx traces about the abort scenario, we compare the traces with the code and figure out that we may miss some checks in the initial validator for the abort case. Adding that in the gaps identified in #452.
I was able to open a head on a multi-headed node:
{"headId":"b3f6c710be970384fa3f3b163df90a62dbec7ffdbd0664e41cff5ba7","tag":"HeadStarted"}
{"headId":"b3f6c710be970384fa3f3b163df90a62dbec7ffdbd0664e41cff5ba7","serverOutput":{"tag":"RolledBack"},"tag":"HeadOutput"}
{"headId":"b3f6c710be970384fa3f3b163df90a62dbec7ffdbd0664e41cff5ba7","serverOutput":{"headId":"b3f6c710be970384fa3f3b163df90a62dbec7ffdbd0664e41cff5ba7","tag":"HeadInitialized"},"tag":"HeadOutput"}
{"headId":"b3f6c710be970384fa3f3b163df90a62dbec7ffdbd0664e41cff5ba7","serverOutput":{"parties":[{"vkey":"b37aabd81024c043f53a069c91e51a5b52e4ea399ae17ee1fe3cb9c44db707eb"},{"vkey":"f68e5624f885d521d2f43c3959a0de70496d5464bd3171aba8248f50d5d72b41"}],"tag":"ReadyToCommit"},"tag":"HeadOutput"}
{"headId":"b3f6c710be970384fa3f3b163df90a62dbec7ffdbd0664e41cff5ba7","serverOutput":{"party":{"vkey":"f68e5624f885d521d2f43c3959a0de70496d5464bd3171aba8248f50d5d72b41"},"tag":"Committed","utxo":{"d2c050f2d6f4eb5f1ba4749d472402eeeae89f36e2d33fec86858231424d5e35#0":{"address":"addr_test1vqg9ywrpx6e50uam03nlu0ewunh3yrscxmjayurmkp52lfskgkq5k","datum":null,"datumhash":null,"inlineDatum":null,"referenceScript":null,"value":{"lovelace":500000000}}}},"tag":"HeadOutput"}
{"tag":"HeadInput", "headId": "b3f6c710be970384fa3f3b163df90a62dbec7ffdbd0664e41cff5ba7", "clientInput": { "tag":"Commit", "utxo": {"8ff987ce9f271a24d341156e5ca3541ac010d3876b2c4a6a12aeb5cbe6a12cc4#1":{"address":"addr_test1vru2drx33ev6dt8gfq245r5k0tmy7ngqe79va69de9dxkrg09c7d3","datum":null,"datumhash":"a654fb60d21c1fed48db2c320aa6df9737ec0204c0ba53b9b94a09fb40e757f3","inlineDatum":null,"referenceScript":null,"value":{"lovelace":51298400}}}}}
{"headId":"b3f6c710be970384fa3f3b163df90a62dbec7ffdbd0664e41cff5ba7","tag":"InputSent"}
{"headId":"b3f6c710be970384fa3f3b163df90a62dbec7ffdbd0664e41cff5ba7","serverOutput":{"party":{"vkey":"b37aabd81024c043f53a069c91e51a5b52e4ea399ae17ee1fe3cb9c44db707eb"},"tag":"Committed","utxo":{"8ff987ce9f271a24d341156e5ca3541ac010d3876b2c4a6a12aeb5cbe6a12cc4#1":{"address":"addr_test1vru2drx33ev6dt8gfq245r5k0tmy7ngqe79va69de9dxkrg09c7d3","datum":null,"datumhash":"a654fb60d21c1fed48db2c320aa6df9737ec0204c0ba53b9b94a09fb40e757f3","inlineDatum":null,"referenceScript":null,"value":{"lovelace":51298400}}}},"tag":"HeadOutput"}
{"headId":"b3f6c710be970384fa3f3b163df90a62dbec7ffdbd0664e41cff5ba7","serverOutput":{"tag":"HeadIsOpen","utxo":{"8ff987ce9f271a24d341156e5ca3541ac010d3876b2c4a6a12aeb5cbe6a12cc4#1":{"address":"addr_test1vru2drx33ev6dt8gfq245r5k0tmy7ngqe79va69de9dxkrg09c7d3","datum":null,"datumhash":"a654fb60d21c1fed48db2c320aa6df9737ec0204c0ba53b9b94a09fb40e757f3","inlineDatum":null,"referenceScript":null,"value":{"lovelace":51298400}},"d2c050f2d6f4eb5f1ba4749d472402eeeae89f36e2d33fec86858231424d5e35#0":{"address":"addr_test1vqg9ywrpx6e50uam03nlu0ewunh3yrscxmjayurmkp52lfskgkq5k","datum":null,"datumhash":null,"inlineDatum":null,"referenceScript":null,"value":{"lovelace":500000000}}}},"tag":"HeadOutput"}
The process is annoyingly involved because I had to manually craft the commits to put in the head but it kind of work!
The code is definitely not pretty and not worth of publication at the moment, and it's very imperative too with lot of plumbing and wiring to connect the various parts and fly messages around.
Conceptually, it's rather simple: When the multi-head node is requested to start a new head, it spins up a singular Hydra.Node
with the proper configuration (cardano keys, hydra keys, network connections...). Then client can interact with any of the open heads, including operating on the heads like committing/closing/fanning out, etc.
For this to work, the multi-head node needs to do some multiplexing of the underlying nodes:
- The
MultiHeaded
node is configured with secret keys for both Hdyra and Cardano, which will be used for all heads and all nodes. This works because there's only one shared wallet so as long as the wallet has enough fuel to pay for the transactions, it's ok - It starts a
Network
server listening on a UDP port, which will multiplex incoming messages and wrap outgoing messages annotated with theHeadId
. UDP seems like an interesting option there because it does not require maintaining connections to remote nodes - There's supposed to be some mechanism (resolver) to identify and locate
Remote
peers. Currently these are just JSON-formatted files in a well-known location but it's not far fetched to think of some other form of directory for nodes, like DNS TXT entries, or even a hosted service - It also starts a
ChainObserver
which is used to observe the chain forInitTx
which are of interest to our node- This is currently done in a very ad hoc way as we distinguish the case of the initiating node and the other cases
- There is a client interface which is JSON based, exposed through the configured API port. It provides some commands on top of
ClientInput
and some additional outputs overServerOutput
, but the client is notified of all the events occuring in anyone of the started node, annotated with theHeadId
- Starting a new head is triggered either by a client command or by observing an
InitTx
on-chain- This triggers starting a new
Hydra.Node
with a complete stack of services tailored for the particular situation - Each node has its own
Direct
chain connection which is started from the point before theInitTx
so that the tx can be observed and the state updated accordingly. This is annoying to do but is needed becuase while it's easy to know theChainPoint
for a tx, it's not possible to retrieve an arbitraryBlock
from the node given such aChainPoint
: One has to follow the chain and only gets thenext
blocks - The
API
part is rewired to connect to the enclosing multi-head node's client interface - The
Network
part encapsulates a UDP component and aHydra.Network.MultiHead
component that annotates each message with theHeadId
- The new node is passed a list of
Remote
representing the other parties of hte head with all the required information (host:port, vkeys). This list is constructed by resolving "symbolic" name (eg. read some files) - Note that in the case of starting a node from observing a Head, the only identifier we have is the
Party
, which is aVerificationKey HydraKey
. We need to lookup through all the known nodes to resolve that key to theRemote
structure which is probably something that would be cumbersome IRL. It would probably make sense to add some symbolic name in the datum of theInitial
UTxO created for the party, so that all parties can more easily infer other parties' information from the chain
- This triggers starting a new