Feat: decouple CFE from SC events #4382

msgmaxim · 2024-01-04T07:03:22Z

Pull Request

Checklist

Please conduct a thorough self-review before opening the PR.

I am confident that the code works.
I have updated documentation where appropriate.

Summary

After discussion with @dandanlen I decided to create a separate pallet for CFE events, so that all events are in the same place and we don't need to repeat the mechanism for cleaning up events etc (the alternative I first considered was having CFE read storage items from separate pallets).

All events are stored in the CfeEvents storage item, which can be queried by block number. In SC Observer, once we receive a new finalized header, we now query all events for that block. Events are stored for 20 blocks after which they are removed, which should give the CFE enough time to retrieve them.

All events are defined in the CfeEvent enum. I imagine we can create simple tests decoding/encoding each variant from/to some hardcoded values to ensure that we don't accidentally make incompatible changes (probably in a separate PR).

I created a few traits, some of which are generic over chain or crypto, so that our generic code (broadcaster, threshold signature etc) can use them while running slightly different code for each. I think for the most part I was able to avoid duplicating code where possible.

I'm pretty sure this doesn't require a runtime storage migration, but let me know if I'm wrong. I did test upgrading runtime from main to this branch and it worked as expected, initialising CfeEvents with the default value (empty container).

AlastairHolmes · 2024-01-04T09:49:31Z

engine/src/state_chain_observer/sc_observer/mod.rs

+                    	match_event! {event, {
+                    		CfeEvent::EthThresholdSignatureRequest(req) => {
+                                handle_signing_request::<_, _, _, EthereumInstance>(
+                                    scope,


state-chain/pallets/cf-cfe-event-emitter/src/lib.rs

state-chain/pallets/cf-threshold-signature/src/lib.rs

engine/src/state_chain_observer/sc_observer/mod.rs

state-chain/traits/src/lib.rs

codecov · 2024-01-09T06:08:22Z

Codecov Report

Attention: 209 lines in your changes are missing coverage. Please review.

Comparison is base (08a1a47) 72% compared to head (191434d) 72%.

Files	Patch %	Lines
engine/src/state_chain_observer/sc_observer/mod.rs	20%	97 Missing and 4 partials ⚠️
state-chain/pallets/cf-cfe-interface/src/lib.rs	57%	21 Missing and 10 partials ⚠️
state-chain/cfe-events/src/lib.rs	19%	1 Missing and 16 partials ⚠️
...chain/pallets/cf-cfe-interface/src/benchmarking.rs	0%	16 Missing ⚠️
state-chain/cf-integration-tests/src/network.rs	92%	11 Missing and 1 partial ⚠️
state-chain/traits/src/mocks/cfe_interface_mock.rs	80%	7 Missing and 4 partials ⚠️
...tate-chain/pallets/cf-cfe-interface/src/weights.rs	50%	8 Missing ⚠️
state-chain/pallets/cf-broadcast/src/tests.rs	84%	5 Missing ⚠️
state-chain/traits/src/lib.rs	0%	3 Missing ⚠️
state-chain/chains/src/btc.rs	0%	1 Missing ⚠️
... and 4 more

Additional details and impacted files

@@          Coverage Diff          @@
##            main   #4382   +/-   ##
=====================================
- Coverage     72%     72%   -0%     
=====================================
  Files        396     401    +5     
  Lines      67561   67659   +98     
  Branches   67561   67659   +98     
=====================================
+ Hits       48756   48798   +42     
- Misses     16363   16392   +29     
- Partials    2442    2469   +27

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

engine/src/state_chain_observer/sc_observer/mod.rs

state-chain/pallets/cf-cfe-event-emitter/src/lib.rs

state-chain/pallets/cf-threshold-signature/src/lib.rs

state-chain/traits/src/mocks/cfe_event_emitter_mock.rs

state-chain/pallets/cf-broadcast/src/tests.rs

state-chain/cf-integration-tests/src/network.rs

kylezs

🙌 product team requests to update events can be accommodated much more easily now

kylezs · 2024-01-12T12:06:09Z

state-chain/runtime/src/lib.rs

@@ -756,6 +770,7 @@ construct_runtime!(
 	{
 		System: frame_system,
 		Timestamp: pallet_timestamp,
+		CfeEventEmitter: pallet_cf_cfe_event_emitter,


Discussion here: https://discord.com/channels/775961728608895008/1195210668819878030

Moved back to the end of the list (now using custom order of pallet execution instead)

msgmaxim · 2024-01-15T08:14:33Z

So just to summarise, in this PR we need a way to clear events at exact block boundary (so that if we request storage for a block hash, we only get events generated at that block). The current solution is/was to put the CFE events pallet before any pallet that might emit a CFE event and clear the events in on_initialize. However, as discussed on discord, this is prolematic as we don't want to change the indexes of the existing pallets. I see a few options as to how to proceed:

a) Go back to the original approach where we store events in a map so that events from different blocks can be queried separately (i.e. reverting 9760930)

b) Clear events in the earliest pallet that currently emits CFE events, so that we can be sure that no new events are lost. @AlastairHolmes also suggested modifying the existing system pallet to do something similar (or even moving the storage for the CFE events there).

Any other suggestions? Kyle suggested that you might have some thoughts @dandanlen.

dandanlen · 2024-01-15T09:45:17Z

We can customise the execution order of the hooks.

In the runtime.rs we have:

pub type Executive = frame_executive::Executive<
	Runtime,
	Block,
	frame_system::ChainContext<Runtime>,
	Runtime,
	AllPalletsWithSystem,
	PalletMigrations,
>;

We can replace AllPalletsWithSystem with our own custom tuple with the pallets in any order we please (same thing as we do with PalletMigrations). This will then execute all the hooks in that order.

The only downsides I can think is that we need to remember to maintain this, ie. when we add a new pallet, the hooks need to be added to the custom tuple. This also applies to eg. integration tests: we need to replicate the same execution order instead of relying on the default.

edit: another downside is that the hooks are executed after runtime upgrades, meaning we would be unable to emit any cfe events during a runtime upgrade. For this reason, the system events are in fact deleted before anything else (before any hooks and before the runtime upgrade). We should consider doing the same for the cfe events as Alastair suggested, but this would involve editing either frame_executive::Executive or the system pallet itself. I'm happy to use the hooks for now, but in the long run changing the Executive/System pallet seems like the more robust approach.

The benefit is that we don't need to worry about re-ordering pallets in the runtime declaration - as you mentioned @msgmaxim this will continue to cause issues whenever we want to add eg. a new chain.

A third option would be to make the breaking change and re-order the pallets, but to use some pre-defined explicit ranges with some gaps for future additions, ie.:

construct_runtime!(
	pub struct Runtime
	{
		System: frame_system = 0,
		Timestamp: pallet_timestamp,
		CfeEventEmitter: pallet_cf_cfe_event_emitter,
		// 3: TBC
		// 4: TBC
		Environment: pallet_cf_environment = 5,
		Flip: pallet_cf_flip,
		Emissions: pallet_cf_emissions,
		// etc..
	}
);

This would buy us some time but will eventually also run into the same limitation (at some point we might run out of free indices between pallets).

dandanlen

FWIW I would prefer a less restrictive pallet name, for example cfe-interface - long term, we might use this pallet for things other than events, for example to group cfe-specific storage values, or to host cfe-specific extrinsics (for example the version number for compatibility checks, or submitting/storing the peer info, which is used only by the cfe, not by the runtime).

dandanlen · 2024-01-15T10:04:53Z

state-chain/pallets/cf-cfe-event-emitter/src/benchmarking.rs

+
+benchmarks! {
+
+	clear_events {


I think this will break when we generate new benchmarks: it needs to be called remove_events_for_block or alternatively, the method in WeightInfo needs to be renamed to clear_events. (I actually prefer the latter given that you removed the index per-block).

Updated in 0b710a2

dandanlen · 2024-01-15T10:10:12Z

state-chain/pallets/cf-broadcast/src/lib.rs

+				payload: broadcast_attempt.transaction_payload.clone(),
+			});
+
+			// TODO: consider removing this


Why would we not remove it? I guess polkaJS, debugging?

edit - I saw there was another comment thread about this, think I agree with keeping it for now

dandanlen · 2024-01-15T10:26:13Z

state-chain/traits/src/lib.rs

+	}
+}
+
+pub trait CfeEventEmitterT<T: Chainflip> {


Why not something like CfePeerRegistration?

We don't need to assume that the underlying implementation emits events.

Similarly for the other traits. Could be CfeMultisigRequest, CfeBroadcastRequest.

My thinking was that the same trait would be used for many potentially unrelated events (ideally all of them, but because some pallets are generic over either Crypto or Chain, I had to create separate traits for them). I see that CfeMultisigRequest is likely to cover any current/future Crypto related events, but I wonder if we would want to add more events from non-generic pallets to CfeEventEmitterT. Would we then prefer separate traits for different events? (I don't really mind either way.)

I will rename the traits like you are suggesting (considering that we can easily rename them again if needed).

FWIW I would prefer more granular traits over a single 'god trait'. Makes it easier to mock, makes the intent clearer etc.

I also feel like it's worth distinguishing the interface from the implementation. It doesn't matter that it emits an event - for example you mentioned we might remove peer registration events in favour of storage writes. What matters is that when you use an implementation of this trait, you expect to notify the engine of a peer update. Whether it's an event or a storage write, or some mock that simply writes to a log, is not really relevant.

Renamed traits in d2fdc66

state-chain/traits/src/mocks/cfe_event_emitter_mock.rs

msgmaxim · 2024-01-16T10:38:48Z

The only downsides I can think is that we need to remember to maintain this

If we forget, does this mean the hooks for that pallet won't be executed? If that's the case, then I don't think it is a really big deal, since we will certaintly notice that something doesn't work.

another downside is that the hooks are executed after runtime upgrades, meaning we would be unable to emit any cfe events during a runtime upgrade

Do you mean in migration code? Seems unlikely that there will be a need for that. Even if do we need something like that, I imagine it wouldn't be too difficult to come up with a workaround.

We can replace AllPalletsWithSystem with our own custom tuple with the pallets in any order we please (same thing as we do with PalletMigrations). This will then execute all the hooks in that order.

Sould like a reasonable approach. Should I go ahead with this or do we want to think about this more? @dandanlen

dandanlen · 2024-01-16T10:49:38Z

I think a custom tuple gives us the quickest win right now. Also, we'll need this anyway whenever we add a new chain.

We will need to remain aware of the runtime upgrade weirdness: For example say we want to trigger a threshold signature or a broadcast during a runtime upgrade, the current approach would not allow this.

I'll open a Linear issue to customize the executor/system pallet events reset mechanism.

dandanlen · 2024-01-16T13:48:31Z

(FYI: https://linear.app/chainflip/issue/PRO-1105/customize-runtime-executivesystem-pallet-to-reset-engine-events)

msgmaxim · 2024-01-17T04:46:52Z

I think a custom tuple gives us the quickest win right now. Also, we'll need this anyway whenever we add a new chain.

Made this change here:
0b710a2. Tested on localnet, seems to work as expected. The order in the tuple is exactly what was used before with the cfe interface pallet moved up right after the Timestamp pallet.

dandanlen · 2024-01-17T16:50:02Z

It looks like some integration tests are failing. Could be related to the change in ordering... either way, it looks like all the tests are failing at the same assertion, so should (hopefully!) be a simple fix. @syan095 might be able to help out here, he's quite familiar with the integration tests. (Also it looks like his recent PR conflicts with this one).

dandanlen · 2024-01-17T16:50:43Z

LGTM otherwise 🎉

msgmaxim · 2024-01-18T05:20:45Z

Fixed integration tests in 71b2087 and resolved the conflict with main @dandanlen.

dandanlen

@kylezs still needs to approve.

kylezs

🙌

…itness-swap * origin/main: (24 commits) fix: restore correct restriction on redemption expiry (PRO-1072) Feat/migrate-to-polkadot-sdk-repo (#4361) chore: fix RUSTSEC-2024-0003 (#4426) Feat: decouple CFE from SC events (#4382) chore: add docker tag prefix to `chainflip-ingress-egress-tacker` 🏷️ (#4427) refactor/PRO-1101: rid of swapping minimum swap amount (#4419) refactor(ingress-egress-tracker): remove unnecessary fields (#4425) Improved the retry queue storage (#4420) fix: bump spec version command only bumps when necessary (#4422) refactor: use yargs for all try_runtime_command options (#4423) fix: await finalisation before starting broker (#4412) chore: debug logs to see get_raw_txs query (#4413) doc: correct env vars (#4416) fix: pool_orders rpc filters empty orders (PRO-1039) (#4377) Produce an event in case the swap yields a zero egress amount (#4410) fix: don't have conflicting redis port with localnet (#4415) Ability to specify output for the subcommands, other than `/dev/null` (#4411) chore: increase limit on max number of bitcoin payloads in a ceremony to theoretical maximum (#4396) refactor(ingress-egress-tracker): configurable expiry times [WEB-761] (#4406) fix: use existing script for upgrade job (#4403) ... # Conflicts: # Cargo.lock # state-chain/primitives/Cargo.toml

msgmaxim requested review from dandanlen and kylezs January 4, 2024 07:03

AlastairHolmes reviewed Jan 4, 2024

View reviewed changes

kylezs reviewed Jan 4, 2024

View reviewed changes

msgmaxim force-pushed the feat/cfe-events-pallet branch from 571210a to 77e996c Compare January 9, 2024 05:59

msgmaxim force-pushed the feat/cfe-events-pallet branch 5 times, most recently from fcdb6df to 39098a5 Compare January 11, 2024 05:00

feat: cfe events pallet

b546f91

msgmaxim force-pushed the feat/cfe-events-pallet branch from 39098a5 to b546f91 Compare January 11, 2024 05:18

msgmaxim marked this pull request as ready for review January 11, 2024 05:19

msgmaxim changed the title ~~WIP: cfe event emitter pallet~~ Feat: decouple CFE from SC events Jan 11, 2024

kylezs reviewed Jan 11, 2024

View reviewed changes

msgmaxim added 2 commits January 12, 2024 15:22

feat: store cfe events in StroageValue

9760930

chore: clean up

389e760

kylezs approved these changes Jan 12, 2024

View reviewed changes

kylezs requested changes Jan 12, 2024

View reviewed changes

dandanlen reviewed Jan 15, 2024

View reviewed changes

msgmaxim added 2 commits January 16, 2024 11:30

fix: rename function for benchmarking

0b710a2

chore: rename traits

d2fdc66

msgmaxim added 2 commits January 17, 2024 13:02

refactor: rename pallet to cfe-interface

93deb57

feat: custom execution order for hooks

ffea8f1

msgmaxim added 3 commits January 18, 2024 16:07

test: custom pallet execution order in integration tests

71b2087

Merge branch 'main' into feat/cfe-events-pallet

b89ff6b

chore: fix bad merge

191434d

dandanlen approved these changes Jan 18, 2024

View reviewed changes

dandanlen mentioned this pull request Jan 18, 2024

Feat/migrate-to-polkadot-sdk-repo #4361

Merged

2 tasks

kylezs approved these changes Jan 18, 2024

View reviewed changes

msgmaxim merged commit acc6698 into main Jan 18, 2024
42 of 43 checks passed

msgmaxim deleted the feat/cfe-events-pallet branch January 18, 2024 11:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat: decouple CFE from SC events #4382

Feat: decouple CFE from SC events #4382

msgmaxim commented Jan 4, 2024 •

edited

Loading

AlastairHolmes Jan 4, 2024

codecov bot commented Jan 9, 2024 •

edited

Loading

kylezs left a comment

kylezs Jan 12, 2024

msgmaxim Jan 17, 2024

msgmaxim commented Jan 15, 2024

dandanlen commented Jan 15, 2024 •

edited

Loading

dandanlen left a comment

dandanlen Jan 15, 2024

msgmaxim Jan 17, 2024

dandanlen Jan 15, 2024

dandanlen Jan 15, 2024

dandanlen Jan 15, 2024

msgmaxim Jan 16, 2024

dandanlen Jan 16, 2024

msgmaxim Jan 17, 2024

msgmaxim commented Jan 16, 2024

dandanlen commented Jan 16, 2024

dandanlen commented Jan 16, 2024

msgmaxim commented Jan 17, 2024

dandanlen commented Jan 17, 2024

dandanlen commented Jan 17, 2024

msgmaxim commented Jan 18, 2024

dandanlen left a comment

kylezs left a comment


		benchmarks! {

		clear_events {

Feat: decouple CFE from SC events #4382

Feat: decouple CFE from SC events #4382

Conversation

msgmaxim commented Jan 4, 2024 • edited Loading

Pull Request

Checklist

Summary

Choose a reason for hiding this comment

codecov bot commented Jan 9, 2024 • edited Loading

Codecov Report

kylezs left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

msgmaxim commented Jan 15, 2024

dandanlen commented Jan 15, 2024 • edited Loading

dandanlen left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

msgmaxim commented Jan 16, 2024

dandanlen commented Jan 16, 2024

dandanlen commented Jan 16, 2024

msgmaxim commented Jan 17, 2024

dandanlen commented Jan 17, 2024

dandanlen commented Jan 17, 2024

msgmaxim commented Jan 18, 2024

dandanlen left a comment

Choose a reason for hiding this comment

kylezs left a comment

Choose a reason for hiding this comment

msgmaxim commented Jan 4, 2024 •

edited

Loading

codecov bot commented Jan 9, 2024 •

edited

Loading

dandanlen commented Jan 15, 2024 •

edited

Loading