-
Notifications
You must be signed in to change notification settings - Fork 10
Conversation
0503c71
to
b610349
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall LGTM, but I feel we may need a special case for the bottom-up interaction from L2 -> root
. Filecoin mainnet won't be running Fendermint, but Lotus, so the only way to submit information on-chain is through an eth_sendTransaction
. We could delegate some of the logic for message relaying and resolution to the IPC agent, but we may be over-complicating it when the subnet validators could make the multisig for the checkpoint available for relayers to submit as a transaction in the parent.
Let me know if this makes sense, we can discuss sync if it is easier.
ChainMessage::ForExecution(_) | ChainMessage::ForResolution(_) => { | ||
// Users cannot send these messages, only validators can propose them in blocks. | ||
ChainMessage::Ipc(IpcMessage::BottomUpResolve(_msg)) => { | ||
// TODO: Check the relayer signature and nonce. Don't have to check the quorum certificate, if it's invalid, make the relayer pay. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This TODO
confused me a bit. Are we expecting to have the figure of a relayer in our initial implementation?
How are BottomUpResolve
messages arriving to our peer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, gotcha. We are calling a relayer the entity submitting the bottom-up checkpoint in the parent. It is clear from the comments in the IPC message definitions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's right, I thought it's just a modified version of the IPC Agent: where it currently has to sign a checkpoint and then send the transaction to the parent, now it has to listen to an event emitted from a smart contract on the child, accumulating the signatures without the help of the agent, and then take that checkpoint, wrap it in a special relay transaction (for CID visibility) and send it to the parent.
Ostensibly we could send it as an FVM transaction and inspect the to
, and parse it, instead of having a special IPC transaction 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
take that checkpoint, wrap it in a special relay transaction (for CID visibility) and send it to the parent.
Let's discuss this in today's sync, I am having a hard time understanding the end-to-end flow you are imagining of how the checkpoint signatures are collected, and the actual submission in the parent.
/// The address (public key) of the relayer in the current subnet. | ||
pub relayer: Address, | ||
/// The nonce of the relayer in the current subnet. | ||
pub sequence: u64, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sequence
is the increasing nonce for the address in the network, right? Meaning, if the user has sent some raw subnet transaction with nonce n
, and then they want to send a RelayedMessage
, the sequence of the relayed message should be n+1
,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it's just the sequence of the actor doing the relaying. I am not happy that I had to copy all these fields (and the gas fields in the next PR), so if you think this should be a regular FVM message that we know how to parse to get the CID out, I am open for that too.
/// a block height (for sequencing), any potential handover to the next validator set, and a pointer | ||
/// to the messages that need to be resolved and executed by the parent validators. | ||
#[derive(Clone, Debug, Serialize, Deserialize, PartialEq, Eq)] | ||
pub struct BottomUpCheckpoint { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need a few more fields as part of the BottomUpCheckpoint
: https://github.com/consensus-shipyard/ipc-solidity-actors/blob/cc109a57d62755cf849028a366258c38c04e49c8/src/structs/Checkpoint.sol#L8. Mainly:
fee
: Fees to be distributed in the parent (this may change for Fendermint).prevHash
: The hash of the previous checkpoint to build a chain of checkpoints that can be persisted and discoverable externally (IPFS/FIL sector).proof
-->snapshot
: We should rename it to snapshot, but whatever state snapshot or proof of the state a subnet wants to propagate to its parent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for commenting on this, it is something I needed your input on.
I see that what you are pointing at is in Solidity, so I was wondering if we can put that behind the CID, and keep only the fields here that we need in order to validate in the SubnetActor that this checkpoint meta-data is correct, and it can be resolved into the full thing.
For example the fields you listed:
fee
: I assume each transaction pays its own fee in our chosen default execution model, where everyone pays for their own stuff. Is this fee a general fee to pay for the checkpoint itself? See if we can leave it behind the CID. I thought that we can "execute" this meta-data message by sending it to the SubnetActor for i) validation and ii) taking payment from the relayer; if fees has to be sent from the common pot to some parent validators, indeed this would be a good time to do it, and for that it should be visible in this record. Could it not live in the SubnetActor instead, so that it automatically does it, instead of the subnet validators sending a different value each time?prevHash
: Do we really need this to be in the meta-data for validation? We have a height and if it's signed by the right validator set at the right height, should we not accept that this is what the committee is saying the next checkpoint is? I'm not saying we don't need it, but can it be behind the CID?snapshot
: This sounds like it can be behind the CID, that it's just like cross-messages, but different. One ramification of lifting it into the meta-data is that I built the resolver in FM-191: Execute BottomUpResolve #201 by assuming a single CID to be resolved, not a list - I assumed we can always create a struct to hold multiple CIDs, which makes it easier to track its completion. We can still know about it, and treat it in a special way during the execution of the checkpoint, when this is already available, otherwise we would just stuff it in the blockstore without even charging for it.
I am also wary about the fact that the checkpoint is in Solidity but we use the IPLD resolver to grab it, which means if was part of the EVM data structures then it's only possible to interpret it by the EVM actor, so it would have to be shoveled over and stored in the BlockStore as IPLD, not as parts of the EVM KVstore, to make it accessible to the IPLD resolver. This would be trivial if we used built-in actors, but with EVM actors it's not.
|
||
/// Relayed messages are signed by the relayer, so we can rightfully charge them message inclusion costs. | ||
#[derive(Clone, Debug, Serialize, Deserialize, PartialEq, Eq)] | ||
pub struct SignedRelayedMessage<T> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do we expect this SignedRelayedMessage
to be triggered by users? I think we may have an issue here, as any bottom-up protocol we come up with would have to be compatible with Filecoin mainnet, i.e. Lotus.
How I was imagining this to work, is that users directly send the CertifiedMessage
to the parent inside an eth_sendTransaction
, and the verification is performed on-chain in the subnet actor, which means that all the information required for the verification should be already included in the message. This means that the child subnets validators would have to tailor the CertifiedMessage
in their subnet before it can be propagated up (maybe this is what you are thinking of doing and I am missing it from the type definitions).
For L2+ we can use this approach of relay+execution, but in L2->L1 bottom up interactions we may just be able to send a raw transaction against the ETH RPC to submit the checkpoint.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do we expect this SignedRelayedMessage to be triggered by users?
It would be sent by the IPC Actor iff it knows the parent is a Fendermint node as well, because these messages don't work for Lotus at all. But with Lotus you also can't use the IPLD Resolver, so it's completely different.
This is unfortunate, and I agree this whole thing with IPLD Resolution and CIDs is a bit forced in order to make use of IPLD in the first place. It's possible that if message size is a problem we should just have gone with a Merkle root in the checkpoint and Merkle proof in relayed messages, simples, at the cost of multiple relayers including the same transaction payload multiple times.
So, we have the well-known ID of the gateway actor where these messages are going, so in theory Fendermint could inspect all FVM messages going to that actor and try to deserialize them as the EVM calldata payload that should contain an ethers
generated DTO. It would have to be a different one from the BottomUpCheckpoint
in Checkpoint.sol
you pointed at, because that one already has fully messages inside; if we want to use CIDs, that is. So it could be the equivalent of a CertifiedMessage
, ready to be validated, with an opaque CID field to be resolved by the time of execution, at which point it can be turned into the Solidity BottomUpCheckpoint
.
That way we can drop this SignedRelayedMessage
. It's existence is to make it clear that we are separating the act of relaying from the act of execution, which forces us to deal with the problem of who should pay for which part - the relayer pays for the validation, the inner message is paid for by its own sender(s). It should be possible to turn it around and lift the FVM message handling into Fendermint by parsing as well.
In any case, Lotus would have a different message format there which includes payloads.
/// The parent subnet already expects the last validator set to sign this one. | ||
pub next_validator_set_id: u64, | ||
/// Pointer at all the bottom-up messages included in this checkpoint. | ||
pub bottom_up_messages: Cid, // TODO: Use TCid |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We may need to include the list of messages here and not just the cid for L2->root interactions as described above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This data structure doesn't have to be the be-all, end-all data structure. It is for bottom-up messages arriving into Fendermint, and that is all we care about. How it is managed with Lotus is the concern of those relayers which send this to Lotus. For example they can do the equivalent of this:
pub struct RelayedCheckpoint {
pub checkpoint: CertifiedMessage<BottomUpCheckpoint>,
/// The CID of these messages have to match the CID in the checkpoint.
pub bottom_up_messages: Vec<CrossMsg>
}
And then part of validation is to check that.
I know it sucks that you have a single set of Solidity actors and two different ideas for relaying the stuff 😞
#[derive(Clone, Debug, Serialize, Deserialize, PartialEq, Eq)] | ||
pub struct SignedRelayedMessage<T> { | ||
/// The relayed message with the relayer identity. | ||
pub message: RelayedMessage<T>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Man, I hate that Filecoin uses message
for transaction
🤦, it is so confusing sometimes. Here we are sending an arbitrary message, but then in the bottom_up_messages
of the checkpoint we have the Cid
of the bottom-up transactions being propagated, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's right, this is any arbitrary message that can be relayed, but in the point of inclusion in IpcMessage
we know what to expect actually, so it's generic in order to be able to potentially reuse it and to express the nature of relaying stuff. The bottom_up_messages
are pointing at a CID that we have to retrieve and execute in a different context, not paid for by the relayer. We can change it point at a CID of a different thing which has a snapshot in it as well.
The goal there was to be able to process the message in stages:
- Relayer sends something: we should be able to validate the thing before resolving it, so as not to start pulling unlimited data, which we do by at least validating the checkpoint meta-data against the subnet committee. The relayer pays for this validation work, and will be rewarded either now or later, when we execute.
- We resolve the content in the background.
- We execute the meta-data and the content together (each paid by someone else); we can reward the relayers (all of them) now.
This would not be possible if we pass all messages as a single FVM message to Solidity.
@adlrocha I don't fully understand this comment. In the data type I am proposing in this PR, this is exactly how it's working. I understand that with Lotus the actual To be honest the interaction with the Solidity actors is what I am most uncomfortable with, and wanted to discuss with you what it unfolds into. |
Co-authored-by: adlrocha <6717133+adlrocha@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After discussing sync, we agree on what needs to be improved and what our constraints are. Merge time :)
For the record, what we discussed about bottom-up checkpointing is that we basically have two approaches:
Alfonso is okay with adding 2 different methods to the Gateway for the above cases, and we'll have a switch in Fendermint to tell which kind of checkpoint to construct: a fat one with messages included, or a thin one with just a CID, depending on what kind of parent it has. |
Closes consensus-shipyard/ipc#315
Creates just the message definition for relayed bottom-up multisig checkpoints:
ChainMessage::Ipc
for IPC related messagesIpcMessage::BottomUpResolve
for relayed checkpoints. The checkpoint has a quorum certificate (multi-sig) from the subnet, and a signature from the relayer.IpcMessage::BottomUpExec
for resolved checkpoints ready for execution. This is where ABCI++ will come in; the checkpoint is stripped from the relayer and this transaction is proposed by the parent validators directly.