Skip to content
This repository has been archived by the owner on Jan 11, 2024. It is now read-only.

Top down discussion #181

Closed
wants to merge 7 commits into from
Closed
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 15 additions & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 3 additions & 1 deletion Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ members = [
"fendermint/testing",
"fendermint/testing/*-test",
"fendermint/vm/*",
"fendermint/ipc",
]

# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
Expand Down Expand Up @@ -56,7 +57,7 @@ serde_tuple = "0.5"
serde_with = "2.3"
tempfile = "3.3"
thiserror = "1"
tokio = {version = "1", features = ["rt-multi-thread", "macros"]}
tokio = {version = "1", features = ["rt-multi-thread", "macros", "time"]}
tracing = "0.1"
tracing-subscriber = "0.3"

Expand Down Expand Up @@ -104,3 +105,4 @@ tendermint-proto = { version = "0.31" }

# Using the IPC SDK without the `fil-actor` feature so as not to depend on the actor `Runtime`.
ipc-sdk = { git = "https://github.com/consensus-shipyard/ipc-actors.git", default-features = false, branch = "main" }
ipc-gateway = { git = "https://github.com/consensus-shipyard/ipc-actors.git", features = [] }
1 change: 1 addition & 0 deletions fendermint/app/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,7 @@ fendermint_vm_core = { path = "../vm/core" }
fendermint_vm_interpreter = { path = "../vm/interpreter", features = ["bundle"] }
fendermint_vm_message = { path = "../vm/message" }
fendermint_vm_genesis = { path = "../vm/genesis" }
fendermint_ipc = { path = "../ipc" }

cid = { workspace = true }
fvm = { workspace = true }
Expand Down
71 changes: 71 additions & 0 deletions fendermint/app/src/app.rs
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,11 @@ use std::sync::{Arc, Mutex};

use anyhow::{anyhow, Context, Result};
use async_trait::async_trait;
use bytes::Bytes;
use cid::Cid;
use fendermint_abci::{AbciResult, Application};
use fendermint_ipc::pof::ProofOfFinality;
use fendermint_ipc::IPCMessage;
use fendermint_storage::{
Codec, Encode, KVCollection, KVRead, KVReadable, KVStore, KVWritable, KVWrite,
};
Expand All @@ -24,6 +27,7 @@ use fendermint_vm_interpreter::signed::InvalidSignature;
use fendermint_vm_interpreter::{
CheckInterpreter, ExecInterpreter, GenesisInterpreter, QueryInterpreter,
};
use fendermint_vm_message::chain::ChainMessage;
use fvm::engine::MultiEngine;
use fvm_ipld_blockstore::Blockstore;
use fvm_shared::chainid::ChainID;
Expand Down Expand Up @@ -125,6 +129,8 @@ where
///
/// Zero means unlimited.
state_hist_size: u64,
/// The top down parent finality.
parent_finality: ProofOfFinality,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could maybe convert this into the ParentViewManager responsible for being the interface of the peer with the parent. As discussed in Slack:

  • It keeps track of the latest heights seen as final in the parent.
  • It exposes an interface to query from the parent.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see you have a ParentSyncer abstraction below. I think what we need here is something like this with the aforementioned functionality.

}

impl<DB, SS, S, I> App<DB, SS, S, I>
Expand Down Expand Up @@ -157,6 +163,7 @@ where
interpreter: Arc::new(interpreter),
exec_state: Arc::new(Mutex::new(None)),
check_state: Arc::new(tokio::sync::Mutex::new(None)),
parent_finality: (),
};
app.init_committed_state()?;
Ok(app)
Expand Down Expand Up @@ -291,6 +298,36 @@ where
let state = self.committed_state()?;
Ok((state.state_params, state.block_height))
}

async fn verify_ipc_message(&self, ipc_message: IPCMessage) -> bool {
match ipc_message {
IPCMessage::TopDown(finality) => self.parent_finality.check_finality(&finality).await,
IPCMessage::BottomUp => unimplemented!(),
}
}

/// Verifies the list of transactions to see if they are valid. Returns true if all the txns
/// are valid and false if any one of the txns is invalid.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like we discussed, let's add a reminder here to explore an alternative serialization that would allow us to inspect the first few bytes of a transaction to determine their type without having to deserialize all of them. No need to address it now, but definitely something worth exploring.

async fn verify_messages(&self, txns: &Vec<Bytes>) -> bool {
for tx in txns {
let is_ok = match serde_json::from_slice::<ChainMessage>(tx) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you take a look at the BytesMessageInterpreter you will see that we use IPLD binary format for messages, not JSON. The Cosmos SDK uses JSON, which may be why you thought Tendermint does as well, but it's actually agnostic, and I thought we should stick to the Filecoin convention of IPLD (not that I don't think JSON wouldn't be more convenient).

So this check should not be here, or exist in this format, because checking this is exactly the purpose of BytesMessageInterpreter. I added this check in #185

Err(_) => {
tracing::info!("cannot deserialized transaction: {tx:?}");
continue;
}
Ok(message) => match message {
ChainMessage::IPC(ipc) => self.verify_ipc_message(ipc).await,
_ => continue,
},
};

if !is_ok {
return false;
}
}

true
}
}

// NOTE: The `Application` interface doesn't allow failures at the moment. The protobuf
Expand Down Expand Up @@ -490,6 +527,40 @@ where
Ok(response)
}

async fn prepare_proposal(
&self,
request: request::PrepareProposal,
) -> AbciResult<response::PrepareProposal> {
let max_tx_bytes: usize = request.max_tx_bytes.try_into().unwrap();
let mut size: usize = 0;
let mut txs = Vec::new();
for tx in request.txs {
if size.saturating_add(tx.len()) > max_tx_bytes {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have question here (probably for @aakoshh), when a transaction is disregarded for a proposal, it is still kept in the mempool for a future proposal, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would think so.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

break;
}
size += tx.len();
txs.push(tx);
}
Comment on lines +545 to +547
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this as a utility function to #185 so there's no need to repeat it in app.rs.


let proof = self.parent_finality.get_finality();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is not correct. The view of the parent finality currently seen should be kept IMO by the ParentSyncer, and in this PoF data structure we should only keep the latest finality committed (I think we are merging both concepts here and is a bit confusing).

let bytes = serde_json::to_vec(&ChainMessage::IPC(IPCMessage::TopDown(proof)))
.expect("should not have failed");
txs.push(bytes.into());

Ok(response::PrepareProposal { txs })
}

async fn process_proposal(
&self,
request: request::ProcessProposal,
) -> AbciResult<response::ProcessProposal> {
if !self.verify_messages(&request.txs).await {
Ok(response::ProcessProposal::Reject)
} else {
Ok(response::ProcessProposal::Accept)
}
}

/// Signals the beginning of a new block, prior to any `DeliverTx` calls.
async fn begin_block(&self, request: request::BeginBlock) -> AbciResult<response::BeginBlock> {
let db = self.state_store_clone();
Expand Down
15 changes: 15 additions & 0 deletions fendermint/ipc/Cargo.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
[package]
name = "fendermint_ipc"
version = "0.1.0"
edition = "2021"
description = "Interfacing with IPC"
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html

[dependencies]
tokio = { workspace = true }
anyhow = { workspace = true }
ipc-sdk = { workspace = true }
async-trait = { workspace = true }
tracing = { workspace = true }
serde = { workspace = true }
serde_json = { workspace = true }
18 changes: 18 additions & 0 deletions fendermint/ipc/src/lib.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
// Copyright 2022-2023 Protocol Labs
// SPDX-License-Identifier: Apache-2.0, MIT
//! Interfacing with IPC, provides utility functions

mod message;
mod parent;
pub mod pof;

pub use message::IPCMessage;

#[derive(Debug, Clone)]
pub struct Config {
/// The number of blocks to delay reporting when creating the pof
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

chain_head_delay: u64,

/// Parent syncing cron period, in seconds
polling_interval: u64,
}
16 changes: 16 additions & 0 deletions fendermint/ipc/src/message.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
//! IPC messages

use crate::pof::IPCParentFinality;
use serde::{Deserialize, Serialize};

#[derive(Debug, Clone, Serialize, Deserialize)]
pub enum IPCMessage {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's move the IPC related messages (and messages only) into the fendermint_vm_message crate, so that doesn't have to include references to crates with implementation such as the IPC Agent proxy. It should be a crate that anyone can add as a dependency to be able to inspect chain messages, until we figure out a way to make them more generic.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On a side note: I always struggle to decide between using acronyms in PascalCasing names, but I usually go with only capitalising the first letter, e.g. ProductId, XmlHttpRequest, JsonRpcServer, rather than ProductID, XMLHTTPRequest, JSONRPCServer. I see there are all sorts of variations out there, but I found this to be a perhaps less aesthetically pleasing, but at least consistent way.

Have you guys established the IPC* prefix?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the equivalent of this enum in the crate I mentioned above in #187 with a placeholder for TopDown.

TopDown(IPCParentFinality),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yes, almost forgot: because every validator has the actual content of this from its own parent, there is no need to propose it in its entire glory.

The IPCParentFinality (isn't there a better name? IpcTopDownBundle?) can have a unknown number of messages - imagine that the subnet nodes are restarting after a relatively long hiatus, and now it's putting in multiple heights finalized on the parent, hours or something, and now we are pushing thousands of cross messages to each other while timing out and screaming "I know!" because we already have them in memory locally too!

Instead of including everything in the proposal, we can include just the height/nonce of the messages up to which point we want to execute, and the CID of the resulting data structure that has all those messages in it, including the membership set up to that point (not everyone's different idea of latest).

Then everyone can construct their view of what that finality cutoff should contain and what its CID should be, and decide whether to vote on it or not.

This will also require a way to actually retrieve that CID from the others in case you don't have it during execution, so it's a good idea to save it into the blockstore, once it's agreed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think using Cid to replace all the top down messages will be a speed boost. But the issue now, like you mentioned, if someone does not have this Cid stored, then there should be a way to retrieve it from other peers. Currently this mechanism is missing. Or I misunderstood.

BottomUp,
}

impl From<IPCMessage> for Vec<u8> {
fn from(value: IPCMessage) -> Self {
serde_json::to_vec(&value).expect("should not happen")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we sure this should not happen? What if someone by mistake does this over a Vec<u8> that is not a IPCMessage? I don't think panicking is the smartest approach. What if we implement try_from instead?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, this is super confusing actually. I read it as parsing Vec<u8> like 3 times already.

So yes, this will (probably) not fail because we can always convert to JSON, and this isn't the case when we're running the risk of a malicious validator passing us some untrustworthy byte content. However, I'd argue this implementation should not exist: it is not an inherent property of IPCMessage that it's transformed to binary via JSON. Case in point: we actually use IPLD. Let's not decide for everyone what they should use, and leave it up to the application implementer.

}
}
29 changes: 29 additions & 0 deletions fendermint/ipc/src/parent.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
// Copyright 2022-2023 Protocol Labs
// SPDX-License-Identifier: Apache-2.0, MIT

//! Parent syncing related functions

use async_trait::async_trait;

/// Obtain the latest state information required for IPC from the parent subnet
#[async_trait]
pub trait ParentSyncer {
/// Get the latest height
async fn latest_height(&self) -> anyhow::Result<u64>;
/// Get the block hash at the target height
async fn block_hash(&self, height: u64) -> anyhow::Result<Vec<u8>>;
}

/// Constantly syncing with parent through polling
pub struct PollingParentSyncer {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be great if we can have an implementation of ParentSyncer periodically (and in parallel) populates a cache with the latest final information it has from the parent, and includes an option to explicitly query the IPC agent if there's a cache miss for when a peer is syncing from scratch.

Maybe we could include a flag to determine if a call to the ParentSyncer wants to be done with_fallback (triggering a query to the IPC agent if there is a cache miss) or without it.


#[async_trait]
impl ParentSyncer for PollingParentSyncer {
async fn latest_height(&self) -> anyhow::Result<u64> {
todo!()
}

async fn block_hash(&self, _height: u64) -> anyhow::Result<Vec<u8>> {
todo!()
}
}
Loading