feat: Multinode P2P sync and POA ( Proof of Authority ) #1333

piux2 · 2023-11-04T03:57:31Z

Contributors' checklist...

Added new tests, or not needed, or not feasible
Provided an example (e.g. screenshot) to aid review or the PR is self-explanatory
Updated the official documentation or not needed
No breaking changes were made, or a BREAKING CHANGE: xxx message was included in the description
Added references to related issues and PRs
Provided any useful hints for running manual tests
Added new benchmarks to generated graphs, if any. More info here.

This is the solution for #1332

albttx · 2023-11-06T00:29:21Z

Hello, i wanted to do the same work on gnoland cli to easier interactions for setuping new chains / nodes on #1102

But please see discussion here #1105 and #1201 about the choices that have been made :)

moul

I need additional time to review.

Early concerns (to be confirmed during the real review):

The issue related to overriding os.Std*.
Exploring the possibility of moving most of the coordination logic to cmd/genesis to keep gnoland lightweight.
Considering an alternative coordination strategy that relies on full contract-based coordination. In this approach, the genesis can configure the initial set instead of maintaining a dual system where it is used for only a few (possibly only one) blocks.

moul · 2023-11-06T16:56:11Z

@zivkovicmilos can you check how much it overlaps with your work on genesis? if it can be made more compatible or if you think the location is not ideal?

@albttx can you give it a try and give us some feedback?

moul · 2023-12-07T08:02:48Z

gno.land/cmd/gnoland/start.go

 	if err != nil {
 		return fmt.Errorf("error in creating node: %w", err)
 	}

-	fmt.Fprintln(io.Err, "Node created.")
+	fmt.Println("Node created.")

 	if c.skipStart {


If we introduce a new "init" subcommand, we need to remove the "--skip-start" option from the "start" subcommand. This option was used to execute an initialization phase in a command before running on an initialized node.

For additional information, please refer to this PR: #1050

moul · 2023-12-07T08:04:43Z

I'm starting a full review now. To begin, I believe this PR is necessary for the launch as it helps manage the multinode p2p system. However, it is not directly related to PoA as it does not manage the validator set directly. Perhaps the title should be updated to reflect this.

moul · 2023-12-07T08:09:18Z

gno.land/cmd/gnoland/start_test.go

-			io.SetOut(commands.WriteNopCloser(mockOut))
-			io.SetErr(commands.WriteNopCloser(mockErr))
-			cmd := newRootCmd(io)
+		in, err := NewMockStdin(name)


It appears that it is used more as an Stdout mock than an Stdin one. The name is likely incorrect. Nevertheless, I want to remove the mockio.go file from this PR. I need to find a link where I explained the reason, or I'll add a new comment here to provide the reasoning.

OK, I remember now.

The goal is to transition to a logger-only approach for tm2.

Could you please simplify this pull request by removing all the added mock features and keeping it as it was before? I will create an issue to ensure that we replace commands.IO with a logger.

Here is the issue: #1419

moul · 2023-12-07T08:12:05Z

gno.land/cmd/gnoland/README.md


    $> gnoland start

 Afterward, you can interact with [`gnokey`](../gnokey) or launch a [`gnoweb`](../gnoweb) interface.
+
+
+## Option 2: Run a node and sync with a Proof of Authority (POA) network


Suggested change

## Option 2: Run a node and sync with a Proof of Authority (POA) network

## Option 2: Run a full node and sync with a network

This is a valid instruction for running a full node on a network, whether it is PoA or not.

moul · 2023-12-07T08:13:28Z

gno.land/cmd/gnoland/README.md

+
+- Retrive node id and give it trusted peers.
+
+      $> gnoland node


Suggested change

$> gnoland node

$> gnoland init

$> gnoland node

moul · 2023-12-07T08:17:33Z

gno.land/cmd/gnoland/README.md

+
+- Download genesis.json from a trusted source and save it to the root-dir/config directory.
+
+- Add your validator to the genesis file.


If we want to retain the POA title in this PR, which I find confusing, I suggest revamping the example. Instead of just adding yourself as a validator, the example should emphasize coordinating with others and running the genesis validator add multiple times.

The current explanation is not clear. A possible improvement is to instruct validators to add their node ID to the shared genesis file (e.g., on a GitHub repo) and ensure they start with persistent node IDs for connectivity with the updated genesis file.

However, it's important to include a prominent warning that this approach is not actually PoA or Po something. It's simply a genesis with peers and no proof of anything except "being in the genesis."

This is the gentx system for validators to be able to add themself inside the genesis.json
cf: #1102

moul · 2023-12-07T08:18:10Z

gno.land/cmd/gnoland/README.md

+
+## Option 2: Run a node and sync with a Proof of Authority (POA) network
+
+ - gnoland init


I believe this step is unnecessary in option 2.

Yes, it's needed to generate the $DAEMON_HOME (GNO_HOME) directory, with the config and data/

moul · 2023-12-07T08:21:38Z

gno.land/cmd/gnoland/README.md

+## Reset `gnoland` node back to genesis state. It's suitable for running test node
+
+    $> gnoland unsafe-reset-all
+
+It removes the database and reset validator state back to genesis state but leaves the genesis.json and config.toml files unchanged.
+
+The `unsafe-reset-all` command is labeled "unsafe" because:
+
+1. It irreversibly deletes all node data, risking data loss.
+2. It may lead to double signing or chain forks in production.
+3. It resets the `priv_validator_state.json`, and can cause network disruption if uncoordinated.
+
+## Reset `gnoland` node history back to genesis state.
+
+ It removes the datastore and keeps the validator state unchanged. It reduces the risk of double signing and chain fork when we sync history state from the genesis. The validator will not sign a block until the node has synced, passing the state where the validator stopped signing.
+
+    $> gnoland reset-state


@albttx, what are your thoughts on these commands? Have you run them in production or on the testnet in the cosmos ecosystem? I want to understand which case these two commands handle, and I think maybe one command would suffice.

@moul I never used it before because it's new and i wasn't aware of it, but i might now use it often!

In tendermint, the more data it stored for a validator, the slower it is...
So we often restart from fresh install (prunned snapshot or state-sync) to have only latest block and be more efficient.
gnoland reset-state will be use in that case

moul · 2023-12-07T08:23:30Z

gno.land/cmd/gnoland/init.go

+		commands.Metadata{
+			Name:       "init",
+			ShortUsage: "init",
+			ShortHelp:  "initialize gnoland node",


Currently, I believe it doesn't initialize a Gnoland node (the start command does that), but it does initialize a validator keypair. Perhaps the help message should provide more accurate information, or we could transfer all the initialization logic from start to this location to make the help message more relevant.

moul · 2023-12-07T08:24:33Z

gno.land/cmd/gnoland/init.go

+	privValStateFile := config.PrivValidatorStateFile()
+	var pv *privval.FilePV
+	if tmos.FileExists(privValKeyFile) {
+		logger.Info("Found private validator", "keyFile", privValKeyFile,


What about panicking with the message "initialization already done" or something similar?

@albttx, we need your feedback on the initialization "UX" please.

moul · 2023-12-07T08:26:05Z

gno.land/cmd/gnoland/main.go

+	"os"
+)
+
+func main() {


I believe it would be clearer to review the "main" and "newRootCmd" functions in the same file. I suggest switching back to "root.go" or renaming it to "main.go." However, it should remain as a single file, as it was previously.

moul · 2023-12-07T08:45:03Z

gno.land/cmd/gnoland/mockio.go

@@ -0,0 +1,116 @@
+package main


moul · 2023-12-07T08:48:06Z

gno.land/cmd/gnoland/show_validator.go

+func newValidatorCmd(bc baseCfg) *commands.Command {
+	cmd := commands.NewCommand(
+		commands.Metadata{
+			Name:       "validator",


What if we merge the validator and node into a single command? This command will always display the node ID and optionally display validator information. We can also include additional information, such as the pruning strategy or any other relevant details that you would like to quickly check or share.

Totally agreed, especially considering how a single node will almost always be tied to a single key pair (1 node = 1 validator)

moul · 2023-12-07T08:49:09Z

gno.land/cmd/gnoland/node_id.go

+func newNodeIDCmd(bc baseCfg) *commands.Command {
+	cmd := commands.NewCommand(
+		commands.Metadata{
+			Name:       "node",


As mentioned in the "validator" subcommand, I propose combining "node" and "validator" into one command, possibly called "info" or "node".

Edit: Upon further consideration, I believe it would be beneficial to implement a unique "info" command that provides the node ID, pubkey, and address.

Additionally, I propose using external helpers, potentially located in the contribs/ folder, to obtain more advanced information such as block height, connected peers, process status, and uptime. These helpers can primarily use the RPC info, read the config files or the raw datastore.

moul · 2023-12-07T08:51:36Z

gno.land/cmd/gnoland/root.go

 	)

 	return cmd
 }
+
+// we relies on the flag option to pass in the root directory before we can identify where


The comment is probably better placed above line 40.

moul · 2023-12-07T08:52:19Z

gno.land/cmd/gnoland/root.go

+		newStartCmd(bc),
+		newInitCmd(bc),
+		newResetAllCmd(bc),
+		newResetStateCmd(bc),
+		newNodeIDCmd(bc),
+		newValidatorCmd(bc),


I still need to understand the usages of the reset(s) commands, but I believe we should target these commands, and nothing more:

init

start

info

I agree with @moul, this reset functionality is definitely something that's nice to have (and would've saved me headaches with the local portal loop in the beginning), but I don't think they belong as part of the official chain binary. Resetting chain state shouldn't be straightforward, because it's not an action that should ever happen; even if there is a need for it, initializing a new chain (working directory) instance is always a better way to go

moul · 2023-12-07T08:53:31Z

gno.land/cmd/gnoland/node_id.go

+)
+
+// Display a node's persistent peer ID to the standard output.
+func newNodeIDCmd(bc baseCfg) *commands.Command {


By the way, the filename and function name should be consistent with "NodeID" and the subcommand should also be "node".

moul · 2023-12-07T09:47:10Z

gno.land/cmd/gnoland/root.go

 	cmd := commands.NewCommand(
 		commands.Metadata{
 			ShortUsage: "<subcommand> [flags] [<arg>...]",
-			ShortHelp:  "Starts the gnoland blockchain node",
-			Options: []ff.Option{
-				ff.WithConfigFileFlag("config"),


I'm not sure what I prefer, but I think we should choose one of two extremes. Currently, we're in an intermediary step.

One extreme option is to clearly differentiate between config files and flags. For example, we can remove items like "persistentPeers" that are available both in the config file and as a start flag.

An alternative option is to establish a standard practice of using a dual approach (config file + flag). This could be implemented globally or exclusively in the init function.

To be honest, I'm not certain about a specific plan. I can only say that the current status is a hybrid mix of implicit and explicit approaches. It is inconsistent, with sometimes one way and sometimes two ways, depending on the commands.

Related with #731.

albttx · 2023-12-07T13:10:28Z

gno.land/cmd/gnoland/README.md

+
+- Add your validator to the genesis file.
+
+      $> genesis validator add \


We could be more generic, with a gentx system, it could be use to add genesis validator, to setup account money, setup realms and packages ...

albttx · 2023-12-07T13:12:44Z

gno.land/cmd/gnoland/README.md

+
+- Start the node
+
+      $> gnoland start --prune "nothing"


Flag should be name --pruning as in the config file

albttx · 2023-12-07T13:15:37Z

gno.land/cmd/gnoland/README.md

+
+## Reset `gnoland` node back to genesis state. It's suitable for running test node
+
+    $> gnoland unsafe-reset-all


Please add the flag --keep-addr-book

Question: In cosmos-sdk, this command is under gaiad tendermint unsafe-reset-all
Should we do the same ?

Usage: gaiad tendermint [command] Available Commands: reset-state Remove all the data and WAL show-address Shows this node's tendermint validator consensus address show-node-id Show this node's ID show-validator Show this node's tendermint validator info unsafe-reset-all (unsafe) Remove all the data and WAL, reset this node's validator to genesis state version Print tendermint libraries' version

or maybe gnoland tm2 ?

zivkovicmilos

Thank you for making the first step towards multinode networks 🙏

There are things in this PR that I don't necessarily agree with, and I'd like for us to open discussions on each, because they are critical for UX and general chain orchestration:

Pruning strategies shouldn't be set in this PR; they need more discussion
Currently, you cannot start the chain without both theconfig.toml, and genesis.json; The relationship between these two needs to be completely revised. This PR focuses on adding helper methods for the config, but doesn't touch upon the entanglement with genesis.json. The thing that I've found lacking was the ability to even define values for the config.toml, which directly affect new nodes (ex. mempool settings, peer settings...but also the genesis.json location, which is hardcoded)
This PR doesn't have anything to do with Proof of Authority per-se, it adds the ability to define the initial validator set (but not to change it once the chain is running, in a permissioned way, which PoA is all about). From discussions we've been having with @moul and the team, validator set states / changes should be done on-chain, while initial validator set info can / needs to be set in the genesis. This PR doesn't work in any mechanism to change the set, only to alter the initial node config for the p2p network

I think the entire chain initialization flow we currently have is very clunky, and we can do better. New chain orchestration needs to be straightforward and simple, so it can be packaged into more complex workflows by devs like @albttx.

The genesis.json is the single most important piece of information for any chain; people should be able to start new nodes and connect to a chain using only the genesis.json, their keys, and the node binary -- that's it. The protocol-level configuration for the node can live in a separate file (and path passed in through a flag), but it shouldn't be mandatory like it is for gno chains currently.

There were some other minor things about the PR, mostly nitpicks and concerning docs. There were flows that don't work (ex. genesis validator add takes in a bech32 representation of the public key, and the commands introduced in this PR print a different format).
I agree with everything @moul has put down as part of his detailed review.

What I propose we do:

change the scope of this PR to handle the config.toml, change title and description accordingly
open discussions for the points I've mentioned above

zivkovicmilos · 2023-12-07T12:54:18Z

gno.land/cmd/gnoland/start.go

+	// prune nothing is the archive node setting
+	// prune syncable is the default node setting. It keeps the lastest 100 transactions and everything 1000th tx
+	prune := store.NewPruningOptionsFromString(c.pruneStrategy)
+	pruningOpt := sdk.SetPruningOptions(prune)
+	gnoBaseApp := gnoApp.(*sdk.BaseApp)
+	pruningOpt(gnoBaseApp)
+


I don't think pruning strategies should be a part of this PR, even if they utilize something that already exists in the SDK.

The reason being these are very large design decisions that have amplifications on end-users, to be included as an add-on. Pruning state is not the only change required to multi-node types, this split is more intricate:

what node types do we support?

how far back do they need to keep state?

what kind of state do they need to keep?

Not to mention that this directly affects how users now fetch data from RPC clusters, as now not all nodes can serve request (but need to delegate). Additionally, it also affects the way nodes execute live syncing (not every node has the entire state required to sync a peer). In my mind, these are things that require more planning and coordination, so I strongly suggest leaving out this pruning strategy change from the PR, and open an issue we can converge on

zivkovicmilos · 2023-12-07T12:57:37Z

gno.land/cmd/gnoland/show_validator.go

+func newValidatorCmd(bc baseCfg) *commands.Command {
+	cmd := commands.NewCommand(
+		commands.Metadata{
+			Name:       "validator",


Totally agreed, especially considering how a single node will almost always be tied to a single key pair (1 node = 1 validator)

zivkovicmilos · 2023-12-07T12:59:46Z

gno.land/cmd/gnoland/reset_test.go

+	"github.com/gnolang/gno/tm2/pkg/p2p"
+)
+
+func TestResetAll(t *testing.T) {


You can run tests in this file in parallel (t.Parallel)

zivkovicmilos · 2023-12-07T13:04:32Z

gno.land/cmd/gnoland/root.go

+		newStartCmd(bc),
+		newInitCmd(bc),
+		newResetAllCmd(bc),
+		newResetStateCmd(bc),
+		newNodeIDCmd(bc),
+		newValidatorCmd(bc),


I agree with @moul, this reset functionality is definitely something that's nice to have (and would've saved me headaches with the local portal loop in the beginning), but I don't think they belong as part of the official chain binary. Resetting chain state shouldn't be straightforward, because it's not an action that should ever happen; even if there is a need for it, initializing a new chain (working directory) instance is always a better way to go

zivkovicmilos · 2023-12-07T13:13:57Z

gno.land/cmd/gnoland/README.md

+      Address: "g14t47gv3v2z3pc23g3zr39mnc99w2cplp0jhqvv"
+      Pubkey: "E5IFULgXFdS49ILgvPmO3/8chuSWfbqw3zYXaNEP+60="
+


This output is not the correct format I get when running gnoland node, ex:

NodeID: g1et0nr2z32lhp6qyfqxdlenxmlw9h3auqy7fw7c

In cosmos-sdk, this is

gaiad tendermint show-node-id gaiad tendermint show-address

I believe it should be 2 separate commands because most of the time, people use them with my action to $(gnoland tm2 show-address)

zivkovicmilos · 2023-12-07T13:22:29Z

gno.land/cmd/gnoland/reset_test.go

+	require.Equal(t, int64(10), pv.LastSignState.Height)
+}
+
+func initFilesWithConfig(config *cfg.Config) error {


You can add *testing.T as a param, and call t.Helper inside the method, so it's clear it's a test helper method (not sure how the linter didn't catch this)

zivkovicmilos · 2023-12-07T13:25:42Z

gno.land/cmd/gnoland/root.go

+func (bc *baseCfg) RegisterFlags(fs *flag.FlagSet) {
+	fs.StringVar(
+		&bc.rootDir,
+		"root-dir",


Not sure why, but the flag value for root-dir is not saved, ex:

$ gnoland init --root-dir node1 .level 1 .msg Generated private validator keyFile testdir/config/priv_validator_key.json stateFile testdir/data/priv_validator_state.json .level 1 .msg Generated node key path testdir/config/node_key.json

zivkovicmilos · 2023-12-07T13:29:57Z

gno.land/cmd/gnoland/show_validator.go

+	bz := amino.MustMarshalJSON(pubKey)
+
+	fmt.Printf("Address: \"%v\"\n", pubKey.Address())
+	fmt.Printf(" Pubkey: %v\n", string(bz))


Nitpick: there is an extra space between " and Pubkey

zivkovicmilos · 2023-12-07T13:34:56Z

gno.land/cmd/gnoland/node_id.go

+
+func execNodeID(bc baseCfg) error {
+	config := bc.tmConfig
+	nodeKey, err := p2p.LoadNodeKey(config.NodeKeyFile())


This is a more general question for everyone:
Why is the validator key != node key?

What is the point of keeping 2 separate keys? Why not just use the validator's private key as the node key, since they're both based on the 25519 curve?

zivkovicmilos · 2023-12-07T13:40:57Z

gno.land/cmd/gnoland/README.md

+
+      $> genesis validator add \
+       --address g14t47gv3v2z3pc23g3zr39mnc99w2cplp0jhqvv \
+       --pub-key E5IFULgXFdS49ILgvPmO3/8chuSWfbqw3zYXaNEP+60= \


The value for this flag is not good here, getting an error:
invalid validator public key, string not all lowercase or all uppercase

It's because genesis validator add expects a bech32 representation of the public key

albttx · 2023-12-07T14:32:05Z

gno.land/cmd/gnoland/README.md

+      Address: "g14t47gv3v2z3pc23g3zr39mnc99w2cplp0jhqvv"
+      Pubkey: "E5IFULgXFdS49ILgvPmO3/8chuSWfbqw3zYXaNEP+60="
+


In cosmos-sdk, this is

gaiad tendermint show-node-id gaiad tendermint show-address

I believe it should be 2 separate commands because most of the time, people use them with my action to $(gnoland tm2 show-address)

piux2 · 2024-10-04T15:59:36Z

Outdated.

piux2 added 2 commits November 3, 2023 01:34

feat: multinode sync and POA

627c343

feat: multinode sync and POA

39a2a6c

piux2 requested a review from a team as a code owner November 4, 2023 03:57

github-actions bot added the 📦 ⛰️ gno.land Issues or PRs gno.land package related label Nov 4, 2023

github-actions bot assigned piux2 Nov 4, 2023

piux2 changed the title ~~Feat: Multinode P2P sync and POA ( Proof of Authority )~~ feat: Multinode P2P sync and POA ( Proof of Authority ) Nov 4, 2023

moul requested review from moul and zivkovicmilos November 6, 2023 16:52

moul requested changes Nov 6, 2023

View reviewed changes

moul requested a review from albttx November 6, 2023 16:55

moul mentioned this pull request Nov 7, 2023

feat: unsafe reset all #1196

Open

7 tasks

moul reviewed Dec 7, 2023

View reviewed changes

gno.land/cmd/gnoland/mockio.go

@@ -0,0 +1,116 @@

package main

Copy link

Member

moul Dec 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See #1419

moul reviewed Dec 7, 2023

View reviewed changes

albttx reviewed Dec 7, 2023

View reviewed changes

zivkovicmilos requested changes Dec 7, 2023

View reviewed changes

albttx requested changes Dec 7, 2023

View reviewed changes

moul mentioned this pull request Apr 26, 2024

feat: overhaul lazy init in gnoland start #1985

Merged

7 tasks

piux2 closed this Oct 4, 2024

	## Option 2: Run a node and sync with a Proof of Authority (POA) network
	## Option 2: Run a full node and sync with a network


		- Download genesis.json from a trusted source and save it to the root-dir/config directory.

		- Add your validator to the genesis file.


		## Option 2: Run a node and sync with a Proof of Authority (POA) network

		- gnoland init


		- Add your validator to the genesis file.

		$> genesis validator add \


		## Reset `gnoland` node back to genesis state. It's suitable for running test node

		$> gnoland unsafe-reset-all

		Address: "g14t47gv3v2z3pc23g3zr39mnc99w2cplp0jhqvv"
		Pubkey: "E5IFULgXFdS49ILgvPmO3/8chuSWfbqw3zYXaNEP+60="

feat: Multinode P2P sync and POA ( Proof of Authority ) #1333

feat: Multinode P2P sync and POA ( Proof of Authority ) #1333

Conversation

piux2 commented Nov 4, 2023

albttx commented Nov 6, 2023 • edited Loading

moul left a comment • edited Loading

Choose a reason for hiding this comment

moul commented Nov 6, 2023

Choose a reason for hiding this comment

moul commented Dec 7, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

moul Dec 7, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

moul Dec 7, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zivkovicmilos left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

piux2 commented Oct 4, 2024 • edited Loading

albttx commented Nov 6, 2023 •

edited

Loading

moul left a comment •

edited

Loading

moul Dec 7, 2023 •

edited

Loading

moul Dec 7, 2023 •

edited

Loading

piux2 commented Oct 4, 2024 •

edited

Loading