[fix] #1716: Fix consensus failure with f=0 cases #2124

s8sato · 2022-04-20T06:14:14Z

Description of the Change

The topology reshuffles when the shift goes full circle
Either of the followings:
- Give an true solution to the failure
- ~~Give an mitigation by exposing the number of the last view changes as the network sanity indicator~~

Minor changes

As a bonus, expose the number of the view changes in the current round as the network sanity indicator
Refactor kura_inspector with clap version bump

Issue

Closes The consensus fails when f decrements to 0 #1716

Benefits

Possible Drawbacks

core/src/sumeragi/network_topology.rs

s8sato · 2022-04-20T18:50:14Z

core/src/sumeragi/network_topology.rs

+    pub fn reshuffle_after(&self) -> u64 {
+        self.sorted_peers.len() as u64


I think this fits to the view change strategy

I also think it's better not to have it be a configuration parameter.

Here are some of my preliminary thoughts.

I think you are assuming that the peers are shifted by one. I think they are but I'm not entirely certain. Can you confirm this?

Now, if we analyze the problem here, the end goal is to find the next valid topology, if there is a valid topology, in a minimum number of iterations in the average case (or maybe in the worst case?). Since we don't have hijiri yet and we assume all peers to be equally trustworthy I think we could gain some benefit from making it a bit more stochastic, i.e. reshuffling even before we've exhausted all the permutations with the current relative ordering.

I also notice, that we don't bring down view_change_proofs to 0 after reshuffling. Is that ok? I think this indicates that we'll be reshuffling all the time after reshuffle_after_n_view_changes is reached

Can you confirm this?

Yeah, confirmed by a new unit test -- network_topology::tests::topology_shifts_by_one_at_view_change

find the next valid topology

Interesting problem.
I guess one of your saying is that just shifting by one is very likely to result in similar faulty peers.
So if every peer were uniformly faulty, I'd minimize the overlapped section -- shift by f.
In reality, however, things are more complicated because

the leader and the proxy tail are privileged

there can be also artificial faults

reshuffling all the time after reshuffle_after_n_view_changes is reached

You are right. This is a mis-implementation unless the parameter increases with every reshuffle.
https://github.com/hyperledger/iroha/blob/81f46bb1a14e16d6eb4cf33f691b673237ec371d/core/src/sumeragi/network_topology.rs#L208
Well, I think we can check if view_change_proofs.len() is congruent to 0 modulo sorted_peers.len()
-- as long as we take the current shift-by-one strategy

Well, I think we can check if view_change_proofs.len() is congruent to 0 modulo sorted_peers.len()

yeah, this would do in the current implementation. If you're going to fix this I would appreciate a test as well, if not, please open an issue

In reality, however, things are more complicated because

It's hard to say what would be the best method of shifting/shuffling peers. Maybe we should open a discussion about this in a separate issue and have it shift a full circle here as you did

As for the shuffle/shift judge I'll fix it and make a test in this PR.
As for the comprehensive discussion about the view change method, created #2133

core/src/sumeragi/fault.rs

core/src/sumeragi/network_topology.rs

docs/source/references/api_spec.md

Signed-off-by: s8sato <49983831+s8sato@users.noreply.github.com>

…work sanity indicator Signed-off-by: s8sato <49983831+s8sato@users.noreply.github.com>

Signed-off-by: s8sato <49983831+s8sato@users.noreply.github.com>

core/src/sumeragi/network_topology.rs

…yperledger-iroha#2124)

github-actions bot added the iroha2-dev The re-implementation of a BFT hyperledger in RUST label Apr 20, 2022

s8sato force-pushed the fix/1716 branch from 318fc70 to 950feae Compare April 20, 2022 16:46

s8sato marked this pull request as ready for review April 20, 2022 18:25

s8sato requested review from appetrosyan, mversic and Arjentix as code owners April 20, 2022 18:25

s8sato commented Apr 20, 2022

View reviewed changes

core/src/sumeragi/network_topology.rs Show resolved Hide resolved

s8sato commented Apr 20, 2022

View reviewed changes

s8sato changed the title ~~[fix] #1716: Rescue the consensus failure when f decrements to 0~~ [fix] #1716: Fix consensus failure with f=0 cases Apr 20, 2022

mversic previously approved these changes Apr 21, 2022

View reviewed changes

core/src/sumeragi/fault.rs Show resolved Hide resolved

core/src/sumeragi/network_topology.rs Outdated Show resolved Hide resolved

core/src/sumeragi/network_topology.rs Show resolved Hide resolved

appetrosyan requested a review from SamHSmith April 21, 2022 08:40

appetrosyan assigned appetrosyan and mversic Apr 21, 2022

appetrosyan reviewed Apr 21, 2022

View reviewed changes

docs/source/references/api_spec.md Show resolved Hide resolved

s8sato mentioned this pull request Apr 21, 2022

Revise the view change method #2133

Closed

s8sato added 5 commits April 22, 2022 16:47

Let n_topology_shifts_before_reshuffle be the topology length

cc02a09

Signed-off-by: s8sato <49983831+s8sato@users.noreply.github.com>

Refactor kura_inspector with clap version bump

8e1f3ac

Signed-off-by: s8sato <49983831+s8sato@users.noreply.github.com>

Make the leader commit at its own discretion and broadcast when f=0

b6a3560

Signed-off-by: s8sato <49983831+s8sato@users.noreply.github.com>

Expose the number of the view changes in the current round as the net…

4ba9da7

…work sanity indicator Signed-off-by: s8sato <49983831+s8sato@users.noreply.github.com>

Apply review comments

1d40b46

Signed-off-by: s8sato <49983831+s8sato@users.noreply.github.com>

s8sato dismissed mversic’s stale review via 1d40b46 April 22, 2022 07:47

s8sato force-pushed the fix/1716 branch from 8fbe8a3 to 1d40b46 Compare April 22, 2022 07:47

s8sato commented Apr 22, 2022

View reviewed changes

core/src/sumeragi/network_topology.rs Show resolved Hide resolved

mversic requested review from appetrosyan and mversic April 22, 2022 07:55

mversic approved these changes Apr 22, 2022

View reviewed changes

appetrosyan approved these changes Apr 22, 2022

View reviewed changes

appetrosyan merged commit da64704 into hyperledger-iroha:iroha2-dev Apr 22, 2022

s8sato added the api-changes Changes in the API for client libraries label Apr 25, 2022

mversic pushed a commit to mversic/iroha that referenced this pull request May 2, 2022

[fix] hyperledger-iroha#1716: Fix consensus failure with f=0 cases (h…

d3d2708

…yperledger-iroha#2124)

appetrosyan pushed a commit to appetrosyan/iroha that referenced this pull request May 4, 2022

[fix] hyperledger-iroha#1716: Fix consensus failure with f=0 cases (h…

3eef19c

…yperledger-iroha#2124)

appetrosyan pushed a commit to appetrosyan/iroha that referenced this pull request May 12, 2022

[fix] hyperledger-iroha#1716: Fix consensus failure with f=0 cases (h…

e0d7fe0

…yperledger-iroha#2124)

appetrosyan pushed a commit to appetrosyan/iroha that referenced this pull request May 12, 2022

[fix] hyperledger-iroha#1716: Fix consensus failure with f=0 cases (h…

346c417

…yperledger-iroha#2124)

appetrosyan pushed a commit to appetrosyan/iroha that referenced this pull request May 12, 2022

[fix] hyperledger-iroha#1716: Fix consensus failure with f=0 cases (h…

38e0920

…yperledger-iroha#2124)

appetrosyan pushed a commit to appetrosyan/iroha that referenced this pull request May 12, 2022

[fix] hyperledger-iroha#1716: Fix consensus failure with f=0 cases (h…

ca30699

…yperledger-iroha#2124)

appetrosyan pushed a commit to appetrosyan/iroha that referenced this pull request May 12, 2022

[fix] hyperledger-iroha#1716: Fix consensus failure with f=0 cases (h…

e14e8ff

…yperledger-iroha#2124)

appetrosyan pushed a commit to appetrosyan/iroha that referenced this pull request May 12, 2022

[fix] hyperledger-iroha#1716: Fix consensus failure with f=0 cases (h…

aae24c3

…yperledger-iroha#2124)

mversic pushed a commit to mversic/iroha that referenced this pull request May 13, 2022

[fix] hyperledger-iroha#1716: Fix consensus failure with f=0 cases (h…

e2c0975

…yperledger-iroha#2124)

mversic pushed a commit to mversic/iroha that referenced this pull request May 13, 2022

[fix] hyperledger-iroha#1716: Fix consensus failure with f=0 cases (h…

43f67f3

…yperledger-iroha#2124)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[fix] #1716: Fix consensus failure with f=0 cases #2124

[fix] #1716: Fix consensus failure with f=0 cases #2124

s8sato commented Apr 20, 2022 •

edited

Loading

s8sato Apr 20, 2022 •

edited

Loading

mversic Apr 21, 2022

mversic Apr 21, 2022

s8sato Apr 21, 2022

mversic Apr 21, 2022

mversic Apr 21, 2022 •

edited

Loading

s8sato Apr 21, 2022 •

edited

Loading

		pub fn reshuffle_after(&self) -> u64 {
		self.sorted_peers.len() as u64

[fix] #1716: Fix consensus failure with f=0 cases #2124

[fix] #1716: Fix consensus failure with f=0 cases #2124

Conversation

s8sato commented Apr 20, 2022 • edited Loading

Description of the Change

Minor changes

Issue

Benefits

Possible Drawbacks

s8sato Apr 20, 2022 • edited Loading

Choose a reason for hiding this comment

mversic Apr 21, 2022

Choose a reason for hiding this comment

mversic Apr 21, 2022

Choose a reason for hiding this comment

s8sato Apr 21, 2022

Choose a reason for hiding this comment

mversic Apr 21, 2022

Choose a reason for hiding this comment

mversic Apr 21, 2022 • edited Loading

Choose a reason for hiding this comment

s8sato Apr 21, 2022 • edited Loading

Choose a reason for hiding this comment

s8sato commented Apr 20, 2022 •

edited

Loading

s8sato Apr 20, 2022 •

edited

Loading

mversic Apr 21, 2022 •

edited

Loading

s8sato Apr 21, 2022 •

edited

Loading