Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: fixing some nondeterministic tests #132

Merged
merged 1 commit into from
May 19, 2022
Merged

WIP: fixing some nondeterministic tests #132

merged 1 commit into from
May 19, 2022

Conversation

jaekwon
Copy link
Contributor

@jaekwon jaekwon commented Apr 22, 2022

From https://github.com/moul/gno/runs/6054431723?check_suite_focus=true

  • fix TestTransportMultiplexAcceptNonBlocking
  • fix TestSignerVoteKeepAlive

The timeout based tests are definitely a problem.
The privval socket system maybe requires a redo, or merging fixes from mainline tendermint at some point.
The p2p/transport system is also pretty confusing.

This PR can stay open while we find more issues.

@moul
Copy link
Member

moul commented Apr 22, 2022

I also have experience of flappy tests as soon as we're doing protocol using p2p; it's a real nightmare

what we did:

  • extend go to support a new kind of tests -> the flappy ones, that should work more than 0 times when run multiple times; it's not perfect but it's still better to not test at all. The goal is to have a way to specify from inside a test that the test is flappy so it's not run with the standard go test flow, but is run with a dedicated makefile rule that just runs the flappy tests. If you're interested, I can bootstrap the CI+makefile stuff to manage this.
  • a huge work on channels, contexts to replace most of the time.Sleep

@jaekwon
Copy link
Contributor Author

jaekwon commented Apr 27, 2022

@moul
They look to be located in ./pkgs/bft/{consensus,blockchain}; one of the is probably « TestReactorRecordsVotesAndBlockParts »
But it’s very inconsistent, I just made 10 run without issue

@moul
Copy link
Member

moul commented May 17, 2022

$ time go test -v ./pkgs/bft/blockchain
[…]
TestBadBlockStopsPeer
[…]
panic: Failed to process committed block (5:5CE69BF4527330E3B6055F2AA9936FA1C03CEADEF8F057E90A93BB9AD4269906): Wrong Block.Header.LastBlockID.  Expected 3B14C912CF5C850CA146D747D263159459E320907A2C6B70ECF3B8E863538271:1:F3F7E09BF331, got 32A5F608D8548CDBBF4522F7AF25E6FC82FD9963CB00F2F1777EB99F2920111B:1:A0A9F448A38A

goroutine 44473 [running]:
github.com/gnolang/gno/pkgs/bft/blockchain.(*BlockchainReactor).poolRoutine(0xc00bed0f00)
        /home/moul/go/src/moul.io/gno4/pkgs/bft/blockchain/reactor.go:343 +0x1425
created by github.com/gnolang/gno/pkgs/bft/blockchain.(*BlockchainReactor).OnStart
        /home/moul/go/src/moul.io/gno4/pkgs/bft/blockchain/reactor.go:117 +0x87
FAIL    github.com/gnolang/gno/pkgs/bft/blockchain      201.628s
FAIL

@moul
Copy link
Member

moul commented May 17, 2022

$ time go test -v ./pkgs/bft/consensus/ -count 2 -run TestHandshakeReplayNone -test.timeout=60m
[…]
.level 1 .msg Executed block height 6 validTxs 0 invalidTxs 0                                                                                                                                                                                         
.level 2 .msg CONSENSUS FAILURE!!! [err leveldb: closed stack goroutine 200 [running]:                                                                                                                                                                
runtime/debug.Stack()                                                                                                                                                                                                                                 
        /home/moul/.gvm/gos/go1.18/src/runtime/debug/stack.go:24 +0x65                                                                                                                                                                                
github.com/gnolang/gno/pkgs/bft/consensus.(*ConsensusState).receiveRoutine.func2()                                                                                                                                                                    
        /home/moul/go/src/moul.io/gno4/pkgs/bft/consensus/state.go:609 +0x4c                                                                                                                                                                          
panic({0x94ce60, 0xc00020b4d0})                                                                                                                                                                                                                       
        /home/moul/.gvm/gos/go1.18/src/runtime/panic.go:838 +0x207                                                                                                                                                                                    
github.com/gnolang/gno/pkgs/db.(*GoLevelDB).Set(0x9a1d60?, {0xded478?, 0x199?, 0x7f04e24c4a68?}, {0xc002fdb0b0?, 0xc000780400?, 0x199?})                                                                                                              
        /home/moul/go/src/moul.io/gno4/pkgs/db/go_level_db.go:67 +0x7f                                                                                                                                                                                
github.com/gnolang/gno/pkgs/bft/abci/example/kvstore.saveState({{0xabb6b0, 0xc00310a538}, 0x0, 0x6, {0xc002fe3918, 0x8, 0x8}})
        /home/moul/go/src/moul.io/gno4/pkgs/bft/abci/example/kvstore/kvstore.go:45 +0xe7
github.com/gnolang/gno/pkgs/bft/abci/example/kvstore.(*KVStoreApplication).Commit(0xc0007c0080)
        /home/moul/go/src/moul.io/gno4/pkgs/bft/abci/example/kvstore/kvstore.go:106 +0x12a
github.com/gnolang/gno/pkgs/bft/abci/example/kvstore.(*PersistentKVStoreApplication).Commit(0xc002fee120?)
        /home/moul/go/src/moul.io/gno4/pkgs/bft/abci/example/kvstore/persistent_kvstore.go:85 +0x5b
github.com/gnolang/gno/pkgs/bft/abci/client.(*localClient).CommitSync(0xe66440?)
        /home/moul/go/src/moul.io/gno4/pkgs/bft/abci/client/local_client.go:188 +0xd4
github.com/gnolang/gno/pkgs/bft/proxy.(*appConnConsensus).CommitSync(0x203000?)
        /home/moul/go/src/moul.io/gno4/pkgs/bft/proxy/app_conn.go:81 +0x82
github.com/gnolang/gno/pkgs/bft/state.(*BlockExecutor).Commit(_, {{0x9f45b7, 0xb}, {0x9f45b7, 0xb}, {0x9f1f20, 0x6}, {0xc000815a70, 0xf}, 0x6, ...}, ...)
        /home/moul/go/src/moul.io/gno4/pkgs/bft/state/execution.go:176 +0x269
github.com/gnolang/gno/pkgs/bft/state.(*BlockExecutor).ApplyBlock(_, {{0x9f45b7, 0xb}, {0x9f45b7, 0xb}, {0x9f1f20, 0x6}, {0xc000815a70, 0xf}, 0x6, ...}, ...)
        /home/moul/go/src/moul.io/gno4/pkgs/bft/state/execution.go:133 +0x52e
github.com/gnolang/gno/pkgs/bft/consensus.(*ConsensusState).finalizeCommit(0xc0000dd500, 0x6)
        /home/moul/go/src/moul.io/gno4/pkgs/bft/consensus/state.go:1353 +0x97e
github.com/gnolang/gno/pkgs/bft/consensus.(*ConsensusState).tryFinalizeCommit(0xc0000dd500, 0x6)
        /home/moul/go/src/moul.io/gno4/pkgs/bft/consensus/state.go:1281 +0x305
github.com/gnolang/gno/pkgs/bft/consensus.(*ConsensusState).enterCommit.func1()
        /home/moul/go/src/moul.io/gno4/pkgs/bft/consensus/state.go:1227 +0x8e
github.com/gnolang/gno/pkgs/bft/consensus.(*ConsensusState).enterCommit(0xc0000dd500, 0x6, 0x0)
        /home/moul/go/src/moul.io/gno4/pkgs/bft/consensus/state.go:1258 +0xab0
github.com/gnolang/gno/pkgs/bft/consensus.(*ConsensusState).addVote(0xc0000dd500, 0xc002600f00, {0x0, 0x0})
        /home/moul/go/src/moul.io/gno4/pkgs/bft/consensus/state.go:1646 +0x8cf
github.com/gnolang/gno/pkgs/bft/consensus.(*ConsensusState).tryAddVote(0xc0000dd500, 0x475886?, {0x0?, 0xc00260dbf0?})
        /home/moul/go/src/moul.io/gno4/pkgs/bft/consensus/state.go:1489 +0x27
github.com/gnolang/gno/pkgs/bft/consensus.(*ConsensusState).handleMsg(0xc0000dd500, {{0xab47c0?, 0xc0000af3a8?}, {0x0?, 0xeda159371?}})
        /home/moul/go/src/moul.io/gno4/pkgs/bft/consensus/state.go:695 +0x388
github.com/gnolang/gno/pkgs/bft/consensus.(*ConsensusState).receiveRoutine(0xc0000dd500, 0x0)
        /home/moul/go/src/moul.io/gno4/pkgs/bft/consensus/state.go:654 +0x465
created by github.com/gnolang/gno/pkgs/bft/consensus.(*ConsensusState).OnStart
        /home/moul/go/src/moul.io/gno4/pkgs/bft/consensus/state.go:346 +0x4cf
 [wal_generator wal_generator]]
.level 1 .msg Starting baseWAL impl baseWAL
.level 1 .msg Starting Group impl Group
.level 1 .msg Searching for height height 1 min 0 max 0
.level 1 .msg Found height 1 index 0
.level 1 .msg Executed block height 1 validTxs 0 invalidTxs 0 
.level 1 .msg Committed state height 1 txs 0 appHash 0000000000000000
.level 1 .msg Executed block height 2 validTxs 0 invalidTxs 0 
.level 1 .msg Committed state height 2 txs 0 appHash 0000000000000000
.level 1 .msg Executed block height 3 validTxs 0 invalidTxs 0 
.level 1 .msg Committed state height 3 txs 0 appHash 0000000000000000
.level 1 .msg Executed block height 4 validTxs 0 invalidTxs 0 
.level 1 .msg Committed state height 4 txs 0 appHash 0000000000000000
.level 1 .msg Executed block height 5 validTxs 0 invalidTxs 0 
.level 1 .msg Committed state height 5 txs 0 appHash 0000000000000000
.level 1 .msg Executed block height 6 validTxs 0 invalidTxs 0 
.level 1 .msg Committed state height 6 txs 0 appHash 0000000000000000
    replay_test.go:679: Error on abci handshake: error on replay: App block height (12) is higher than core (6)
.level 1 .msg Stopping baseWAL impl baseWAL
.level 1 .msg Stopping Group impl Group                
--- FAIL: TestHandshakeReplayNone (0.24s)
FAIL                                                                                                                       
FAIL    github.com/gnolang/gno/pkgs/bft/consensus       1.377s       
FAIL                                                                                                                        

Seems to be stable when called with -count 1, but not when >=2

Full logs here -> https://gist.github.com/moul/d15e000f8be82f8fc45325c3a8df1a19

@moul
Copy link
Member

moul commented May 18, 2022

I've merged #201 and opened #202

What do you think about merging this PR, and use the #202 issue to keep track of the ongoing flappy tests?

@moul
Copy link
Member

moul commented May 19, 2022

@jaekwon, I'll merge your PR now, because master is red again because of "TestTransportMultiplexAcceptNonBlocking"

let's continue on #202

@moul moul merged commit 1e68cd1 into master May 19, 2022
@moul moul deleted the deflap branch May 19, 2022 11:20
@moul moul added this to the 🏗1️⃣ test1.gno.land milestone Oct 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

2 participants