Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Small improvements to various flaky tests #1772

Merged
merged 18 commits into from
Nov 21, 2019
Merged

Conversation

asaj
Copy link
Contributor

@asaj asaj commented Nov 19, 2019

Description

  • Fixes flake in sync end-to-end test, in which we don't always provide enough time for the validators to mine the next block
  • Fixes flake in epoch rewards unit test, in which we don't always mine a block with the intended timestamp offset from the EpochRewards contract deployment block
  • Fixes flake in epoch rewards unit tests due to fixidity imprecision
  • Improves flake in governance end-to-end tests, description below:

When you see the should propose in a round robin fashion test failing, this is typically because one of the nodes is not able to successfully propose, and there's a round change. This is why the expected array (the array of validator signers) winds up being different than the first N block miners of the epoch (where N is the length of the array of validator signers). It seems this typically happens to 0x2ffe970257D93eae9d6B134f528b93b262C31030, which happens to be validator4, i.e. the last validator. Because we bring up validators in series, validator4 has not started its consensus engine by the time other nodes start sending it consensus messages. It fails to handle those messages [2] and disconnects, and does not reconnect by the time it needs to propose a new block. When it does, it doesn't have many (or possibly any) peers to send the preprepare to. To improve this flake, we bring up the nodes in parallel. As a further improvement, we wait a couple epochs before we begin testing blocks, so that if nodes did disconnect from each other, they have time to reconnect.

 145 DEBUG[11-19|20:15:22.450|eth/handler.go:353]                       Ethereum message handling failed         id=2d8a8afe30d9a560 conn=staticdial err="stop     ped engine"
 146 DEBUG[11-19|20:15:22.450|eth/handler.go:215]                       Removing Ethereum peer                   peer=2d8a8afe30d9a560
 147 TRACE[11-19|20:15:22.450|eth/downloader/downloader.go:330]         Unregistering sync peer                  peer=2d8a8afe30d9a560
 148 TRACE[11-19|20:15:22.450|p2p/peer.go:382]                          Protocol istanbul/64 failed              id=2d8a8afe30d9a560 conn=staticdial err="stop     ped engine"

@asaj asaj assigned asaj and unassigned nategraf Nov 19, 2019
@codecov
Copy link

codecov bot commented Nov 20, 2019

Codecov Report

Merging #1772 into master will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master   #1772    +/-   ##
=======================================
  Coverage    74.2%   74.2%            
=======================================
  Files         278     278            
  Lines        7653    7653            
  Branches      956     672   -284     
=======================================
  Hits         5679    5679            
  Misses       1857    1857            
  Partials      117     117
Flag Coverage Δ
#mobile 74.2% <ø> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4570fde...82bac87. Read the comment docs.

@asaj asaj changed the title Reduce precision in epoch rewards test Small fixes to various flaky tests Nov 20, 2019
@asaj asaj assigned nambrot and unassigned asaj Nov 20, 2019
@asaj asaj assigned asaj and unassigned nambrot Nov 20, 2019
@timmoreton timmoreton added the Priority: P0 Blocker label Nov 20, 2019
@nambrot nambrot added betanet-blocker and removed Priority: P0 Blocker labels Nov 20, 2019
@asaj asaj changed the title Small fixes to various flaky tests Small improvements to various flaky tests Nov 21, 2019
@asaj asaj added the automerge Have PR merge automatically when checks pass label Nov 21, 2019
@celo-ci-bot-user celo-ci-bot-user merged commit 0ea0bfc into master Nov 21, 2019
@celo-ci-bot-user celo-ci-bot-user deleted the asaj/protocol-flake branch November 21, 2019 05:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
automerge Have PR merge automatically when checks pass
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants