Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Beat itest [3/3]: fix all itest flakes #9260

Open
wants to merge 32 commits into
base: yy-beat-itest-flakes
Choose a base branch
from

Conversation

yyforyongyu
Copy link
Member

@yyforyongyu yyforyongyu commented Nov 12, 2024

Fix all the itest flakes to make sure the blockbeat works as expected. The key results,

  • All itest flakes are now documented and fixed.

  • A large decrease in the time taken to run the CI, e.g., for btcd itest, previously it took 45m and now it takes around 18m.

Check #9306 for more context.

In this final PR, we focus on breaking down the large tests into smaller ones, skipping some flaky tests for windows, and minor flake fixes.

TODOs:

  • create issues to document the flakes.

Copy link
Contributor

coderabbitai bot commented Nov 12, 2024

Important

Review skipped

Auto reviews are limited to specific labels.

🏷️ Labels to auto review (1)
  • llm-review

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Experiment)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

Pull reviewers stats

Stats of the last 30 days for lnd:

User Total reviews Time to review Total comments
guggero
🥇
21
▀▀▀
1d 2h 36m
26
▀▀
yyforyongyu
🥈
12
▀▀
2d 5h 52m
38
▀▀▀
ellemouton
🥉
6
16h 51m
19
bhandras
5
3h 28m
2
ProofOfKeags
4
5d 4h 23m
▀▀
14
dstadulis
4
2h 13m
7
ziggie1984
4
12h 3m
6
ffranr
3
2d 2h 46m
1
Roasbeef
2
3d 12h 37m
6
bitromortac
2
11h 1m
2
saubyk
2
9d 9h 7m
▀▀▀
7
ViktorTigerstrom
1
3d 23h 47m
4
jharveyb
1
2d 7h 32m
1

@guggero
Copy link
Collaborator

guggero commented Nov 12, 2024

Screenshot From 2024-11-12 12-26-56

OMG, what is this sorcery? I don't think I've ever seen this on an lnd PR...

You are my hero, @yyforyongyu!

@yyforyongyu
Copy link
Member Author

OMG, what is this sorcery? I don't think I've ever seen this on an lnd PR...

It's a bit of cheating as the unit tests are skipped for me to quickly get the CI results🤓 Plus I know there are two more bugs that can cause itest to fail - one is sql-related and the other is graph, but yeah, at least the flakes are fixed and we should see this as a norm in 2025!

@yyforyongyu yyforyongyu force-pushed the yy-beat-itest-optimize branch 2 times, most recently from be8726e to 5a07ffa Compare November 20, 2024 11:32
@yyforyongyu yyforyongyu force-pushed the yy-beat-itest-optimize branch 2 times, most recently from c2aeb68 to 67f9404 Compare November 21, 2024 08:21
@yyforyongyu yyforyongyu force-pushed the yy-beat-itest-optimize branch from 67f9404 to ee869a2 Compare November 21, 2024 13:27
@yyforyongyu yyforyongyu force-pushed the yy-beat-itest-optimize branch from ee869a2 to 4bdf873 Compare November 21, 2024 16:59
@yyforyongyu yyforyongyu force-pushed the yy-beat-itest-optimize branch from 4bdf873 to db80ec6 Compare November 21, 2024 17:00
@yyforyongyu yyforyongyu force-pushed the yy-beat-itest-optimize branch 2 times, most recently from e24d6b9 to 7fba4ab Compare November 25, 2024 06:29
@yyforyongyu yyforyongyu force-pushed the yy-beat-itest-optimize branch 2 times, most recently from 446296c to 9f93fdd Compare November 25, 2024 09:35
@yyforyongyu yyforyongyu force-pushed the yy-beat-itest-optimize branch 2 times, most recently from 7e1c2d4 to b70b0b0 Compare November 25, 2024 14:06
Also removed the duplicate test cases.
Also fixes a wrong usage of `ht.Subtest`.
To make the CI indicative, we now starting tracking the flaky tests
found when running on Windows. As a starting point, rather than ignore
the windows CI entirely, we now identify there are cases where lnd can
be buggy when running in windows.

We should fix the tests in the future, otherwise the windows build
should be deleted.
To increase the speed from 40m per run to roughly 20m per run.
Most of the time we only need to fund the node with given number of
UTXOs without concerning the amount, so we add the more efficient
funding method as it mines a single block in the end.
Previous splitting logic simply put all the remainder in the last
tranche, which could make the last tranche run significantly more test
cases. We now change it so the remainder is evened out across tranches.
For Windows the tests run much slower so we create customized timeouts
for them.
This commit removes the panic used in checking the shutdown log.
Instead, the error is returned and asserted in `shutdownAllNodes` so
it's easier to check which node failed in which test. We also catch all
the errors returned from `StopDaemon` call to properly access the
shutdown behavior.
Keep the SQL, etcd, bitcoin rpcpolling builds and non-ubuntu builds at 8
since they are less stable.
We sometimes see `timeout waiting for UTXOs` error from bitcoind-related
itests due to the chain backend not synced to the miner. We now assert
it's synced before continue.
The response from `ClosedChannels` may not be up-to-date, so we wrap it
inside a wait closure.
@yyforyongyu yyforyongyu force-pushed the yy-beat-itest-optimize branch from df5d749 to cc89b32 Compare December 12, 2024 12:38
Copy link
Collaborator

@bhandras bhandras left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, very nice work @yyforyongyu 🥇

@@ -357,12 +353,6 @@ func testEstimateRouteFee(ht *lntest.HarnessTest) {
break
}
}

mts.ht.CloseChannelAssertPending(mts.bob, channelPointBobPaula, false)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd argue that not starting from the same clean state can alter test outcomes in the extreme case, but as a plus it may end up finding more flakes. We still probably get more readable logs if we clean things up properly.

// be excluded from the test suite atm.
//
// TODO(yy): fix these tests and remove them from this list.
var excludedTestsWindows = []string{
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least we have a nice and tidy TODO list now :)

// Otherwise log a warning if it's mining more than 40 blocks.
desc := "!============================================!\n"

desc += fmt.Sprintf("Too many blocks (%v) mined in one test! Tips:\n",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As an alternative to tranches we could parallelize per itest, but it does require a lot of runners and can be costly. The end result is super fast test runs (see loop server itests).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants