Beat itest [3/3]: fix all itest flakes #9260

yyforyongyu · 2024-11-12T07:13:12Z

Fix all the itest flakes to make sure the blockbeat works as expected. The key results,

All itest flakes are now documented and fixed.
A large decrease in the time taken to run the CI, e.g., for btcd itest, previously it took 45m and now it takes around 18m.

Check #9306 for more context.

In this final PR, we focus on breaking down the large tests into smaller ones, skipping some flaky tests for windows, and minor flake fixes.

TODOs:

create issues to document the flakes.

coderabbitai · 2024-11-12T07:13:18Z

Important

Review skipped

Auto reviews are limited to specific labels.

🏷️ Labels to auto review (1)

llm-review

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR. (Experiment)
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

github-actions · 2024-11-12T07:17:16Z

Pull reviewers stats

Stats of the last 30 days for lnd:

User	Total reviews	Time to review	Total comments
guggero 🥇	21 ▀▀▀	1d 2h 36m	26 ▀▀
yyforyongyu 🥈	12 ▀▀	2d 5h 52m ▀	38 ▀▀▀
ellemouton 🥉	6 ▀	16h 51m	19 ▀
bhandras	5 ▀	3h 28m	2
ProofOfKeags	4 ▀	5d 4h 23m ▀▀	14 ▀
dstadulis	4 ▀	2h 13m	7 ▀
ziggie1984	4 ▀	12h 3m	6
ffranr	3	2d 2h 46m ▀	1
Roasbeef	2	3d 12h 37m ▀	6
bitromortac	2	11h 1m	2
saubyk	2	9d 9h 7m ▀▀▀	7 ▀
ViktorTigerstrom	1	3d 23h 47m ▀	4
jharveyb	1	2d 7h 32m ▀	1

guggero · 2024-11-12T11:28:28Z

OMG, what is this sorcery? I don't think I've ever seen this on an lnd PR...

You are my hero, @yyforyongyu!

yyforyongyu · 2024-11-12T14:23:42Z

OMG, what is this sorcery? I don't think I've ever seen this on an lnd PR...

It's a bit of cheating as the unit tests are skipped for me to quickly get the CI results🤓 Plus I know there are two more bugs that can cause itest to fail - one is sql-related and the other is graph, but yeah, at least the flakes are fixed and we should see this as a norm in 2025!

Also removed the duplicate test cases.

Also fixes a wrong usage of `ht.Subtest`.

To make the CI indicative, we now starting tracking the flaky tests found when running on Windows. As a starting point, rather than ignore the windows CI entirely, we now identify there are cases where lnd can be buggy when running in windows. We should fix the tests in the future, otherwise the windows build should be deleted.

To increase the speed from 40m per run to roughly 20m per run.

Most of the time we only need to fund the node with given number of UTXOs without concerning the amount, so we add the more efficient funding method as it mines a single block in the end.

Previous splitting logic simply put all the remainder in the last tranche, which could make the last tranche run significantly more test cases. We now change it so the remainder is evened out across tranches.

For Windows the tests run much slower so we create customized timeouts for them.

This commit removes the panic used in checking the shutdown log. Instead, the error is returned and asserted in `shutdownAllNodes` so it's easier to check which node failed in which test. We also catch all the errors returned from `StopDaemon` call to properly access the shutdown behavior.

Keep the SQL, etcd, bitcoin rpcpolling builds and non-ubuntu builds at 8 since they are less stable.

We sometimes see `timeout waiting for UTXOs` error from bitcoind-related itests due to the chain backend not synced to the miner. We now assert it's synced before continue.

The response from `ClosedChannels` may not be up-to-date, so we wrap it inside a wait closure.

bhandras

LGTM, very nice work @yyforyongyu 🥇

bhandras · 2024-12-12T17:31:52Z

itest/lnd_estimate_route_fee_test.go

@@ -357,12 +353,6 @@ func testEstimateRouteFee(ht *lntest.HarnessTest) {
 			break
 		}
 	}
-
-	mts.ht.CloseChannelAssertPending(mts.bob, channelPointBobPaula, false)


I'd argue that not starting from the same clean state can alter test outcomes in the extreme case, but as a plus it may end up finding more flakes. We still probably get more readable logs if we clean things up properly.

bhandras · 2024-12-12T17:32:56Z

itest/list_exclude_test.go

+// be excluded from the test suite atm.
+//
+// TODO(yy): fix these tests and remove them from this list.
+var excludedTestsWindows = []string{


At least we have a nice and tidy TODO list now :)

bhandras · 2024-12-12T17:37:24Z

lntest/harness.go

+	// Otherwise log a warning if it's mining more than 40 blocks.
+	desc := "!============================================!\n"
+
+	desc += fmt.Sprintf("Too many blocks (%v) mined in one test! Tips:\n",


As an alternative to tranches we could parallelize per itest, but it does require a lot of runners and can be costly. The end result is super fast test runs (see loop server itests).

This was referenced Nov 18, 2024

Beat [4/4]: implement Consumer in chainWatcher #9277

Open

Beat [3/4]: prepare resolvers to handle the blockbeat #9276

Merged

Beat [2/4]: implement blockbeat #8894

Merged

yyforyongyu force-pushed the yy-beat-itest-flakes branch from 315ff72 to 2c92455 Compare November 20, 2024 06:16

yyforyongyu force-pushed the yy-beat-itest-optimize branch 2 times, most recently from be8726e to 5a07ffa Compare November 20, 2024 11:32

yyforyongyu force-pushed the yy-beat-itest-flakes branch from 2c92455 to 88350c6 Compare November 20, 2024 12:40

yyforyongyu force-pushed the yy-beat-itest-optimize branch 2 times, most recently from c2aeb68 to 67f9404 Compare November 21, 2024 08:21

yyforyongyu force-pushed the yy-beat-itest-flakes branch from 88350c6 to e259d59 Compare November 21, 2024 13:26

yyforyongyu force-pushed the yy-beat-itest-optimize branch from 67f9404 to ee869a2 Compare November 21, 2024 13:27

yyforyongyu force-pushed the yy-beat-itest-flakes branch from e259d59 to 11587b8 Compare November 21, 2024 16:51

yyforyongyu force-pushed the yy-beat-itest-optimize branch from ee869a2 to 4bdf873 Compare November 21, 2024 16:59

yyforyongyu force-pushed the yy-beat-itest-flakes branch from 11587b8 to a744740 Compare November 21, 2024 16:59

yyforyongyu force-pushed the yy-beat-itest-optimize branch from 4bdf873 to db80ec6 Compare November 21, 2024 17:00

yyforyongyu force-pushed the yy-beat-itest-flakes branch from a744740 to d8e61c8 Compare November 22, 2024 15:49

yyforyongyu force-pushed the yy-beat-itest-optimize branch 2 times, most recently from e24d6b9 to 7fba4ab Compare November 25, 2024 06:29

yyforyongyu force-pushed the yy-beat-itest-flakes branch from d8e61c8 to 43247ea Compare November 25, 2024 07:53

yyforyongyu force-pushed the yy-beat-itest-optimize branch 2 times, most recently from 446296c to 9f93fdd Compare November 25, 2024 09:35

yyforyongyu force-pushed the yy-beat-itest-flakes branch from 43247ea to 269bfd7 Compare November 25, 2024 10:02

yyforyongyu force-pushed the yy-beat-itest-optimize branch 2 times, most recently from 7e1c2d4 to b70b0b0 Compare November 25, 2024 14:06

yyforyongyu mentioned this pull request Nov 25, 2024

htlcswitch+routing: handle nil pointer dereference properly #9303

Open

yyforyongyu force-pushed the yy-beat-itest-flakes branch from 269bfd7 to b1011ac Compare November 26, 2024 08:59

yyforyongyu added 28 commits December 12, 2024 20:15

itest: break all multihop test cases

6f26186

itest: break down scid alias channel update tests

428829a

itest: break down open channel fee policy

c8e6d74

itest: break down payment failed tests

41ae04c

itest: break down channel backup restore tests

b69e214

itest: break down wallet import account tests

c82610c

itest: break down basic funding flow tests

6bef51d

itest: break down single hop send to route

f42b108

Also removed the duplicate test cases.

itest: break down taproot tests

80895be

itest: break down channel fundmax tests

c0ffd29

itest: breakdown testSendDirectPayment

93765f2

Also fixes a wrong usage of `ht.Subtest`.

itest: further reduce block mined in tests

b3f99c3

lntest: increase node start timeout and payment benchmark timeout

fc7f282

lntest: make sure policies are populated in AssertChannelInGraph

5aec1ff

workflows: use btcd for macOS

9aea852

To increase the speed from 40m per run to roughly 20m per run.

itest+lntest: add new method FundNumCoins

9a819b8

Most of the time we only need to fund the node with given number of UTXOs without concerning the amount, so we add the more efficient funding method as it mines a single block in the end.

lntest: limit the num of blocks mined in each test

ef54c92

docs: update release notes

72b0985

itest: add a prefix before appending a subtest case

b0a1f90

itest: even out num of tests per tranche

0ac77d5

Previous splitting logic simply put all the remainder in the last tranche, which could make the last tranche run significantly more test cases. We now change it so the remainder is evened out across tranches.

lntest: increase port timeout

90e84c8

lntest: add timeouts for windows

e39ba4d

For Windows the tests run much slower so we create customized timeouts for them.

workflows: increase num of tranches to 16

2088e36

Keep the SQL, etcd, bitcoin rpcpolling builds and non-ubuntu builds at 8 since they are less stable.

lntest: make sure chain backend is synced to miner

e31c412

We sometimes see `timeout waiting for UTXOs` error from bitcoind-related itests due to the chain backend not synced to the miner. We now assert it's synced before continue.

itest: document and fix wallet UTXO flake

e1407ff

itest: fix flake in testCoopCloseWithExternalDeliveryImpl

cc89b32

The response from `ClosedChannels` may not be up-to-date, so we wrap it inside a wait closure.

yyforyongyu force-pushed the yy-beat-itest-optimize branch from df5d749 to cc89b32 Compare December 12, 2024 12:38

bhandras approved these changes Dec 12, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Beat itest [3/3]: fix all itest flakes #9260

Beat itest [3/3]: fix all itest flakes #9260

yyforyongyu commented Nov 12, 2024 •

edited

Loading

coderabbitai bot commented Nov 12, 2024 •

edited

Loading

Review skipped

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Documentation and Community

github-actions bot commented Nov 12, 2024

guggero commented Nov 12, 2024

yyforyongyu commented Nov 12, 2024

bhandras left a comment

bhandras Dec 12, 2024

bhandras Dec 12, 2024

bhandras Dec 12, 2024

Beat itest [3/3]: fix all itest flakes #9260

Are you sure you want to change the base?

Beat itest [3/3]: fix all itest flakes #9260

Conversation

yyforyongyu commented Nov 12, 2024 • edited Loading

coderabbitai bot commented Nov 12, 2024 • edited Loading

Review skipped

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

github-actions bot commented Nov 12, 2024

Pull reviewers stats

guggero commented Nov 12, 2024

yyforyongyu commented Nov 12, 2024

bhandras left a comment

Choose a reason for hiding this comment

bhandras Dec 12, 2024

Choose a reason for hiding this comment

bhandras Dec 12, 2024

Choose a reason for hiding this comment

bhandras Dec 12, 2024

Choose a reason for hiding this comment

yyforyongyu commented Nov 12, 2024 •

edited

Loading

coderabbitai bot commented Nov 12, 2024 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)