-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"Upload missing inputs" performance regression in Bazel 5.0 and 5.3 #16054
Comments
Is this fixed by f440f8e? |
@brentleyjones it's not -- that regression took these tests from ~2s to ~45s, but it's still several times higher than in 4.2. (I ran a few of the same tests on release-5.3.0 branch at 9d57003, with about the same results.) |
Made a small improvement via #16118 but still the regression is not able to fix since we need lock for every inputs to deduplicate I was using your test repo with For the baseline ( For the
Essentially, we pay around I made a prototype to replace What's your build wall time difference between |
The execution time increase I would expect is due to the recursive visitor pattern when building the Merkle tree. Now, that pattern is already part of the old |
Description of the bug:
cc @coeuvre -- this bug is very similar to #15872, but the regression actually occurred in the 5.0.0 release. I did a git bisect between 41feb61 and 2ac6581, and it appears db15e47 is the source of this slower behavior.
In the actual repo affected by this, we have ~600k inputs to some actions, which are taking nearly 7 seconds on this step with a recent release-5.3.0 commit. (We thought the fix to #15872 may have helped here, but it is consistent with 5.0's performance, and still slower than 4.2.)
Unfortunately this overhead was much smaller (<1s) with Bazel 4.2.2, and this regression, while not the same magnitude as that of #15872, is still significant for us.
What's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.
Using the same test repo, https://github.com/clint-stripe/action_many_inputs/, but with 300,000 inputs to the action (edit the number in
WORKSPACE
to change this):I have a script that runs the same set of commands, which reproduces this very clearly: (In the
action_many_inputs
repo, withbazel
checked out and built in a sibling directory)Run this once, to ensure that the inputs have all been uploaded -- we want to measure the time when no action inputs change.
Then, run this a few times (it's ok if the test doesn't actually run; I usually hit ctrl-c after ~10 seconds just to ensure the "upload missing inputs" step is complete), and check the timing in the profile.
If it's helpful, you can look at just the event we care about:
Which operating system are you running Bazel on?
linux
What is the output of
bazel info release
?No response
If
bazel info release
returnsdevelopment version
or(@non-git)
, tell us how you built Bazel.What's the output of
git remote get-url origin; git rev-parse master; git rev-parse HEAD
?No response
Have you found anything relevant by searching the web?
No response
Any other information, logs, or outputs that you want to share?
I still have the full profiles from most of these bazel invocations, happy to share if there's anything else there. (Unfortunately these changes predate the more granular profiling that distinguishes between 'collect digests' and 'find missing digests'.)
The text was updated successfully, but these errors were encountered: