-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cmd/dist: huge increase in runtime for 'dist test' on slow builders after CL 432535 #57734
Comments
That is https://go.dev/cl/452679, which I plan to merge when the tree reopens for Go 1.21.
I think that is essentially https://go.dev/cl/452775? (But without a parameter — skipping the staleness check always.) |
Prior to Go 1.20, all packages in |
Great, thank you! Those changes should both make a big improvement on filesystem-limited builders. |
Change https://go.dev/cl/452775 mentions this issue: |
Change https://go.dev/cl/452679 mentions this issue: |
The user is likely to run other commands that need these libraries immediately after they are built. For #57734. Updates #56889. Change-Id: I2a1a234e6031d85f017ee692ea1ace8c6e0e7355 Reviewed-on: https://go-review.googlesource.com/c/go/+/452679 Run-TryBot: Bryan Mills <bcmills@google.com> Auto-Submit: Bryan Mills <bcmills@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Russ Cox <rsc@golang.org>
Since packages in "std" no longer have install targets, checking them for staleness is somewhat meaningless: if they are not cached they will be rebuilt anyway, and without installed archives against which we can compare them the staleness check will not detect builder skew. It would still be meaningful to check "cmd" for staleness, but (especially on sharded VM-based builders) that is a fairly expensive operation relative to its benefit. If we are really interested in detecting builder skew and/or build reproducibility, we could instead add a "misc" test (similar to "misc/reboot", or perhaps even a part of that test) that verifies that bootstrapped binaries are reproducible. For #57734. Updates #47257. Updates #56896. Change-Id: I8683ee81aefe8fb59cce9484453df9729bdc587c Reviewed-on: https://go-review.googlesource.com/c/go/+/452775 Run-TryBot: Bryan Mills <bcmills@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Auto-Submit: Bryan Mills <bcmills@google.com> Reviewed-by: Austin Clements <austin@google.com>
I believe this should be fixed as of CL 452775, with perhaps an additional boost from CL 464736. Looking at the dashboard, it also seems to me that the @millerresearch, I'm going to close this out but feel free to reopen if we need to consider other mitigations. |
Thanks, test time on the biggest plan9-arm builder has dropped from about 150 minutes to about 80 minutes. Much better! |
Since packages in "std" no longer have install targets, checking them for staleness is somewhat meaningless: if they are not cached they will be rebuilt anyway, and without installed archives against which we can compare them the staleness check will not detect builder skew. It would still be meaningful to check "cmd" for staleness, but (especially on sharded VM-based builders) that is a fairly expensive operation relative to its benefit. If we are really interested in detecting builder skew and/or build reproducibility, we could instead add a "misc" test (similar to "misc/reboot", or perhaps even a part of that test) that verifies that bootstrapped binaries are reproducible. For golang#57734. Updates golang#47257. Updates golang#56896. Change-Id: I8683ee81aefe8fb59cce9484453df9729bdc587c Reviewed-on: https://go-review.googlesource.com/c/go/+/452775 Run-TryBot: Bryan Mills <bcmills@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Auto-Submit: Bryan Mills <bcmills@google.com> Reviewed-by: Austin Clements <austin@google.com>
CL 432535 caused a 75% increase in total runtime for go tests on plan9-arm builders. I believe the effect will have been similar on other builders with slow filesystems, for example openbsd-arm. Looking at local logs for a raspberry pi in the plan9-arm cluster around the time the CL was merged, I see this [the last column is total runtime in minutes]:
The increased time comes from this new call in
cmd/dist/test.go
:(Parenthetically, could someone explain what "packages without install targets" refers to?)
Note that this extra build is performed at the start of every
dist test
invocation in a test run. Only the first one after themake.bash
/make.rc
might actually build something; subsequent builds will just amount to a staleness test. On a fast linux machine with aggressive caching of directories and inodes, the staleness test may seem like a "no-op", but at last count it involves 9999open
and 6326stat
calls on 4580 files and directories. On a plan9-arm builder this adds nearly a minute of overhead to eachdist test
:Because the tests in a run are partitioned (badly, and in my opinion unnecessarily - see #49343), each test run of 275 tests currently involves 131 invocations of
dist test
- so, potentially more than two hours spent on redundantdist test
overheadSolutions? I would suggest one or more of:
dist test
dist test
to indicate that a build has just been done so no staleness test is needed (maybe - radical idea - just respect--no-rebuild
?)The text was updated successfully, but these errors were encountered: