-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tracking peak total storage use #129808
Comments
While this issue is about improving this for rustc itself, is there any issue about reducing the size of "normal" rust programs built with cargo? That is also a problem. For example:
|
Have you disabled incr comp already? CI generally benefits less from incr comp, but it does take a lot of space. (To be precise restoring the incr comp cache from the network can take longer than the amount of time saved by incr comp) |
I did that yes. And disable debug info too. But I target 32 and 64 bit. Across x86, ARM, RISCV and PPC. And all three major operating systems (obviously not all combinations are possible). That quickly adds up due to combinatorial explosion. The total cache size I would need currently is around 18 GB for that project. As such it would be nice to be able to reduce the size of things that rustc produces. From my experience C++ doesn't expand (based on LOC) quite so voluminously. Of course, they are different languages (and Rust has more advanced abstractions) but perhaps there are things that could be done to the intermediate file formats. |
This is not actually about reducing the size per se. |
There's |
( I waffled back and forth on whether libs cares, and then I remembered all the libs PRs about compiled binary size, so... ) Anyways, nominating this for uh Basically Everyone I guess. I asked some more specific questions but also feel free to answer the question in a broad way, basically on two parts:
@rustbot label: +I-compiler-nominated +I-libs-nominated +I-release-nominated |
Would T-infra (the only team not nominated 😅 ) have more historical context? Maybe I didn't search deep enough but I couldn't find on Zulip a thread with them discussing this issue |
T-libs is not really concerned with the size of intermediate build products, that's mostly a T-compiler/T-cargo issue. We do track code size for the final binary, and keep an eye out for issues related to the size of rlibs shipped in rustup or the size of docs but that's it. |
probably but there's no I-infra-nominated so in the absence of confirmation that they are attending to it, it seemed easiest to volley the question at a relevant subteam. :^) (GitHub notifications seemed more disruptive since they would continue to pester people in a stochastic way even after someone took care of responding.) |
this is release nominated, but I don't know if release looks at nominations (given there's been no response I suspect no). @rust-lang/release in case you don't |
Related August 29th CI event
On August 29th, around 3:59 Pacific Daylight Time, our CI started to fail due to not having enough storage available. It merged a few PRs, but then about 10 hours later it merged the final PR that it would merge that day: 0d63418
It continued to fail for hours. The spotty CI passes were probably due to GitHub initiating a rollout to their fleet that took 12 hours to reach complete global saturation. At that moment, GitHub reduced the actual levels of storage offered to runners to levels that closely reflect their service agreement. See actions/runner-images#10511 for more on that.
Eventually, I landed #129797 which seemed to get CI going again.
Do we take up too much space?
We have had our storage usage grow, arguably to concerning levels, over time. Yes, a lot compresses for transfer, but I'm talking about peak storage occupancy here. And tarballs are not a format that are conducive to accessing individual files, so in practice, the relevant data occupies hosts in its full, uncompressed glory nonetheless. We also generate quite a lot of build intermediates. Big ones. Some of this is unavoidable, but we should consider investigating ways to reduce storage occupancy of the toolchain and its build intermediates.
Besides, we are having issues keeping our storage usage under the amount available to CI, even if there are other aggravating events. Obviously, clearing CI storage space can be done as a dirty hack to get things running again, but changes that benefit the entire ecosystem are more desirable. However, note that a solution that reduces storage but significantly increases the number of filesystem accesses, especially during compiler or tool builds, is likely to make CI problems worse due to this fun little issue:
I'm opening this issue as a question, effectively: We track how much time the compiler costs, but what about space? Where are we tracking things like e.g. total doc size (possibly divided between libstd doc size and so on)? Are we aware of things like how much space is used by incremental compilation or other intermediates, and how it changes between versions? How about things like e.g. how many crater subjobs run out of space in each beta run? Where would someone find this information?
The text was updated successfully, but these errors were encountered: