-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Exponential growth in compile time, memory usage for obligation evaluation #74456
Comments
@rustbot ping icebreakers-cleanup-crew Help creating a smaller MCVE would be appreciated! |
Hey Cleanup Crew ICE-breakers! This bug has been identified as a good cc @AminArria @camelid @chrissimpkins @contrun @DutchGhost @elshize @ethanboxx @h-michael @HallerPatrick @hdhoang @hellow554 @imtsuki @kanru @KarlK90 @LeSeulArtichaut @MAdrianMattocks @matheus-consoli @mental32 @nmccarty @Noah-Kennedy @pard68 @PeytonT @pierreN @Redblueflame @RobbieClarken @RobertoSnap @robjtede @SarthakSingh31 @senden9 @shekohex @sinato @spastorino @turboladen @woshilapin @yerke |
I haven't looked in the protocol crate yet, but the numbers I've seen from println-based debugging look really big
One of the (much smaller!) predicates causing this looked like this
|
(copy-paste from Zulip) at this point I think what's happening is that normalization generates a bunch of extra obligations. i'm getting output for example "with 885017 obligations" from here: https://doc.rust-lang.org/nightly/nightly-rustc/src/rustc_trait_selection/traits/project.rs.html#270 presumably, after that happens deduplication occurs at some point. i've been messing around with RUSTC_LOG and have noticed that the actual process of generating those obligations is pretty quick but after each execution of normalize_with_depth_to the compiler takes longer and longer to process the next iteration, presumably due to time spent deduplicating (as I earlier ascertained that doing so is a substantial portion of the CPU time spent, in the other thread in #t-compiler/help. it does seem like the amount of time spent deduplicating is nonlinear over the number of obligations generated, which doesn't make sense to me given that it's just inserting them all into a hashset, but I'm not sure. this output from projection is consistent with Denis's findings above i.e. that most obligations are redundant what needs to happen I suppose is some sort of deduplication of the accumulated obligations during normalization during that process as opposed to after the fact. I don't know what the best path is to do that: I don't know whether storing them all in an FxHashSet in the first place is feasible (maybe some duplicates are actually necessary? i seem to remember seeing some comments about ordering guarantees, how does that impact potential deduplication strategies? etc.) and I'm not sure what the compiler's general design philosophy requires for this to be implemented elegantly and soundly so I guess at this point I'm in need of
|
After further review and as noted on Zulip this appears to just be #20304. |
The source provided here, also used for reproduction of the previously filed #73593, takes an unexpectedly long time to compile and consumes substantial memory, growing exponentially as seemingly trivial complexity is added to the source. For example, uncommenting this line results in rustc crashing due to OOM killer on my system.
I apologize for the lack of MCVE but my attempts to investigate this have been stymied by my lack of understanding of the relevant portions of the compiler. This zulip thread contains the discussion of investigation so far, but the concise overview is that through use of
massif
andperf
I've narrowed the performance issues (both compile time and compile memory) down to theobligations
Vec used here in rustc and, more specifically, the subsequent retain call for deduplication. Over half the CPU time for a full build of this crate is spent computing the Hash ofObligationCauseCode
in thoseinsert
calls.Identifying this hot path as the problem is informative but insufficient to allow me to minimize this. The most granular causative information I have is from the output of
-Zself-profile
, which suggests that the time spent is in evaluate_obligation (as is clear from the above information as well) for queries such asThe unifying theme among the queries that consume the vast majority of the total CPU time is that they're all root-level evaluations of predication on the
Unravel
trait, and there doesn't appear to be a way to achieve further granularity with-Zself-profile
such that I could establish what resultant predicates are being produced that in turn would result in the voluminous contents of thatobligations
Vec, whatever they might be. I'm not sure if the issue here is a failure/missed opportunity in deduplication, an accumulation of a huge number of predicates before deduplication such that a single iteration consumes all this time (perf
records using sampling so I don't know whether many small calls or fewer more time-consuming calls are being made), or something else entirely.For profiling I'm using a locally built stage 1 rustc on
x86_64-unknown-linux-gnu
from commit8aa18cbdc5d4bc33bd61e2d9a4b643d87f5d21de
though this issue occurs on current stable etc. as well.The text was updated successfully, but these errors were encountered: