Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix stack overflows when compiling high-recursion_limit programs #93056

Closed

Conversation

LegionMammal978
Copy link
Contributor

@LegionMammal978 LegionMammal978 commented Jan 19, 2022

This should close #75577; I've added ensure_sufficient_stack to all of the recursive paths used by the original dumb program, as well as more I've found while testing trivial variations. (Some of the variations are listed in my comment there.) However, there are a few problems left with these fixes.

  1. While I tried to avoid modifying indentation of large blocks, this was not possible for all functions. In its current state, upstream changes are causing frequent merge conflicts with the modified functions. There might be no solution to this. It would be nice to have something like an #[ensure_sufficient_stack] attribute on functions that wraps their entire body, but rustc currently only seems to be using proc macros for deriving traits.

  2. The final commit is extremely unpolished; I mainly wanted to have it work at all. The llvm::Verifier class (defined in src/llvm-project/llvm/lib/IR/Verifier.cpp) finds all references to metadata nodes (mostly debuginfo) in the LLVM module, then traverses them in a recursive function. Fixing stack overflows here is necessary for all of the variations involving deeply nested types. Since the function is written in C++, I can't simply insert ensure_sufficient_stack; the only option is to allocate the entire additional stack space ahead of time. However, to do that, the compiler must compute the depth of the metadata node graph (which can contain loops), and then pass that down to the backend in the codegen thread for the module. There are several ways to achieve the first:

    • Each time a possibly recursive metadata node is created, calculate its depth from its operands, and store that in a HashMap somewhere. This is what my commit does; I added a call to every invocation of a DIBuilder method.
    • Instead of passing around raw llvm::DIWhatever references, always pass (llvm::DIWhatever, usize) pairs with the depth of the node and its operands.
    • Use LLVM's APIs to traverse the entire llvm::Module after the fact to compute the depth. This would effectively involve replicating all of Verifier.cpp.
    • Determine exactly what kinds of metadata the compiler generates, including which parts are recursive. Use known information about the crate's types and other items to derive the depth of the metadata generated from it.
    • Use some multiple of the recursion_limit. This is the simplest possible solution, but it creates lots of memory overhead when the limit is not reached and would likely cause issues in currently existing crates.

    After that, there's the issue of collecting the data and passing it to the backend. In my commit, I add a RefCell<FxHashMap<&'ll llvm::Metadata, usize>> to the CrateDebugContext (as well as a method to add to it), and a static Mutex<FxHashMap<usize, usize>> to associate llvm::Module pointers to their maximum depths. I'm not familiar with the compiler, and I simply have no idea how to idiomatically pass anything but the llvm::Module itself to the backend. Overall, I could really use some advice on this part of the fix.

  3. There are undoubtedly many recursive paths which my limited testing has not revealed. Eliminating all stack overflows would require inspecting the entire call graph of the compiler, including its backends. I recall seeing a couple tools offering call graph generation for regular crates, but I doubt they'd work well with rustc's custom build system.

@rustbot rustbot added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label Jan 19, 2022
@rust-highfive
Copy link
Collaborator

r? @wesleywiser

(rust-highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jan 19, 2022
@petrochenkov petrochenkov self-assigned this Jan 19, 2022
@LegionMammal978 LegionMammal978 changed the title Fix stack overflows in high-recursion_limit programs Fix stack overflows when compiling high-recursion_limit programs Jan 22, 2022
@petrochenkov petrochenkov removed their assignment Jan 22, 2022
@bors
Copy link
Contributor

bors commented Jan 28, 2022

☔ The latest upstream changes (presumably #93006) made this pull request unmergeable. Please resolve the merge conflicts.

@LegionMammal978 LegionMammal978 force-pushed the fix-stack-overflows branch 2 times, most recently from f12763d to 1e81a60 Compare January 30, 2022 15:34
@bors
Copy link
Contributor

bors commented Feb 2, 2022

☔ The latest upstream changes (presumably #93154) made this pull request unmergeable. Please resolve the merge conflicts.

@bors
Copy link
Contributor

bors commented Feb 8, 2022

☔ The latest upstream changes (presumably #92007) made this pull request unmergeable. Please resolve the merge conflicts.

@bors
Copy link
Contributor

bors commented Feb 10, 2022

☔ The latest upstream changes (presumably #93836) made this pull request unmergeable. Please resolve the merge conflicts.

@bors
Copy link
Contributor

bors commented Feb 11, 2022

☔ The latest upstream changes (presumably #93893) made this pull request unmergeable. Please resolve the merge conflicts.

@rust-log-analyzer

This comment has been minimized.

@bors
Copy link
Contributor

bors commented Feb 15, 2022

☔ The latest upstream changes (presumably #93148) made this pull request unmergeable. Please resolve the merge conflicts.

@bors
Copy link
Contributor

bors commented Feb 18, 2022

☔ The latest upstream changes (presumably #94103) made this pull request unmergeable. Please resolve the merge conflicts.

@rust-log-analyzer

This comment has been minimized.

@bors
Copy link
Contributor

bors commented Feb 19, 2022

☔ The latest upstream changes (presumably #94134) made this pull request unmergeable. Please resolve the merge conflicts.

@bors
Copy link
Contributor

bors commented Feb 20, 2022

☔ The latest upstream changes (presumably #94174) made this pull request unmergeable. Please resolve the merge conflicts.

@michaelwoerister
Copy link
Member

@LegionMammal978, this is the PR you reached to me about, right? I'll try to have look at it next week (especially wrt the debuginfo problem).

@LegionMammal978
Copy link
Contributor Author

Yes, this is the PR I was referring to. I've since gotten the hack to work with your changes; annoyingly, a few of the LLVMRustDIBuilder* functions create multiple nodes at once. Currently, it's set to panic if a node is registered with an unregistered operand, so that it always stays in sync with debuginfo changes, but it's not exactly ideal.

@bors
Copy link
Contributor

bors commented Mar 30, 2022

☔ The latest upstream changes (presumably #94081) made this pull request unmergeable. Please resolve the merge conflicts.

@michaelwoerister
Copy link
Member

@LegionMammal978, I've finally gotten around to taking a closer look at this. Regarding the debuginfo part, I have to say I'm skeptical about going to such lengths on our side, just to make sure that LLVM doesn't run into a stack overflow for what are most likely unrealistic programs :)

I suggest splitting the non-controversial parts of this PR out into a separate PR and get that landed.

About debuginfo, I'm not so sure what the best approach is.

Use some multiple of the recursion_limit. This is the simplest possible solution, but it creates lots of memory overhead when the limit is not reached and would likely cause issues in currently existing crates.

Do we know if this is a problem for the default recursion limit? I think it would be fine if explicitly upping the recursion limit implied increased memory consumption.

Instead of passing around raw llvm::DIWhatever references, always pass (llvm::DIWhatever, usize) pairs with the depth of the node and its operands.

I'll need to think about that approach a bit. Maybe it can be implemented cheaply and mostly transparently as part of the TypeMap. But I don't think we should introduce a maintenance burden just for handling pathological cases.

@LegionMammal978
Copy link
Contributor Author

I have to say I'm skeptical about going to such lengths on our side, just to make sure that LLVM doesn't run into a stack overflow for what are most likely unrealistic programs :)

Really, the optimal solution here would be for LLVM to provide a config option to let the user pass in their own stack-growing callback. But I have no clue how to begin submitting a patch to upstream LLVM, especially one of such magnitude, and I've heard horror stories of patches for the big compilers languishing for years with no response. For that reason, I'm not too optimistic about a solution on the LLVM side. It would be a shame if we couldn't get this in on either side, though, since the main motivation is that programs setting the recursion_limit should actually be able to use the full limit.

I suggest splitting the non-controversial parts of this PR out into a separate PR and get that landed.

The annoying part is, the controversial debuginfo fix is necessary for most of the type-based test cases to work. I'm currently using 8 different test cases, all of which create some deeply nested program structure using a declarative macro:

  1. Test<Test<Test<...>>>
  2. Test<Test<Test<...>>> (using a different macro design)
  3. ((), ((), ((), (...))))
  4. const TEST = deref(&deref(&deref(&...))), where deref(x) = *x
  5. <<<<...>::Type as Trait> as Trait>::Type as Trait>::Type
  6. const TEST = ((((((...))))))
  7. fn test() { fn test() { fn test() { ... } } }
  8. const TEST = id(id(id(...))), where id(x) = x

Test cases 1, 2, and 3 require this fix to work in debug builds. Notably, this fix is also necessary to compile the "dumb program" posted in the original issue. I guess it would be possible to extract out the other fixes, but first I'd want to find a way to reduce the number of re-indented functions.

Do we know if this is a problem for the default recursion limit? I think it would be fine if explicitly upping the recursion limit implied increased memory consumption.

The problem comes from programs that are manually specifying the recursion limit. Currently, in my test cases, I specify #![recursion_limit = "1000000000"] for simplicity's sake, and the compiler takes up no additional memory. However, if memory usage did scale with the recursion limit, then this would suddenly add dozens of gigabytes to the memory usage. Since it's currently possible for stable programs to set the recursion limit to 1000000000 with no issue, I believe it would be a breaking change to start using O(program + limit) memory instead of O(program) memory.

@bors
Copy link
Contributor

bors commented Apr 9, 2022

☔ The latest upstream changes (presumably #95524) made this pull request unmergeable. Please resolve the merge conflicts.

@bors
Copy link
Contributor

bors commented Apr 23, 2022

☔ The latest upstream changes (presumably #96316) made this pull request unmergeable. Please resolve the merge conflicts.

@rust-log-analyzer

This comment has been minimized.

@bors
Copy link
Contributor

bors commented Apr 28, 2022

☔ The latest upstream changes (presumably #91557) made this pull request unmergeable. Please resolve the merge conflicts.

The ensure_sufficient_stack function executes its callback in a
dynamically allocated stack when the current stack is low on space.
Using it in recursive algorithms such as visitors prevents the compiler
from overflowing its stack when compiling modules with a high
recursion_limit.
In rustc_ast, an Expr contains an ExprKind, which can contain a P<Expr>.
To prevent the stack from overflowing when dropping an Expr, one of the
types must use ManuallyDrop for its fields. Since both Expr and ExprKind
are frequently deconstructed into their fields using match expressions,
I chose to attach it to P::ptr.
This code is a proof-of-concept more than anything else. It is not ready
to be merged in its current state. Any version of this fix must somehow
measure the depth of the metadata nodes (or a rough upper bound), and it
must somehow send that depth to the LLVM codegen entry point. There is
likely a tradeoff to be made between performance and readability here.
@bors
Copy link
Contributor

bors commented Apr 28, 2022

☔ The latest upstream changes (presumably #96495) made this pull request unmergeable. Please resolve the merge conflicts.

@apiraino
Copy link
Contributor

Since there was a review and following activity, I'd flag this as s-waiting-on-author until a new review can be requested

@rustbot -S-waiting-on-review +S-waiting-on-author

@apiraino apiraino added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Apr 28, 2022
@JohnCSimon
Copy link
Member

@LegionMammal978
ping from triage - can you post your status on this PR? Thanks

@LegionMammal978
Copy link
Contributor Author

@JohnCSimon My apologies, I didn't receive the ping. I've been pretty busy IRL for the past few months, so I haven't had much time to fix this up. Right now, I'll probably want to wait for #97447 to land, given how much of the type-folding logic it touches. Also, I've been meaning in general to find a way to enumerate the recursive visitors in the compiler, since I'm certain I haven't caught all of them. Regarding the debuginfo depth, I have some ideas for how to independently compute it from the type graph without too much churn, but I'm still not exactly sure what form it will take. I might be able to cook something up in 2 or 3 weeks.

@JohnCSimon
Copy link
Member

@LegionMammal978
#97447 has been merged, I see a lot of merge conflicts.

FYI: when a PR is ready for review, send a message containing
@rustbot ready to switch to S-waiting-on-review so the PR is in the reviewer's backlog.

@JohnCSimon JohnCSimon added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Jul 24, 2022
@JohnCSimon
Copy link
Member

@LegionMammal978
Ping from triage: I'm closing this due to inactivity, Please reopen when you are ready to continue with this.
Note: if you do please open the PR BEFORE you push to it, else you won't be able to reopen - this is a quirk of github.
Thanks for your contribution.

@rustbot label: +S-inactive

@JohnCSimon JohnCSimon closed this Sep 11, 2022
@rustbot rustbot added the S-inactive Status: Inactive and waiting on the author. This is often applied to closed PRs. label Sep 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-inactive Status: Inactive and waiting on the author. This is often applied to closed PRs. S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Thread 'rustc' has overflowed its stack on dumb program
10 participants