Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

regression: crate compilation regression on reporting errors with Rust 1.68-nightly (62bf38fa6 2025-01-10) #136516

Open
erickt opened this issue Feb 3, 2025 · 4 comments
Labels
C-bug Category: This is a bug. E-needs-bisection Call for participation: This issue needs bisection: https://github.com/rust-lang/cargo-bisect-rustc E-needs-mcve Call for participation: This issue has a repro, but needs a Minimal Complete and Verifiable Example I-compiletime Issue: Problems and improvements with respect to compile times. I-prioritize Issue: Indicates that prioritization has been requested for this issue. regression-untriaged Untriaged performance or correctness regression. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-types Relevant to the types team, which will review and decide on the PR/issue.

Comments

@erickt
Copy link
Contributor

erickt commented Feb 3, 2025

Hello folks,

On Fuchsia we've noticed a pretty significant complication time regression in Fuchsia when reporting an error in code when we accidentally out an associated type from an impl. This is very analogous to an issue we had in 2023 with #116996 where the compiler appeared to get stuck in a livelock or some exponential-time worst case scenario if we forgot to use a type that was referenced in the code. As far as we can tell that was fixed #132625.

We ran into this in two different places. First was with a missing associated type in a trait impl in http://fxrev.dev/1180033/10..11 in the testutil.rs, where the bug occurred in "Patchset 10" and fixed in "Patchset 11".

The second case was in http://fxrev.dev/1179835 (our internal bug is https://fxbug.dev/385922001 for any Googlers), where we forgot to import frunner into the namespace. This would fix it:

use {
     fidl_fuchsia_io as fio, fidl_fuchsia_memory_attribution as fattribution,
-    fuchsia_async as fasync,
+    fuchsia_async as fasync, fidl_fuchsia_component_runner as frunner
 }; 

If it helps, here's the old thread we had on zulip t-types https://rust-lang.zulipchat.com/#narrow/channel/144729-t-types/topic/Fuchsia.20Build.20Time.20Regression/near/495568058

Please let me know here or on zulip if there's anything you'd like us to test out, since it's a bit complicated to reproduce issues since it can take a long time to compile Fuchsia.

Meta

rustc --version --verbose:

Rust 1.68-nightly (62bf38fa6 2025-01-10)
Backtrace

The engineer who encountered this let rust run overnight, and after about 10 hours Rust finally erred out with this large backtrace: https://gist.github.com/erickt/76bd9586079552915f623fa7a378fc7d.

cc @lqd

@erickt erickt added the C-bug Category: This is a bug. label Feb 3, 2025
@rustbot rustbot added the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Feb 3, 2025
@erickt
Copy link
Contributor Author

erickt commented Feb 3, 2025

One other observation we saw is that there was another recent ICE that had the same assertion failed: value <= 0xFFFF_FF00 error message #132429, but from what we can tell our version of rustc should contain the #132466 fix.

@saethlin saethlin added the T-types Relevant to the types team, which will review and decide on the PR/issue. label Feb 4, 2025
@jieyouxu jieyouxu added T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. E-needs-bisection Call for participation: This issue needs bisection: https://github.com/rust-lang/cargo-bisect-rustc E-needs-mcve Call for participation: This issue has a repro, but needs a Minimal Complete and Verifiable Example I-compiletime Issue: Problems and improvements with respect to compile times. regression-untriaged Untriaged performance or correctness regression. and removed needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. labels Feb 4, 2025
@rustbot rustbot added the I-prioritize Issue: Indicates that prioritization has been requested for this issue. label Feb 4, 2025
@lqd
Copy link
Member

lqd commented Feb 4, 2025

The second case was in https://fuchsia-review.git.corp.google.com/c/fuchsia/+/1179835 (our internal bug is https://fxbug.dev/385922001 for any Googlers), where we forgot to import frunner into the namespace

It seems the first link is also for googlers only.

Please let me know here or on zulip if there's anything you'd like us to test out, since it's a bit complicated to reproduce issues since it can take a long time to compile Fuchsia.

As discussed on zulip, some of these could be helpful:

@erickt
Copy link
Contributor Author

erickt commented Feb 6, 2025

It seems the first link is also for googlers only.

Oops sorry! I updated the links with the publicly accessible ones.

  • whether the issue reproduces with the old solver only, -Znext-solver=no
  • whether the issue reproduces with the new solver in coherence, -Znext-solver=coherence (that is the current default)
  • if it works for this crate, as it's still in-progress work: whether the issue reproduces with the new solver, -Znext-solver=globally

I'll test this out and get back to you (probably tomorrow or Friday).

A repro outside of fuchsia, and an MCVE, would allow for further investigation.

I'll see if I can detangle things to make it easier to reproduce outside of our tree. That might take some time though.

In the meantime, if we are able to extract out the code, do you know if there's a knob we could tweak that could reduce the amount of interned indices to trigger the assertion failure? If so, since this code takes about 3-5 minutes to compile, we could find a limit that triggers the assertion around 10 minutes, and then we could use something like rust-reduce or https://github.com/langston-barrett/treereduce to try to get an MCVE.

@lqd
Copy link
Member

lqd commented Feb 6, 2025

I'll test this out and get back to you (probably tomorrow or Friday).

Cool, thank you.

In the meantime, if we are able to extract out the code, do you know if there's a knob we could tweak that could reduce the amount of interned indices to trigger the assertion failure? If so, since this code takes about 3-5 minutes to compile

Technically it's here, but I would probably against starting with this approach: we use this niche "pretty liberally" in serializing and LEB128 encoding, so I'm not sure if other things may break before reaching your issue. If you know which newtype is overlfowing then you can also #[max] that I guess. Though the ICE looks to be in incremental compilation (i.e. usually a distant byproduct of a trait solving slowness) and one would usually disable that when minimizing.

But also because of the answer below.

we could find a limit that triggers the assertion around 10 minutes

Hum, an easy way to trigger a 10-min timeout would be timeout 10m rustc file.rs, so I'd probably start with that rather than the newtype id exhaustion 😅.

(Casual reminder to a googler that rust-reduce is AGPL)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Category: This is a bug. E-needs-bisection Call for participation: This issue needs bisection: https://github.com/rust-lang/cargo-bisect-rustc E-needs-mcve Call for participation: This issue has a repro, but needs a Minimal Complete and Verifiable Example I-compiletime Issue: Problems and improvements with respect to compile times. I-prioritize Issue: Indicates that prioritization has been requested for this issue. regression-untriaged Untriaged performance or correctness regression. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-types Relevant to the types team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

5 participants