Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cargo Check T-lang Policy #3477

Merged
merged 9 commits into from
Sep 20, 2023
Merged

Cargo Check T-lang Policy #3477

merged 9 commits into from
Sep 20, 2023

Conversation

Lokathor
Copy link
Contributor

@Lokathor Lokathor commented Aug 22, 2023

@Lokathor
Copy link
Contributor Author

@rfcbot label +I-lang-nominated

@Lokathor
Copy link
Contributor Author

@scottmcm the bot didn't seem to do the thing

@compiler-errors
Copy link
Member

@rustbot label +I-lang-nominated

@rustbot rustbot added the I-lang-nominated Indicates that an issue has been nominated for prioritizing at the next lang team meeting. label Aug 22, 2023
@ehuss ehuss added the T-lang Relevant to the language team, which will review and decide on the RFC. label Aug 22, 2023
@clarfonthey
Copy link
Contributor

clarfonthey commented Aug 23, 2023

Slightly confused by the wording here, but taking away from this RFC, there are effectively three "tiers" to checking for compile errors in Rust:

  1. cargo check catches a subset of the errors of cargo build and therefore may pass even for failing builds.
  2. cargo build is the standard.
  3. cargo clippy catches a superset of the errors of cargo build and therefore may fail even for passing builds.

I'm adding clippy here as an example because before reading this, I had the misconception that it was actually this:

  1. cargo build is the standard.
  2. cargo check will always fail when cargo build fails, but may not return all the errors cargo build does.
  3. cargo clippy will always fail when cargo check does, but may fail more often.

Effectively, I thought that cargo check was a "short circuit" for cargo build that simply stopped sooner upon errors, and completed sooner when things succeeded. There could be subtle compiler bugs that cause cargo check to pass when cargo build fails, but I always assumed that was just a bug to be fixed.

Based upon this RFC, cargo check is merely a best-effort layer on top of cargo build, and it seems… less useful, to codify that. I've always been under the assumption that, while cargo check guarantees little about its output, it would still ensure that it fails at minimum whenever cargo build fails, so that you can use it in cases like CI where you want to ensure that a build would pass but don't actually need the result of building.

Under this system, a CI system which lets you merge changes based upon the output of cargo check would allow invalid changes onto the mainline branch which would later fail cargo build. Which makes it unclear who exactly would benefit from cargo check in that case.

@BurntSushi
Copy link
Member

@clarfonthey Does the possibility of post-monomorphization errors impact your mental model? (Sorry for the broad and pithy question, but it's something that sprang to mind for me after reading your comment.)

@ehuss
Copy link
Contributor

ehuss commented Aug 23, 2023

I'm not entirely sure which teams this RFC would fall under (though I labeled T-lang for now). This doesn't exactly directly relate to Rust-the-language, but more-so the tooling that exists. For example, would T-lang care which errors rust-analyzer emits? Why would that decision be different from what cargo check does?

@ehuss
Copy link
Contributor

ehuss commented Aug 23, 2023

cargo clippy catches a superset of the errors of cargo build and therefore may fail even for passing builds.

A small clarification, cargo clippy is a superset of cargo check, not build. It is the same as check with some extra lints added.

@Lokathor
Copy link
Contributor Author

@ehuss this RFC was written at the request of members of T-lang. The main thing here is the stability promises we want to uphold. The RFC makes the case: If you do not pass cargo build then future versions of rust can change if you pass a cargo check or not.

This helps unblock things like the inline const development where there's been concern about exactly how much can be caught by a check vs a build currently or in the future.

@ehuss
Copy link
Contributor

ehuss commented Aug 23, 2023

I think it might help the discussion to have some background information, definitions, and examples to clarify what this RFC means. This doesn't necessarily need to be codified in the RFC, but if people don't have the context of the particulars of how things work, it could be difficult to understand.

I think this RFC is intentionally trying avoid defining what cargo check means, because that could change at any time in the future.

How it works today

cargo check is implemented in rustc so that it performs most of the compilation steps of a regular build, but stops before the code-generation phase. (This is a large simplification.) The actual flags that are used is --emit=metadata instead of --emit=link.

What kinds of things aren't checked today?

Some of the kinds of errors that are only generated with a full build are:

  • MIR-based lints (arithmetic-overflow, unconditional-panic)
  • MIR-based optimization errors
  • Some constant evaluation errors
  • Post-monomorphization errors
  • Code generation errors
  • Symbol errors
  • Any errors generated by LLVM
  • Any errors generated by the linker

I'm sure @oli-obk or others could explain it better than I can.

There are a variety of issues where you can see examples of what these mean:

rust-lang/rust#49292
rust-lang/rust#52484
rust-lang/rust#56711
rust-lang/rust#61330
rust-lang/rust#63362
rust-lang/rust#67771
rust-lang/rust#70923
rust-lang/rust#99682
rust-lang/rust#111240
rust-lang/rust#112301

You can also search the compiler test suite for build-fail annotations, most of those tests will not fail on check (https://github.com/search?q=repo%3Arust-lang%2Frust%20build-fail&type=code).

@clarfonthey
Copy link
Contributor

@clarfonthey Does the possibility of post-monomorphization errors impact your mental model? (Sorry for the broad and pithy question, but it's something that sprang to mind for me after reading your comment.)

I mean, I have argued in favour of those in the past, but bringing them up in this case makes it reasonable to argue against them. But, I dunno, I still think that there are a large number of steps in building (linking by itself is a massive step, not to mention all the optimisations on the compiled code) that I would expect cargo check to just stop once it gets past the point where any errors might occur.

Arguably, monomorphisation is part of building. But if it can error, I would consider it necessary for cargo check. Whereas, for example, converting the code into runnable instructions or linking, would not be part of that process.

Honestly, link-time errors are a better example here, since those effectively are the last possible errors you can encounter, and you can't easily detect them without building everything. It makes me question me approach more, but I personally am willing to accept them being excluded on account of most Rust code not needing to worry about them.

@djc
Copy link
Contributor

djc commented Aug 23, 2023

I'm surprised by this stance. Similar to @clarfonthey, my mental model has been that cargo check executes all the steps necessary before getting the code generation backend to actually build code -- in particular, because the code generation (and linking) are particularly slow.

So, to change this around: how much of a performance penalty would there be in the typical case if cargo check started doing more (from @ehuss' list, perhaps stopping after post-monomorphization errors but before codegen errors)? IME cargo check is usually fast enough so making it, say, 5% slower with the benefit of catching a bunch more errors -- particular errors that a "naive" user's mental model would expect it to catch -- might be worth it?

Or another way to put it is to suggest that this RFC needs a "how do we teach this?" section.

@oli-obk
Copy link
Contributor

oli-obk commented Aug 23, 2023

Effectively, I thought that cargo check was a "short circuit" for cargo build that simply stopped sooner upon errors, and completed sooner when things succeeded

That seems like something we could easily offer under a flag in general. A fail-fast/stop-on-first-error scheme

@weihanglo
Copy link
Member

That seems like something we could easily offer under a flag in general. A fail-fast/stop-on-first-error scheme

I am confused. Doesn't Cargo already behave like that — stop on the first error from rustc?

@oli-obk
Copy link
Contributor

oli-obk commented Aug 23, 2023

Doesn't Cargo already behave like that — stop on the first error from rustc?

Yes, but rustc continues its current compilation for as long as possible, reporting as many things as possible.

@joshtriplett
Copy link
Member

joshtriplett commented Aug 23, 2023

The rationale leading to this RFC was a discussion on rust-lang/rust#112879 , and the question of whether it'd be permitted to not do all of that work on cargo check to avoid paying the penalty there. The question was then, would it be acceptable if cargo check passed but cargo build failed? Which comes down to a question of whether cargo check is "catch as much as possible while being fast" or "catch everything, and then go as fast as we can under the constraint of still catching everything".

That distinction is important, and we should establish which of those two scenarios we want, rather than leaving it ambiguous or deciding on a case-by-case basis.

@RalfJung
Copy link
Member

RalfJung commented Aug 23, 2023

So, to change this around: how much of a performance penalty would there be in the typical case if cargo check started doing more (from @ehuss' list, perhaps stopping after post-monomorphization errors but before codegen errors)?

I don't think we have numbers, but this requires full monomorphization, whereas currently check doesn't do any monomorphization. We already see bad regressions in rust-lang/rust#112879 from just doing a bit more monomorphization during build; I expect the impact of doing full monomorphization during check to be a lot bigger than that. Like, I'd be surprised if it was less than 50%.

Co-authored-by: Ralf Jung <post@ralfj.de>
@oli-obk
Copy link
Contributor

oli-obk commented Aug 23, 2023

From a compiler perspective (tho I am biased here as someone working on this logic) it would allow us more freedom in refactorings and keep the compiler implementation simpler if we allowed check builds to pass when a full build would fail.

@RustyYato
Copy link

I think any error before LLVM should be caught by cargo check. But maybe not by default, instead there could be a flag that opts into this slow but complete behavior. This way, CI can have the confidence that all errors are caught, and normal usage of cargo check remains very fast. Especially because even linking in a full build can take quite a long time in large projects, so cargo build would still be much slower than cargo check.

text/0000-cargo-check-lang-policy.md Outdated Show resolved Hide resolved
text/0000-cargo-check-lang-policy.md Outdated Show resolved Hide resolved
@RalfJung
Copy link
Member

But maybe not by default, instead there could be a flag that opts into this slow but complete behavior.

Sure, the compiler and cargo teams have the option of introducing such a flag. The RFC talks about cargo check as it exists today, and its default behavior.

Lokathor and others added 3 commits August 24, 2023 00:12
Co-authored-by: Ralf Jung <post@ralfj.de>
Co-authored-by: Ralf Jung <post@ralfj.de>
Co-authored-by: Ralf Jung <post@ralfj.de>
@Lokathor
Copy link
Contributor Author

Also, I would argue that CI should absolutely be running cargo test, not just a check

@epage
Copy link
Contributor

epage commented Aug 28, 2023

Also, I would argue that CI should absolutely be running cargo test, not just a check

For me, I used a mix of cargo test and cargo check based on the probability that something will exhibit runtime vs build time failures so I can streamline my builds.

One example is I run cargo test on stable and cargo check on MSRV. However, that wouldn't be affected by this because I'm still running some kind of cargo build (via cargo test) on the same subset. However, I could see myself running cargo hack check to verify all feature combinations while only doing cargo test on a fixed set of features. This is a case where it could potentially break.

@dlight
Copy link

dlight commented Aug 29, 2023

Some errors can't be emitted without actually doing a full compilation. For example inline asm blocks are checked by the codegen backend and linker errors are emitted by the linker.

Ok, then what about cargo check --check-level=1 where the check-level is an option that signals how thorough should the test be (the higher the number, the slower it is and it catches more errors), with the understanding that link-time checks will require cargo build anyway.

Then, an option in Cargo.toml's [profile] section that sets the default check level when it's not specified in the command line, like

[profile.dev]
check-level=1

@tmandry
Copy link
Member

tmandry commented Aug 29, 2023

This was discussed in today's lang triage meeting (notes). The consensus was that we should do it, after the updates in 309c26f. In particular, we can't guarantee that no optimization-dependent errors will ever exist, but we want the default to be that they shouldn't exist and are considered a bug (and to document the exceptions).

@rfcbot fcp merge

@rfcbot
Copy link
Collaborator

rfcbot commented Aug 29, 2023

Team member @tmandry has proposed to merge this. The next step is review by the rest of the tagged team members:

No concerns currently listed.

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

cc @rust-lang/lang-advisors: FCP proposed for lang, please feel free to register concerns.
See this document for info about what commands tagged team members can give me.

@rfcbot rfcbot added proposed-final-comment-period Currently awaiting signoff of all team members in order to enter the final comment period. disposition-merge This RFC is in PFCP or FCP with a disposition to merge it. labels Aug 29, 2023
Monomorphization is expensive: instead of having to check each function only once, each function now has to be checked once for all choices of generic parameters that the crate needs.
Given this performance cost and the fact that errors during monomorphization are fairly rare, `cargo check` favors speed over completeness.

Examples where the optimization level can affect if a program passes `cargo check` and/or `cargo build` are considered bugs unless there is a documented policy exception, approved by T-lang. One example of such an exception is [RFC #3016](https://rust-lang.github.io/rfcs/3016-const-ub.html), which indicated that undefined behavior in const functions cannot always be detected statically (and in particular, optimizations may cause the UB to be undetectable).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, this is currently not optimization-dependent. CTFE always runs on unoptimized MIR, exactly to avoid that situation.

But I understand the point of this paragraph is that it would be allowed to be optimization-dependent.

Co-authored-by: scottmcm <scottmcm@users.noreply.github.com>
@scottmcm
Copy link
Member

I do think that it would be nice to have more non-normative references and examples (like #3477 (comment) brought up), but I agree with the actual policy.

@rfcbot reviewed

Copy link

@CAD97 CAD97 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TL;DR:

  • Good outcomes of this policy:
    • A program passing cargo check but failing cargo build is an invalid program and not covered under stability promises.
    • Optimization dependent errors in cargo build are considered bugs barring specific policy.
  • Bad outcomes of this policy:
    • cargo check being "fast" is more important than being "complete," without any limitation on when sacrificing the latter for the former is considered acceptable.
    • Continued conflation of codegen errors and const panics as equally hard/bad post-monomorphization errors.

I want to note that I've found it less than useful to use the term "post-monomorphization errors" here; it's not a useful term to anyone who isn't the compiler. It's basically just tautologically the set of errors which are found by cargo build but not cargo check, as the difference between the two is that check stops before performing monomorphization. The set of what errors are post-mono isn't even a fully consistent set with respect to surface language features.

I find a much more useful categorization is "codegen" and "generic instantiation" errors. The "codegen" category is those things that are intrinsically tied to the build process, so LLVM, inline asm, and linker errors, but also MIR lowering errors and (vendor) intrinsic usage errors. "Generic instantiation" errors are those that would be diagnosed pre-mono if the exact same code were written in a different function.

The latter category may even be solely populated by const evaluation panics. I still hold that such are a special case compared to every other post-mono error.

I have no qualms with cargo check missing errors caught by cargo build in general, nor even with errors reported by cargo build being optimization dependent. Certain classes of errors ("codegen") are clearly build/optimization dependent. It's the specific case of const panics which I take issue with.

An example and more complaining about this specific case
impl Trait for () {}
trait Trait {
    const ASSOC: () = panic!();
}

fn f_inner<T: Trait>() {
    T::ASSOC // const { panic!() }
}

pub fn f() {
    // cargo check: okay
    // cargo clippy: error
    // cargo build: error
    <()>::ASSOC;
}

// cargo check: error
const _: () = <()>::ASSOC;

The reason this bothers me so much I think has to do with legacy — in the beginning, it used to be that a panic in a (promoted) constant evaluation was deferred until runtime, being the guaranteed panic lint. With const panic!, though, this changed; even when the usage of a const item is at runtime, the error is promised to occur at build time... if it happens at all.

If we made check catch less and made the above into a deny-by-default guaranteed panic lint instead of a hard error, I'd care less. It's the inconsistency which I take issue with; const panic being a hard error says that const evaluation is a comptime thing, but deferring evaluation to mono time instead of obligation checking is based on comptime const eval being just an optimization over runtime eval. The two can't both be true at the same time.


While the policy is general and doesn't need to have a final say on what classes of errors check does or doesn't find, I do think the policy should have some examples of what classes of errors check is still expected to find, and what classes it's expected to ignore. I know it's not the actual position of any team, but as written, the stated policy as written of check being a fast approximation of build being more important than catching errors means I could PR massive wins by bypassing borrowck and name resolution independent trait obligation evaluation in check, and accepting that would be allowed by this policy.

Much less drastic, I'll make it so that all unused const items never get evaluated by check. Unused associated const items are something currently skipped by build, so there's clear precedent for skipping this kind of const evaluation. That should offer a consistent small win on real world crates. And likely make those crates unsound under check, since const _: () = assert!(...); is a fairly standard way to do a compile-time const assertion.

I'm representing a deliberately hostile interpretation of the policy because without clarification, an expected outcome should be an erosion of the trust in cargo check. So far it's been that sometimes check won't catch some errors that build will, but if you aren't doing anything unsafe (e.g. linker errors) and/or weird and advanced (e.g. associated const eval errors), check will always be sufficient. This policy is explicitly saying that this statement is far too strong.

At absurdity, consider if borrowck were extremely expensive, and took on average 20% of the runtime of check. Would removing borrowck from check be permissable? Is there a vibe check available for when a completeness/speed tradeoff is considered desirable other than just legacy of what's pre- or post-mono in the current compiler architecture? How do we make justifiable calls when our largest test corpuses are primarily of correct code, and this is about diagnosing incorrect code?

At a minimum, I'd hope that adopting this as policy would come with a blog post, and for that blog post to illustrate an example of an error in each category of "caught by check, will always be caught by check;" "only caught by build, likely never caught by check;" and "might be caught by check, if doing so can be made fast."

# Unresolved questions
[unresolved-questions]: #unresolved-questions

* Is there any situation when we would *want* to allow optimization level to affect if a program passes or fails a build? This seems unlikely.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's one very notable case for linker errors: the "no panic trick" relies on making an unwind call an undefined linker symbol and optimization removing any references to the symbol, thus allowing the code to compile iff the compiler proves that no unwinding can occur.

Additionally, essentially the entirety of the average 5% regressions caused by rust-lang/rust#112879 come from not permitting dead code optimizations to skip middle-end MIR monomorphization of statically known to be unreachable code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additionally, essentially the entirety of the average 5% regressions caused by rust-lang/rust#112879 come from not permitting dead code optimizations to skip middle-end MIR monomorphization of statically known to be unreachable code.

That's not clear at all, given that that PR reports regressions in check as well.

@Lokathor
Copy link
Contributor Author

Lokathor commented Sep 4, 2023

I would say that while T-lang is in charge of defining "Rust the Language", it's still separately up to T-compiler to manage "Rust the Compiler Implementation". I don't think that T-compiler would carelessly approve any big PRs which make the user's experience worse, even if the bounds of the language technically allow it.

@RalfJung
Copy link
Member

RalfJung commented Sep 4, 2023

I want to note that I've found it less than useful to use the term "post-monomorphization errors" here; it's not a useful term to anyone who isn't the compiler. It's basically just tautologically the set of errors which are found by cargo build but not cargo check, as the difference between the two is that check stops before performing monomorphization. The set of what errors are post-mono isn't even a fully consistent set with respect to surface language features.

That's not correct. We currently have some "errors skipped by check" that are not post-monomorphization errors: rust-lang/rust#49292.

There's a very clear definition of post-mono errors and it has nothing directly to do with check vs build. It's simply about whether the error can be detected on generic MIR or not. So I still find the term quite useful. That's not to say we can't have sub-categories of post-mono errors, but the overarching category has its use as well.

I find a much more useful categorization is "codegen" and "generic instantiation" errors. The "codegen" category is those things that are intrinsically tied to the build process, so LLVM, inline asm, and linker errors, but also MIR lowering errors and (vendor) intrinsic usage errors. "Generic instantiation" errors are those that would be diagnosed pre-mono if the exact same code were written in a different function.

The latter category may even be solely populated by const evaluation panics. I still hold that such are a special case compared to every other post-mono error.

It's definitely not just const-eval panics, it's at least const-eval failure in general (which can be due to panic, UB, or doing something unsupported at compile-time).

@rfcbot rfcbot added final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised. and removed proposed-final-comment-period Currently awaiting signoff of all team members in order to enter the final comment period. labels Sep 5, 2023
@rfcbot
Copy link
Collaborator

rfcbot commented Sep 5, 2023

🔔 This is now entering its final comment period, as per the review above. 🔔

@nikomatsakis
Copy link
Contributor

@rfcbot fcp reviewed

I am happy with this being the lang policy. In short, the "official" semantics of Rust come with cargo build, and lang team still controls what is or is not an error on that; but the compiler can offer convenience modes (like check) that do a subset of those checks. It's up to the compiler team then to keep that subset useful.

@Lokathor Lokathor mentioned this pull request Sep 7, 2023
@rfcbot rfcbot added finished-final-comment-period The final comment period is finished for this RFC. and removed final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised. labels Sep 15, 2023
@rfcbot
Copy link
Collaborator

rfcbot commented Sep 15, 2023

The final comment period, with a disposition to merge, as per the review above, is now complete.

As the automated representative of the governance process, I would like to thank the author for their work and everyone else who contributed.

This will be merged soon.

@CAD97
Copy link

CAD97 commented Sep 17, 2023

Since I've been loudly against check ignoring instantiated constant evaluation errors stripped before monomorphization, I want to clarify that I'm actually in favor of how Niko phrased it: the language semantics are implemented by build, and it's the domain of T-compiler (not T-lang) to determine how shortcut check is/isn't. I don't have an actual say here, but I felt this worth clarifying as a voice to the counter party.

I still disagree with an unqualified check missing Rust-native build errors (i.e. caused by language semantics in the frontend, not by the process of backend lowering/codegen), but I agree with the actual textual content of the policy.

@tmandry tmandry merged commit 889898e into rust-lang:master Sep 20, 2023
@tmandry
Copy link
Member

tmandry commented Sep 20, 2023

The @rust-lang/lang team has decided to accept this RFC.

There is no tracking issue for this RFC because it is a policy decision, and there is no associated implementation to stabilize. To report issues with the compiler, please open an issue in the rust-lang/rust repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
disposition-merge This RFC is in PFCP or FCP with a disposition to merge it. finished-final-comment-period The final comment period is finished for this RFC. I-lang-nominated Indicates that an issue has been nominated for prioritizing at the next lang team meeting. T-lang Relevant to the language team, which will review and decide on the RFC. to-announce
Projects
None yet
Development

Successfully merging this pull request may close these issues.