Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix --remap-path-prefix not correctly remapping rust-src component paths and unify handling of path mapping with virtualized paths #83813

Merged
merged 12 commits into from
May 12, 2021

Conversation

cbeuw
Copy link
Contributor

@cbeuw cbeuw commented Apr 3, 2021

This PR fixes #73167 ("Binaries end up containing path to the rust-src component despite --remap-path-prefix") by preventing real local filesystem paths from reaching compilation output if the path is supposed to be remapped.

RealFileName::Named introduced in #72767 is now renamed as LocalPath, because this variant wraps a (most likely) valid local filesystem path.

RealFileName::Devirtualized is renamed as Remapped to be used for remapped path from a real path via --remap-path-prefix argument, as well as real path inferred from a virtualized (during compiler bootstrapping) /rustc/... path. The local_path field is now an Option<PathBuf>, as it will be set to None before serialisation, so it never reaches any build output. Attempting to serialise a non-None local_path will cause an assertion faliure.

When a path is remapped, a RealFileName::Remapped variant is created. The original path is preserved in local_path field and the remapped path is saved in virtual_name field. Previously, the local_path is directly modified which goes against its purpose of "suitable for reading from the file system on the local host".

rustc_span::SourceFile's fields unmapped_path (introduced by #44940) and name_was_remapped (introduced by #41508 when --remap-path-prefix feature originally added) are removed, as these two pieces of information can be inferred from the name field: if it's anything other than a FileName::Real(_), or if it is a FileName::Real(RealFileName::LocalPath(_)), then clearly name_was_remapped would've been false and unmapped_path would've been None. If it is a FileName::Real(RealFileName::Remapped{local_path, virtual_name}), then name_was_remapped would've been true and unmapped_path would've been Some(local_path).

cc @eddyb who implemented /rustc/... path devirtualisation

@rust-highfive
Copy link
Collaborator

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @varkor (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

Please see the contribution instructions for more information.

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Apr 3, 2021
@eddyb
Copy link
Member

eddyb commented Apr 3, 2021

cc @pnkfelix @michaelwoerister

Copy link
Member

@nagisa nagisa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could the history be simplified, please?

e.g. 1 commit for adding a test case, rather than 2, combine the Devirtualized changes into one, etc. As the history is right now, its not very straightforward to follow and makes reviewing difficult as the same area of code is changed multiple times.

Furthermore, am I correct that the last 2 commits are not required to fix the issue itself? Could those be split out into a separate PR?

@cbeuw
Copy link
Contributor Author

cbeuw commented Apr 3, 2021

@nagisa yeah I probably should've squashed some of them before the PR. I'll reorganise every change so far and force push. So further changes can still be kept as individual commits.

It's true that the bug can be solved without the last two commit and I did contemplate putting up two PRs. But the second PR would have to incorporate all the changes in the first one, so it seems I need to wait until the first PR merges into master first? Another reason was that the state before the last two commits has bits I'm unsure about (for instance, unmaped_path is supposed to be None for imported files, but we are remapping imported file paths, should it be set or left as None - it's better to just remove the field entirely), so I decided to put all changes in one PR to be reviewed together

@rust-log-analyzer

This comment has been minimized.

@michaelwoerister
Copy link
Member

Nice! Thanks for the PR, @cbeuw!

@nagisa, @varkor, let me know if you'd like me to take a closer look at the fix.

@nagisa
Copy link
Member

nagisa commented Apr 6, 2021

Please do! My primary concern is basically that this implements repeated remapping, whereas in the linked issue @eddyb said:

I don't think --remap-path-prefix should remap paths that were already remapped

I think implementation wise this is fine. Perhaps what we need is a fcp merge?

@cbeuw
Copy link
Contributor Author

cbeuw commented Apr 7, 2021

@nagisa Regarding repeated remapping, I think there are two senarios:

  1. Remapping a /rustc/$hash/... path which has been remapped during compiler bootstraping. Although it doesn't seem very useful, I don't see why we shouldn't allow this. It might be used (in a very contrived manner) to manually remap it back to a real path on the system for a sysroot crate without installing rust-src.
  2. Remapping paths embedded in the compilation output of another crate, which may have already been remapped. This is what src/test/codegen/remap_path_prefix/double-remap-imported.rs is testing. This shouldn't happen if rustc is invoked by cargo, as it passes the same RUSTFLAGS to all invocation to rustc - that is unless the user specifies multiple --remap-path-prefixes with the to path of one matching from path of another, e.g. --remap-path-prefix=a=b --remap-path-prefix=b=c. It shouldn't break anything and I don't think we should place artificial restrictions on this.

@varkor
Copy link
Member

varkor commented Apr 7, 2021

@nagisa, @varkor, let me know if you'd like me to take a closer look at the fix.

I'm very busy at the moment, so if you'd be happy to complete the review, that would be great :)

r? @michaelwoerister

@michaelwoerister
Copy link
Member

OK, I'll take over the review 👍

@michaelwoerister
Copy link
Member

@eddyb Do you think that the functionality in #70642 could also be implemented by making Rustup's rustc wrapper binary inject an appropriate --remap-path-prefix option?

@eddyb
Copy link
Member

eddyb commented Apr 7, 2021

@eddyb Do you think that the functionality in #70642 could also be implemented by making Rustup's rustc wrapper binary inject an appropriate --remap-path-prefix option?

No, I think that's way too invasive. I'm not sure if this was clear to everyone but part of the reason for making it sysroot-relative is that distro packages can also supply it, as long as they follow the same path hierarchy that rust-src has (AFAIK make install / x.py install should be enough?).

Maybe the hash and/or version should be part of the src paths in rust-src itself? Then the whole /rustc/<hash> hack could be replaced with just something like $sysroot - ofc this would break a bunch of tools, but it would also make it more clear what is going on.

Anyway I'm not super sure why the whole virtualized path thing is even relevant.
Its purpose is to allow diagnostics to point to an actual file to improve UX (we could even hide the path and only use it to show the code, really, but then the user wouldn't have access to wider context). The "devirtualized local path" should never end up in file!()/#[track_caller]/debuginfo, and if it does that's a bug IMO.

@michaelwoerister
Copy link
Member

I'm still in the process of finding out what the status quo is and how this PR modifies it. My findings so far are:

Currently it does not look to me like we need remapping of imported paths (i.e. "double remapping") to fix #73167. The compiler has the stable name of upstream file names available. I would opt for trying to fix the existing bugs without introducing new functionality (which would probably need a major change proposal).

That being said, from the description of the PR it sounds like it contains some nice cleanup refactorings. I'll take a closer look soon.

@michaelwoerister
Copy link
Member

No, I think that's way too invasive. I'm not sure if this was clear to everyone but part of the reason for making it sysroot-relative is that distro packages can also supply it ...

Yes, that makes sense to me.

@cbeuw
Copy link
Contributor Author

cbeuw commented Apr 7, 2021

Currently there isn't a impl ToString for FileName, hence it falls back to impl Display for FileName which returns the local_name for (de)virtualised variant:

// FIXME: might be nice to display both components of Devirtualized.
// But for now (to backport fix for issue #70924), best to not
// perturb diagnostics so its obvious test suite still works.
Real(RealFileName::Devirtualized { ref local_path, virtual_name: _ }) => {
write!(fmt, "{}", local_path.display())
}

There's one place where FileName::to_string() is used for debuginfo:

pub fn file_metadata(cx: &CodegenCx<'ll, '_>, source_file: &SourceFile) -> &'ll DIFile {
debug!("file_metadata: file_name: {}", source_file.name);
let hash = Some(&source_file.src_hash);
let file_name = Some(source_file.name.to_string());
let directory = if source_file.is_real_file() && !source_file.is_imported() {
Some(cx.sess().working_dir.0.to_string_lossy().to_string())
} else {
// If the path comes from an upstream crate we assume it has been made
// independent of the compiler's working directory one way or another.
None
};
file_metadata_raw(cx, file_name, directory, hash)
}

And used for panic location:

let const_loc = tcx.const_caller_location((
Symbol::intern(&caller.file.name.to_string()),
caller.line as u32,
caller.col_display as u32 + 1,
));

It's probably appropirate to manualy do impl ToString for FileName and leave impl Display for FileName exclusively responsible for diagnostics.

@michaelwoerister
Copy link
Member

Great find about the panic locations, @cbeuw! These are exactly the things we need to look for. I wonder if we can remove the Display implementation for FileName and require users to explicitly choose if they want the local or the stable version. It's really easy to get this wrong at the moment.

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

rust-highfive added a commit to rust-lang-nursery/rust-toolstate that referenced this pull request May 12, 2021
Tested on commit rust-lang/rust@e1ff91f.
Direct link to PR: <rust-lang/rust#83813>

💔 miri on windows: test-fail → build-fail (cc @oli-obk @eddyb @RalfJung).
💔 miri on linux: test-fail → build-fail (cc @oli-obk @eddyb @RalfJung).
💔 rls on windows: test-pass → build-fail (cc @Xanewok).
💔 rls on linux: test-pass → build-fail (cc @Xanewok).
💔 rustfmt on windows: test-pass → build-fail (cc @calebcartwright @topecongiro).
💔 rustfmt on linux: test-pass → build-fail (cc @calebcartwright @topecongiro).
@bors bors mentioned this pull request May 12, 2021
@cbeuw cbeuw deleted the remap-std branch May 12, 2021 14:52
bors added a commit to rust-lang/miri that referenced this pull request May 12, 2021
Sync with rustc_span changes

rust-lang/rust#83813 made some changes to SourceMap and RealFileName. Now to get a string from a `rustc_span::FileName`, we need to specify if we would like the local filesystem path or remapped path via `--remap-path-prefix`. There seems to be only one place in miri that requires change.
@michaelwoerister
Copy link
Member

Thank you for all the hard work you have been putting into this, @cbeuw!

bors added a commit to rust-lang/rls that referenced this pull request May 15, 2021
Sync with rustc_span changes

rust-lang/rust#83813 made some changes to SourceMap and RealFileName. Now to get a string from a `rustc_span::FileName` or `RealFileName` (`working_dir` in `rustc_span::Session` is now a `RealFileName` because it may be remapped), we need to specify if we would like the local filesystem path or remapped path via `--remap-path-prefix`.

There are two files affected, the context very similar in both. I'm not entirely sure if we want the local path or remapped path here, so I just picked local path as a placeholder for opening this PR.

Closes - after updating rls module in rustc repo - rust-lang/rust#85225
bors added a commit to rust-lang/rls that referenced this pull request May 15, 2021
Sync with rustc_span changes

rust-lang/rust#83813 made some changes to SourceMap and RealFileName. Now to get a string from a `rustc_span::FileName` or `RealFileName` (`working_dir` in `rustc_span::Session` is now a `RealFileName` because it may be remapped), we need to specify if we would like the local filesystem path or remapped path via `--remap-path-prefix`.

There are two files affected, the context very similar in both. I'm not entirely sure if we want the local path or remapped path here, so I just picked local path as a placeholder for opening this PR.

Closes - after updating rls module in rustc repo - rust-lang/rust#85225
flip1995 pushed a commit to flip1995/rust that referenced this pull request May 20, 2021
Fix `--remap-path-prefix` not correctly remapping `rust-src` component paths and unify handling of path mapping with virtualized paths

This PR fixes rust-lang#73167 ("Binaries end up containing path to the rust-src component despite `--remap-path-prefix`") by preventing real local filesystem paths from reaching compilation output if the path is supposed to be remapped.

`RealFileName::Named` introduced in rust-lang#72767 is now renamed as `LocalPath`, because this variant wraps a (most likely) valid local filesystem path.

`RealFileName::Devirtualized` is renamed as `Remapped` to be used for remapped path from a real path via `--remap-path-prefix` argument, as well as real path inferred from a virtualized (during compiler bootstrapping) `/rustc/...` path. The `local_path` field is now an `Option<PathBuf>`, as it will be set to `None` before serialisation, so it never reaches any build output. Attempting to serialise a non-`None` `local_path` will cause an assertion faliure.

When a path is remapped, a `RealFileName::Remapped` variant is created. The original path is preserved in `local_path` field and the remapped path is saved in `virtual_name` field. Previously, the `local_path` is directly modified which goes against its purpose of "suitable for reading from the file system on the local host".

`rustc_span::SourceFile`'s fields `unmapped_path` (introduced by rust-lang#44940) and `name_was_remapped` (introduced by rust-lang#41508 when `--remap-path-prefix` feature originally added) are removed, as these two pieces of information can be inferred from the `name` field: if it's anything other than a `FileName::Real(_)`, or if it is a `FileName::Real(RealFileName::LocalPath(_))`, then clearly `name_was_remapped` would've been false and `unmapped_path` would've been `None`. If it is a `FileName::Real(RealFileName::Remapped{local_path, virtual_name})`, then `name_was_remapped` would've been true and `unmapped_path` would've been `Some(local_path)`.

cc `@eddyb` who implemented `/rustc/...` path devirtualisation
bjorn3 pushed a commit to bjorn3/rust that referenced this pull request May 27, 2021
Fix `--remap-path-prefix` not correctly remapping `rust-src` component paths and unify handling of path mapping with virtualized paths

This PR fixes rust-lang#73167 ("Binaries end up containing path to the rust-src component despite `--remap-path-prefix`") by preventing real local filesystem paths from reaching compilation output if the path is supposed to be remapped.

`RealFileName::Named` introduced in rust-lang#72767 is now renamed as `LocalPath`, because this variant wraps a (most likely) valid local filesystem path.

`RealFileName::Devirtualized` is renamed as `Remapped` to be used for remapped path from a real path via `--remap-path-prefix` argument, as well as real path inferred from a virtualized (during compiler bootstrapping) `/rustc/...` path. The `local_path` field is now an `Option<PathBuf>`, as it will be set to `None` before serialisation, so it never reaches any build output. Attempting to serialise a non-`None` `local_path` will cause an assertion faliure.

When a path is remapped, a `RealFileName::Remapped` variant is created. The original path is preserved in `local_path` field and the remapped path is saved in `virtual_name` field. Previously, the `local_path` is directly modified which goes against its purpose of "suitable for reading from the file system on the local host".

`rustc_span::SourceFile`'s fields `unmapped_path` (introduced by rust-lang#44940) and `name_was_remapped` (introduced by rust-lang#41508 when `--remap-path-prefix` feature originally added) are removed, as these two pieces of information can be inferred from the `name` field: if it's anything other than a `FileName::Real(_)`, or if it is a `FileName::Real(RealFileName::LocalPath(_))`, then clearly `name_was_remapped` would've been false and `unmapped_path` would've been `None`. If it is a `FileName::Real(RealFileName::Remapped{local_path, virtual_name})`, then `name_was_remapped` would've been true and `unmapped_path` would've been `Some(local_path)`.

cc `@eddyb` who implemented `/rustc/...` path devirtualisation
GuillaumeGomez added a commit to GuillaumeGomez/rust that referenced this pull request Sep 1, 2021
…cs, r=estebank

Path remapping: Make behavior of diagnostics output dependent on presence of --remap-path-prefix.

This PR fixes a regression (rust-lang#87745) with `--remap-path-prefix` where the flag stopped causing diagnostic messages to be remapped as well. The regression was introduced in rust-lang#83813 where we erroneously assumed that remapping of diagnostic messages was not desired anymore (because rust-lang#70642 partially undid that functionality with nobody objecting).

The issue is fixed by making `--remap-path-prefix` remap diagnostic messages again, including for paths that have been remapped in upstream crates (e.g. the standard library). This means that "sysroot-localization" (implemented in rust-lang#70642) is also disabled if `rustc` is invoked with `--remap-path-prefix`. The assumption is that once someone starts explicitly remapping paths they also don't want paths to their local Rust installation in their build output.

In the future we might want to give more fine-grained control over this behavior via compiler flags (see rust-lang/rfcs#3127 for a related RFC). For now this PR is intended as a regression fix.

This PR is an alternative to rust-lang#88191, which makes diagnostic messages be remapped unconditionally. That approach, however, would effectively revert rust-lang#70642.

Fixes rust-lang#87745.

cc `@cbeuw`
r? `@ghost`
GuillaumeGomez added a commit to GuillaumeGomez/rust that referenced this pull request Sep 2, 2021
…cs, r=estebank

Path remapping: Make behavior of diagnostics output dependent on presence of --remap-path-prefix.

This PR fixes a regression (rust-lang#87745) with `--remap-path-prefix` where the flag stopped causing diagnostic messages to be remapped as well. The regression was introduced in rust-lang#83813 where we erroneously assumed that remapping of diagnostic messages was not desired anymore (because rust-lang#70642 partially undid that functionality with nobody objecting).

The issue is fixed by making `--remap-path-prefix` remap diagnostic messages again, including for paths that have been remapped in upstream crates (e.g. the standard library). This means that "sysroot-localization" (implemented in rust-lang#70642) is also disabled if `rustc` is invoked with `--remap-path-prefix`. The assumption is that once someone starts explicitly remapping paths they also don't want paths to their local Rust installation in their build output.

In the future we might want to give more fine-grained control over this behavior via compiler flags (see rust-lang/rfcs#3127 for a related RFC). For now this PR is intended as a regression fix.

This PR is an alternative to rust-lang#88191, which makes diagnostic messages be remapped unconditionally. That approach, however, would effectively revert rust-lang#70642.

Fixes rust-lang#87745.

cc ``@cbeuw``
r? ``@ghost``
m-ou-se added a commit to m-ou-se/rust that referenced this pull request Sep 2, 2021
…cs, r=estebank

Path remapping: Make behavior of diagnostics output dependent on presence of --remap-path-prefix.

This PR fixes a regression (rust-lang#87745) with `--remap-path-prefix` where the flag stopped causing diagnostic messages to be remapped as well. The regression was introduced in rust-lang#83813 where we erroneously assumed that remapping of diagnostic messages was not desired anymore (because rust-lang#70642 partially undid that functionality with nobody objecting).

The issue is fixed by making `--remap-path-prefix` remap diagnostic messages again, including for paths that have been remapped in upstream crates (e.g. the standard library). This means that "sysroot-localization" (implemented in rust-lang#70642) is also disabled if `rustc` is invoked with `--remap-path-prefix`. The assumption is that once someone starts explicitly remapping paths they also don't want paths to their local Rust installation in their build output.

In the future we might want to give more fine-grained control over this behavior via compiler flags (see rust-lang/rfcs#3127 for a related RFC). For now this PR is intended as a regression fix.

This PR is an alternative to rust-lang#88191, which makes diagnostic messages be remapped unconditionally. That approach, however, would effectively revert rust-lang#70642.

Fixes rust-lang#87745.

cc `@cbeuw`
r? `@ghost`
bors added a commit to rust-lang-ci/rust that referenced this pull request Sep 3, 2021
…, r=estebank

Path remapping: Make behavior of diagnostics output dependent on presence of --remap-path-prefix.

This PR fixes a regression (rust-lang#87745) with `--remap-path-prefix` where the flag stopped causing diagnostic messages to be remapped as well. The regression was introduced in rust-lang#83813 where we erroneously assumed that remapping of diagnostic messages was not desired anymore (because rust-lang#70642 partially undid that functionality with nobody objecting).

The issue is fixed by making `--remap-path-prefix` remap diagnostic messages again, including for paths that have been remapped in upstream crates (e.g. the standard library). This means that "sysroot-localization" (implemented in rust-lang#70642) is also disabled if `rustc` is invoked with `--remap-path-prefix`. The assumption is that once someone starts explicitly remapping paths they also don't want paths to their local Rust installation in their build output.

In the future we might want to give more fine-grained control over this behavior via compiler flags (see rust-lang/rfcs#3127 for a related RFC). For now this PR is intended as a regression fix.

This PR is an alternative to rust-lang#88191, which makes diagnostic messages be remapped unconditionally. That approach, however, would effectively revert rust-lang#70642.

Fixes rust-lang#87745.

cc `@cbeuw`
r? `@ghost`
bjorn3 pushed a commit to rust-lang/rustc_codegen_cranelift that referenced this pull request Sep 7, 2021
Path remapping: Make behavior of diagnostics output dependent on presence of --remap-path-prefix.

This PR fixes a regression (#87745) with `--remap-path-prefix` where the flag stopped causing diagnostic messages to be remapped as well. The regression was introduced in rust-lang/rust#83813 where we erroneously assumed that remapping of diagnostic messages was not desired anymore (because #70642 partially undid that functionality with nobody objecting).

The issue is fixed by making `--remap-path-prefix` remap diagnostic messages again, including for paths that have been remapped in upstream crates (e.g. the standard library). This means that "sysroot-localization" (implemented in #70642) is also disabled if `rustc` is invoked with `--remap-path-prefix`. The assumption is that once someone starts explicitly remapping paths they also don't want paths to their local Rust installation in their build output.

In the future we might want to give more fine-grained control over this behavior via compiler flags (see rust-lang/rfcs#3127 for a related RFC). For now this PR is intended as a regression fix.

This PR is an alternative to rust-lang/rust#88191, which makes diagnostic messages be remapped unconditionally. That approach, however, would effectively revert #70642.

Fixes rust-lang/rust#87745.

cc `@cbeuw`
r? `@ghost`
flip1995 pushed a commit to flip1995/rust-clippy that referenced this pull request Sep 8, 2021
Path remapping: Make behavior of diagnostics output dependent on presence of --remap-path-prefix.

This PR fixes a regression (#87745) with `--remap-path-prefix` where the flag stopped causing diagnostic messages to be remapped as well. The regression was introduced in rust-lang/rust#83813 where we erroneously assumed that remapping of diagnostic messages was not desired anymore (because #70642 partially undid that functionality with nobody objecting).

The issue is fixed by making `--remap-path-prefix` remap diagnostic messages again, including for paths that have been remapped in upstream crates (e.g. the standard library). This means that "sysroot-localization" (implemented in #70642) is also disabled if `rustc` is invoked with `--remap-path-prefix`. The assumption is that once someone starts explicitly remapping paths they also don't want paths to their local Rust installation in their build output.

In the future we might want to give more fine-grained control over this behavior via compiler flags (see rust-lang/rfcs#3127 for a related RFC). For now this PR is intended as a regression fix.

This PR is an alternative to rust-lang/rust#88191, which makes diagnostic messages be remapped unconditionally. That approach, however, would effectively revert #70642.

Fixes rust-lang/rust#87745.

cc `@cbeuw`
r? `@ghost`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-reproducibility Area: Reproducible / deterministic builds merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Binaries end up containing path to the rust-src component despite --remap-path-prefix