Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[breaking change] Validate crate name in --extern [MCP 650] #116001

Merged
merged 1 commit into from
Sep 22, 2023

Conversation

fmease
Copy link
Member

@fmease fmease commented Sep 20, 2023

Reject non-ASCII-identifier crate names passed to the CLI option --extern (rustc, rustdoc).
Implements MCP 650 (except that we only allow ASCII identifiers not arbitrary Rust identifiers).
Fixes #113035.

As mentioned on Zulip, doing a crater run probably doesn't make sense since it wouldn't yield anything. Most users don't interact with rustc directly but only ever through Cargo which always passes a valid crate name to --extern when it invokes rustc and rustdoc. In any case, the user wouldn't be able to use such a crate name in the source code anyway.

Note that I'm not using rustc_session::output::validate_crate_name (used for --crate-name and #![crate_name]) since the latter doesn't reject non-ASCII crate names and ones that start with a digit.

As an aside, I've also thought about getting rid of validate_crate_name entirely in a separate PR (with another MCP) in favor of is_ascii_ident to reject more weird --crate-names, #![crate_name]s and file names but I think that would lead to a lot of actual breakage, namely because of file names starting with a digit. In tests/ui 9 tests would be impacted for example.

CC @estebank
r? @est31

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Sep 20, 2023
@rust-log-analyzer

This comment has been minimized.

@@ -2451,6 +2451,19 @@ pub fn parse_externs(
Some((opts, name)) => (Some(opts), name.to_string()),
};

if !rustc_lexer::is_ident(&name) {
let mut error = handler.early_struct_error(format!(
"crate name `{name}` passed to `--extern` is not a valid identifier"
Copy link
Member Author

@fmease fmease Sep 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I briefly considered Debug-printing name instead (via {:?} or via .escape_debug()) to escape line breaks but it looked confusing in the common case (backticks plus double quotes or \n in backticks which the user didn't write).

@@ -1710,6 +1710,15 @@ impl EarlyErrorHandler {
self.handler.struct_fatal(msg).emit()
}

#[allow(rustc::untranslatable_diagnostic)]
#[allow(rustc::diagnostic_outside_of_impl)]
pub(crate) fn early_struct_error(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's unfortunate that I had to introduce this method. I can get rid of it again at the cost of not showing the .help("replace dashes with underscores …").

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's fine.

@fmease fmease force-pushed the validate-crate-name-extern-cli-opt branch from 04de89d to c3e8204 Compare September 20, 2023 13:26
@rustbot
Copy link
Collaborator

rustbot commented Sep 20, 2023

These commits modify the Cargo.lock file. Unintentional changes to Cargo.lock can be introduced when switching branches and rebasing PRs.

If this was unintentional then you should revert the changes before this PR is merged.
Otherwise, you can ignore this comment.

@ehuss
Copy link
Contributor

ehuss commented Sep 20, 2023

One thing that might want to be considered is that generally rustc requires crate names to be ASCII. That is, you cannot reference an extern crate with unicode characters. Should this also be checking name.as_str().is_ascii()?

@fmease
Copy link
Member Author

fmease commented Sep 20, 2023

One thing that might want to be considered is that generally rustc requires crate names to be ASCII. That is, you cannot reference an extern crate with unicode characters. Should this also be checking name.as_str().is_ascii()?

Hmm, good point! That's very interesting since rustc actually allows you to compile crates with a non-ASCII crate name & link them via --extern and it only fails once you actually try to reference the crate name in the code (cannot load a crate with a non-ascii name).

I'm gonna check if they're ASCII then 👍

@rust-log-analyzer

This comment has been minimized.

@fmease fmease added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Sep 20, 2023
@fmease fmease marked this pull request as draft September 20, 2023 13:59
@fmease fmease added the A-CLI Area: Command-line interface (CLI) to the compiler label Sep 20, 2023
@fmease fmease force-pushed the validate-crate-name-extern-cli-opt branch from c3e8204 to 192c353 Compare September 20, 2023 16:09
@@ -15,7 +15,7 @@ INCR=$(TMPDIR)/incr
all:
cp first_crate.rs second_crate.rs $(TMPDIR)
$(RUSTC) $(TMPDIR)/first_crate.rs -C incremental=$(INCR) --target $(TARGET) --crate-type lib
$(RUSTC) $(TMPDIR)/second_crate.rs -C incremental=$(INCR) --target $(TARGET) --extern first-crate=$(TMPDIR) --crate-type lib
$(RUSTC) $(TMPDIR)/second_crate.rs -C incremental=$(INCR) --target $(TARGET) --extern first_crate=$(TMPDIR)/libfirst_crate.rlib --crate-type lib
Copy link
Member Author

@fmease fmease Sep 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still baffled that this test used to work at all, can somebody enlighten me? If I manually follow the steps as taken by this Makefile, I correctly get the error can't find crate for first_crate with a nightly rustc (since the name is first-crate on master). Is there any preprocessing going on here?

Apart from the incorrect crate name, --extern can't point to a directory, which I've also fixed. What is going on?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is weird indeed. I've checked the logs of recently successful runs and tests/run-make/incr-foreign-head-span does succeed on master. Very weird.

@fmease fmease added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Sep 20, 2023
@fmease fmease marked this pull request as ready for review September 20, 2023 16:18
@fmease fmease force-pushed the validate-crate-name-extern-cli-opt branch from 192c353 to 58a610f Compare September 20, 2023 16:23
@rust-log-analyzer

This comment has been minimized.

@fmease fmease force-pushed the validate-crate-name-extern-cli-opt branch from 58a610f to 8d81d5a Compare September 20, 2023 16:51
Copy link
Member Author

@fmease fmease Sep 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test has become obsolete since it would've triggered on the --extern first, instead of on the use item. It's superseded by extern-flag/invalid-crate-name-non-ascii.rs.

Comment on lines +2454 to +2465
if !crate::utils::is_ascii_ident(&name) {
let mut error = handler.early_struct_error(format!(
"crate name `{name}` passed to `--extern` is not a valid ASCII identifier"
));
let adjusted_name = name.replace("-", "_");
if crate::utils::is_ascii_ident(&adjusted_name) {
error.help(format!(
"consider replacing the dashes with underscores: `{adjusted_name}`"
));
}
error.emit();
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably not worth it given how rare this error will be in practice, but on other non-ASCII errors we go beyond replacing - with _ and try to provide more context on which chars were problematic and try to provide suggestions to similar ASCII chars when possible. This is particularly useful for homonyms, where someone is trying to sneak (on purpose or accidentally) a, for example, cyrillic char.

Copy link
Member Author

@fmease fmease Sep 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The lints uncommon_codepoints, mixed_script_confusables and confusable_idents are amazing for sure! I'd lean towards not doing that here since it'd mean moving the code from rustc_lint here and adapting it to an environment that doesn't have a Session yet, yuck! Not sure how easy that'd be, maybe it's not that bad. early_errors can't even be translated right now.

Esp. since --extern is not actually used directly as we've established (apart from rustc devs). Alternative Rust package managers should be able to figure out what to pass to --extern relatively quickly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fmease fair points 👍 . It can also be later work, too.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My stance is always "don't block on improving a solution short of perfect as long as it improves over the status quo", and this PR comfortably glides over that bar.

@estebank estebank added the relnotes Marks issues that should be documented in the release notes of the next release. label Sep 20, 2023
@est31
Copy link
Member

est31 commented Sep 21, 2023

@bors r+

@bors
Copy link
Contributor

bors commented Sep 21, 2023

📌 Commit 8d81d5a has been approved by est31

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Sep 21, 2023
@bors
Copy link
Contributor

bors commented Sep 22, 2023

⌛ Testing commit 8d81d5a with merge aadb571...

@bors
Copy link
Contributor

bors commented Sep 22, 2023

☀️ Test successful - checks-actions
Approved by: est31
Pushing aadb571 to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Sep 22, 2023
@bors bors merged commit aadb571 into rust-lang:master Sep 22, 2023
11 checks passed
@rustbot rustbot added this to the 1.74.0 milestone Sep 22, 2023
@fmease fmease deleted the validate-crate-name-extern-cli-opt branch September 22, 2023 21:46
@rust-timer
Copy link
Collaborator

Finished benchmarking commit (aadb571): comparison URL.

Overall result: no relevant changes - no action needed

@rustbot label: -perf-regression

Instruction count

This benchmark run did not return any relevant results for this metric.

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
3.0% [2.4%, 3.6%] 4
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) - - 0

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-2.8% [-3.6%, -2.2%] 5
All ❌✅ (primary) - - 0

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 635.567s -> 635.843s (0.04%)
Artifact size: 317.55 MiB -> 317.47 MiB (-0.03%)

netbsd-srcmastr pushed a commit to NetBSD/pkgsrc that referenced this pull request Jan 8, 2024
Pkgsrc changes:

 * Remove NetBSD-8 support (embedded LLVm requires newer C++
   than what is in -8; it's conceivable that this could still
   build with an external LLVM)
 * undo powerpc 9.0 file naming tweak, since we no longer support -8.
 * Remove patch to LLVM for powerpc now included by upstream.
 * Minor adjustments, checksum changes etc.


Upstream changes:

Version 1.74.1 (2023-12-07)
===========================

- [Resolved spurious STATUS_ACCESS_VIOLATIONs in LLVM]
  (rust-lang/rust#118464)
- [Clarify guarantees for std::mem::discriminant]
  (rust-lang/rust#118006)
- [Fix some subtyping-related regressions]
  (rust-lang/rust#116415)

Version 1.74.0 (2023-11-16)
==========================

Language
--------

- [Codify that `std::mem::Discriminant<T>` does not depend on any
  lifetimes in T]
  (rust-lang/rust#104299)
- [Replace `private_in_public` lint with `private_interfaces` and
  `private_bounds` per RFC 2145]
  (rust-lang/rust#113126)
  Read more in
  [RFC 2145](https://rust-lang.github.io/rfcs/2145-type-privacy.html).
- [Allow explicit `#[repr(Rust)]`]
  (rust-lang/rust#114201)
- [closure field capturing: don't depend on alignment of packed fields]
  (rust-lang/rust#115315)
- [Enable MIR-based drop-tracking for `async` blocks]
  (rust-lang/rust#107421)

Compiler
--------

- [stabilize combining +bundle and +whole-archive link modifiers]
  (rust-lang/rust#113301)
- [Stabilize `PATH` option for `--print KIND=PATH`]
  (rust-lang/rust#114183)
- [Enable ASAN/LSAN/TSAN for `*-apple-ios-macabi`]
  (rust-lang/rust#115644)
- [Promote loongarch64-unknown-none* to Tier 2]
  (rust-lang/rust#115368)
- [Add `i686-pc-windows-gnullvm` as a tier 3 target]
  (rust-lang/rust#115687)

Libraries
---------

- [Implement `From<OwnedFd/Handle>` for ChildStdin/out/err]
  (rust-lang/rust#98704)
- [Implement `From<{&,&mut} [T; N]>` for `Vec<T>` where `T: Clone`]
  (rust-lang/rust#111278)
- [impl Step for IP addresses]
  (rust-lang/rust#113748)
- [Implement `From<[T; N]>` for `Rc<[T]>` and `Arc<[T]>`]
  (rust-lang/rust#114041)
- [`impl TryFrom<char> for u16`]
  (rust-lang/rust#114065)
- [Stabilize `io_error_other` feature]
  (rust-lang/rust#115453)
- [Stabilize the `Saturating` type]
  (rust-lang/rust#115477)
- [Stabilize const_transmute_copy]
  (rust-lang/rust#115520)

Stabilized APIs
---------------

- [`core::num::Saturating`]
  (https://doc.rust-lang.org/stable/std/num/struct.Saturating.html)
- [`impl From<io::Stdout> for std::process::Stdio`]
  (https://doc.rust-lang.org/stable/std/process/struct.Stdio.html#impl-From%3CStdout%3E-for-Stdio)
- [`impl From<io::Stderr> for std::process::Stdio`]
  (https://doc.rust-lang.org/stable/std/process/struct.Stdio.html#impl-From%3CStderr%3E-for-Stdio)
- [`impl From<OwnedHandle> for std::process::Child{Stdin, Stdout, Stderr}`]
  (https://doc.rust-lang.org/stable/std/process/struct.Stdio.html#impl-From%3CStderr%3E-for-Stdio)
- [`impl From<OwnedFd> for std::process::Child{Stdin, Stdout, Stderr}`]
  (https://doc.rust-lang.org/stable/std/process/struct.Stdio.html#impl-From%3CStderr%3E-for-Stdio)
- [`std::ffi::OsString::from_encoded_bytes_unchecked`]
  (https://doc.rust-lang.org/stable/std/ffi/struct.OsString.html#method.from_encoded_bytes_unchecked)
- [`std::ffi::OsString::into_encoded_bytes`]
  (https://doc.rust-lang.org/stable/std/ffi/struct.OsString.html#method.into_encoded_bytes)
- [`std::ffi::OsStr::from_encoded_bytes_unchecked`]
  (https://doc.rust-lang.org/stable/std/ffi/struct.OsStr.html#method.from_encoded_bytes_unchecked)
- [`std::ffi::OsStr::as_encoded_bytes`]
  (https://doc.rust-lang.org/stable/std/ffi/struct.OsStr.html#method.as_encoded_bytes)
- [`std::io::Error::other`]
  (https://doc.rust-lang.org/stable/std/io/struct.Error.html#method.other)
- [`impl TryFrom<char> for u16`]
  (https://doc.rust-lang.org/stable/std/primitive.u16.html#impl-TryFrom%3Cchar%3E-for-u16)
- [`impl<T: Clone, const N: usize> From<&[T; N]> for Vec<T>`]
  (https://doc.rust-lang.org/stable/std/vec/struct.Vec.html#impl-From%3C%26%5BT;+N%5D%3E-for-Vec%3CT,+Global%3E)
- [`impl<T: Clone, const N: usize> From<&mut [T; N]> for Vec<T>`]
  (https://doc.rust-lang.org/stable/std/vec/struct.Vec.html#impl-From%3C%26mut+%5BT;+N%5D%3E-for-Vec%3CT,+Global%3E)
- [`impl<T, const N: usize> From<[T; N]> for Arc<[T]>`]
  (https://doc.rust-lang.org/stable/std/sync/struct.Arc.html#impl-From%3C%5BT;+N%5D%3E-for-Arc%3C%5BT%5D,+Global%3E)
- [`impl<T, const N: usize> From<[T; N]> for Rc<[T]>`]
  (https://doc.rust-lang.org/stable/std/rc/struct.Rc.html#impl-From%3C%5BT;+N%5D%3E-for-Rc%3C%5BT%5D,+Global%3E)

These APIs are now stable in const contexts:

- [`core::mem::transmute_copy`]
  (https://doc.rust-lang.org/beta/std/mem/fn.transmute_copy.html)
- [`str::is_ascii`]
  (https://doc.rust-lang.org/beta/std/primitive.str.html#method.is_ascii)
- [`[u8]::is_ascii`]
  (https://doc.rust-lang.org/beta/std/primitive.slice.html#method.is_ascii)

Cargo
-----

- [fix: Set MSRV for internal packages]
  (rust-lang/cargo#12381)
- [config: merge lists in precedence order]
  (rust-lang/cargo#12515)
- [fix(update): Clarify meaning of --aggressive as --recursive]
  (rust-lang/cargo#12544)
- [fix(update): Make `-p` more convenient by being positional]
  (rust-lang/cargo#12545)
- [feat(help): Add styling to help output ]
  (rust-lang/cargo#12578)
- [feat(pkgid): Allow incomplete versions when unambigious]
  (rust-lang/cargo#12614)
- [feat: stabilize credential-process and registry-auth]
  (rust-lang/cargo#12649)
- [feat(cli): Add '-n' to dry-run]
  (rust-lang/cargo#12660)
- [Add support for `target.'cfg(..)'.linker`]
  (rust-lang/cargo#12535)
- [Stabilize `--keep-going`]
  (rust-lang/cargo#12568)
- [feat: Stabilize lints]
  (rust-lang/cargo#12648)

Rustdoc
-------

- [Add warning block support in rustdoc]
  (rust-lang/rust#106561)
- [Accept additional user-defined syntax classes in fenced code blocks]
  (rust-lang/rust#110800)
- [rustdoc-search: add support for type parameters]
  (rust-lang/rust#112725)
- [rustdoc: show inner enum and struct in type definition for concrete type]
  (rust-lang/rust#114855)

Compatibility Notes
-------------------

- [Raise minimum supported Apple OS versions]
  (rust-lang/rust#104385)
- [make Cell::swap panic if the Cells partially overlap]
  (rust-lang/rust#114795)
- [Reject invalid crate names in `--extern`]
  (rust-lang/rust#116001)
- [Don't resolve generic impls that may be shadowed by dyn built-in impls]
  (rust-lang/rust#114941)

Internal Changes
----------------

These changes do not affect any public interfaces of Rust, but they represent
significant improvements to the performance or internals of rustc and related
tools.

None this cycle.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-CLI Area: Command-line interface (CLI) to the compiler merged-by-bors This PR was explicitly merged by bors. relnotes Marks issues that should be documented in the release notes of the next release. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Crate name found in --extern gets suggested verbatim on unresolved import
8 participants