Improve unused_unsafe lint

Main motivation: Fixes some issues with the current behavior. This PR is more-or-less completely re-implementing the unused_unsafe lint; it’s also only done in the MIR-version of the lint, the set of tests for the `-Zthir-unsafeck` version no longer succeeds (and is thus disabled, see `lint-unused-unsafe.rs`). On current nightly, ```rs unsafe fn unsf() {} fn inner_ignored() { unsafe { #[allow(unused_unsafe)] unsafe { unsf() } } } ``` doesn’t create any warnings. This situation is not unrealistic to come by, the inner `unsafe` block could e.g. come from a macro. Actually, this PR even includes removal of one unused `unsafe` in the standard library that was missed in a similar situation. (The inner `unsafe` coming from an external macro hides the warning, too.) The reason behind this problem is how the check currently works: * While generating MIR, it already skips nested unsafe blocks (i.e. unsafe nested in other unsafe) so that the inner one is always the one considered unused * To differentiate the cases of no unsafe operations inside the `unsafe` vs. a surrounding `unsafe` block, there’s some ad-hoc magic walking up the HIR to look for surrounding used `unsafe` blocks. There’s a lot of problems with this approach besides the one presented above. E.g. the MIR-building uses checks for `unsafe_op_in_unsafe_fn` lint to decide early whether or not `unsafe` blocks in an `unsafe fn` are redundant and ought to be removed. ```rs unsafe fn granular_disallow_op_in_unsafe_fn() { unsafe { #[deny(unsafe_op_in_unsafe_fn)] { unsf(); } } } ``` ``` error: call to unsafe function is unsafe and requires unsafe block (error E0133) --> src/main.rs:13:13 | 13 | unsf(); | ^^^^^^ call to unsafe function | note: the lint level is defined here --> src/main.rs:11:16 | 11 | #[deny(unsafe_op_in_unsafe_fn)] | ^^^^^^^^^^^^^^^^^^^^^^ = note: consult the function's documentation for information on how to avoid undefined behavior warning: unnecessary `unsafe` block --> src/main.rs:10:5 | 9 | unsafe fn granular_disallow_op_in_unsafe_fn() { | --------------------------------------------- because it's nested under this `unsafe` fn 10 | unsafe { | ^^^^^^ unnecessary `unsafe` block | = note: `#[warn(unused_unsafe)]` on by default ``` Here, the intermediate `unsafe` was ignored, even though it contains a unsafe operation that is not allowed to happen in an `unsafe fn` without an additional `unsafe` block. Also closures were problematic and the workaround/algorithms used on current nightly didn’t work properly. (I skipped trying to fully understand what it was supposed to do, because this PR uses a completely different approach.) ```rs fn nested() { unsafe { unsafe { unsf() } } } ``` ``` warning: unnecessary `unsafe` block --> src/main.rs:10:9 | 9 | unsafe { | ------ because it's nested under this `unsafe` block 10 | unsafe { unsf() } | ^^^^^^ unnecessary `unsafe` block | = note: `#[warn(unused_unsafe)]` on by default ``` vs ```rs fn nested() { let _ = || unsafe { let _ = || unsafe { unsf() }; }; } ``` ``` warning: unnecessary `unsafe` block --> src/main.rs:9:16 | 9 | let _ = || unsafe { | ^^^^^^ unnecessary `unsafe` block | = note: `#[warn(unused_unsafe)]` on by default warning: unnecessary `unsafe` block --> src/main.rs:10:20 | 10 | let _ = || unsafe { unsf() }; | ^^^^^^ unnecessary `unsafe` block ``` *note that this warning kind-of suggests that **both** unsafe blocks are redundant* -------------------------------------------------------------------------------- I also dislike the fact that it always suggests keeping the outermost `unsafe`. E.g. for ```rs fn granularity() { unsafe { unsafe { unsf() } unsafe { unsf() } unsafe { unsf() } } } ``` I prefer if `rustc` suggests removing the more-course outer-level `unsafe` instead of the fine-grained inner `unsafe` blocks, which it currently does on nightly: ``` warning: unnecessary `unsafe` block --> src/main.rs:10:9 | 9 | unsafe { | ------ because it's nested under this `unsafe` block 10 | unsafe { unsf() } | ^^^^^^ unnecessary `unsafe` block | = note: `#[warn(unused_unsafe)]` on by default warning: unnecessary `unsafe` block --> src/main.rs:11:9 | 9 | unsafe { | ------ because it's nested under this `unsafe` block 10 | unsafe { unsf() } 11 | unsafe { unsf() } | ^^^^^^ unnecessary `unsafe` block warning: unnecessary `unsafe` block --> src/main.rs:12:9 | 9 | unsafe { | ------ because it's nested under this `unsafe` block ... 12 | unsafe { unsf() } | ^^^^^^ unnecessary `unsafe` block ``` -------------------------------------------------------------------------------- Needless to say, this PR addresses all these points. For context, as far as my understanding goes, the main advantage of skipping inner unsafe blocks was that a test case like ```rs fn top_level_used() { unsafe { unsf(); unsafe { unsf() } unsafe { unsf() } unsafe { unsf() } } } ``` should generate some warning because there’s redundant nested `unsafe`, however every single `unsafe` block _does_ contain some statement that uses it. Of course this PR doesn’t aim change the warnings on this kind of code example, because the current behavior, warning on all the inner `unsafe` blocks, makes sense in this case. As mentioned, during MIR building all the unsafe blocks *are* kept now, and usage is attributed to them. The way to still generate a warning like ``` warning: unnecessary `unsafe` block --> src/main.rs:11:9 | 9 | unsafe { | ------ because it's nested under this `unsafe` block 10 | unsf(); 11 | unsafe { unsf() } | ^^^^^^ unnecessary `unsafe` block | = note: `#[warn(unused_unsafe)]` on by default warning: unnecessary `unsafe` block --> src/main.rs:12:9 | 9 | unsafe { | ------ because it's nested under this `unsafe` block ... 12 | unsafe { unsf() } | ^^^^^^ unnecessary `unsafe` block warning: unnecessary `unsafe` block --> src/main.rs:13:9 | 9 | unsafe { | ------ because it's nested under this `unsafe` block ... 13 | unsafe { unsf() } | ^^^^^^ unnecessary `unsafe` block ``` in this case is by emitting a `unused_unsafe` warning for all of the `unsafe` blocks that are _within a **used** unsafe block_. The previous code had a little HIR traversal already anyways to collect a set of all the unsafe blocks (in order to afterwards determine which ones are unused afterwards). This PR uses such a traversal to do additional things including logic like _always_ warn for an `unsafe` block that’s inside of another **used** unsafe block. The traversal is expanded to include nested closures in the same go, this simplifies a lot of things. The whole logic around `unsafe_op_in_unsafe_fn` is a little complicated, there’s some test cases of corner-cases in this PR. (The implementation involves differentiating between whether a used unsafe block was used exclusively by operations where `allow(unsafe_op_in_unsafe_fn)` was active.) The main goal was to make sure that code should compile successfully if all the `unused_unsafe`-warnings are addressed _simultaneously_ (by removing the respective `unsafe` blocks) no matter how complicated the patterns of `unsafe_op_in_unsafe_fn` being disallowed and allowed throughout the function are. -------------------------------------------------------------------------------- One noteworthy design decision I took here: An `unsafe` block with `allow(unused_unsafe)` **is considered used** for the purposes of linting about redundant contained unsafe blocks. So while ```rs fn granularity() { unsafe { //~ ERROR: unnecessary `unsafe` block unsafe { unsf() } unsafe { unsf() } unsafe { unsf() } } } ``` warns for the outer `unsafe` block, ```rs fn top_level_ignored() { #[allow(unused_unsafe)] unsafe { #[deny(unused_unsafe)] { unsafe { unsf() } //~ ERROR: unnecessary `unsafe` block unsafe { unsf() } //~ ERROR: unnecessary `unsafe` block unsafe { unsf() } //~ ERROR: unnecessary `unsafe` block } } } ``` warns on the inner ones.
rust-lang · Feb 20, 2022 · 8f8689f · 8f8689f
1 parent c1aa854
commit 8f8689f
Show file tree

Hide file tree

Showing 11 changed files with 3,209 additions and 244 deletions.
diff --git a/compiler/rustc_codegen_llvm/src/abi.rs b/compiler/rustc_codegen_llvm/src/abi.rs
@@ -599,13 +599,11 @@ impl<'ll, 'tcx> FnAbiLlvmExt<'ll, 'tcx> for FnAbi<'tcx, Ty<'tcx>> {
         if self.conv == Conv::CCmseNonSecureCall {
             // This will probably get ignored on all targets but those supporting the TrustZone-M
             // extension (thumbv8m targets).
-            unsafe {
-                llvm::AddCallSiteAttrString(
-                    callsite,
-                    llvm::AttributePlace::Function,
-                    cstr::cstr!("cmse_nonsecure_call"),
-                );
-            }
+            llvm::AddCallSiteAttrString(
+                callsite,
+                llvm::AttributePlace::Function,
+                cstr::cstr!("cmse_nonsecure_call"),
+            );
         }
     }
 }

diff --git a/compiler/rustc_middle/src/hir/nested_filter.rs b/compiler/rustc_middle/src/hir/nested_filter.rs
@@ -3,6 +3,10 @@ use rustc_hir::intravisit::nested_filter::NestedFilter;
 /// Do not visit nested item-like things, but visit nested things
 /// that are inside of an item-like.
 ///
+/// Notably, possible occurrences of bodies in non-item-like things
+/// include: closures/generators, inline `const {}` blocks, and
+/// constant arguments of types, e.g. in `let _: [(); /* HERE */];`.
+///
 /// **This is the most common choice.** A very common pattern is
 /// to use `visit_all_item_likes()` as an outer loop,
 /// and to have the visitor that visits the contents of each item

diff --git a/compiler/rustc_middle/src/lint.rs b/compiler/rustc_middle/src/lint.rs
@@ -202,6 +202,77 @@ impl<'a> LintDiagnosticBuilder<'a> {
     }
 }
 
+pub fn explain_lint_level_source<'s>(
+    sess: &'s Session,
+    lint: &'static Lint,
+    level: Level,
+    src: LintLevelSource,
+    err: &mut DiagnosticBuilder<'s>,
+) {
+    let name = lint.name_lower();
+    match src {
+        LintLevelSource::Default => {
+            sess.diag_note_once(
+                err,
+                DiagnosticMessageId::from(lint),
+                &format!("`#[{}({})]` on by default", level.as_str(), name),
+            );
+        }
+        LintLevelSource::CommandLine(lint_flag_val, orig_level) => {
+            let flag = match orig_level {
+                Level::Warn => "-W",
+                Level::Deny => "-D",
+                Level::Forbid => "-F",
+                Level::Allow => "-A",
+                Level::ForceWarn => "--force-warn",
+            };
+            let hyphen_case_lint_name = name.replace('_', "-");
+            if lint_flag_val.as_str() == name {
+                sess.diag_note_once(
+                    err,
+                    DiagnosticMessageId::from(lint),
+                    &format!(
+                        "requested on the command line with `{} {}`",
+                        flag, hyphen_case_lint_name
+                    ),
+                );
+            } else {
+                let hyphen_case_flag_val = lint_flag_val.as_str().replace('_', "-");
+                sess.diag_note_once(
+                    err,
+                    DiagnosticMessageId::from(lint),
+                    &format!(
+                        "`{} {}` implied by `{} {}`",
+                        flag, hyphen_case_lint_name, flag, hyphen_case_flag_val
+                    ),
+                );
+            }
+        }
+        LintLevelSource::Node(lint_attr_name, src, reason) => {
+            if let Some(rationale) = reason {
+                err.note(rationale.as_str());
+            }
+            sess.diag_span_note_once(
+                err,
+                DiagnosticMessageId::from(lint),
+                src,
+                "the lint level is defined here",
+            );
+            if lint_attr_name.as_str() != name {
+                let level_str = level.as_str();
+                sess.diag_note_once(
+                    err,
+                    DiagnosticMessageId::from(lint),
+                    &format!(
+                        "`#[{}({})]` implied by `#[{}({})]`",
+                        level_str, name, level_str, lint_attr_name
+                    ),
+                );
+            }
+        }
+    }
+}
+
 pub fn struct_lint_level<'s, 'd>(
     sess: &'s Session,
     lint: &'static Lint,
@@ -277,69 +348,9 @@ pub fn struct_lint_level<'s, 'd>(
             }
         }
 
-        let name = lint.name_lower();
-        match src {
-            LintLevelSource::Default => {
-                sess.diag_note_once(
-                    &mut err,
-                    DiagnosticMessageId::from(lint),
-                    &format!("`#[{}({})]` on by default", level.as_str(), name),
-                );
-            }
-            LintLevelSource::CommandLine(lint_flag_val, orig_level) => {
-                let flag = match orig_level {
-                    Level::Warn => "-W",
-                    Level::Deny => "-D",
-                    Level::Forbid => "-F",
-                    Level::Allow => "-A",
-                    Level::ForceWarn => "--force-warn",
-                };
-                let hyphen_case_lint_name = name.replace('_', "-");
-                if lint_flag_val.as_str() == name {
-                    sess.diag_note_once(
-                        &mut err,
-                        DiagnosticMessageId::from(lint),
-                        &format!(
-                            "requested on the command line with `{} {}`",
-                            flag, hyphen_case_lint_name
-                        ),
-                    );
-                } else {
-                    let hyphen_case_flag_val = lint_flag_val.as_str().replace('_', "-");
-                    sess.diag_note_once(
-                        &mut err,
-                        DiagnosticMessageId::from(lint),
-                        &format!(
-                            "`{} {}` implied by `{} {}`",
-                            flag, hyphen_case_lint_name, flag, hyphen_case_flag_val
-                        ),
-                    );
-                }
-            }
-            LintLevelSource::Node(lint_attr_name, src, reason) => {
-                if let Some(rationale) = reason {
-                    err.note(rationale.as_str());
-                }
-                sess.diag_span_note_once(
-                    &mut err,
-                    DiagnosticMessageId::from(lint),
-                    src,
-                    "the lint level is defined here",
-                );
-                if lint_attr_name.as_str() != name {
-                    let level_str = level.as_str();
-                    sess.diag_note_once(
-                        &mut err,
-                        DiagnosticMessageId::from(lint),
-                        &format!(
-                            "`#[{}({})]` implied by `#[{}({})]`",
-                            level_str, name, level_str, lint_attr_name
-                        ),
-                    );
-                }
-            }
-        }
+        explain_lint_level_source(sess, lint, level, src, &mut err);
 
+        let name = lint.name_lower();
         let is_force_warn = matches!(level, Level::ForceWarn);
         err.code(DiagnosticId::Lint { name, has_future_breakage, is_force_warn });
 

diff --git a/compiler/rustc_middle/src/mir/query.rs b/compiler/rustc_middle/src/mir/query.rs
@@ -2,7 +2,7 @@
 
 use crate::mir::{Body, Promoted};
 use crate::ty::{self, Ty, TyCtxt};
-use rustc_data_structures::sync::Lrc;
+use rustc_data_structures::stable_map::FxHashMap;
 use rustc_data_structures::vec_map::VecMap;
 use rustc_errors::ErrorReported;
 use rustc_hir as hir;
@@ -114,13 +114,44 @@ pub struct UnsafetyViolation {
     pub details: UnsafetyViolationDetails,
 }
 
-#[derive(Clone, TyEncodable, TyDecodable, HashStable, Debug)]
+#[derive(Copy, Clone, PartialEq, TyEncodable, TyDecodable, HashStable, Debug)]
+pub enum UnusedUnsafe {
+    /// `unsafe` block contains no unsafe operations
+    /// > ``unnecessary `unsafe` block``
+    Unused,
+    /// `unsafe` block nested under another (used) `unsafe` block
+    /// > ``… because it's nested under this `unsafe` block``
+    InUnsafeBlock(hir::HirId),
+    /// `unsafe` block nested under `unsafe fn`
+    /// > ``… because it's nested under this `unsafe fn` ``
+    ///
+    /// the second HirId here indicates the first usage of the `unsafe` block,
+    /// which allows retrival of the LintLevelSource for why that operation would
+    /// have been permitted without the block
+    InUnsafeFn(hir::HirId, hir::HirId),
+}
+
+#[derive(Copy, Clone, PartialEq, TyEncodable, TyDecodable, HashStable, Debug)]
+pub enum UsedUnsafeBlockData {
+    SomeDisallowedInUnsafeFn,
+    // the HirId here indicates the first usage of the `unsafe` block
+    // (i.e. the one that's first encountered in the MIR traversal of the unsafety check)
+    AllAllowedInUnsafeFn(hir::HirId),
+}
+
+#[derive(TyEncodable, TyDecodable, HashStable, Debug)]
 pub struct UnsafetyCheckResult {
     /// Violations that are propagated *upwards* from this function.
-    pub violations: Lrc<[UnsafetyViolation]>,
-    /// `unsafe` blocks in this function, along with whether they are used. This is
-    /// used for the "unused_unsafe" lint.
-    pub unsafe_blocks: Lrc<[(hir::HirId, bool)]>,
+    pub violations: Vec<UnsafetyViolation>,
+
+    /// Used `unsafe` blocks in this function. This is used for the "unused_unsafe" lint.
+    ///
+    /// The keys are the used `unsafe` blocks, the UnusedUnsafeKind indicates whether
+    /// or not any of the usages happen at a place that doesn't allow `unsafe_op_in_unsafe_fn`.
+    pub used_unsafe_blocks: FxHashMap<hir::HirId, UsedUnsafeBlockData>,
+
+    /// This is `Some` iff the item is not a closure.
+    pub unused_unsafes: Option<Vec<(hir::HirId, UnusedUnsafe)>>,
 }
 
 rustc_index::newtype_index! {

diff --git a/compiler/rustc_mir_build/src/build/block.rs b/compiler/rustc_mir_build/src/build/block.rs
@@ -3,8 +3,6 @@ use crate::build::ForGuard::OutsideGuard;
 use crate::build::{BlockAnd, BlockAndExtension, BlockFrame, Builder};
 use rustc_middle::mir::*;
 use rustc_middle::thir::*;
-use rustc_session::lint::builtin::UNSAFE_OP_IN_UNSAFE_FN;
-use rustc_session::lint::Level;
 use rustc_span::Span;
 
 impl<'a, 'tcx> Builder<'a, 'tcx> {
@@ -209,28 +207,18 @@ impl<'a, 'tcx> Builder<'a, 'tcx> {
         block.unit()
     }
 
-    /// If we are changing the safety mode, create a new source scope
+    /// If we are entering an unsafe block, create a new source scope
     fn update_source_scope_for_safety_mode(&mut self, span: Span, safety_mode: BlockSafety) {
         debug!("update_source_scope_for({:?}, {:?})", span, safety_mode);
         let new_unsafety = match safety_mode {
-            BlockSafety::Safe => None,
-            BlockSafety::BuiltinUnsafe => Some(Safety::BuiltinUnsafe),
+            BlockSafety::Safe => return,
+            BlockSafety::BuiltinUnsafe => Safety::BuiltinUnsafe,
             BlockSafety::ExplicitUnsafe(hir_id) => {
-                match self.in_scope_unsafe {
-                    Safety::Safe => {}
-                    // no longer treat `unsafe fn`s as `unsafe` contexts (see RFC #2585)
-                    Safety::FnUnsafe
-                        if self.tcx.lint_level_at_node(UNSAFE_OP_IN_UNSAFE_FN, hir_id).0
-                            != Level::Allow => {}
-                    _ => return,
-                }
                 self.in_scope_unsafe = Safety::ExplicitUnsafe(hir_id);
-                Some(Safety::ExplicitUnsafe(hir_id))
+                Safety::ExplicitUnsafe(hir_id)
             }
         };
 
-        if let Some(unsafety) = new_unsafety {
-            self.source_scope = self.new_source_scope(span, LintLevel::Inherited, Some(unsafety));
-        }
+        self.source_scope = self.new_source_scope(span, LintLevel::Inherited, Some(new_unsafety));
     }
 }