Skip to content

Commit

Permalink
Rollup merge of #126922 - asquared31415:asm_binary_label, r=estebank
Browse files Browse the repository at this point in the history
add lint for inline asm labels that look like binary

fixes #94426

Due to a bug/feature in LLVM, labels composed of only the digits `0` and `1` can sometimes be confused with binary literals, even if a binary literal would not be valid in that position.

This PR adds detection for such labels and also as a drive-by change, adds a note to cases such as `asm!(include_str!("file"))` that the label that it found came from an expansion of a macro, it wasn't found in the source code.

I expect this PR to upset some people that were using labels `0:` or `1:` without issue because they never hit the case where LLVM got it wrong, but adding a heuristic to the lint to prevent this is not feasible - it would involve writing a whole assembly parser for every target that we have assembly support for.

[zulip discussion](https://rust-lang.zulipchat.com/#narrow/stream/238009-t-compiler.2Fmeetings/topic/.5Bweekly.5D.202024-06-20/near/445870628)

r? ``@estebank``
  • Loading branch information
workingjubilee authored Jul 12, 2024
2 parents 5d56572 + 87856c4 commit fc0136e
Show file tree
Hide file tree
Showing 8 changed files with 397 additions and 91 deletions.
17 changes: 13 additions & 4 deletions compiler/rustc_lint/messages.ftl
Original file line number Diff line number Diff line change
Expand Up @@ -52,10 +52,6 @@ lint_builtin_allow_internal_unsafe =
lint_builtin_anonymous_params = anonymous parameters are deprecated and will be removed in the next edition
.suggestion = try naming the parameter or explicitly ignoring it
lint_builtin_asm_labels = avoid using named labels in inline assembly
.help = only local labels of the form `<number>:` should be used in inline asm
.note = see the asm section of Rust By Example <https://doc.rust-lang.org/nightly/rust-by-example/unsafe/asm.html#labels> for more information
lint_builtin_clashing_extern_diff_name = `{$this}` redeclares `{$orig}` with a different signature
.previous_decl_label = `{$orig}` previously declared here
.mismatch_label = this signature doesn't match the previous declaration
Expand Down Expand Up @@ -403,6 +399,19 @@ lint_incomplete_include =
lint_inner_macro_attribute_unstable = inner macro attributes are unstable
lint_invalid_asm_label_binary = avoid using labels containing only the digits `0` and `1` in inline assembly
.label = use a different label that doesn't start with `0` or `1`
.note = an LLVM bug makes these labels ambiguous with a binary literal number
.note = see <https://bugs.llvm.org/show_bug.cgi?id=36144> for more information
lint_invalid_asm_label_format_arg = avoid using named labels in inline assembly
.help = only local labels of the form `<number>:` should be used in inline asm
.note1 = format arguments may expand to a non-numeric value
.note2 = see the asm section of Rust By Example <https://doc.rust-lang.org/nightly/rust-by-example/unsafe/asm.html#labels> for more information
lint_invalid_asm_label_named = avoid using named labels in inline assembly
.help = only local labels of the form `<number>:` should be used in inline asm
.note = see the asm section of Rust By Example <https://doc.rust-lang.org/nightly/rust-by-example/unsafe/asm.html#labels> for more information
lint_invalid_asm_label_no_span = the label may be declared in the expansion of a macro
lint_invalid_crate_type_value = invalid `crate_type` value
.suggestion = did you mean
Expand Down
131 changes: 102 additions & 29 deletions compiler/rustc_lint/src/builtin.rs
Original file line number Diff line number Diff line change
Expand Up @@ -30,13 +30,13 @@ use crate::{
BuiltinExplicitOutlivesSuggestion, BuiltinFeatureIssueNote, BuiltinIncompleteFeatures,
BuiltinIncompleteFeaturesHelp, BuiltinInternalFeatures, BuiltinKeywordIdents,
BuiltinMissingCopyImpl, BuiltinMissingDebugImpl, BuiltinMissingDoc,
BuiltinMutablesTransmutes, BuiltinNamedAsmLabel, BuiltinNoMangleGeneric,
BuiltinNonShorthandFieldPatterns, BuiltinSpecialModuleNameUsed, BuiltinTrivialBounds,
BuiltinTypeAliasGenericBounds, BuiltinTypeAliasGenericBoundsSuggestion,
BuiltinTypeAliasWhereClause, BuiltinUngatedAsyncFnTrackCaller, BuiltinUnpermittedTypeInit,
BuiltinMutablesTransmutes, BuiltinNoMangleGeneric, BuiltinNonShorthandFieldPatterns,
BuiltinSpecialModuleNameUsed, BuiltinTrivialBounds, BuiltinTypeAliasGenericBounds,
BuiltinTypeAliasGenericBoundsSuggestion, BuiltinTypeAliasWhereClause,
BuiltinUngatedAsyncFnTrackCaller, BuiltinUnpermittedTypeInit,
BuiltinUnpermittedTypeInitSub, BuiltinUnreachablePub, BuiltinUnsafe,
BuiltinUnstableFeatures, BuiltinUnusedDocComment, BuiltinUnusedDocCommentSub,
BuiltinWhileTrue, SuggestChangingAssocTypes,
BuiltinWhileTrue, InvalidAsmLabel, SuggestChangingAssocTypes,
},
EarlyContext, EarlyLintPass, LateContext, LateLintPass, Level, LintContext,
};
Expand All @@ -45,7 +45,7 @@ use rustc_ast::tokenstream::{TokenStream, TokenTree};
use rustc_ast::visit::{FnCtxt, FnKind};
use rustc_ast::{self as ast, *};
use rustc_ast_pretty::pprust::{self, expr_to_string};
use rustc_errors::{Applicability, LintDiagnostic, MultiSpan};
use rustc_errors::{Applicability, LintDiagnostic};
use rustc_feature::{deprecated_attributes, AttributeGate, BuiltinAttribute, GateIssue, Stability};
use rustc_hir as hir;
use rustc_hir::def::{DefKind, Res};
Expand All @@ -69,7 +69,6 @@ use rustc_target::abi::Abi;
use rustc_trait_selection::infer::{InferCtxtExt, TyCtxtInferExt};
use rustc_trait_selection::traits::query::evaluate_obligation::InferCtxtExt as _;
use rustc_trait_selection::traits::{self, misc::type_allowed_to_implement_copy};
use tracing::debug;

use crate::nonstandard_style::{method_context, MethodLateContext};

Expand Down Expand Up @@ -2728,10 +2727,52 @@ declare_lint! {
"named labels in inline assembly",
}

declare_lint_pass!(NamedAsmLabels => [NAMED_ASM_LABELS]);
declare_lint! {
/// The `binary_asm_labels` lint detects the use of numeric labels containing only binary
/// digits in the inline `asm!` macro.
///
/// ### Example
///
/// ```rust,compile_fail
/// # #![feature(asm_experimental_arch)]
/// use std::arch::asm;
///
/// fn main() {
/// unsafe {
/// asm!("0: jmp 0b");
/// }
/// }
/// ```
///
/// {{produces}}
///
/// ### Explanation
///
/// A [LLVM bug] causes this code to fail to compile because it interprets the `0b` as a binary
/// literal instead of a reference to the previous local label `0`. Note that even though the
/// bug is marked as fixed, it only fixes a specific usage of intel syntax within standalone
/// files, not inline assembly. To work around this bug, don't use labels that could be
/// confused with a binary literal.
///
/// See the explanation in [Rust By Example] for more details.
///
/// [LLVM bug]: https://bugs.llvm.org/show_bug.cgi?id=36144
/// [Rust By Example]: https://doc.rust-lang.org/nightly/rust-by-example/unsafe/asm.html#labels
pub BINARY_ASM_LABELS,
Deny,
"labels in inline assembly containing only 0 or 1 digits",
}

impl<'tcx> LateLintPass<'tcx> for NamedAsmLabels {
#[allow(rustc::diagnostic_outside_of_impl)]
declare_lint_pass!(AsmLabels => [NAMED_ASM_LABELS, BINARY_ASM_LABELS]);

#[derive(Debug, Clone, Copy, PartialEq, Eq)]
enum AsmLabelKind {
Named,
FormatArg,
Binary,
}

impl<'tcx> LateLintPass<'tcx> for AsmLabels {
fn check_expr(&mut self, cx: &LateContext<'tcx>, expr: &'tcx hir::Expr<'tcx>) {
if let hir::Expr {
kind: hir::ExprKind::InlineAsm(hir::InlineAsm { template_strs, options, .. }),
Expand Down Expand Up @@ -2759,7 +2800,8 @@ impl<'tcx> LateLintPass<'tcx> for NamedAsmLabels {
None
};

let mut found_labels = Vec::new();
// diagnostics are emitted per-template, so this is created here as opposed to the outer loop
let mut spans = Vec::new();

// A semicolon might not actually be specified as a separator for all targets, but
// it seems like LLVM accepts it always.
Expand All @@ -2782,16 +2824,21 @@ impl<'tcx> LateLintPass<'tcx> for NamedAsmLabels {

// Whether a { bracket has been seen and its } hasn't been found yet.
let mut in_bracket = false;
let mut label_kind = AsmLabelKind::Named;

// A label starts with an ASCII alphabetic character or . or _
// A label can also start with a format arg, if it's not a raw asm block.
if !raw && start == '{' {
in_bracket = true;
label_kind = AsmLabelKind::FormatArg;
} else if matches!(start, '0' | '1') {
// Binary labels have only the characters `0` or `1`.
label_kind = AsmLabelKind::Binary;
} else if !(start.is_ascii_alphabetic() || matches!(start, '.' | '_')) {
// Named labels start with ASCII letters, `.` or `_`.
// anything else is not a label
break 'label_loop;
}

// Labels continue with ASCII alphanumeric characters, _, or $
for c in chars {
// Inside a template format arg, any character is permitted for the
// puproses of label detection because we assume that it can be
Expand All @@ -2812,34 +2859,60 @@ impl<'tcx> LateLintPass<'tcx> for NamedAsmLabels {
} else if !raw && c == '{' {
// Start of a format arg.
in_bracket = true;
label_kind = AsmLabelKind::FormatArg;
} else {
if !(c.is_ascii_alphanumeric() || matches!(c, '_' | '$')) {
let can_continue = match label_kind {
// Format arg labels are considered to be named labels for the purposes
// of continuing outside of their {} pair.
AsmLabelKind::Named | AsmLabelKind::FormatArg => {
c.is_ascii_alphanumeric() || matches!(c, '_' | '$')
}
AsmLabelKind::Binary => matches!(c, '0' | '1'),
};

if !can_continue {
// The potential label had an invalid character inside it, it
// cannot be a label.
break 'label_loop;
}
}
}

// If all characters passed the label checks, this is likely a label.
found_labels.push(possible_label);
// If all characters passed the label checks, this is a label.
spans.push((find_label_span(possible_label), label_kind));
start_idx = idx + 1;
}
}

debug!("NamedAsmLabels::check_expr(): found_labels: {:#?}", &found_labels);

if found_labels.len() > 0 {
let spans = found_labels
.into_iter()
.filter_map(|label| find_label_span(label))
.collect::<Vec<Span>>();
// If there were labels but we couldn't find a span, combine the warnings and
// use the template span.
let target_spans: MultiSpan =
if spans.len() > 0 { spans.into() } else { (*template_span).into() };

cx.emit_span_lint(NAMED_ASM_LABELS, target_spans, BuiltinNamedAsmLabel);
for (span, label_kind) in spans {
let missing_precise_span = span.is_none();
let span = span.unwrap_or(*template_span);
match label_kind {
AsmLabelKind::Named => {
cx.emit_span_lint(
NAMED_ASM_LABELS,
span,
InvalidAsmLabel::Named { missing_precise_span },
);
}
AsmLabelKind::FormatArg => {
cx.emit_span_lint(
NAMED_ASM_LABELS,
span,
InvalidAsmLabel::FormatArg { missing_precise_span },
);
}
AsmLabelKind::Binary => {
// the binary asm issue only occurs when using intel syntax
if !options.contains(InlineAsmOptions::ATT_SYNTAX) {
cx.emit_span_lint(
BINARY_ASM_LABELS,
span,
InvalidAsmLabel::Binary { missing_precise_span, span },
)
}
}
};
}
}
}
Expand Down
2 changes: 1 addition & 1 deletion compiler/rustc_lint/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -225,7 +225,7 @@ late_lint_methods!(
NoopMethodCall: NoopMethodCall,
EnumIntrinsicsNonEnums: EnumIntrinsicsNonEnums,
InvalidAtomicOrdering: InvalidAtomicOrdering,
NamedAsmLabels: NamedAsmLabels,
AsmLabels: AsmLabels,
OpaqueHiddenInferredBound: OpaqueHiddenInferredBound,
MultipleSupertraitUpcastable: MultipleSupertraitUpcastable,
MapUnitFn: MapUnitFn,
Expand Down
30 changes: 26 additions & 4 deletions compiler/rustc_lint/src/lints.rs
Original file line number Diff line number Diff line change
Expand Up @@ -2047,10 +2047,32 @@ pub struct UnitBindingsDiag {
}

#[derive(LintDiagnostic)]
#[diag(lint_builtin_asm_labels)]
#[help]
#[note]
pub struct BuiltinNamedAsmLabel;
pub enum InvalidAsmLabel {
#[diag(lint_invalid_asm_label_named)]
#[help]
#[note]
Named {
#[note(lint_invalid_asm_label_no_span)]
missing_precise_span: bool,
},
#[diag(lint_invalid_asm_label_format_arg)]
#[help]
#[note(lint_note1)]
#[note(lint_note2)]
FormatArg {
#[note(lint_invalid_asm_label_no_span)]
missing_precise_span: bool,
},
#[diag(lint_invalid_asm_label_binary)]
#[note]
Binary {
#[note(lint_invalid_asm_label_no_span)]
missing_precise_span: bool,
// hack to get a label on the whole span, must match the emitted span
#[label]
span: Span,
},
}

#[derive(Subdiagnostic)]
pub enum UnexpectedCfgCargoHelp {
Expand Down
17 changes: 17 additions & 0 deletions tests/ui/asm/binary_asm_labels.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
//@ needs-asm-support
//@ only-x86_64

// tests that labels containing only the digits 0 and 1 are rejected
// uses of such labels can sometimes be interpreted as a binary literal

use std::arch::{asm, global_asm};

fn main() {
unsafe {
asm!("0: jmp 0b"); //~ ERROR avoid using labels containing only the digits
asm!("1: jmp 1b"); //~ ERROR avoid using labels containing only the digits
asm!("10: jmp 10b"); //~ ERROR avoid using labels containing only the digits
asm!("01: jmp 01b"); //~ ERROR avoid using labels containing only the digits
asm!("1001101: jmp 1001101b"); //~ ERROR avoid using labels containing only the digits
}
}
43 changes: 43 additions & 0 deletions tests/ui/asm/binary_asm_labels.stderr
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
error: avoid using labels containing only the digits `0` and `1` in inline assembly
--> $DIR/binary_asm_labels.rs:11:15
|
LL | asm!("0: jmp 0b");
| ^ use a different label that doesn't start with `0` or `1`
|
= note: an LLVM bug makes these labels ambiguous with a binary literal number
= note: `#[deny(binary_asm_labels)]` on by default

error: avoid using labels containing only the digits `0` and `1` in inline assembly
--> $DIR/binary_asm_labels.rs:12:15
|
LL | asm!("1: jmp 1b");
| ^ use a different label that doesn't start with `0` or `1`
|
= note: an LLVM bug makes these labels ambiguous with a binary literal number

error: avoid using labels containing only the digits `0` and `1` in inline assembly
--> $DIR/binary_asm_labels.rs:13:15
|
LL | asm!("10: jmp 10b");
| ^^ use a different label that doesn't start with `0` or `1`
|
= note: an LLVM bug makes these labels ambiguous with a binary literal number

error: avoid using labels containing only the digits `0` and `1` in inline assembly
--> $DIR/binary_asm_labels.rs:14:15
|
LL | asm!("01: jmp 01b");
| ^^ use a different label that doesn't start with `0` or `1`
|
= note: an LLVM bug makes these labels ambiguous with a binary literal number

error: avoid using labels containing only the digits `0` and `1` in inline assembly
--> $DIR/binary_asm_labels.rs:15:15
|
LL | asm!("1001101: jmp 1001101b");
| ^^^^^^^ use a different label that doesn't start with `0` or `1`
|
= note: an LLVM bug makes these labels ambiguous with a binary literal number

error: aborting due to 5 previous errors

Loading

0 comments on commit fc0136e

Please sign in to comment.