From 63441ea1f80f69db7cff78d13b031505af0b1916 Mon Sep 17 00:00:00 2001 From: Lokathor Date: Wed, 6 Sep 2023 19:13:26 -0600 Subject: [PATCH 01/29] add rfc text --- text/0000-unsafe-extern-blocks.md | 101 ++++++++++++++++++++++++++++++ 1 file changed, 101 insertions(+) create mode 100644 text/0000-unsafe-extern-blocks.md diff --git a/text/0000-unsafe-extern-blocks.md b/text/0000-unsafe-extern-blocks.md new file mode 100644 index 00000000000..ec079392362 --- /dev/null +++ b/text/0000-unsafe-extern-blocks.md @@ -0,0 +1,101 @@ + +- Feature Name: `unsafe_extern` +- Start Date: 2023-05-23 +- RFC PR: [rust-lang/rfcs#0000](https://github.com/rust-lang/rfcs/pull/0000) +- Rust Issue: [rust-lang/rust#0000](https://github.com/rust-lang/rust/issues/0000) + +# Summary +[summary]: #summary + +In Edition 2024 it is `unsafe` to declare an `extern` function or static, but external functions and statics *can* be safe to use after the initial declaration. + +# Motivation +[motivation]: #motivation + +Simply declaring extern items, even without ever using them, can cause Undefined Behavior. +When performing cross-language compilation, attributes on one function declaration can flow to the foreign declaration elsewhere within LLVM and cause a miscompilation. +In Rust we consider all sources of Undefined Behavior to be `unsafe`, and so we must make declaring extern blocks be `unsafe`. +The up-side to this change is that in the new style it will be possible to declare an extern fn that's safe to call after the initial unsafe declaration. + +# Guide-level explanation +[guide-level-explanation]: #guide-level-explanation + +Rust can utilize functions and statics from foreign code that are provided during linking, though it is `unsafe` to do so. + +An `extern` block can be placed anywhere a function declaration could appear (generally at the top level of a module), and must always be prefixed with the keyword `unsafe`. + +Within the block you can declare the exernal functions and statics that you want to make visible within the current scope. +Each function declaration gives only the function's signature, similar to how methods for traits are declared. +If calling a foreign function is `unsafe` then you must declare the function as `unsafe fn`, otherwise you can declare it as a normal `fn`. +Each static declaration gives the name and type, but no initial value. + +* If the `unsafe_code` lint is denied or forbidden at a particular scope it will cause the `unsafe extern` block to be a compilation error within that scope. +* Declaring an incorrect external item signature can cause Undefined Behavior during compilation, even if Rust never accesses the item. + +```rust +unsafe extern { + // sqrt (from libm) can be called with any `f64` + pub fn sqrt(x: f64) -> f64; + + // strlen (from libc) requires a valid pointer, + // so we mark it as being an unsafe fn + pub unsafe fn strlen(p: *const c_char) -> usize; + + pub static IMPORTANT_BYTES: [u8; 256]; + + pub static LINES: UnsafeCell; +} +``` + +Note: other rules for extern blocks, such as optionally including an ABI, are unchanged from previous editions, so those parts of the guide would remain. + +# Reference-level explanation +[reference-level-explanation]: #reference-level-explanation + +This adjusts the grammar of the language to *require* the `unsafe` keyword before an `extern` block declaration (currently it's optional and syntatically allowed but semantically rejected). + +Replace the *Functions* and *Statics* sections with the following: + +### Functions +Functions within external blocks are declared in the same way as other Rust functions, with the exception that they must not have a body and are instead terminated by a semicolon. Patterns are not allowed in parameters, only IDENTIFIER or _ may be used. The function qualifiers `const`, `async`, and `extern` are not allowed. If the function is unsafe to call, then the function must use the `unsafe` qualifier. + +If the function signature declared in Rust is incompatible with the function signature as declared in the foreign code it is Undefined Behavior. + +Functions within external blocks may be called by Rust code, just like functions defined in Rust. The Rust compiler will automatically use the correct foreign ABI when making the call. + +When coerced to a function pointer, a function declared in an extern block has type +```rust +extern "abi" for<'l1, ..., 'lm> fn(A1, ..., An) -> R +``` +where `'l1`, ... `'lm` are its lifetime parameters, `A1`, ..., `An` are the declared types of its parameters and `R` is the declared return type. + +### Statics +Statics within external blocks are declared in the same way as statics outside of external blocks, except that they do not have an expression initializing their value. It is unsafe to declare a static item in an extern block, whether or not it's mutable, because there is nothing guaranteeing that the bit pattern at the static's memory is valid for the type it is declared with. + +Extern statics can be either immutable or mutable just like statics outside of external blocks. An immutable static must be initialized before any Rust code is executed. It is not enough for the static to be initialized before Rust code reads from it. A mutable extern static is unsafe to access, the same as a Rust mutable static. + +# Drawbacks +[drawbacks]: #drawbacks + +* It is very unfortunate to have to essentially reverse the status quo. + * Hopefully, allowing people to safely call some foreign functions will make up for the churn caused by this change. + +# Rationale and alternatives +[rationale-and-alternatives]: #rationale-and-alternatives + +Incorrect extern declarations can cause UB in current Rust, but we have no way to automatically check that all declarations are correct, nor is such a thing likely to be developed. Making the declarations `unsafe` so that programmers are aware of the dangers and can give extern blocks the attention they deserve is the minimum step. + +# Prior art +[prior-art]: #prior-art + +None we are aware of. + +# Unresolved questions +[unresolved-questions]: #unresolved-questions + +* Extern declarations are actually *always* unsafe and able to cause UB regardless of edition. This RFC doesn't have a specific answer on how to improve pre-2024 code. + +# Future possibilities +[future-possibilities]: #future-possibilities + +None are apparent at this time. From 726478f7d757a5613fe96cb8c5f64355a571bd53 Mon Sep 17 00:00:00 2001 From: Lokathor Date: Thu, 7 Sep 2023 00:39:30 -0600 Subject: [PATCH 02/29] Update text/0000-unsafe-extern-blocks.md Co-authored-by: Jacob Lifshay --- text/0000-unsafe-extern-blocks.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-unsafe-extern-blocks.md b/text/0000-unsafe-extern-blocks.md index ec079392362..e7715c6970b 100644 --- a/text/0000-unsafe-extern-blocks.md +++ b/text/0000-unsafe-extern-blocks.md @@ -43,7 +43,7 @@ unsafe extern { pub static IMPORTANT_BYTES: [u8; 256]; - pub static LINES: UnsafeCell; + pub static LINES: SyncUnsafeCell; } ``` From 47949ec64165afb66bcdbc0f290129bdefa9b6b1 Mon Sep 17 00:00:00 2001 From: Lokathor Date: Mon, 25 Mar 2024 12:10:57 -0600 Subject: [PATCH 03/29] per https://github.com/rust-lang/rfcs/pull/3484#issuecomment-1758275493 --- text/0000-unsafe-extern-blocks.md | 49 +++++++++++++++++++++---------- 1 file changed, 33 insertions(+), 16 deletions(-) diff --git a/text/0000-unsafe-extern-blocks.md b/text/0000-unsafe-extern-blocks.md index e7715c6970b..56b2d7e4abd 100644 --- a/text/0000-unsafe-extern-blocks.md +++ b/text/0000-unsafe-extern-blocks.md @@ -22,44 +22,61 @@ The up-side to this change is that in the new style it will be possible to decla Rust can utilize functions and statics from foreign code that are provided during linking, though it is `unsafe` to do so. -An `extern` block can be placed anywhere a function declaration could appear (generally at the top level of a module), and must always be prefixed with the keyword `unsafe`. +An `extern` block can be placed anywhere a function declaration could appear (generally at the top level of a module). +* You can always write `unsafe extern { ... }`. +* If the `unsafe_code` lint is denied or forbidden at a particular scope it will cause the `unsafe extern` block to be a compilation error within that scope. +* On editions >= 2024, you must write all `extern` blocks as `unsafe extern`. +* On editions < 2024, it is allowed to write an `extern` block *without* the `unsafe` keyword, but this generates a compatibility warning that you should use the `unsafe` keyword. -Within the block you can declare the exernal functions and statics that you want to make visible within the current scope. -Each function declaration gives only the function's signature, similar to how methods for traits are declared. -If calling a foreign function is `unsafe` then you must declare the function as `unsafe fn`, otherwise you can declare it as a normal `fn`. -Each static declaration gives the name and type, but no initial value. +Within an `extern` block is zero or more declarations of external functions and/or external static values. +An extern function is declared with a `;` instead of a function body (similar to a method of a trait). +An extern static value is also declared with a `;` instead of an expression (similar to an associated const of a trait). +In both cases, the actual function body or value is provided by whatever external source (which is probably not even written in Rust). -* If the `unsafe_code` lint is denied or forbidden at a particular scope it will cause the `unsafe extern` block to be a compilation error within that scope. -* Declaring an incorrect external item signature can cause Undefined Behavior during compilation, even if Rust never accesses the item. +When an `extern` block is used (with or without `unsafe` in front of it), all declarations within that `extern` block should have the `unsafe` or `safe` keywords as part of their signature. +If one of the two keywords is not explicitly provided, the declaration is assumed to be `unsafe`. +The `safe` keyword is a contextual keyword, only used within `extern` blocks. ```rust unsafe extern { // sqrt (from libm) can be called with any `f64` - pub fn sqrt(x: f64) -> f64; + pub safe fn sqrt(x: f64) -> f64; // strlen (from libc) requires a valid pointer, // so we mark it as being an unsafe fn pub unsafe fn strlen(p: *const c_char) -> usize; + + // this function doesn't say safe or unsafe, so it defaults to unsafe + pub fn free(p: *mut core::ffi::c_void); - pub static IMPORTANT_BYTES: [u8; 256]; + pub safe static IMPORTANT_BYTES: [u8; 256]; - pub static LINES: SyncUnsafeCell; + pub safe static LINES: SyncUnsafeCell; } ``` -Note: other rules for extern blocks, such as optionally including an ABI, are unchanged from previous editions, so those parts of the guide would remain. +`extern` blocks are `unsafe` because if the declaration doesn't match the actual external function, or the actual external data, then it causes compile time Undefined Behavior (UB). + +Once they are unsafely declared, a `safe` item can be used outside the `extern` block as if it were any other safe function or static value declared within rust. +The unsafe obligation of ensuring that the correct items are being linked to is performed by the crate making the declaration, not the crate using of that declaration. + +Items declared as `unsafe` *must* still have a correctly matching signature at compile time, but they *also* some sort of additional obligation for correct usage at runtime. +They can only be used within an `unsafe` block. # Reference-level explanation [reference-level-explanation]: #reference-level-explanation -This adjusts the grammar of the language to *require* the `unsafe` keyword before an `extern` block declaration (currently it's optional and syntatically allowed but semantically rejected). +The grammar of the langauge is updated so that: + +* Editions >= 2024 *must* prefix all `extern` blocks with `unsafe`. +* Editions < 2024 *should* prefix `extern` blocks with `unsafe`, this is a warn-by-default compatibility lint when `unsafe` is missing. Replace the *Functions* and *Statics* sections with the following: ### Functions -Functions within external blocks are declared in the same way as other Rust functions, with the exception that they must not have a body and are instead terminated by a semicolon. Patterns are not allowed in parameters, only IDENTIFIER or _ may be used. The function qualifiers `const`, `async`, and `extern` are not allowed. If the function is unsafe to call, then the function must use the `unsafe` qualifier. +Functions within external blocks are declared in the same way as other Rust functions, with the exception that they must not have a body and are instead terminated by a semicolon. Patterns are not allowed in parameters, only IDENTIFIER or _ may be used. The function qualifiers `const`, `async`, and `extern` are not allowed. If the function is unsafe to call, then the function should use the `unsafe` qualifier. If the function is safe to call, then the function should use the `safe` qualifier (a contextual keyword). Functions that are not qualified as `unsafe` or `safe` are assumed to be `unsafe`. -If the function signature declared in Rust is incompatible with the function signature as declared in the foreign code it is Undefined Behavior. +If the function signature declared in Rust is incompatible with the function signature as declared in the foreign code it is Undefined Behavior to compile and link the code. Functions within external blocks may be called by Rust code, just like functions defined in Rust. The Rust compiler will automatically use the correct foreign ABI when making the call. @@ -70,9 +87,9 @@ extern "abi" for<'l1, ..., 'lm> fn(A1, ..., An) -> R where `'l1`, ... `'lm` are its lifetime parameters, `A1`, ..., `An` are the declared types of its parameters and `R` is the declared return type. ### Statics -Statics within external blocks are declared in the same way as statics outside of external blocks, except that they do not have an expression initializing their value. It is unsafe to declare a static item in an extern block, whether or not it's mutable, because there is nothing guaranteeing that the bit pattern at the static's memory is valid for the type it is declared with. +Statics within external blocks are declared in the same way as statics outside of external blocks, except that they do not have an expression initializing their value. If the static is unsafe to access, then the static should use the `unsafe` qualifier. If the static is safe to access (and immutable), then the static should use the `safe` qualifier (a contextual keyword). Statics that are not qualified as `unsafe` or `safe` are assumed to be `unsafe`. -Extern statics can be either immutable or mutable just like statics outside of external blocks. An immutable static must be initialized before any Rust code is executed. It is not enough for the static to be initialized before Rust code reads from it. A mutable extern static is unsafe to access, the same as a Rust mutable static. +Extern statics can be either immutable or mutable just like statics outside of external blocks. An immutable static must be initialized before any Rust code is executed. It is not enough for the static to be initialized before Rust code reads from it. A mutable extern static is always `unsafe` to access, the same as a Rust mutable static. # Drawbacks [drawbacks]: #drawbacks From 063768476cd8c040e49e318602560ecedbd26b3c Mon Sep 17 00:00:00 2001 From: Lokathor Date: Mon, 25 Mar 2024 12:52:31 -0600 Subject: [PATCH 04/29] typo: missing "have" --- text/0000-unsafe-extern-blocks.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-unsafe-extern-blocks.md b/text/0000-unsafe-extern-blocks.md index 56b2d7e4abd..f907e590e93 100644 --- a/text/0000-unsafe-extern-blocks.md +++ b/text/0000-unsafe-extern-blocks.md @@ -60,7 +60,7 @@ unsafe extern { Once they are unsafely declared, a `safe` item can be used outside the `extern` block as if it were any other safe function or static value declared within rust. The unsafe obligation of ensuring that the correct items are being linked to is performed by the crate making the declaration, not the crate using of that declaration. -Items declared as `unsafe` *must* still have a correctly matching signature at compile time, but they *also* some sort of additional obligation for correct usage at runtime. +Items declared as `unsafe` *must* still have a correctly matching signature at compile time, but they *also* have some sort of additional obligation for correct usage at runtime. They can only be used within an `unsafe` block. # Reference-level explanation From 3fa1a617c218c6724e6609d2867bbd5710a2954b Mon Sep 17 00:00:00 2001 From: Lokathor Date: Mon, 25 Mar 2024 13:02:42 -0600 Subject: [PATCH 05/29] corrections from Zulip feedback --- text/0000-unsafe-extern-blocks.md | 17 ++++++++++------- 1 file changed, 10 insertions(+), 7 deletions(-) diff --git a/text/0000-unsafe-extern-blocks.md b/text/0000-unsafe-extern-blocks.md index f907e590e93..90712dd5e84 100644 --- a/text/0000-unsafe-extern-blocks.md +++ b/text/0000-unsafe-extern-blocks.md @@ -23,19 +23,22 @@ The up-side to this change is that in the new style it will be possible to decla Rust can utilize functions and statics from foreign code that are provided during linking, though it is `unsafe` to do so. An `extern` block can be placed anywhere a function declaration could appear (generally at the top level of a module). -* You can always write `unsafe extern { ... }`. -* If the `unsafe_code` lint is denied or forbidden at a particular scope it will cause the `unsafe extern` block to be a compilation error within that scope. -* On editions >= 2024, you must write all `extern` blocks as `unsafe extern`. -* On editions < 2024, it is allowed to write an `extern` block *without* the `unsafe` keyword, but this generates a compatibility warning that you should use the `unsafe` keyword. + +* On editions >= 2024, you *must* write all `extern` blocks as `unsafe extern`. +* On editions < 2024, you *may* write `unsafe extern`, or you can write an `extern` block without the `unsafe` keyword. Writing an `extern` block without the `unsafe` keyword is provided for compatibility only, and will generate a warning. +* `unsafe extern` interacts with the `unsafe_code` lint, and a `deny` or `forbid` with that lint will deny or forbid the unsafe external block. Within an `extern` block is zero or more declarations of external functions and/or external static values. An extern function is declared with a `;` instead of a function body (similar to a method of a trait). An extern static value is also declared with a `;` instead of an expression (similar to an associated const of a trait). In both cases, the actual function body or value is provided by whatever external source (which is probably not even written in Rust). -When an `extern` block is used (with or without `unsafe` in front of it), all declarations within that `extern` block should have the `unsafe` or `safe` keywords as part of their signature. -If one of the two keywords is not explicitly provided, the declaration is assumed to be `unsafe`. -The `safe` keyword is a contextual keyword, only used within `extern` blocks. +When an `unsafe extern` block is used, all declarations within that `extern` block *should* have the `unsafe` or `safe` keywords as part of their signature. +If one of the two keywords is not explicitly provided, the declaration is assumed to be `unsafe`, and also a warning is generated. +The `safe` keyword is a contextual keyword, it is currently only used within `extern` blocks. + +If an `extern` block is used in an older edition without the `unsafe` keyword, declarations *cannot* specify `safe` or `unsafe`. +Code must update to `unsafe extern` style blocks if it wants to make `safe` declarations. ```rust unsafe extern { From 7754241fafc69d9baf4926dbce4548d87647320d Mon Sep 17 00:00:00 2001 From: Lokathor Date: Mon, 25 Mar 2024 13:04:08 -0600 Subject: [PATCH 06/29] typo: remove "of" --- text/0000-unsafe-extern-blocks.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-unsafe-extern-blocks.md b/text/0000-unsafe-extern-blocks.md index 90712dd5e84..e6b7cfcae9c 100644 --- a/text/0000-unsafe-extern-blocks.md +++ b/text/0000-unsafe-extern-blocks.md @@ -61,7 +61,7 @@ unsafe extern { `extern` blocks are `unsafe` because if the declaration doesn't match the actual external function, or the actual external data, then it causes compile time Undefined Behavior (UB). Once they are unsafely declared, a `safe` item can be used outside the `extern` block as if it were any other safe function or static value declared within rust. -The unsafe obligation of ensuring that the correct items are being linked to is performed by the crate making the declaration, not the crate using of that declaration. +The unsafe obligation of ensuring that the correct items are being linked to is performed by the crate making the declaration, not the crate using that declaration. Items declared as `unsafe` *must* still have a correctly matching signature at compile time, but they *also* have some sort of additional obligation for correct usage at runtime. They can only be used within an `unsafe` block. From 2144ac3f1536e479dfcc7127528b395780adddd7 Mon Sep 17 00:00:00 2001 From: Lokathor Date: Sun, 31 Mar 2024 17:23:56 -0600 Subject: [PATCH 07/29] Update text/0000-unsafe-extern-blocks.md Co-authored-by: Waffle Maybe --- text/0000-unsafe-extern-blocks.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-unsafe-extern-blocks.md b/text/0000-unsafe-extern-blocks.md index e6b7cfcae9c..0718d043c30 100644 --- a/text/0000-unsafe-extern-blocks.md +++ b/text/0000-unsafe-extern-blocks.md @@ -33,7 +33,7 @@ An extern function is declared with a `;` instead of a function body (similar to An extern static value is also declared with a `;` instead of an expression (similar to an associated const of a trait). In both cases, the actual function body or value is provided by whatever external source (which is probably not even written in Rust). -When an `unsafe extern` block is used, all declarations within that `extern` block *should* have the `unsafe` or `safe` keywords as part of their signature. +When an `unsafe extern` block is used, all declarations within that `extern` block *must* have the `unsafe` or `safe` keywords as part of their signature. If one of the two keywords is not explicitly provided, the declaration is assumed to be `unsafe`, and also a warning is generated. The `safe` keyword is a contextual keyword, it is currently only used within `extern` blocks. From 676383fce91a37831adbe47a382bc60acfcca37c Mon Sep 17 00:00:00 2001 From: Lokathor Date: Sun, 31 Mar 2024 18:10:03 -0600 Subject: [PATCH 08/29] Update text/0000-unsafe-extern-blocks.md Co-authored-by: Jacob Lifshay --- text/0000-unsafe-extern-blocks.md | 1 - 1 file changed, 1 deletion(-) diff --git a/text/0000-unsafe-extern-blocks.md b/text/0000-unsafe-extern-blocks.md index 0718d043c30..44f6626a206 100644 --- a/text/0000-unsafe-extern-blocks.md +++ b/text/0000-unsafe-extern-blocks.md @@ -34,7 +34,6 @@ An extern static value is also declared with a `;` instead of an expression (sim In both cases, the actual function body or value is provided by whatever external source (which is probably not even written in Rust). When an `unsafe extern` block is used, all declarations within that `extern` block *must* have the `unsafe` or `safe` keywords as part of their signature. -If one of the two keywords is not explicitly provided, the declaration is assumed to be `unsafe`, and also a warning is generated. The `safe` keyword is a contextual keyword, it is currently only used within `extern` blocks. If an `extern` block is used in an older edition without the `unsafe` keyword, declarations *cannot* specify `safe` or `unsafe`. From d5bb7db100e32d60bd6e452710885b3d1e54811b Mon Sep 17 00:00:00 2001 From: Lokathor Date: Tue, 2 Apr 2024 17:57:14 -0600 Subject: [PATCH 09/29] Update text/0000-unsafe-extern-blocks.md Co-authored-by: Waffle Maybe --- text/0000-unsafe-extern-blocks.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-unsafe-extern-blocks.md b/text/0000-unsafe-extern-blocks.md index 44f6626a206..230a89acb2a 100644 --- a/text/0000-unsafe-extern-blocks.md +++ b/text/0000-unsafe-extern-blocks.md @@ -91,7 +91,7 @@ where `'l1`, ... `'lm` are its lifetime parameters, `A1`, ..., `An` are the decl ### Statics Statics within external blocks are declared in the same way as statics outside of external blocks, except that they do not have an expression initializing their value. If the static is unsafe to access, then the static should use the `unsafe` qualifier. If the static is safe to access (and immutable), then the static should use the `safe` qualifier (a contextual keyword). Statics that are not qualified as `unsafe` or `safe` are assumed to be `unsafe`. -Extern statics can be either immutable or mutable just like statics outside of external blocks. An immutable static must be initialized before any Rust code is executed. It is not enough for the static to be initialized before Rust code reads from it. A mutable extern static is always `unsafe` to access, the same as a Rust mutable static. +Extern statics can be either immutable or mutable just like statics outside of external blocks. An immutable static must be initialized before any Rust code is executed. It is not enough for the static to be initialized before Rust code reads from it. A mutable extern static is always `unsafe` to access, the same as a Rust mutable static, and as such can not be marked with a `safe` qualifier. # Drawbacks [drawbacks]: #drawbacks From 6dba902b3c63b61b11e5d77a13caadfb5c187014 Mon Sep 17 00:00:00 2001 From: Travis Cross Date: Mon, 6 May 2024 05:54:36 +0000 Subject: [PATCH 10/29] Cleanup whitespace --- text/0000-unsafe-extern-blocks.md | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/text/0000-unsafe-extern-blocks.md b/text/0000-unsafe-extern-blocks.md index 230a89acb2a..559cd89b99d 100644 --- a/text/0000-unsafe-extern-blocks.md +++ b/text/0000-unsafe-extern-blocks.md @@ -1,4 +1,3 @@ - - Feature Name: `unsafe_extern` - Start Date: 2023-05-23 - RFC PR: [rust-lang/rfcs#0000](https://github.com/rust-lang/rfcs/pull/0000) @@ -43,16 +42,16 @@ Code must update to `unsafe extern` style blocks if it wants to make `safe` decl unsafe extern { // sqrt (from libm) can be called with any `f64` pub safe fn sqrt(x: f64) -> f64; - + // strlen (from libc) requires a valid pointer, // so we mark it as being an unsafe fn pub unsafe fn strlen(p: *const c_char) -> usize; // this function doesn't say safe or unsafe, so it defaults to unsafe pub fn free(p: *mut core::ffi::c_void); - + pub safe static IMPORTANT_BYTES: [u8; 256]; - + pub safe static LINES: SyncUnsafeCell; } ``` From 23f0acfe2657677114db50f4d858cb4191dd6c75 Mon Sep 17 00:00:00 2001 From: Travis Cross Date: Mon, 6 May 2024 05:55:56 +0000 Subject: [PATCH 11/29] Improve wording of the drawback During the FCP, people noticed that the wording of the drawback probably fit better with an earlier draft of this RFC. Let's incorporate that feedback before merging. (Thanks to Waffle for raising this point.) --- text/0000-unsafe-extern-blocks.md | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/text/0000-unsafe-extern-blocks.md b/text/0000-unsafe-extern-blocks.md index 559cd89b99d..a261c1f1e7d 100644 --- a/text/0000-unsafe-extern-blocks.md +++ b/text/0000-unsafe-extern-blocks.md @@ -95,8 +95,7 @@ Extern statics can be either immutable or mutable just like statics outside of e # Drawbacks [drawbacks]: #drawbacks -* It is very unfortunate to have to essentially reverse the status quo. - * Hopefully, allowing people to safely call some foreign functions will make up for the churn caused by this change. +This change will induce some churn. Hopefully, allowing people to safely call some foreign functions will make up for that. # Rationale and alternatives [rationale-and-alternatives]: #rationale-and-alternatives From 1cef0263cca8dbebd3aa215e3ae564663a70dcb0 Mon Sep 17 00:00:00 2001 From: Travis Cross Date: Mon, 6 May 2024 06:01:45 +0000 Subject: [PATCH 12/29] Improve wording of where `safe` is allowed Let's improve the wording related to where the `safe` keyword is allowed. (Thanks to Waffle for raising this point.) --- text/0000-unsafe-extern-blocks.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-unsafe-extern-blocks.md b/text/0000-unsafe-extern-blocks.md index a261c1f1e7d..1e09b82fe3f 100644 --- a/text/0000-unsafe-extern-blocks.md +++ b/text/0000-unsafe-extern-blocks.md @@ -33,7 +33,7 @@ An extern static value is also declared with a `;` instead of an expression (sim In both cases, the actual function body or value is provided by whatever external source (which is probably not even written in Rust). When an `unsafe extern` block is used, all declarations within that `extern` block *must* have the `unsafe` or `safe` keywords as part of their signature. -The `safe` keyword is a contextual keyword, it is currently only used within `extern` blocks. +The `safe` keyword is a contextual keyword; it is currently allowed only within `extern` blocks. If an `extern` block is used in an older edition without the `unsafe` keyword, declarations *cannot* specify `safe` or `unsafe`. Code must update to `unsafe extern` style blocks if it wants to make `safe` declarations. From 842bd55b045ee12726e7000ac49345f23e3cde51 Mon Sep 17 00:00:00 2001 From: Travis Cross Date: Mon, 6 May 2024 06:13:36 +0000 Subject: [PATCH 13/29] Fix typo (Thanks to spastorino for noticing this.) --- text/0000-unsafe-extern-blocks.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-unsafe-extern-blocks.md b/text/0000-unsafe-extern-blocks.md index 1e09b82fe3f..56d50dc5bfb 100644 --- a/text/0000-unsafe-extern-blocks.md +++ b/text/0000-unsafe-extern-blocks.md @@ -67,7 +67,7 @@ They can only be used within an `unsafe` block. # Reference-level explanation [reference-level-explanation]: #reference-level-explanation -The grammar of the langauge is updated so that: +The grammar of the language is updated so that: * Editions >= 2024 *must* prefix all `extern` blocks with `unsafe`. * Editions < 2024 *should* prefix `extern` blocks with `unsafe`, this is a warn-by-default compatibility lint when `unsafe` is missing. From 176d73f7443e33de567b66c0c0cfd89dedd07adf Mon Sep 17 00:00:00 2001 From: Travis Cross Date: Mon, 6 May 2024 06:35:48 +0000 Subject: [PATCH 14/29] Clarify extent of UB An incorrect declaration in an `extern` block may cause undefined behavior in the resulting program. Let's clarify that in the text. (Thanks to Waffle for raising this point.) --- text/0000-unsafe-extern-blocks.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/0000-unsafe-extern-blocks.md b/text/0000-unsafe-extern-blocks.md index 56d50dc5bfb..98662c1ac34 100644 --- a/text/0000-unsafe-extern-blocks.md +++ b/text/0000-unsafe-extern-blocks.md @@ -56,7 +56,7 @@ unsafe extern { } ``` -`extern` blocks are `unsafe` because if the declaration doesn't match the actual external function, or the actual external data, then it causes compile time Undefined Behavior (UB). +`extern` blocks are `unsafe` because if the declaration doesn't match the actual external function, or the actual external data, then the behavior of the resulting program may be undefined. Once they are unsafely declared, a `safe` item can be used outside the `extern` block as if it were any other safe function or static value declared within rust. The unsafe obligation of ensuring that the correct items are being linked to is performed by the crate making the declaration, not the crate using that declaration. @@ -77,7 +77,7 @@ Replace the *Functions* and *Statics* sections with the following: ### Functions Functions within external blocks are declared in the same way as other Rust functions, with the exception that they must not have a body and are instead terminated by a semicolon. Patterns are not allowed in parameters, only IDENTIFIER or _ may be used. The function qualifiers `const`, `async`, and `extern` are not allowed. If the function is unsafe to call, then the function should use the `unsafe` qualifier. If the function is safe to call, then the function should use the `safe` qualifier (a contextual keyword). Functions that are not qualified as `unsafe` or `safe` are assumed to be `unsafe`. -If the function signature declared in Rust is incompatible with the function signature as declared in the foreign code it is Undefined Behavior to compile and link the code. +If the function signature declared in Rust is incompatible with the function signature as declared in the foreign code, the behavior of the resulting program may be undefined. Functions within external blocks may be called by Rust code, just like functions defined in Rust. The Rust compiler will automatically use the correct foreign ABI when making the call. From 60631ce72121c44b34d3bd498f34f8e78dc8d376 Mon Sep 17 00:00:00 2001 From: Travis Cross Date: Mon, 6 May 2024 06:47:00 +0000 Subject: [PATCH 15/29] Clarify what we're replacing in the Reference This RFC suggests replacements to text within the Rust Reference. Let's clarify where those sections are. (Thanks to Waffle for raising this point.) --- text/0000-unsafe-extern-blocks.md | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/text/0000-unsafe-extern-blocks.md b/text/0000-unsafe-extern-blocks.md index 98662c1ac34..dd2f0f0fbf2 100644 --- a/text/0000-unsafe-extern-blocks.md +++ b/text/0000-unsafe-extern-blocks.md @@ -72,7 +72,11 @@ The grammar of the language is updated so that: * Editions >= 2024 *must* prefix all `extern` blocks with `unsafe`. * Editions < 2024 *should* prefix `extern` blocks with `unsafe`, this is a warn-by-default compatibility lint when `unsafe` is missing. -Replace the *Functions* and *Statics* sections with the following: +This RFC replaces the *["functions"][]* and *["statics"][]* sections in the [external blocks][] chapter of the Rust Reference with the following: + +["functions"]: https://doc.rust-lang.org/nightly/reference/items/external-blocks.html#functions +["statics"]: https://doc.rust-lang.org/nightly/reference/items/external-blocks.html#statics +[external blocks]: https://doc.rust-lang.org/nightly/reference/items/external-blocks.html ### Functions Functions within external blocks are declared in the same way as other Rust functions, with the exception that they must not have a body and are instead terminated by a semicolon. Patterns are not allowed in parameters, only IDENTIFIER or _ may be used. The function qualifiers `const`, `async`, and `extern` are not allowed. If the function is unsafe to call, then the function should use the `unsafe` qualifier. If the function is safe to call, then the function should use the `safe` qualifier (a contextual keyword). Functions that are not qualified as `unsafe` or `safe` are assumed to be `unsafe`. From fc53654996b2d000e3ecabda5144ac180c83b713 Mon Sep 17 00:00:00 2001 From: Travis Cross Date: Mon, 6 May 2024 06:56:47 +0000 Subject: [PATCH 16/29] Add reference to Rust issue 46188 There's a long history of discussion on how incorrect declarations in `extern` blocks might cause UB in programs compiled using LLVM. Let's link to one of the issues in that history. (Thanks to madsmtm for raising this question, and to RalfJ for providing this citation.) --- text/0000-unsafe-extern-blocks.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/text/0000-unsafe-extern-blocks.md b/text/0000-unsafe-extern-blocks.md index dd2f0f0fbf2..9d47f74a724 100644 --- a/text/0000-unsafe-extern-blocks.md +++ b/text/0000-unsafe-extern-blocks.md @@ -11,11 +11,13 @@ In Edition 2024 it is `unsafe` to declare an `extern` function or static, but ex # Motivation [motivation]: #motivation -Simply declaring extern items, even without ever using them, can cause Undefined Behavior. +Simply declaring extern items, even without ever using them, can cause Undefined Behavior (see, e.g., issue [#46188][]). When performing cross-language compilation, attributes on one function declaration can flow to the foreign declaration elsewhere within LLVM and cause a miscompilation. In Rust we consider all sources of Undefined Behavior to be `unsafe`, and so we must make declaring extern blocks be `unsafe`. The up-side to this change is that in the new style it will be possible to declare an extern fn that's safe to call after the initial unsafe declaration. +[#46188]: https://github.com/rust-lang/rust/issues/46188 + # Guide-level explanation [guide-level-explanation]: #guide-level-explanation From 5cc4cc3137ce55932ae4cadafd3dfc2c007b142e Mon Sep 17 00:00:00 2001 From: Travis Cross Date: Mon, 6 May 2024 07:04:57 +0000 Subject: [PATCH 17/29] Clarify that we will "eventually" lint This RFC means to specify that we will *eventually* issue a lint in all editions when `extern` is not prefixed with `unsafe`. Let's specify this more clearly. (Thanks to Waffle and joshtriplett for raising this point.) --- text/0000-unsafe-extern-blocks.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/0000-unsafe-extern-blocks.md b/text/0000-unsafe-extern-blocks.md index 9d47f74a724..dd4b7a78f9b 100644 --- a/text/0000-unsafe-extern-blocks.md +++ b/text/0000-unsafe-extern-blocks.md @@ -26,7 +26,7 @@ Rust can utilize functions and statics from foreign code that are provided durin An `extern` block can be placed anywhere a function declaration could appear (generally at the top level of a module). * On editions >= 2024, you *must* write all `extern` blocks as `unsafe extern`. -* On editions < 2024, you *may* write `unsafe extern`, or you can write an `extern` block without the `unsafe` keyword. Writing an `extern` block without the `unsafe` keyword is provided for compatibility only, and will generate a warning. +* On editions < 2024, you *may* write `unsafe extern`, or you can write an `extern` block without the `unsafe` keyword. Writing an `extern` block without the `unsafe` keyword is provided for compatibility only, and will eventually generate a warning. * `unsafe extern` interacts with the `unsafe_code` lint, and a `deny` or `forbid` with that lint will deny or forbid the unsafe external block. Within an `extern` block is zero or more declarations of external functions and/or external static values. @@ -72,7 +72,7 @@ They can only be used within an `unsafe` block. The grammar of the language is updated so that: * Editions >= 2024 *must* prefix all `extern` blocks with `unsafe`. -* Editions < 2024 *should* prefix `extern` blocks with `unsafe`, this is a warn-by-default compatibility lint when `unsafe` is missing. +* Editions < 2024 *should* prefix `extern` blocks with `unsafe`, this will eventually be a warn-by-default compatibility lint when `unsafe` is missing. This RFC replaces the *["functions"][]* and *["statics"][]* sections in the [external blocks][] chapter of the Rust Reference with the following: From 4684d53b8ac4a86eb0f929ba1c6bf761c9a99a16 Mon Sep 17 00:00:00 2001 From: Travis Cross Date: Mon, 6 May 2024 20:19:01 +0000 Subject: [PATCH 18/29] Unwrap lines This document had a mix of line wrapping styles. Let's consistently unwrap the lines. --- text/0000-unsafe-extern-blocks.md | 53 ++++++++++++++----------------- 1 file changed, 23 insertions(+), 30 deletions(-) diff --git a/text/0000-unsafe-extern-blocks.md b/text/0000-unsafe-extern-blocks.md index dd4b7a78f9b..3291cc25eac 100644 --- a/text/0000-unsafe-extern-blocks.md +++ b/text/0000-unsafe-extern-blocks.md @@ -11,10 +11,7 @@ In Edition 2024 it is `unsafe` to declare an `extern` function or static, but ex # Motivation [motivation]: #motivation -Simply declaring extern items, even without ever using them, can cause Undefined Behavior (see, e.g., issue [#46188][]). -When performing cross-language compilation, attributes on one function declaration can flow to the foreign declaration elsewhere within LLVM and cause a miscompilation. -In Rust we consider all sources of Undefined Behavior to be `unsafe`, and so we must make declaring extern blocks be `unsafe`. -The up-side to this change is that in the new style it will be possible to declare an extern fn that's safe to call after the initial unsafe declaration. +Simply declaring extern items, even without ever using them, can cause Undefined Behavior (see, e.g., issue [#46188][]). When performing cross-language compilation, attributes on one function declaration can flow to the foreign declaration elsewhere within LLVM and cause a miscompilation. In Rust we consider all sources of Undefined Behavior to be `unsafe`, and so we must make declaring extern blocks be `unsafe`. The up-side to this change is that in the new style it will be possible to declare an extern fn that's safe to call after the initial unsafe declaration. [#46188]: https://github.com/rust-lang/rust/issues/46188 @@ -25,20 +22,15 @@ Rust can utilize functions and statics from foreign code that are provided durin An `extern` block can be placed anywhere a function declaration could appear (generally at the top level of a module). -* On editions >= 2024, you *must* write all `extern` blocks as `unsafe extern`. -* On editions < 2024, you *may* write `unsafe extern`, or you can write an `extern` block without the `unsafe` keyword. Writing an `extern` block without the `unsafe` keyword is provided for compatibility only, and will eventually generate a warning. -* `unsafe extern` interacts with the `unsafe_code` lint, and a `deny` or `forbid` with that lint will deny or forbid the unsafe external block. +- On editions >= 2024, you *must* write all `extern` blocks as `unsafe extern`. +- On editions < 2024, you *may* write `unsafe extern`, or you can write an `extern` block without the `unsafe` keyword. Writing an `extern` block without the `unsafe` keyword is provided for compatibility only, and will eventually generate a warning. +- `unsafe extern` interacts with the `unsafe_code` lint, and a `deny` or `forbid` with that lint will deny or forbid the unsafe external block. -Within an `extern` block is zero or more declarations of external functions and/or external static values. -An extern function is declared with a `;` instead of a function body (similar to a method of a trait). -An extern static value is also declared with a `;` instead of an expression (similar to an associated const of a trait). -In both cases, the actual function body or value is provided by whatever external source (which is probably not even written in Rust). +Within an `extern` block is zero or more declarations of external functions and/or external static values. An extern function is declared with a `;` instead of a function body (similar to a method of a trait). An extern static value is also declared with a `;` instead of an expression (similar to an associated const of a trait). In both cases, the actual function body or value is provided by whatever external source (which is probably not even written in Rust). -When an `unsafe extern` block is used, all declarations within that `extern` block *must* have the `unsafe` or `safe` keywords as part of their signature. -The `safe` keyword is a contextual keyword; it is currently allowed only within `extern` blocks. +When an `unsafe extern` block is used, all declarations within that `extern` block *must* have the `unsafe` or `safe` keywords as part of their signature. The `safe` keyword is a contextual keyword; it is currently allowed only within `extern` blocks. -If an `extern` block is used in an older edition without the `unsafe` keyword, declarations *cannot* specify `safe` or `unsafe`. -Code must update to `unsafe extern` style blocks if it wants to make `safe` declarations. +If an `extern` block is used in an older edition without the `unsafe` keyword, declarations *cannot* specify `safe` or `unsafe`. Code must update to `unsafe extern` style blocks if it wants to make `safe` declarations. ```rust unsafe extern { @@ -60,19 +52,17 @@ unsafe extern { `extern` blocks are `unsafe` because if the declaration doesn't match the actual external function, or the actual external data, then the behavior of the resulting program may be undefined. -Once they are unsafely declared, a `safe` item can be used outside the `extern` block as if it were any other safe function or static value declared within rust. -The unsafe obligation of ensuring that the correct items are being linked to is performed by the crate making the declaration, not the crate using that declaration. +Once they are unsafely declared, a `safe` item can be used outside the `extern` block as if it were any other safe function or static value declared within rust. The unsafe obligation of ensuring that the correct items are being linked to is performed by the crate making the declaration, not the crate using that declaration. -Items declared as `unsafe` *must* still have a correctly matching signature at compile time, but they *also* have some sort of additional obligation for correct usage at runtime. -They can only be used within an `unsafe` block. +Items declared as `unsafe` *must* still have a correctly matching signature at compile time, but they *also* have some sort of additional obligation for correct usage at runtime. They can only be used within an `unsafe` block. # Reference-level explanation [reference-level-explanation]: #reference-level-explanation The grammar of the language is updated so that: -* Editions >= 2024 *must* prefix all `extern` blocks with `unsafe`. -* Editions < 2024 *should* prefix `extern` blocks with `unsafe`, this will eventually be a warn-by-default compatibility lint when `unsafe` is missing. +- Editions >= 2024 *must* prefix all `extern` blocks with `unsafe`. +- Editions < 2024 *should* prefix `extern` blocks with `unsafe`, this will eventually be a warn-by-default compatibility lint when `unsafe` is missing. This RFC replaces the *["functions"][]* and *["statics"][]* sections in the [external blocks][] chapter of the Rust Reference with the following: @@ -81,32 +71,35 @@ This RFC replaces the *["functions"][]* and *["statics"][]* sections in the [ext [external blocks]: https://doc.rust-lang.org/nightly/reference/items/external-blocks.html ### Functions -Functions within external blocks are declared in the same way as other Rust functions, with the exception that they must not have a body and are instead terminated by a semicolon. Patterns are not allowed in parameters, only IDENTIFIER or _ may be used. The function qualifiers `const`, `async`, and `extern` are not allowed. If the function is unsafe to call, then the function should use the `unsafe` qualifier. If the function is safe to call, then the function should use the `safe` qualifier (a contextual keyword). Functions that are not qualified as `unsafe` or `safe` are assumed to be `unsafe`. + +Functions within external blocks are declared in the same way as other Rust functions, with the exception that they must not have a body and are instead terminated by a semicolon. Patterns are not allowed in parameters, only IDENTIFIER or _ may be used. The function qualifiers `const`, `async`, and `extern` are not allowed. If the function is unsafe to call, then the function should use the `unsafe` qualifier. If the function is safe to call, then the function should use the `safe` qualifier (a contextual keyword). Functions that are not qualified as `unsafe` or `safe` are assumed to be `unsafe`. If the function signature declared in Rust is incompatible with the function signature as declared in the foreign code, the behavior of the resulting program may be undefined. -Functions within external blocks may be called by Rust code, just like functions defined in Rust. The Rust compiler will automatically use the correct foreign ABI when making the call. +Functions within external blocks may be called by Rust code, just like functions defined in Rust. The Rust compiler will automatically use the correct foreign ABI when making the call. + +When coerced to a function pointer, a function declared in an extern block has type: -When coerced to a function pointer, a function declared in an extern block has type ```rust extern "abi" for<'l1, ..., 'lm> fn(A1, ..., An) -> R ``` -where `'l1`, ... `'lm` are its lifetime parameters, `A1`, ..., `An` are the declared types of its parameters and `R` is the declared return type. +where `'l1`, ..., `'lm` are its lifetime parameters, `A1`, ..., `An` are the declared types of its parameters and `R` is the declared return type. ### Statics -Statics within external blocks are declared in the same way as statics outside of external blocks, except that they do not have an expression initializing their value. If the static is unsafe to access, then the static should use the `unsafe` qualifier. If the static is safe to access (and immutable), then the static should use the `safe` qualifier (a contextual keyword). Statics that are not qualified as `unsafe` or `safe` are assumed to be `unsafe`. -Extern statics can be either immutable or mutable just like statics outside of external blocks. An immutable static must be initialized before any Rust code is executed. It is not enough for the static to be initialized before Rust code reads from it. A mutable extern static is always `unsafe` to access, the same as a Rust mutable static, and as such can not be marked with a `safe` qualifier. +Statics within external blocks are declared in the same way as statics outside of external blocks, except that they do not have an expression initializing their value. If the static is unsafe to access, then the static should use the `unsafe` qualifier. If the static is safe to access (and immutable), then the static should use the `safe` qualifier (a contextual keyword). Statics that are not qualified as `unsafe` or `safe` are assumed to be `unsafe`. + +Extern statics can be either immutable or mutable just like statics outside of external blocks. An immutable static must be initialized before any Rust code is executed. It is not enough for the static to be initialized before Rust code reads from it. A mutable extern static is always `unsafe` to access, the same as a Rust mutable static, and as such can not be marked with a `safe` qualifier. # Drawbacks [drawbacks]: #drawbacks -This change will induce some churn. Hopefully, allowing people to safely call some foreign functions will make up for that. +This change will induce some churn. Hopefully, allowing people to safely call some foreign functions will make up for that. # Rationale and alternatives [rationale-and-alternatives]: #rationale-and-alternatives -Incorrect extern declarations can cause UB in current Rust, but we have no way to automatically check that all declarations are correct, nor is such a thing likely to be developed. Making the declarations `unsafe` so that programmers are aware of the dangers and can give extern blocks the attention they deserve is the minimum step. +Incorrect extern declarations can cause UB in current Rust, but we have no way to automatically check that all declarations are correct, nor is such a thing likely to be developed. Making the declarations `unsafe` so that programmers are aware of the dangers and can give extern blocks the attention they deserve is the minimum step. # Prior art [prior-art]: #prior-art @@ -116,7 +109,7 @@ None we are aware of. # Unresolved questions [unresolved-questions]: #unresolved-questions -* Extern declarations are actually *always* unsafe and able to cause UB regardless of edition. This RFC doesn't have a specific answer on how to improve pre-2024 code. +* Extern declarations are actually *always* unsafe and able to cause UB regardless of edition. This RFC doesn't have a specific answer on how to improve pre-2024 code. # Future possibilities [future-possibilities]: #future-possibilities From b423b2bfff110ba9872c6fd85fa0ccae471cc2f7 Mon Sep 17 00:00:00 2001 From: Travis Cross Date: Mon, 6 May 2024 20:28:27 +0000 Subject: [PATCH 19/29] Lowercase "undefined behavior" The term "undefined behavior" is not a proper noun, so let's make this lowercase. --- text/0000-unsafe-extern-blocks.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-unsafe-extern-blocks.md b/text/0000-unsafe-extern-blocks.md index 3291cc25eac..2cebc9fd55c 100644 --- a/text/0000-unsafe-extern-blocks.md +++ b/text/0000-unsafe-extern-blocks.md @@ -11,7 +11,7 @@ In Edition 2024 it is `unsafe` to declare an `extern` function or static, but ex # Motivation [motivation]: #motivation -Simply declaring extern items, even without ever using them, can cause Undefined Behavior (see, e.g., issue [#46188][]). When performing cross-language compilation, attributes on one function declaration can flow to the foreign declaration elsewhere within LLVM and cause a miscompilation. In Rust we consider all sources of Undefined Behavior to be `unsafe`, and so we must make declaring extern blocks be `unsafe`. The up-side to this change is that in the new style it will be possible to declare an extern fn that's safe to call after the initial unsafe declaration. +Simply declaring extern items, even without ever using them, can cause undefined behavior (see, e.g., issue [#46188][]). When performing cross-language compilation, attributes on one function declaration can flow to the foreign declaration elsewhere within LLVM and cause a miscompilation. In Rust we consider all sources of undefined behavior to be `unsafe`, and so we must make declaring extern blocks be `unsafe`. The up-side to this change is that in the new style it will be possible to declare an extern fn that's safe to call after the initial unsafe declaration. [#46188]: https://github.com/rust-lang/rust/issues/46188 From 09a088c1130ffe56350df8cda16fa45d0d260439 Mon Sep 17 00:00:00 2001 From: Travis Cross Date: Mon, 6 May 2024 23:02:08 +0000 Subject: [PATCH 20/29] Address feedback and questions As this RFC was reviewed in the GitHub thread, many alternatives were proposed and questions raised. As those of us who were a bit too close to it were cursed with knowledge, the rationale for rejecting these alternatives were not fully articulated and not all of these questions were clearly answered. Let's better document these alternatives, the rationale for rejecting each, and the answers to various known questions. (Thanks to GoldsteinE, madsmtm, kennytm, samih, and tmccombs for raising these alternatives and questions.) --- text/0000-unsafe-extern-blocks.md | 197 +++++++++++++++++++++++++++++- 1 file changed, 194 insertions(+), 3 deletions(-) diff --git a/text/0000-unsafe-extern-blocks.md b/text/0000-unsafe-extern-blocks.md index 2cebc9fd55c..b136e26b831 100644 --- a/text/0000-unsafe-extern-blocks.md +++ b/text/0000-unsafe-extern-blocks.md @@ -96,11 +96,198 @@ Extern statics can be either immutable or mutable just like statics outside of e This change will induce some churn. Hopefully, allowing people to safely call some foreign functions will make up for that. -# Rationale and alternatives -[rationale-and-alternatives]: #rationale-and-alternatives +# Rationale +[rationale]: #rationale Incorrect extern declarations can cause UB in current Rust, but we have no way to automatically check that all declarations are correct, nor is such a thing likely to be developed. Making the declarations `unsafe` so that programmers are aware of the dangers and can give extern blocks the attention they deserve is the minimum step. +# Alternatives +[alternatives]: #alternatives + +## Don't prefix `extern` with `unsafe` + +One could ask, why not allow each item within an `extern` block to be prefixed with either `safe` or `unsafe`, but do not prefix `extern` with `unsafe`? E.g.: + +```rust +extern { + pub safe fn sqrt(x: f64) -> f64; + pub unsafe fn strlen(p: *const c_char) -> usize; +} +``` + +Here's the problem with this. The programmer is asserting that these signatures are correct, but this assertion cannot be checked by the compiler. The human must simply get these correct, and if that person doesn't, then calling either of these functions, even the one marked `safe`, may result in undefined behavior. In fact, as explained in the motivation, due to current LLVM behavior, simply *writing* this `extern` block with incorrect signatures can lead to undefined behavior in the resulting program, even if these functions are never called by Rust code. + +In Rust, we use `unsafe { .. }` (and, as of [RFC 3325][], `unsafe(..)`) to indicate that what is enclosed must be proven correct by the programmer to avoid undefined behavior. This RFC extends this pattern to `extern` blocks. + +[RFC 3325]: https://github.com/rust-lang/rfcs/pull/3325 + +## Don't prefix `extern` with `unsafe` and support `unsafe` items only + +One could ask, why not support only `unsafe` items within `extern` blocks and then don't require those blocks to be marked `unsafe`? E.g.: + +```rust +extern { + pub unsafe fn sqrt(x: f64) -> f64; + pub unsafe fn strlen(p: *const c_char) -> usize; +} +``` + +One could argue that, since an `unsafe { .. }` block must be used to call either of these functions, that this is OK. + +There are three problems with this. + +One, as mentioned above, simply *writing* this `extern` block, if the signatures are incorrect, can cause undefined behavior in the resulting program, even if these functions are never called by Rust code. Saying `unsafe` in the signature of each item only indicates that the caller must uphold certain unchecked invariants; it does not correctly capture the semantics of this kind of unsafety. + +Two, we have to think about *who* is responsible for discharging the obligation of ensuring that these signatures are correct. Is it the responsibility of a *caller* to these functions to ensure the signatures are correct? That would seem unreasonable. So even though the caller has to write `unsafe { .. }` to call these functions, and even setting aside the issue that calling the functions from Rust is not required to produce undefined behavior, this suggests that the `extern` *itself* should be somehow marked with or wrapped in `unsafe`. + +Three, not allowing items to be marked as `safe` would remove one of the key tangible *benefits* that the changes in this RFC provide to users. This would reduce the motivation to make this change at all. + +## Prefix only `extern` with `safe` or `unsafe` + +One could ask, who not prefix *only* `extern` with `safe` or `unsafe`? E.g.: + +```rust +safe extern { + pub fn sqrt(x: f64) -> f64; +} +unsafe extern { + pub fn strlen(p: *const c_char) -> usize; +} +``` + +The problem with this, as explained in the last two sections, is that the person who writes the `extern` block must discharge an unchecked obligation of proving that the signatures are correct. This must be proven by the programmer even for the `sqrt` function. One purpose of this RFC is to flag this obligation with `unsafe`. This variation would fail to do that. + +## Wrap `extern` in `unsafe { .. }` + +Semantically, what we're trying to express is probably most precisely represented by syntax such as: + +```rust +unsafe { extern { + pub safe fn sqrt(x: f64) -> f64; + pub unsafe fn strlen(p: *const c_char) -> usize; +}} +``` + +However, we currently don't support `unsafe { .. }` blocks at the item level, and the extra set of braces and indentation would seem unfortunate here. One way to think of `unsafe extern { .. }` is exactly as above, but with the braces elided. + +## Don't add the `safe` contextual keyword, flip the default + +One could ask, why include the `safe` contextual keyword at all? Why not just *assume* that within an `unsafe extern` block that items not marked as `unsafe` are in fact safe to call (as is true elsewhere in Rust)? E.g.: + +```rust +unsafe extern { + pub fn sqrt(x: f64) -> f64; // Safe to call. + pub unsafe fn strlen(p: *const c_char) -> usize; +} +``` + +This was in fact the original proposal. The reason we did not end up adopting this was to reduce the churn that users would experience and to make the transition more incremental. + +Consider that all `extern` blocks today look like this: + +```rust +extern { + pub fn sqrt(x: f64) -> f64; // Unsafe to call. + pub fn strlen(p: *const c_char) -> usize; // Unsafe to call. + // Many more items follow... +} +``` + +We want users to be able to adopt the new syntax by changing just one line, e.g.: + +```rust +unsafe extern { // <--- We added `unsafe` here. + pub fn sqrt(x: f64) -> f64; // Unsafe to call. + pub fn strlen(p: *const c_char) -> usize; // Unsafe to call. + // Many more items follow... +} +``` + +If we had made it so that writing `unsafe extern` flipped the default and made each item safe to call, then users would, upon making this change, have to simultaneously examine each item to determine whether it should be safe to call (or, at least, would have to conservatively add `unsafe` to each item). We wanted to avoid this. + +Still, the user gets immediate *benefit* out of this change, because the user can now *incrementally* mark items as safe to call, e.g.: + +```rust +unsafe extern { + pub safe fn sqrt(x: f64) -> f64; // <--- We added `safe` here. + pub fn strlen(p: *const c_char) -> usize; // Unsafe to call. + // Many more items follow... +} +``` + +We may or may not, in a later edition, decide to switch the default and thereby make the `safe` contextual keyword redundant. Either way, adding the `safe` keyword makes the migration more straightforward while delivering value to users and better indicating where users must make a correctness assertion to the compiler. + +## Don't add the `safe` contextual keyword, keep the default + +One could ask, why not allow but not require items within an `unsafe extern` block to be prefixed with `unsafe`, but not support prefixing items with `safe`, and treat items not prefixed as `unsafe`? E.g.: + +```rust +unsafe extern { + pub fn sqrt(x: f64) -> f64; // Unsafe to call. + pub unsafe fn strlen(p: *const c_char) -> usize; +} +``` + +Doing this would eliminate one of the key tangible benefits of this RFC, which is allowing users to express that an item declared within an `unsafe extern` block is in fact sound to use directly in safe code. + +While we could, in a later edition, perhaps flip the default to make items safe to call, we could only do that if enough code has already been migrated. But in the interim, we'd be asking users to accept the churn of migrating to this syntax without receiving any of the benefits. That seems a bit like a cyclic dependency, so we've chosen not to do that. + +## Require all items to be marked as either `safe` or `unsafe` + +One could ask, why not require all items within an `unsafe extern` block to be marked as either `safe` or `unsafe` rather than making this optional? Or alternatively, one could ask, why not *only* allow items to be marked as `unsafe` and *require* that all items be marked `unsafe`? + +As described in the last section, doing this would lead to a worse migration story for users, and so we chose not to do this. + +## Wait until we switch to `safe { .. }` blocks + +One could ask, why not wait to do this at all until we switch the language to use `safe { .. }` rather than `unsafe { .. }` blocks and then align this RFC with that? + +The problem with this is that there is no current plan to make such a switch. Waiting to improve the language on a possibility that may or may not happen -- and in any case, will not happen soon -- is usually not a good plan. + +## Use `trusted` as the contextual keyword + +One could ask, why not use `trusted` rather than `safe` as the contextual keyword? E.g.: + +```rust +unsafe extern { + pub trusted fn sqrt(x: f64) -> f64; // Safe to call. + pub unsafe fn strlen(p: *const c_char) -> usize; +} +``` + +The Rust language already has an accepted semantic for "safe" and "unsafe". If we were to introduce a separated "trusted" concept, that would need to be part of a larger plan. Such a plan does not yet exist in any concrete form, and it's not clear at this point whether any plan along these lines will succeed in gaining consensus. Waiting to deliver value here on that possibility seems like a bad plan. + +If we later decide, e.g., to replace all uses of `unsafe { .. }` with `trusted { .. }`, large amounts of code would need to be changed in that migration. Changing from `safe fn` to `trusted fn` as part of that, as this RFC would require, doesn't seem that it would make that migration markedly more painful. + +## Fire the `unsafe_code` lint for `extern` blocks also + +This RFC specifies that the `unsafe_code` lint will fire for `unsafe extern` but not for `extern` blocks. One could ask, why not fire this for `extern` blocks also? + +The problem with doing this is that it may be very noisy. We're careful when expanding the meaning of existing lints to not create too much noise, and doing this, at least immediately, may run afoul of this. + +# Questions and answers +[q-and-a]: #q-and-a + +## Why do we want to mark `extern` blocks as `unsafe`? + +In *safe* Rust, we want the compiler to *prove* that all code is *sound* and therefore cannot exhibit undefined behavior. However, for some things, the compiler cannot complete this proof without help from the programmer. When the programmer must make assertions that cannot be checked by the compiler to preserve soundness, we call this *unsafe* Rust. We use the `unsafe` keyword to designate places where the programmer has this proof obligation. + +In the past, `extern` blocks have been an exception to this. Programmers are required to prove that these blocks are correct, and the compiler has no way of checking this, but we had yet not thought to write `unsafe` here. This RFC closes that gap. + +## Is adding this feature going to break people's existing code on existing editions? + +No. Rust has a stability guarantee that is outlined in [RFC 1122][]. Adding this feature does not break any existing code on existing editions when updating to newer versions of the Rust compiler. + +[RFC 1122]: https://github.com/rust-lang/rfcs/pull/1122 + +## Will `extern` blocks not marked `unsafe extern` fire the `unsafe_code` lint? + +No. This RFC specifies that `unsafe extern` blocks will fire this lint. There are no such blocks in the ecosystem today, so people who have `#![forbid(unsafe_code)]` will only newly encounter this lint when switching a block from `extern` to `unsafe extern`. + +## Does this RFC require all items in an `unsafe extern` block to be marked `safe` or `unsafe`? + +No. This RFC allows for items within an `unsafe extern` block to not be marked with either of `safe` or `unsafe`. Items that are not marked in either way are assumed to be `unsafe`. + # Prior art [prior-art]: #prior-art @@ -114,4 +301,8 @@ None we are aware of. # Future possibilities [future-possibilities]: #future-possibilities -None are apparent at this time. +## Interaction with extern types + +If we were to later accept [RFC 3396][] ("Extern types v2"), that would introduce `type` items into `extern` blocks, and the interaction between those items, this RFC, and the `unsafe_code` lint would need to be addressed. + +[RFC 3396]: https://github.com/rust-lang/rfcs/pull/3396 From ca7713cba6bdc036935d55c957a9e6cae4b0c07f Mon Sep 17 00:00:00 2001 From: Travis Cross Date: Tue, 7 May 2024 06:50:21 +0000 Subject: [PATCH 21/29] Add alternative of fixing LLVM (if it is a fix) One possibility we should mention is that of changing the behavior of LLVM and then not adding `unsafe extern`, so let's mention that. (Thanks to RalfJ for raising this point.) --- text/0000-unsafe-extern-blocks.md | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/text/0000-unsafe-extern-blocks.md b/text/0000-unsafe-extern-blocks.md index b136e26b831..93b3863315c 100644 --- a/text/0000-unsafe-extern-blocks.md +++ b/text/0000-unsafe-extern-blocks.md @@ -142,6 +142,18 @@ Two, we have to think about *who* is responsible for discharging the obligation Three, not allowing items to be marked as `safe` would remove one of the key tangible *benefits* that the changes in this RFC provide to users. This would reduce the motivation to make this change at all. +## Fix LLVM and don't prefix `extern` with `unsafe` + +One could ask, why not fix LLVM such that incorrect signatures in an `extern` block would not result in undefined behavior in the resulting program unless those items were used in Rust code, and then not add `unsafe extern`? + +There are three problems with this. + +One, it's not entirely clear that it's feasible to fix LLVM in this way. Moreover, it's still a bit unclear to us whether or not this behavior is allowed by the C standard. If it is allowed, then LLVM does not, arguably, need to be fixed at all. + +Two, even if the C standard does not permit what LLVM is doing and it proves feasible to fix LLVM, we still, as described above, believe that it's unreasonable to expect that *callers* to a function declared in an `extern` block should have to prove that the signature is correct. We want the obligation of proving this to sit with the person writing the `extern` block, not the person calling a function declared within. + +Three, if we were to say that the proof obligation of ensuring the signature of an item declared within an `extern` block rests with the person *using* that item, then we could never declare some items within an `extern` to be OK to use directly from safe code. This is something we want to allow, and the only way to do this is if the proof obligation rests with the person writing the `extern` block. Marking these blocks with `unsafe` more clearly signals who holds this proof obligation. + ## Prefix only `extern` with `safe` or `unsafe` One could ask, who not prefix *only* `extern` with `safe` or `unsafe`? E.g.: From efc671cf0862ad547a85a5d92762e819c9c88b72 Mon Sep 17 00:00:00 2001 From: Travis Cross Date: Tue, 7 May 2024 15:38:57 +0000 Subject: [PATCH 22/29] Clarify about fixing LLVM despite C Even if the C standard allows for what LLVM is doing, we could still conceivably fix LLVM. In the text, let's draw this out a bit more finely. (Thanks to RalfJ for raising this point.) --- text/0000-unsafe-extern-blocks.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/0000-unsafe-extern-blocks.md b/text/0000-unsafe-extern-blocks.md index 93b3863315c..1a7f0900833 100644 --- a/text/0000-unsafe-extern-blocks.md +++ b/text/0000-unsafe-extern-blocks.md @@ -148,9 +148,9 @@ One could ask, why not fix LLVM such that incorrect signatures in an `extern` bl There are three problems with this. -One, it's not entirely clear that it's feasible to fix LLVM in this way. Moreover, it's still a bit unclear to us whether or not this behavior is allowed by the C standard. If it is allowed, then LLVM does not, arguably, need to be fixed at all. +One, it's not entirely clear that it's feasible to fix LLVM in this way. Moreover, it's still a bit unclear to us whether or not this behavior is allowed by the C standard. If it is allowed, that may make it more challenging to build a consensus in favor of changing it in LLVM. -Two, even if the C standard does not permit what LLVM is doing and it proves feasible to fix LLVM, we still, as described above, believe that it's unreasonable to expect that *callers* to a function declared in an `extern` block should have to prove that the signature is correct. We want the obligation of proving this to sit with the person writing the `extern` block, not the person calling a function declared within. +Two, even if the C standard does not permit what LLVM is doing (or we were otherwise able to build a consensus for change) and it proves feasible to fix LLVM, we still, as described above, believe that it's unreasonable to expect that *callers* to a function declared in an `extern` block should have to prove that the signature is correct. We want the obligation of proving this to sit with the person writing the `extern` block, not the person calling a function declared within. Three, if we were to say that the proof obligation of ensuring the signature of an item declared within an `extern` block rests with the person *using* that item, then we could never declare some items within an `extern` to be OK to use directly from safe code. This is something we want to allow, and the only way to do this is if the proof obligation rests with the person writing the `extern` block. Marking these blocks with `unsafe` more clearly signals who holds this proof obligation. From 2c106c3c7e043fa6772415e08188d98f050466d2 Mon Sep 17 00:00:00 2001 From: Travis Cross Date: Tue, 7 May 2024 16:12:35 +0000 Subject: [PATCH 23/29] Clarify about `unsafe_code` and edition migration When people migrate to the new edition, if they have turned up the severity of the `unsafe_code` lint and they have `extern` blocks that need to be marked `unsafe`, they will see this lint as intended. Let's make a note of that. (Thanks to kennytm for raising this point.) --- text/0000-unsafe-extern-blocks.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/text/0000-unsafe-extern-blocks.md b/text/0000-unsafe-extern-blocks.md index 1a7f0900833..9f36adf648a 100644 --- a/text/0000-unsafe-extern-blocks.md +++ b/text/0000-unsafe-extern-blocks.md @@ -275,7 +275,9 @@ If we later decide, e.g., to replace all uses of `unsafe { .. }` with `trusted { This RFC specifies that the `unsafe_code` lint will fire for `unsafe extern` but not for `extern` blocks. One could ask, why not fire this for `extern` blocks also? -The problem with doing this is that it may be very noisy. We're careful when expanding the meaning of existing lints to not create too much noise, and doing this, at least immediately, may run afoul of this. +The problem with doing this is that it may be very noisy on existing editions. We're careful when expanding the meaning of existing lints to not create too much noise for existing code on existing editions, and doing this, at least immediately, may run afoul of this. + +Of course, when migrating code to the *new* edition, people will be changing from `extern` to `unsafe extern`, and so if these people have both specifically turned up the severity of the `unsafe_code` lint (which, by default, is set to `allow`) and have `extern` blocks that now must be marked as `unsafe`, they will see this lint. That is the intention of this change, as we're making clear that the person writing an `unsafe extern` block is responsible for proving that it is correct to ensure soundness, which makes this code *unsafe*. # Questions and answers [q-and-a]: #q-and-a From c1192da628933421b7d5957e2f510d46d1d0694f Mon Sep 17 00:00:00 2001 From: Travis Cross Date: Tue, 7 May 2024 17:03:04 +0000 Subject: [PATCH 24/29] Fix typo --- text/0000-unsafe-extern-blocks.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-unsafe-extern-blocks.md b/text/0000-unsafe-extern-blocks.md index 9f36adf648a..e2589da5ebd 100644 --- a/text/0000-unsafe-extern-blocks.md +++ b/text/0000-unsafe-extern-blocks.md @@ -156,7 +156,7 @@ Three, if we were to say that the proof obligation of ensuring the signature of ## Prefix only `extern` with `safe` or `unsafe` -One could ask, who not prefix *only* `extern` with `safe` or `unsafe`? E.g.: +One could ask, why not prefix *only* `extern` with `safe` or `unsafe`? E.g.: ```rust safe extern { From 9f36a9256a340decce6bc1b31dae9dc2bdc34128 Mon Sep 17 00:00:00 2001 From: Travis Cross Date: Tue, 7 May 2024 17:06:49 +0000 Subject: [PATCH 25/29] Remove issue 46188 as a motivation This RFC had used as one motivation that undefined behavior can currently result simply from having an incorrect `extern` block even if the items within are not used from Rust code. This motivation was not cited in the lang team's 2023-10-11 consensus for how and why to proceed with this RFC, and it's possible that we may be able to resolve this issue in other ways, so let's remove that motivation from this RFC. (Thanks to RalfJ for raising this point and suggesting this.) --- text/0000-unsafe-extern-blocks.md | 48 ++++++++++++------------------- 1 file changed, 18 insertions(+), 30 deletions(-) diff --git a/text/0000-unsafe-extern-blocks.md b/text/0000-unsafe-extern-blocks.md index e2589da5ebd..67e380ddc72 100644 --- a/text/0000-unsafe-extern-blocks.md +++ b/text/0000-unsafe-extern-blocks.md @@ -11,9 +11,11 @@ In Edition 2024 it is `unsafe` to declare an `extern` function or static, but ex # Motivation [motivation]: #motivation -Simply declaring extern items, even without ever using them, can cause undefined behavior (see, e.g., issue [#46188][]). When performing cross-language compilation, attributes on one function declaration can flow to the foreign declaration elsewhere within LLVM and cause a miscompilation. In Rust we consider all sources of undefined behavior to be `unsafe`, and so we must make declaring extern blocks be `unsafe`. The up-side to this change is that in the new style it will be possible to declare an extern fn that's safe to call after the initial unsafe declaration. +When we declare the signature of items within `extern` blocks, we are asserting to the compiler that these declarations are correct. The compiler cannot itself verify these assertions. If the signatures we declare are in fact not correct, then using these items may result in undefined behavior. It's *unreasonable* to expect the *caller* (in the case of function items) to have to prove that the signature is valid. Instead, it's the responsibility of the person writing the `extern` block to ensure the correctness of all signatures within. -[#46188]: https://github.com/rust-lang/rust/issues/46188 +Since this proof obligation must be discharged at the site of the `extern` block, and since this proof cannot be checked by the compiler, this implies that `extern` blocks are *unsafe*. Correspondingly, we want to mark these blocks with the `unsafe` keyword and fire the `unsafe_code` lint for them. + +By making clear where this proof obligation sits, we can now allow for items that can be soundly used directly from *safe* code to be declared within `unsafe extern` blocks. # Guide-level explanation [guide-level-explanation]: #guide-level-explanation @@ -96,11 +98,6 @@ Extern statics can be either immutable or mutable just like statics outside of e This change will induce some churn. Hopefully, allowing people to safely call some foreign functions will make up for that. -# Rationale -[rationale]: #rationale - -Incorrect extern declarations can cause UB in current Rust, but we have no way to automatically check that all declarations are correct, nor is such a thing likely to be developed. Making the declarations `unsafe` so that programmers are aware of the dangers and can give extern blocks the attention they deserve is the minimum step. - # Alternatives [alternatives]: #alternatives @@ -115,7 +112,7 @@ extern { } ``` -Here's the problem with this. The programmer is asserting that these signatures are correct, but this assertion cannot be checked by the compiler. The human must simply get these correct, and if that person doesn't, then calling either of these functions, even the one marked `safe`, may result in undefined behavior. In fact, as explained in the motivation, due to current LLVM behavior, simply *writing* this `extern` block with incorrect signatures can lead to undefined behavior in the resulting program, even if these functions are never called by Rust code. +Here's the problem with this. The programmer is asserting that these signatures are correct, but this assertion cannot be checked by the compiler. The human must simply get these correct, and if that person doesn't, then calling either of these functions, even the one marked `safe`, may result in undefined behavior. In Rust, we use `unsafe { .. }` (and, as of [RFC 3325][], `unsafe(..)`) to indicate that what is enclosed must be proven correct by the programmer to avoid undefined behavior. This RFC extends this pattern to `extern` blocks. @@ -134,25 +131,11 @@ extern { One could argue that, since an `unsafe { .. }` block must be used to call either of these functions, that this is OK. -There are three problems with this. - -One, as mentioned above, simply *writing* this `extern` block, if the signatures are incorrect, can cause undefined behavior in the resulting program, even if these functions are never called by Rust code. Saying `unsafe` in the signature of each item only indicates that the caller must uphold certain unchecked invariants; it does not correctly capture the semantics of this kind of unsafety. - -Two, we have to think about *who* is responsible for discharging the obligation of ensuring that these signatures are correct. Is it the responsibility of a *caller* to these functions to ensure the signatures are correct? That would seem unreasonable. So even though the caller has to write `unsafe { .. }` to call these functions, and even setting aside the issue that calling the functions from Rust is not required to produce undefined behavior, this suggests that the `extern` *itself* should be somehow marked with or wrapped in `unsafe`. - -Three, not allowing items to be marked as `safe` would remove one of the key tangible *benefits* that the changes in this RFC provide to users. This would reduce the motivation to make this change at all. - -## Fix LLVM and don't prefix `extern` with `unsafe` - -One could ask, why not fix LLVM such that incorrect signatures in an `extern` block would not result in undefined behavior in the resulting program unless those items were used in Rust code, and then not add `unsafe extern`? - -There are three problems with this. +There are two problems with this. -One, it's not entirely clear that it's feasible to fix LLVM in this way. Moreover, it's still a bit unclear to us whether or not this behavior is allowed by the C standard. If it is allowed, that may make it more challenging to build a consensus in favor of changing it in LLVM. +One, we have to think about *who* is responsible for discharging the obligation of ensuring that these signatures are correct. Is it the responsibility of a *caller* to these functions to ensure the signatures are correct? That would seem unreasonable. So even though the caller has to write `unsafe { .. }` to call these functions, this suggests that the `extern` *itself* should be somehow marked with or wrapped in `unsafe`. -Two, even if the C standard does not permit what LLVM is doing (or we were otherwise able to build a consensus for change) and it proves feasible to fix LLVM, we still, as described above, believe that it's unreasonable to expect that *callers* to a function declared in an `extern` block should have to prove that the signature is correct. We want the obligation of proving this to sit with the person writing the `extern` block, not the person calling a function declared within. - -Three, if we were to say that the proof obligation of ensuring the signature of an item declared within an `extern` block rests with the person *using* that item, then we could never declare some items within an `extern` to be OK to use directly from safe code. This is something we want to allow, and the only way to do this is if the proof obligation rests with the person writing the `extern` block. Marking these blocks with `unsafe` more clearly signals who holds this proof obligation. +Two, not allowing items to be marked as `safe` would remove one of the key tangible *benefits* that the changes in this RFC provide to users. This would reduce the motivation to make this change at all. ## Prefix only `extern` with `safe` or `unsafe` @@ -302,16 +285,21 @@ No. This RFC specifies that `unsafe extern` blocks will fire this lint. There No. This RFC allows for items within an `unsafe extern` block to not be marked with either of `safe` or `unsafe`. Items that are not marked in either way are assumed to be `unsafe`. +## What's the #46188 situation? + +Currently, an `extern` block with incorrect signatures can result in a program exhibiting undefined behavior even if none of the items within that block are used by Rust code. See, e.g., [#46188][]. + +Originally, the possibility of this undefined behavior was one of the motivations for this RFC. However, it's possible that we may be able to resolve this in other ways, so we have redrafted this RFC to exclude this as a motivation. + +The key motivation for this RFC is to make clear that the person writing an `extern` block is responsible for proving the correctness of the signatures within and that the compiler cannot check this proof. + +[#46188]: https://github.com/rust-lang/rust/issues/46188 + # Prior art [prior-art]: #prior-art None we are aware of. -# Unresolved questions -[unresolved-questions]: #unresolved-questions - -* Extern declarations are actually *always* unsafe and able to cause UB regardless of edition. This RFC doesn't have a specific answer on how to improve pre-2024 code. - # Future possibilities [future-possibilities]: #future-possibilities From c19839632db10061c3f54a00863f7d18adcf1f94 Mon Sep 17 00:00:00 2001 From: Travis Cross Date: Tue, 7 May 2024 17:09:42 +0000 Subject: [PATCH 26/29] Remove unused "prior art" section --- text/0000-unsafe-extern-blocks.md | 5 ----- 1 file changed, 5 deletions(-) diff --git a/text/0000-unsafe-extern-blocks.md b/text/0000-unsafe-extern-blocks.md index 67e380ddc72..c32dde44f2a 100644 --- a/text/0000-unsafe-extern-blocks.md +++ b/text/0000-unsafe-extern-blocks.md @@ -295,11 +295,6 @@ The key motivation for this RFC is to make clear that the person writing an `ext [#46188]: https://github.com/rust-lang/rust/issues/46188 -# Prior art -[prior-art]: #prior-art - -None we are aware of. - # Future possibilities [future-possibilities]: #future-possibilities From 39795a03fa8237619a02ae5f8ff3376118e597b8 Mon Sep 17 00:00:00 2001 From: Travis Cross Date: Tue, 7 May 2024 17:12:31 +0000 Subject: [PATCH 27/29] Fix optionality of `safe`/`unsafe` in guide section The 2023-10-11 consensus from lang on how to move forward included that the `safe` and `unsafe` keywords would be optional within an `unsafe extern` block. The reference-level section correctly followed this consensus, but the guide-level section did not, and suggested that annotating items with these keywords was required. Let's fix that. --- text/0000-unsafe-extern-blocks.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-unsafe-extern-blocks.md b/text/0000-unsafe-extern-blocks.md index c32dde44f2a..6dfccf44fba 100644 --- a/text/0000-unsafe-extern-blocks.md +++ b/text/0000-unsafe-extern-blocks.md @@ -30,7 +30,7 @@ An `extern` block can be placed anywhere a function declaration could appear (ge Within an `extern` block is zero or more declarations of external functions and/or external static values. An extern function is declared with a `;` instead of a function body (similar to a method of a trait). An extern static value is also declared with a `;` instead of an expression (similar to an associated const of a trait). In both cases, the actual function body or value is provided by whatever external source (which is probably not even written in Rust). -When an `unsafe extern` block is used, all declarations within that `extern` block *must* have the `unsafe` or `safe` keywords as part of their signature. The `safe` keyword is a contextual keyword; it is currently allowed only within `extern` blocks. +Declarations within an `unsafe extern` block *may* annotate their signatures with either `safe` or `unsafe`. If a signature within the block is not annotated, it is assumed to be `unsafe`. The `safe` keyword is contextual and is currently allowed only within `extern` blocks. If an `extern` block is used in an older edition without the `unsafe` keyword, declarations *cannot* specify `safe` or `unsafe`. Code must update to `unsafe extern` style blocks if it wants to make `safe` declarations. From 3b6ae2bbf94068f8a4cb3b5145f3653443b6535d Mon Sep 17 00:00:00 2001 From: Travis Cross Date: Sun, 19 May 2024 22:36:40 +0000 Subject: [PATCH 28/29] Prepare RFC 3484 to be merged The FCP for RFC 3484 has completed with a disposition to merge. Let's prepare to merge it. --- ...0-unsafe-extern-blocks.md => 3484-unsafe-extern-blocks.md} | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) rename text/{0000-unsafe-extern-blocks.md => 3484-unsafe-extern-blocks.md} (99%) diff --git a/text/0000-unsafe-extern-blocks.md b/text/3484-unsafe-extern-blocks.md similarity index 99% rename from text/0000-unsafe-extern-blocks.md rename to text/3484-unsafe-extern-blocks.md index 6dfccf44fba..8b6689e6baa 100644 --- a/text/0000-unsafe-extern-blocks.md +++ b/text/3484-unsafe-extern-blocks.md @@ -1,7 +1,7 @@ - Feature Name: `unsafe_extern` - Start Date: 2023-05-23 -- RFC PR: [rust-lang/rfcs#0000](https://github.com/rust-lang/rfcs/pull/0000) -- Rust Issue: [rust-lang/rust#0000](https://github.com/rust-lang/rust/issues/0000) +- RFC PR: [rust-lang/rfcs#3484](https://github.com/rust-lang/rfcs/pull/3484) +- Tracking Issue: [rust-lang/rust#123743](https://github.com/rust-lang/rust/issues/123743) # Summary [summary]: #summary From 45590fe36d754f5062b77a5d4be5b0041ef2e93e Mon Sep 17 00:00:00 2001 From: Travis Cross Date: Sun, 19 May 2024 23:34:41 +0000 Subject: [PATCH 29/29] Do some copyediting We had earlier made clarifying edits to many sections, but not to all of them. Let's clarify some text in these remaining sections. --- text/3484-unsafe-extern-blocks.md | 28 +++++++++++++--------------- 1 file changed, 13 insertions(+), 15 deletions(-) diff --git a/text/3484-unsafe-extern-blocks.md b/text/3484-unsafe-extern-blocks.md index 8b6689e6baa..5b7af0fe3d1 100644 --- a/text/3484-unsafe-extern-blocks.md +++ b/text/3484-unsafe-extern-blocks.md @@ -6,7 +6,7 @@ # Summary [summary]: #summary -In Edition 2024 it is `unsafe` to declare an `extern` function or static, but external functions and statics *can* be safe to use after the initial declaration. +It is unsafe to declare an `extern` block. Starting in Rust 2024, all `extern` blocks must be marked as `unsafe`. In all editions, items within `unsafe extern` blocks may be marked as safe to use. # Motivation [motivation]: #motivation @@ -20,23 +20,23 @@ By making clear where this proof obligation sits, we can now allow for items tha # Guide-level explanation [guide-level-explanation]: #guide-level-explanation -Rust can utilize functions and statics from foreign code that are provided during linking, though it is `unsafe` to do so. +Rust code can use functions and statics from foreign code. The type signatures of these foreign items must be provided by the programmer in `extern` blocks. These blocks must contain correct signatures to avoid undefined behavior. The Rust compiler cannot check the correctness of the signatures in these blocks, so writing these blocks is *unsafe*. -An `extern` block can be placed anywhere a function declaration could appear (generally at the top level of a module). +An `extern` block may be placed anywhere a function declaration may appear. - On editions >= 2024, you *must* write all `extern` blocks as `unsafe extern`. -- On editions < 2024, you *may* write `unsafe extern`, or you can write an `extern` block without the `unsafe` keyword. Writing an `extern` block without the `unsafe` keyword is provided for compatibility only, and will eventually generate a warning. -- `unsafe extern` interacts with the `unsafe_code` lint, and a `deny` or `forbid` with that lint will deny or forbid the unsafe external block. +- On editions < 2024, you *may* write `unsafe extern`, or you may write an `extern` block without the `unsafe` keyword. Writing an `extern` block without the `unsafe` keyword is provided for compatibility only, and will eventually generate a warning. +- Use of `unsafe extern`, in all editions, fires the `unsafe_code` lint. -Within an `extern` block is zero or more declarations of external functions and/or external static values. An extern function is declared with a `;` instead of a function body (similar to a method of a trait). An extern static value is also declared with a `;` instead of an expression (similar to an associated const of a trait). In both cases, the actual function body or value is provided by whatever external source (which is probably not even written in Rust). +Within an `extern` block are zero or more declarations of external functions and/or external statics. An extern function is declared with a `;` (semicolon) instead of a function body (similar to a method of a trait). An extern static value is also declared with a `;` (semicolon) instead of an expression (similar to an associated const of a trait). In both cases, the actual function body or value is provided by some external source. Declarations within an `unsafe extern` block *may* annotate their signatures with either `safe` or `unsafe`. If a signature within the block is not annotated, it is assumed to be `unsafe`. The `safe` keyword is contextual and is currently allowed only within `extern` blocks. -If an `extern` block is used in an older edition without the `unsafe` keyword, declarations *cannot* specify `safe` or `unsafe`. Code must update to `unsafe extern` style blocks if it wants to make `safe` declarations. +If an `extern` block is used in an older edition without the `unsafe` keyword, item declarations *may not* specify `safe` or `unsafe`. Code must update to `unsafe extern` to make `safe` item declarations. ```rust unsafe extern { - // sqrt (from libm) can be called with any `f64` + // sqrt (from libm) may be called with any `f64` pub safe fn sqrt(x: f64) -> f64; // strlen (from libc) requires a valid pointer, @@ -52,11 +52,9 @@ unsafe extern { } ``` -`extern` blocks are `unsafe` because if the declaration doesn't match the actual external function, or the actual external data, then the behavior of the resulting program may be undefined. +Once unsafely declared, a `safe` item within an `unsafe extern` block may be used directly from safe Rust code. The unsafe obligation of ensuring that the signature is correct is discharged by the block that declares the signature for the item. -Once they are unsafely declared, a `safe` item can be used outside the `extern` block as if it were any other safe function or static value declared within rust. The unsafe obligation of ensuring that the correct items are being linked to is performed by the crate making the declaration, not the crate using that declaration. - -Items declared as `unsafe` *must* still have a correctly matching signature at compile time, but they *also* have some sort of additional obligation for correct usage at runtime. They can only be used within an `unsafe` block. +When an item is declared as `unsafe`, as is usual in Rust, that means that the caller (or, in general, the user) may need to uphold certain unchecked obligations so as to prevent undefined behavior, and consequently that the item may only be used within an `unsafe` block. However, the `extern` block (not the caller or other user) is still responsible for ensuring that the signature of that item is correct. # Reference-level explanation [reference-level-explanation]: #reference-level-explanation @@ -64,7 +62,7 @@ Items declared as `unsafe` *must* still have a correctly matching signature at c The grammar of the language is updated so that: - Editions >= 2024 *must* prefix all `extern` blocks with `unsafe`. -- Editions < 2024 *should* prefix `extern` blocks with `unsafe`, this will eventually be a warn-by-default compatibility lint when `unsafe` is missing. +- Editions < 2024 *should* prefix `extern` blocks with `unsafe`; this will eventually be a warn-by-default compatibility lint when `unsafe` is missing. This RFC replaces the *["functions"][]* and *["statics"][]* sections in the [external blocks][] chapter of the Rust Reference with the following: @@ -74,7 +72,7 @@ This RFC replaces the *["functions"][]* and *["statics"][]* sections in the [ext ### Functions -Functions within external blocks are declared in the same way as other Rust functions, with the exception that they must not have a body and are instead terminated by a semicolon. Patterns are not allowed in parameters, only IDENTIFIER or _ may be used. The function qualifiers `const`, `async`, and `extern` are not allowed. If the function is unsafe to call, then the function should use the `unsafe` qualifier. If the function is safe to call, then the function should use the `safe` qualifier (a contextual keyword). Functions that are not qualified as `unsafe` or `safe` are assumed to be `unsafe`. +Functions within external blocks are declared in the same way as other Rust functions, with the exception that they must not have a body and are instead terminated by a semicolon. Patterns are not allowed in parameters, only `IDENTIFIER` or `_` (underscore) may be used. The function qualifiers `const`, `async`, and `extern` are not allowed. If the function is unsafe to call, then the function should use the `unsafe` qualifier. If the function is safe to call, then the function should use the `safe` qualifier (a contextual keyword). Functions that are not qualified as `unsafe` or `safe` are assumed to be `unsafe`. If the function signature declared in Rust is incompatible with the function signature as declared in the foreign code, the behavior of the resulting program may be undefined. @@ -91,7 +89,7 @@ where `'l1`, ..., `'lm` are its lifetime parameters, `A1`, ..., `An` are the dec Statics within external blocks are declared in the same way as statics outside of external blocks, except that they do not have an expression initializing their value. If the static is unsafe to access, then the static should use the `unsafe` qualifier. If the static is safe to access (and immutable), then the static should use the `safe` qualifier (a contextual keyword). Statics that are not qualified as `unsafe` or `safe` are assumed to be `unsafe`. -Extern statics can be either immutable or mutable just like statics outside of external blocks. An immutable static must be initialized before any Rust code is executed. It is not enough for the static to be initialized before Rust code reads from it. A mutable extern static is always `unsafe` to access, the same as a Rust mutable static, and as such can not be marked with a `safe` qualifier. +Extern statics may be either immutable or mutable just like statics outside of external blocks. An immutable static must be initialized before any Rust code is executed. It is not enough for the static to be initialized before Rust code reads from it. A mutable extern static is always `unsafe` to access, the same as a Rust mutable static, and as such may not be marked with a `safe` qualifier. # Drawbacks [drawbacks]: #drawbacks