Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking Issue for str_from_raw_parts #119206

Open
1 of 4 tasks
Sky9x opened this issue Dec 22, 2023 · 5 comments
Open
1 of 4 tasks

Tracking Issue for str_from_raw_parts #119206

Sky9x opened this issue Dec 22, 2023 · 5 comments
Labels
C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Comments

@Sky9x
Copy link
Contributor

Sky9x commented Dec 22, 2023

Feature gate: #![feature(str_from_raw_parts)]

This is a tracking issue for ACP#167, which adds core::str::from_raw_parts[_mut].

Public API

// core::str

pub const unsafe fn from_raw_parts<'a>(ptr: *const u8, len: usize) -> &'a str

pub const unsafe fn from_raw_parts_mut<'a>(ptr: *mut u8, len: usize) -> &'a mut str

Steps / History

Unresolved Questions

  • None yet.

Footnotes

  1. https://std-dev-guide.rust-lang.org/feature-lifecycle/stabilization.html

@Sky9x Sky9x added C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. labels Dec 22, 2023
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Jan 26, 2024
Initial implementation of `str::from_raw_parts[_mut]`

ACP (accepted): rust-lang/libs-team#167
Tracking issue: rust-lang#119206

Thanks to `@Kixiron` for previous work on this (rust-lang#107207)

`@rustbot` label +T-libs-api -T-libs
r? `@thomcc`

Closes rust-lang#107207.
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Jan 26, 2024
Initial implementation of `str::from_raw_parts[_mut]`

ACP (accepted): rust-lang/libs-team#167
Tracking issue: rust-lang#119206

Thanks to ``@Kixiron`` for previous work on this (rust-lang#107207)

``@rustbot`` label +T-libs-api -T-libs
r? ``@thomcc``

Closes rust-lang#107207.
rust-timer added a commit to rust-lang-ci/rust that referenced this issue Jan 26, 2024
Rollup merge of rust-lang#119466 - Sky9x:str_from_raw_parts, r=dtolnay

Initial implementation of `str::from_raw_parts[_mut]`

ACP (accepted): rust-lang/libs-team#167
Tracking issue: rust-lang#119206

Thanks to ``@Kixiron`` for previous work on this (rust-lang#107207)

``@rustbot`` label +T-libs-api -T-libs
r? ``@thomcc``

Closes rust-lang#107207.
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Apr 18, 2024
fix: make `str::from_raw_parts_mut` `mut`

`str::from_raw_parts_mut` wasn't actually `mut`
rust-lang#119206
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Apr 18, 2024
fix: make `str::from_raw_parts_mut` `mut`

`str::from_raw_parts_mut` wasn't actually `mut`
rust-lang#119206
rust-timer added a commit to rust-lang-ci/rust that referenced this issue Apr 18, 2024
Rollup merge of rust-lang#124100 - dev-ardi:master, r=workingjubilee

fix: make `str::from_raw_parts_mut` `mut`

`str::from_raw_parts_mut` wasn't actually `mut`
rust-lang#119206
@robertbastian
Copy link
Contributor

I'm not a fan of this. This saves a negligible amount of code but bunches up a whole lot of safety considerations. slice::from_raw_parts has language-level memory safety invariants, and str::from_utf8_unchecked has a single library-level safety invariant (that the slice contains valid UTF-8). In code, these should be justified separately. The example in rust-lang/libs-team#167 was

unsafe {
    let bytes = slice::from_raw_parts(ptr, len);
    str::from_utf8_unchecked(bytes)
}

but it should have really been

// SAFETY: <explain why ptr and len make a valid slice>
let bytes = unsafe { slice::from_raw_parts(ptr, len) };
// SAFETY: <explain why this slice is UTF-8>
unsafe { str::from_utf8_unchecked(bytes) }

I don't see the value of combining these to

// SAFETY: <explain why ptr and len make a valid slice and why this slice is UTF-8>
unsafe { str::from_raw_parts(ptr, len) }

The proposed documentation also says

The pointed-to bytes must be valid UTF-8. If this might not be the case, use str::from_utf8(slice::from_raw_parts(ptr, len)), which will return an Err if the data isn’t valid UTF-8.

which kind of admits that there are two safety considerations that should be handled by two function calls. In the end, docs could just recommend str::from_utf8_unchecked(slice::from_raw_parts(ptr, len)).

@Sky9x
Copy link
Contributor Author

Sky9x commented Jun 19, 2024

This API intends to be the borrowed equivalent the already-stable String::from_raw_parts, and fill in the two missing methods in the table below, bringing the string API more in line with the slice API.

String Slice
& str::from_raw_parts slice::from_raw_parts
&mut str::from_raw_parts_mut slice::from_raw_parts_mut
Owned String::from_raw_parts Vec::from_raw_parts

The documentation could be improved to properly express the safety concerns.

Also note that the ACP was accepted because of the existence of String::from_raw_parts, which by the same argument could be written as String::from_utf8_unchecked(Vec::from_raw_parts(ptr, len, cap)).

If you really don't want these functions to appear in your codebase, consider the clippy::disallowed_methods lint:

# clippy.toml
disallowed-methods = [
    "core::str::from_raw_parts",
    "core::str::from_raw_parts_mut",
    "alloc::string::String::from_raw_parts",
    # ...
]

@robertbastian
Copy link
Contributor

It looks like String::from_raw_parts's safety documentation was at some point copy-pasted from Vec::from_raw_parts's, but it has since diverged and is currently much less detailed. It for example misses the invariant "The allocated size in bytes must be no larger than isize::MAX". This is exactly the issue I'm trying to point out, the documentation basically has to say

Safety

  • Everything in Vec::<u8>::from_raw_parts's safety requirements
  • Everything in String::from_utf8_unchecked's safety requirements

at which point why not just make those two calls?

String::from_raw_parts is one method with this issue, does this justify adding two more?

@scottmcm
Copy link
Member

scottmcm commented Oct 1, 2024

I'll copy my comment from the ACP here, as something potentially still worth discussing at stabilization time:

I'm not convinced that one more function call is "a fairly boilerplate filled task". That's especially true for unsafe code where the hard part is demonstrating that the safety conditions have been met.

Because of that, I think two calls is actually better when the two safety conditions are very different, like they are here.

(This is the same sentiment as #119206 (comment) above.)

@PureWhiteWu
Copy link

Hi, is there any progress on this feature? Seems there's no outstanding questions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

4 participants