Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge sockaddr_storage_to_addr and SockAddr::from_raw_sockaddr #1504

Closed

Conversation

coolreader18
Copy link
Contributor

@coolreader18 coolreader18 commented Aug 25, 2021

Based on #1496

@coolreader18 coolreader18 marked this pull request as draft August 25, 2021 03:16
@coolreader18 coolreader18 force-pushed the from_raw_sockaddr-2 branch 4 times, most recently from 9026d5f to 1cab7bf Compare August 29, 2021 22:29
@coolreader18 coolreader18 marked this pull request as ready for review September 19, 2021 16:39
@coolreader18
Copy link
Contributor Author

Alright, this is ready now that #1496 is merged.

@asomers
Copy link
Member

asomers commented Sep 19, 2021

Could you please explain again what the motivation is? And it it necessary to replace a safe function with an unsafe one?

@asomers
Copy link
Member

asomers commented Sep 19, 2021

@coolreader18 we need to publish a new release soon. But this PR really ought to be included. Let's try to get it finished.

@coolreader18
Copy link
Contributor Author

coolreader18 commented Sep 19, 2021

So as-is, I'm pretty sure the code that was using sockaddr_storage_to_addr was unsound - most of the time, the sockaddr_storage isn't fully initialized; only the first N bytes are in order to store whatever address, and the rest are left alone (and so if it was uninit() they're uninit). By passing a raw pointer, from_raw_sockaddr only looks at the appropriate bytes according to the family, and so uninitialized bytes are never behind a &-reference (except for when it's unixaddr maybe but that's a different thing).

@coolreader18
Copy link
Contributor Author

Also, it really is inherently unsafe - we cast the bytes directly to structs, and though I think most of them don't have padding or any fancy layout restrictions, if they ever did that should be protected against from having arbitrary bytes in sockaddr_storage.

Also, some addresses you just have a *const sockaddr and no sockaddr_storage, so that should be able to work too. Casting *const sockaddr as &sockaddr_storage is obviously wildly unsafe.

@asomers
Copy link
Member

asomers commented Sep 19, 2021

So as-is, I'm pretty sure the code that was using sockaddr_storage_to_addr was unsound - most of the time, the sockaddr_storage isn't fully initialized; only the first N bytes are in order to store whatever address, and the rest are left alone (and so if it was uninit() they're uninit). By passing a raw pointer, from_raw_sockaddr only looks at the appropriate bytes according to the family, and so uninitialized bytes are never behind a &-reference (except for when it's unixaddr maybe but that's a different thing).

That's not a safety problem, because the public API assumes a fully initialized struct. If the real struct is only part-initialized, then the unsafety is in the caller's code, not sockaddr_storage_to_addr

@asomers
Copy link
Member

asomers commented Sep 19, 2021

Also, it really is inherently unsafe - we cast the bytes directly to structs, and though I think most of them don't have padding or any fancy layout restrictions, if they ever did that should be protected against from having arbitrary bytes in sockaddr_storage.

You're right. It's OK for a safe function to assume that its input structs are correctly initialized. However, since sockaddr_storage_to_addr takes a libc::sockaddr_storage, which has public fields, we technically shouldn't assume that.

Also, some addresses you just have a *const sockaddr and no sockaddr_storage, so that should be able to work too. Casting *const sockaddr as &sockaddr_storage is obviously wildly unsafe.

Yes, but it's equally unsafe whether it happens in sockaddr_storage_to_addr or in the calling code. So I think the current API is better. It means unsafety sometimes, not always.

@coolreader18
Copy link
Contributor Author

If the real struct is only part-initialized, then the unsafety is in the caller's code, not sockaddr_storage_to_addr

No but on master, there are partially uninitialized sockaddr_storages being passed to sockaddr_storage_to_addr:

    unsafe {
        let mut addr = mem::MaybeUninit::uninit();
        let mut len = mem::size_of::<sockaddr_storage>() as socklen_t;

        let ret = libc::getpeername(
            fd,
            addr.as_mut_ptr() as *mut libc::sockaddr,
            &mut len
        );

        Errno::result(ret)?;

        sockaddr_storage_to_addr(&addr.assume_init(), len as usize)
    }

getpeername only has to initialize the first e.g. 4 bytes of the sockaddr_storage for an ipv4 addr. assume_init here is UB for almost every case, unless there's some AF whose address type occupies all 128 bytes of sockaddr_storage.

Yes, but it's equally unsafe whether it happens in sockaddr_storage_to_addr or in the calling code. So I think the current API is better. It means unsafety sometimes, not always.

I actually mean unsound, not unsafe - it's almost always UB to cast &*(*const sockaddr as *const sockaddr_storage). A *const sockaddr could have anything as the backing storage, it could be a 16-byte *const sockaddr_in casted to a *const sockaddr, because socket APIs only read what they know they're allowed to read based on sa_family. Casting that to a &sockaddr_storage would make Rust think that there's 128 valid bytes of sockaddr_storage behind the reference, but there's only 16 of sockaddr_in. The new SockAddr::from_raw_addr never actually casts the passed addr to sockaddr_storage, it just reads ->sa_family and casts to the appropriate struct from there.

@asomers
Copy link
Member

asomers commented Sep 20, 2021

Ughh, you're making my brain hurt. I'll try to revisit this tomorrow.

Copy link
Member

@asomers asomers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There seem to be several unrelated changes in here, some of which are just internal refactoring stuff. Please split them up, and write a complete commit message for each.

@@ -1375,7 +1374,7 @@ unsafe fn read_mhdr<'a, 'b>(
mhdr: msghdr,
r: isize,
msg_controllen: usize,
address: sockaddr_storage,
address: *const sockaddr,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why change the argument from a sockaddr_storage to a sockaddr?

Copy link
Contributor Author

@coolreader18 coolreader18 Sep 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same sort of thing that the previous way assumed that it was all initialized, and also sockaddr_storage is 128 bytes while *const sockaddr is 8 bytes. It's might get inlined anyway but ¯\_(ツ)_/¯

Some(_) | None => None,
/// `addr` must be a valid, non-null pointer, and `len` should describe the
/// number of bytes within `*addr` that are initialized and represent data.
pub unsafe fn from_raw_sockaddr(addr: *const libc::sockaddr, len: usize) -> Result<SockAddr> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why add the len argument? struct sockaddr isn't very big, unlike sockaddr_storage. Are there ever any cases when the caller actually gets a partial sockaddr? The case in from_libc_ifaddrs isn't a good example, because it determines the len strictly based on the contents of the sockaddr.

Copy link
Contributor Author

@coolreader18 coolreader18 Sep 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

*const sockaddr isn't actually used as "I am passing a direct reference to a struct sockaddr", it's used (paired with a len) as a general type to say "there is a sockaddr stored here of length len". In the bind(2) linux man page:

int bind(int sockfd, const struct sockaddr *addr,
         socklen_t addrlen);

...
The actual structure passed for the addr argument will depend on the address family. The sockaddr structure is defined as something like:

struct sockaddr {
    sa_family_t sa_family;
    char        sa_data[14];
}

The only purpose of this structure is to cast the structure pointer passed in addr in order to avoid compiler warnings. See EXAMPLES below.

For example, bind() accepts a const struct sockaddr *addr, socklen_t addrlen, but that doesn't mean that the biggest address you can pass to it is 14 bytes long - you can definitely do struct sockaddr_un u = ...; bind(fd, &u, sizeof(u)). We do the same thing here with accepting a *const sockaddr, usize pair, just with turning it into a Rust type instead of naming the socket.

/// allocated and valid. It must be at least as large as all the useful parts
/// of the structure. Note that in the case of a `sockaddr_un`, `len` need not
/// include the terminating null.
pub fn sockaddr_storage_to_addr(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I get the motivation for removing this now. It's not that it's inherently unsafe, it's that it's difficult to use safely, because they signature requires a fully initialized sockaddr_storage but the caller often won't have that. Alternatives would be to use a *const sockaddr_storage or a mem::MaybeUninit<&sockaddr_storage>, both of which would have to be coupled with a len. The latter is probably better, because it at least checks the argument's lifetime at compile-time.
Two things you definitely should do are:

  • Explain this in the commit message
  • #[deprecate] the old function rather than remove it.

@coolreader18
Copy link
Contributor Author

coolreader18 commented Sep 23, 2021

One last thing that I noticed is unsound - as_ffi_pair() -> (&libc::sockaddr, libc::socklen_t) should be -> (*const libc::sockaddr, ...). This is totally unsound, since some of the specialized socket types that it casts from are smaller than sockaddr itself, e.g. sockaddr_nl is 12 bytes while sockaddr is 16 bytes, so casting &sockaddr_nl as &sockaddr is UB. (I think it's questionable if some of the other casts are UB, but this kind is for sure).

So, should I change it here? Ideally it wouldn't actually break anything, since if it's for ffi people would be casting/coercing it to a raw pointer anyway, and if they are poking at the sockaddr's fields, it's probably unsound and should break (imo).

@coolreader18
Copy link
Contributor Author

Committed it, let me know if you're opposed to it.

asomers added a commit to asomers/nix that referenced this pull request Mar 20, 2022
The SockAddr enum is quite large, and the user must allocate space for
the whole thing even though he usually knows what type he needs.
Furthermore, thanks to the sa_family field, the sockaddr types are
basically an enum even in C.

So replace the ungainly enum with a SockaddrLike trait implemented by
all sockaddr types and  a SockaddrStorage union that has safe accessors.

Also, deprecate InetAddr, which only existed to support SockAddr.

Supplants nix-rust#1504
Fixes nix-rust#1544
asomers added a commit to asomers/nix that referenced this pull request Mar 20, 2022
The SockAddr enum is quite large, and the user must allocate space for
the whole thing even though he usually knows what type he needs.
Furthermore, thanks to the sa_family field, the sockaddr types are
basically an enum even in C.

So replace the ungainly enum with a SockaddrLike trait implemented by
all sockaddr types and  a SockaddrStorage union that has safe accessors.

Also, deprecate InetAddr, which only existed to support SockAddr.

Supplants nix-rust#1504
Fixes nix-rust#1544
asomers added a commit to asomers/nix that referenced this pull request Mar 20, 2022
The SockAddr enum is quite large, and the user must allocate space for
the whole thing even though he usually knows what type he needs.
Furthermore, thanks to the sa_family field, the sockaddr types are
basically an enum even in C.

So replace the ungainly enum with a SockaddrLike trait implemented by
all sockaddr types and  a SockaddrStorage union that has safe accessors.

Also, deprecate InetAddr, which only existed to support SockAddr.

Supplants nix-rust#1504
Fixes nix-rust#1544
asomers added a commit to asomers/nix that referenced this pull request Mar 22, 2022
The SockAddr enum is quite large, and the user must allocate space for
the whole thing even though he usually knows what type he needs.
Furthermore, thanks to the sa_family field, the sockaddr types are
basically an enum even in C.

So replace the ungainly enum with a SockaddrLike trait implemented by
all sockaddr types and  a SockaddrStorage union that has safe accessors.

Also, deprecate InetAddr, which only existed to support SockAddr.

Supplants nix-rust#1504
Fixes nix-rust#1544
bors bot added a commit that referenced this pull request Mar 22, 2022
1684: Replace the Sockaddr enum with a union r=rtzoeller a=asomers

The SockAddr enum is quite large, and the user must allocate space for
the whole thing even though he usually knows what type he needs.
Furthermore, thanks to the sa_family field, the sockaddr types are
basically an enum even in C.

So replace the ungainly enum with a SockaddrLike trait implemented by
all sockaddr types and  a SockaddrStorage union that has safe accessors.

Also, deprecate InetAddr, which only existed to support SockAddr.

Supplants #1504
Fixes #1544

Co-authored-by: Alan Somers <asomers@gmail.com>
@asomers
Copy link
Member

asomers commented Mar 24, 2022

Closing in favor of #1684

@asomers asomers closed this Mar 24, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants