Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add trait to reduce duplicated code between socket address types, implement netlink addresses #1004

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

kevinmehall
Copy link
Contributor

@kevinmehall kevinmehall commented Jan 29, 2024

#918

This adds a trait SockAddr, and implements it for SocketAddrV4, SocketAddrV6, SocketAddrUnix, SocketAddrXdp, SocketAddr, SocketAddrAny, and the newly-added SocketAddrNetlink. The syscalls that take a sockaddr are reimplemented in terms of this trait, removing hundreds of lines of duplicated code. It also simplifies the public interface, allowing rustix::net::{bind, connect, sendto, sendmsg_addr} to take an (addr: &impl SockAddr), rather than needing separate functions bind_v4, bind_v6, bind_unix, bind_xdp, bind_any etc.

The SocketAddrAny enum is replaced by a struct wrapping SocketAddrStorage. This allows SocketAddrAny be fully generic and allows syscalls returning addresses to be used with any address type not known to rustix. Address types use .into() / .try_into() to convert to / from SocketAddrAny.

@sunfishcode
Copy link
Member

As a quick update here, I am looking forward to reviewing this; I've just been busy.

Copy link

@niklaswimmer niklaswimmer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this approach, it allows me to very easily define a socket address not yet present in this crate, like so

#[derive(Debug, Clone, Copy)]
#[allow(dead_code)] // fields are never read directly because they get memcp'd into another struct
#[repr(C)]
struct SocketAddressHci {
    hci_family: rustix::net::AddressFamily,
    hci_dev: u16,
    hci_channel: u16,
}

unsafe impl SocketAddress for SocketAddressHci {
    /// SAFETY: `SocketAddressHci` has the same layout as the type in the kernel and can therefore safely be passed in a syscall
    type CSockAddr = Self;

    fn encode(&self) -> Self::CSockAddr {
        self.clone()
    }
}

I left some comment about the code structure, most of it more nit than anything else. I am interested to here your opinion on using something like a Bow for the return value of encode tho. It would allow impl's to avoid the clone without having to reimplement write_sockaddr and with_sockaddr again :)

src/backend/linux_raw/net/mod.rs Outdated Show resolved Hide resolved
src/backend/linux_raw/net/addr.rs Outdated Show resolved Hide resolved
src/backend/linux_raw/net/addr.rs Outdated Show resolved Hide resolved
src/backend/linux_raw/net/addr.rs Outdated Show resolved Hide resolved
src/backend/libc/net/msghdr.rs Outdated Show resolved Hide resolved
src/net/socket_address.rs Outdated Show resolved Hide resolved
src/net/socket_address.rs Outdated Show resolved Hide resolved
@kevinmehall kevinmehall mentioned this pull request May 24, 2024
3 tasks
@kevinmehall kevinmehall force-pushed the socket-addr-trait branch 2 times, most recently from 44051ea to 9b432db Compare June 1, 2024 23:23
@kevinmehall
Copy link
Contributor Author

Given that sockaddr was included on the list of libc types to remove in #1063, I slimmed down the SockAddr trait further to only with_sockaddr and made the raw pointer type an opaque struct to avoid all libc types in the API.

I also rebased to clean up the messy history for easier review. Full history archived here.

@al8n
Copy link
Contributor

al8n commented Jan 13, 2025

Hi, any updates on this PR? Looking forward rustix supporting netlink.

@raftario
Copy link

This would be very nice to have in 1.0
#753

@sunfishcode
Copy link
Member

Thanks for the pings here, and your patience! I'll make an effort to review this before 1.0.

Copy link
Member

@sunfishcode sunfishcode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks really nice!

I was initially worried about the safety of the f called by the with_sockaddr function, however I now see that the safety requirements are covered by the pub unsafe trait, so it looks good.

And I was initially worried about generic functions in src/backend, to keep code duplication down, however they're in fact replacing manually duplicated code, so that doesn't seem to be a problem.

I agree that it'd be good to continue to evolve this as you discussed. #1171 adds a RawSocketAddr type which is very close to what you describe for return values, and sounds like a good general direction.

I have have a few relatively minor comments, plus this needs a rebase. If you'd like help with the rebase please let me know.


/// A trait abstracting over the types that can be passed as a `sockaddr`.
///
/// Safety: implementers of this trait must ensure that `with_sockaddr` calls
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a minor style nit, this should be a # Safety section header.

#[repr(C)]
pub struct SockAddrRaw {
_data: [u8; 0],
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would you think of renaming SockAddrRaw to SocketAddrOpaque?

There are two changes there:

  • Sock -> Socket, because if we have some things prefixed Socket and some prefixed Sock I expect I'll frequently forget which things use which prefixes.
  • Raw -> Opaque`, which I think better captures that this struct type doesn't describe the data layout.

src/net/mod.rs Outdated
@@ -10,6 +10,7 @@
mod send_recv;
mod socket;
mod socket_addr_any;
mod socket_address;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we make this a pub mod, and perhaps name it addr, to keep SockAddr and SockAddrRaw from cluttering the rest of the net API? It seems like most users won't ever need to use those directly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it confusing that rustix::net::addr then only contains those two, and not any of the other address types (SocketAddrUnix, SocketAddrV4, SocketAddrV6, SocketAddr, SocketAddrAny, SocketAddrStorage) that are one level up? Seems like at least SocketAddrStorage and SocketAddrOpaque should be next to each other since they represent similar C types.

Maybe move them all into rustix::net::addr?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea to keep SocketAddrStorage next to SocketAddrOpaque.

It seems nice to keep SocketAddrV4, SocketAddrV6, SocketAddrUnix, in rustix::net where all the rest of the API that uses them are though, because that's the "regular" API.

What if we put SocketAddrOpaque, SocketAddrArg, and SocketAddrStorage into a pub mod named rustix::net::abstract?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. abstract might be confused with the abstract namespace for unix sockets, but I don't have a better name to propose.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, abstract is a Rust reserved keyword.

/// `f` with a pointer that is readable for the passed length, and points
/// to data that is a valid socket address for the system calls that accept
/// `sockaddr` as a const pointer.
pub unsafe trait SockAddr {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And, what would you think of renaming this to SocketAddrArg? Sock -> Socket for the same reason as above, and Arg since it is for describing arguments; return values will need to use some other mechanism as you noted in the top comment.

@kevinmehall kevinmehall force-pushed the socket-addr-trait branch 3 times, most recently from f5ecd5e to 7214eea Compare January 31, 2025 06:48
@kevinmehall
Copy link
Contributor Author

Rebased and made the requested changes. I named the new module rustix::net::addr since abstract is a reserved keyword, and I moved SocketAddrStorage into it. It could use a better name if we can come up with one.

I'll take a look at updating SocketAddrAny with #1171 RawSocketAddr.

Since it looks like you're merging breaking changes to main for 1.0, should I go ahead and remove the address type specific functions?

  • bind_any, bind_unix, bind_v4, bind_v6, bind_xdp in favor of bind,
  • connect_any, connect_unix, connect_v4, connect_v6 in favor of connect (leaving address-less connect_unspec)
  • sendmsg_v4, sendmsg_v6, sendmsg_unix, sendmsg_xdp, sendmsg_any in favor of sendmsg_addr (leaving address-less sendmsg)
  • sendto_any, sendto_v4, sendto_v6, sendto_unix, sendto_xdp in favor of sendto

@sunfishcode
Copy link
Member

Rebased and made the requested changes. I named the new module rustix::net::addr since abstract is a reserved keyword, and I moved SocketAddrStorage into it. It could use a better name if we can come up with one.

I haven't thought of a better name yet, but I'll keep thinking 😄 .

I'll take a look at updating SocketAddrAny with #1171 RawSocketAddr.

👍

Since it looks like you're merging breaking changes to main for 1.0, should I go ahead and remove the address type specific functions?

* `bind_any`, `bind_unix`, `bind_v4`, `bind_v6`, `bind_xdp` in favor of `bind`,

* `connect_any`, `connect_unix`, `connect_v4`, `connect_v6` in favor of `connect` (leaving address-less `connect_unspec`)

* `sendmsg_v4`, `sendmsg_v6`, `sendmsg_unix`, `sendmsg_xdp`, `sendmsg_any` in favor of `sendmsg_addr` (leaving address-less `sendmsg`)

* `sendto_any`, `sendto_v4`, `sendto_v6`, `sendto_unix`, `sendto_xdp` in favor of `sendto`

Yes, that'd be great!

To support extensibility over address types, use sockaddr_storage and
Into / TryInto conversions.
Removes:
* `bind_any`, `bind_unix`, `bind_v4`, `bind_v6`, `bind_xdp` in favor of `bind`,
* `connect_any`, `connect_unix`, `connect_v4`, `connect_v6` in favor of `connect` (leaving address-less `connect_unspec`)
* `sendmsg_v4`, `sendmsg_v6`, `sendmsg_unix`, `sendmsg_xdp`, `sendmsg_any` in favor of `sendmsg_addr` (leaving address-less `sendmsg`)
* `sendto_any`, `sendto_v4`, `sendto_v6`, `sendto_unix`, `sendto_xdp` in favor of `sendto`
@kevinmehall
Copy link
Contributor Author

Removed the address-type-specific variants of bind, connect, sendmsg, and sendto, and changed SocketAddrAny from an enum to a struct wrapping SocketAddrStorage.

Copy link
Member

@sunfishcode sunfishcode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, overall this looks great!

///
/// `storage` must point to valid memory for decoding a socket
/// address. `len` bytes must be initialized.
pub unsafe fn read(src: *const SocketAddrOpaque, len: usize) -> Self {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is pre-existing, but: it appears every user of this has a socklen_t rather than a usize, what would you think about making this argument take a socklen_t, to save all users from having to .try_into().unwrap()?

///
/// `storage` must point to valid memory for decoding a socket
/// address. `len` bytes must be initialized.
pub unsafe fn read_option(src: *const SocketAddrOpaque, len: usize) -> Option<Self> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And same here with making len be a socklen_t.

SocketAddrAny::V6(v6) => Ok(v6),
_ => unreachable!(),
}
let any = unsafe { SocketAddrAny::read(value.as_ptr().cast(), optlen as usize) };
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this code and code like it, instead of doing SocketAddrAny::read, which does a copy_nonoverlapping, would it work to change value above from MaybeUninit<SocketAddrStorage> to MaybeUninit<SocketAddrAny>, and have the OS write the data directly into the SocketAddrAny?

/// `storage` must point to valid memory for decoding a socket
/// address. `len` bytes must be initialized.
pub unsafe fn read(src: *const SocketAddrOpaque, len: usize) -> Self {
assert!(len >= core::mem::size_of::<backend::c::sa_family_t>());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is also pre-existing, but: on BSD-family platforms, the sa_family field is preceded by a length field, so the invariant ought to be len >= offset_of!(sockaddr, sa_family) + size_of::<sa_family_tt>()

@@ -0,0 +1,83 @@
//! Socket address types.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're not able to think of a better name for this module, it'd be good to have a module-level comment describing what's in this module.

}
if addr.len() < size_of::<c::sockaddr_in>() {
return Err(Errno::INVAL);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SocketAddAny::read panics if the len is out of range; would it make sense to do that here too? The length should ever be wrong unless there's a bug somewhere.

And similar for the other read_sockaddr_* functions.

return Err(Errno::INVAL);
}
if len == offsetof_sun_path {
SocketAddrUnix::new(&[][..])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm looking to figure out all the ways that TryFrom<SocketAddrAny> can fail, and SocketAddrUnix::new here can fail with NAMETOOLONG. However the only time it fails is if the path is longer than sun_path.len(), which we've already checked here, so I think that can't happen. Would it make sense to make this do Ok(SocketAddrUnix::new(&[][...]).unwrap()), or add a new infallible SocketAddrUnix constructor?

And similar for the SocketAddrUnix::new_abstract_name and SocketAddrUnix::new calls below?

///
/// Socket addresses can be converted to `SocketAddrAny` via the [`From`] and
/// [`Into`] traits. `SocketAddrAny` can be converted back to a specific socket
/// address type with [`TryFrom`] and [`TryInto`].
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you document here that the TryFrom functions fail with Errno::AFNOSUPPORT if the address family is not the requested one?

An unfortunate thing about using Errno is that it doesn't describe which errors could actually occur. However, it might be worth it here, because a convenient thing about using Errno is that these functions will usually be called from within functions that return Result<..., Errno>, so users can just do ? on the result and it'll do something reasonable.

Looking at the code, it looks like these TryFrom<SocketAddrAny> functions can fail with:

  • AFNOSUPPORT if the address family isn't the requested one
  • INVAL if the length is wrong, though perhaps those should be panics?
  • NAMETOOLONG if it's Unix-domain and the name doesn't fit in sun_path, but that shouldn't be possible

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants