Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot create AF_VSOCK Sockets due to overly strict SocketPal #58327

Open
fbrosseau opened this issue Aug 29, 2021 · 31 comments
Open

Cannot create AF_VSOCK Sockets due to overly strict SocketPal #58327

fbrosseau opened this issue Aug 29, 2021 · 31 comments
Labels
area-System.Net.Sockets enhancement Product code improvement that does NOT require public API changes/additions
Milestone

Comments

@fbrosseau
Copy link

fbrosseau commented Aug 29, 2021

Description

Hello,

Unix SocketPal prevents managed code from creating AF_VSOCK (and any other new and/or non-standard AF's). This is unfortunate, because at the managed level, everything is there to extend for new address families: You write your EndPoint-derived implementation properly, and then it gets serialized to a SocketAddress, and that gets passed down to socket for the platform, and everything works. That is the case for Windows, where dotnet does not attempt to do anything with the parameters - if WinSock is happy with it, it works.

In Linux, SocketPal needs to do some translations, likely because some of the basic values do not match between their original Windows constants and the standard Linux ones. In its current form, the code there bails out entirely if anything is unknown:

int32_t SystemNative_Socket(int32_t addressFamily, int32_t socketType, int32_t protocolType, intptr_t* createdSocket)
{
if (createdSocket == NULL)
{
return Error_EFAULT;
}
sa_family_t platformAddressFamily;
int platformSocketType, platformProtocolType;
if (!TryConvertAddressFamilyPalToPlatform(addressFamily, &platformAddressFamily))
{
*createdSocket = -1;
return Error_EAFNOSUPPORT;
}
if (!TryConvertSocketTypePalToPlatform(socketType, &platformSocketType))
{
*createdSocket = -1;
return Error_EPROTOTYPE;
}
if (!TryConvertProtocolTypePalToPlatform(addressFamily, protocolType, &platformProtocolType))
{
*createdSocket = -1;
return Error_EPROTONOSUPPORT;
}

My exact use case is for AF_VSOCK. In this specific case, I guess the simplest fix would be to add AF_VSOCK to TryConvertAddressFamilyPalToPlatform. However, should Linux SocketPal allow for arbitrary unknown AFs, like Windows does? That's a larger change, and that can come with complications if the values keep diverging over time between Linux and Windows.

Regression

No, pal_networking.c blame looks old over there.

Workaround

Pinvoke socket(), connect() / bind() / ... (basically anything that deals with EndPoint classes), and then new Socket(new SafeSocketHandle((IntPtr)sock, true)); to leverage the actual Socket/IO benefits provided by .net.

This workaround has multiple drawbacks - the current managed code in Socket gets all confused and assumes that the socket has Connected=false, due to the code only handling IPv4/IPv6/Unix in the SafeHandle constructor. Having Connected=false prevents it from using NetworkStream, for example. Then, if a Stream interface is required, the workaround is to instead use a FileStream over the sockfd. This is really not ideal : )

@antonfirsov

@dotnet-issue-labeler dotnet-issue-labeler bot added area-System.Net.Sockets untriaged New issue has not been triaged by the area owner labels Aug 29, 2021
@ghost
Copy link

ghost commented Aug 29, 2021

Tagging subscribers to this area: @dotnet/ncl
See info in area-owners.md if you want to be subscribed.

Issue Details

Description

Hello,

Unix SocketPal prevents managed code from creating AF_VSOCK (and any other new and/or non-standard AF's). This is unfortunate, because at the managed level, everything is there to extend for new address families: You write your EndPoint-derived implementation properly, and then it gets serialized to a SocketAddress, and that gets passed down to socket for the platform, and everything works. That is the case for Windows, where dotnet does not attempt to do anything with the parameters - if WinSock is happy with it, it works.

In Linux, SocketPal needs to do some translations, likely because some of the basic values do not match between their original Windows constants and the standard Linux ones. In its current form, the code there bails out entirely if anything is unknown:

int32_t SystemNative_Socket(int32_t addressFamily, int32_t socketType, int32_t protocolType, intptr_t* createdSocket)
{
if (createdSocket == NULL)
{
return Error_EFAULT;
}
sa_family_t platformAddressFamily;
int platformSocketType, platformProtocolType;
if (!TryConvertAddressFamilyPalToPlatform(addressFamily, &platformAddressFamily))
{
*createdSocket = -1;
return Error_EAFNOSUPPORT;
}
if (!TryConvertSocketTypePalToPlatform(socketType, &platformSocketType))
{
*createdSocket = -1;
return Error_EPROTOTYPE;
}
if (!TryConvertProtocolTypePalToPlatform(addressFamily, protocolType, &platformProtocolType))
{
*createdSocket = -1;
return Error_EPROTONOSUPPORT;
}

My exact use case is for AF_VSOCK. In this specific case, I guess the simplest fix would be to add AF_VSOCK to TryConvertAddressFamilyPalToPlatform. However, should Linux SocketPal allow for arbitrary unknown AFs, like Windows does? That's a larger change, and that can come with complications if the values keep diverging over time between Linux and Windows.

Regression

No, pal_networking.c blame looks old over there.

Workaround

Pinvoke socket(), connect() / bind() / ... (basically anything that deals with EndPoint classes), and then new Socket(new SafeSocketHandle((IntPtr)sock, true)); to leverage the actual Socket/IO benefits provided by .net.

Author: fbrosseau
Assignees: -
Labels:

area-System.Net.Sockets, untriaged

Milestone: -

@antonfirsov
Copy link
Member

I guess the simplest fix would be to add AF_VSOCK to TryConvertAddressFamilyPalToPlatform

For that we also need a managed AddressFamily entry for AF_VSOCK. If VSOCK is an important use case for you, I recommend to file an API proposal adding that specific address family. Assuming we accept the proposal, to me it looks like an up-for-grabs candidate.

the current managed code in Socket gets all confused and assumes that the socket has Connected=false, due to the code only handling IPv4/IPv6/Unix in the SafeHandle constructor.

If understand it correctly, this is due to a condition check before we call getpeername on the socket handle:

if (_rightEndPoint != null)
{
try
{
// Local and remote end points may be different sizes for protocols like Unix Domain Sockets.
bufferLength = buffer.Length;
switch (SocketPal.GetPeerName(handle, buffer, ref bufferLength))
{

I'm wondering what would happen if we just removed this criteria to unblock scenarios dealing with connected sockets of arbitrary address families? _remoteEndPoint would remain unassigned in case of arbitrary address families, but send/receive operations should still work. @geoffkizer @tmds thoughts?

@davidfowl
Copy link
Member

cc @shirhatti

@fbrosseau
Copy link
Author

fbrosseau commented Aug 30, 2021

For that we also need a managed AddressFamily entry for AF_VSOCK. If VSOCK is an important use case for you, I recommend to file an API proposal adding that specific address family. Assuming we accept the proposal, to me it looks like an up-for-grabs candidate.

From a framework perspective I guess this makes the most sense, but I was hoping this issue could be handled 100% internally at the PAL level so that it is easier to pass. I can submit an API proposal in parallel. At that point, would it only be the AddressFamily entry, or would it be AddressFamily + EndPoint impl? And AF_VSOCK only, or AF_HYPERV as well? Note that AF_HYPERV is not a problem, unlike VSOCK, because on Windows it is possible to write a fully functional EndPoint implementation and AF_HYPERV works completely in managed. But providing the implementation would add convenience.

What I mean by handling it internally at the PAL is that could it be changed so that unknown values flow through to the native calls instead of early-returning an error? Right now it strictly enforces well-known values to guard against invalid parameters and sizes, but the native call would already return an adequate error if it does not like the parameters.

If understand it correctly, this is due to a condition check before we call getpeername on the socket handle:

Yes. My understanding of that code is that relaxing those conditions would work.

@antonfirsov
Copy link
Member

or would it be AddressFamily + EndPoint impl

In #28636 (dotnet/corefx#37315) we shipped a few new address families without EndPoint implementations so it's probably not necessary, meaning that the minimal API would just focus on unblocking users instead of delivering a full solution. /cc @wfurt

or AF_HYPERV as well

To me it sounds reasonable to file a shared proposal for these two.

@wfurt
Copy link
Member

wfurt commented Aug 30, 2021

I did some work on Netlink and It was working for me - at least the small part I was using. Supporting completely new protocols blindly is not trivial. As you noticed there are fragments in the socket code making certain assumptions. Ability to create Socket from handle was added to solve also some other use cases. Creating Socket from RAW AddressFamily was discussed but rejected as it may have still have some other lingering issue. For one, it may still have issue not knowing if socket is or is not Connected. Existing code is guessing and perhaps that may be improved. Alternatively we may pass that information to the constructor.

Can you share your code @fbrosseau primarily the endpoint, creation and example oof what is not working?

@fbrosseau
Copy link
Author

fbrosseau commented Aug 30, 2021

I did some work on Netlink and It was working for me

In Linux? Windows PAL allows for arbitrary AFs, but not Linux.

Can you share your code @fbrosseau primarily the endpoint, creation and example oof what is not working?

new Socket((AddressFamily)40, SocketType.Stream, 0);

-->
Unhandled exception. System.Net.Sockets.SocketException (97): Address family not supported by protocol
   at System.Net.Sockets.Socket..ctor(AddressFamily addressFamily, SocketType socketType, ProtocolType protocolType)
   at AfHyperV.Program.Main(String[] args)
new SocketAddress((AddressFamily)40, 32);

-->

Unhandled exception. System.PlatformNotSupportedException: Operation is not supported on this platform.
   at System.Net.SocketAddressPal.ThrowOnFailure(Error err)
   at System.Net.SocketAddressPal.SetAddressFamily(Byte[] buffer, AddressFamily family)
   at System.Net.SocketAddress..ctor(AddressFamily family, Int32 size)
   at AfHyperV.Program.Main(String[] args) 

Anything that touches the AddressFamily throws. I can share a full example of what would be a correct VSockEndPoint implementation, but basically every step is a blocker :)

Note that these errors are generated by the Unix PAL itself, not the OS. The OS would accept these parameters happily. See the part of pal_networking.c that I linked in OP

@wfurt
Copy link
Member

wfurt commented Aug 30, 2021

Yes, on Linux. Windows does not have Netlink. And this is not what I meant. I know casting number to AddressFamily will not work. It is about creating handle you self, creating Socket from it and using your EndPoint for Bind().

@fbrosseau
Copy link
Author

fbrosseau commented Aug 30, 2021

Ah sorry! Yes, well managed Socket.Bind/.Connect is a no-go due to the SocketAddress constructor throwing, and I do not think there is any possible way to implement EndPoint.Serialize without it. So that step too must be done through PInvoke. But then due to the defensive code in the Socket constructor that takes a handle, _rightEndPoint is never considered if the AF is not IPv4/IPv6/Unix.

@wfurt
Copy link
Member

wfurt commented Aug 30, 2021

Take a look at #26416. It has example of managed Bind with custom AddressFamily. I think the sequence should be:

  1. create handle with P/Invoke.
  2. Create Socket from it.
  3. Use Socket.Bind if needed.

This should all work AFAIK.
Looking back, I don't think I ever tried Connect since neither CAN nor Netlink is stream oriented. If Bind works but Connect does not it I think that would be something we can look into.
I can probably also find my old Netlink prototype. It did not do Connect but it was using SendTo as far as I remember.

@fbrosseau
Copy link
Author

fbrosseau commented Aug 30, 2021

LLEndpoint uses new SocketAddress(AddressFamily.Packet, 20); which has first-class support in TryConvertAddressFamilyPalToPlatform and hence it works, but anything outside of the hardcoded values will return false from there and throw once back in managed.

@wfurt
Copy link
Member

wfurt commented Aug 30, 2021

Than look at

public unsafe void Ctor_SafeHandle_UnknownSocket_Success()

Original addition of Netlink was reverted (e.g. there is neither PAL nor enum) but that test can still dump routing table.
I think the trick that makes it work is here:

// We need to create base from something known.
public unsafe NlSocketAddress(int pid) : base(AddressFamily.Packet)

Netlink is one example were even if we added AddressFamily use of it was still problematic.
That was my story of adding support for unknown/arbitrary protocols.

@fbrosseau
Copy link
Author

Ah yes interesting, I didn't realize SocketAddress could be fuzzed into changing the AF byte, this goes against both the docs and the code comments, so I did not expect this to be a supported scenario.

// Access to unmanaged serialized data. This doesn't
// allow access to the first 2 bytes of unmanaged memory
// that are supposed to contain the address family which
// is readonly.
public byte this[int offset]

Then yes if that's supported that works, that removes the need for quite a bit of PInvoke, except the initial socket() call.

@wfurt
Copy link
Member

wfurt commented Aug 30, 2021

It would be probably cleaner if you can use AddressFamily.Unknown as base but that will probably fail as that does not really represent any OS protocol. One challenge with supporting unknown protocols is lack of uses case and difficulty with writing tests. If you can get something working we can perhaps take that it as feedback and make improvements for future @fbrosseau. Trouble with Netlink was fact that it also needed specific ProtocolType and there are many and user can possibly also register and create one dynamically. Perhaps we can look at it again in 7.0 and make process of adding new protocols easier.

@fbrosseau
Copy link
Author

fbrosseau commented Aug 30, 2021

AFAIK VSock is mainstream enough now and it also supports "loopback" that does not require having any actual VM in the system, so this is probably a good testable candidate AF.

@wfurt
Copy link
Member

wfurt commented Aug 30, 2021

Yah, you can argue that about Netlink as well. Part of the trouble is that .NET in not particularly well setup to handle platform specific features e.g. works only on Linux.

@antonfirsov
Copy link
Member

antonfirsov commented Aug 31, 2021

Part of the trouble is that .NET in not particularly well setup to handle platform specific features e.g. works only on Linux.

That's true, but we already have similar address families (eg. AF_PACKET), so I don't see an issue adding more of them. We should just point out in the docs, that the given enum value is platform-specific.

I think this is a good candidate to be a community-driven feature, my recommendation is still to

Independently, we should also evaluate if we can improve the support of using arbitrary AddressFamilies / ProtocolTypes through P/Invoke and custom endpoints. I think we can provide limited support for advanced users without adding new API-s or implementing high impact changes like the eliminating strict PAL "mappability" criterias entirely, by doing just 2 things:

  1. Relax the criteria only in SocketAddress(AddressFamily) constructor to actually enable the use of EndPoints with custom address families.
  2. Remove the _rightEndPoint != null check in Socket(SafeSocketHandle) as mentioned in my Cannot create AF_VSOCK Sockets due to overly strict SocketPal #58327 (comment) so connected sockets created in unmanaged code act like they are actually connected.

@fbrosseau
Copy link
Author

fbrosseau commented Aug 31, 2021

Sure I could look into that.

Points 1 & 2

Could the TryConvertAddressFamilyPalToPlatform condition be relaxed in SystemNative_Socket too? The code already does not do anything with the result other than passing it to socket(), so if SystemNative_Socket did not return with error on TryConvertAddressFamilyPalToPlatform=false, even that would no longer require PInvoke. AFAICT with those changes + this one, zero PInvoke would then be required for anything to have a functioning custom AF.

@tmds
Copy link
Member

tmds commented Aug 31, 2021

Could the TryConvertAddressFamilyPalToPlatform condition be relaxed in SystemNative_Socket too?

You want to use AddressFamily to pass in raw values. For values that match defined address families, the PAL has no way of knowing whether it should convert them or treat them as raw.

We need some API addition that indicates whether the values are raw ones. These could be some overload that accepts ints, similar to SetRawSocketOption vs SetSocketOption.

Remove the _rightEndPoint != null

_rightEndPoint is the way Socket tracks the type of EndPoint it should use. For unknown address families, a user may want to specify this. Or maybe we want to add a RawEndPoint that exposes the raw address data.

API could be something in this direction:

const int AF_VSOCK = 40;
const int SOCK_STREAM = 1;
using var socket = new Socket(AF_VSOCK, SOCK_STREAM, 0, new MyVSockEndPoint());

@fbrosseau
Copy link
Author

fbrosseau commented Aug 31, 2021

You want to use AddressFamily to pass in raw values. For values that match defined address families, the PAL has no way of knowing whether it should convert them or treat them as raw.

I am not sure I can see the concrete problem - the already existing special-cases (Packet, CAN) already have synthetic values on purpose to not clash with anything standard. So that part is fine. Then, for the normal stuff (IPv4 etc), the values are universal. For the AF values that do differ between platforms (VSock and HyperV are examples), the OS would already tell if the values are good or bad.

API could be something in this direction:

First argument would be redundant with the endpoint argument since the AF value is already part of the EndPoint contract. I am also not sure if having an endpoint at socket()-time would make sense for all families. Also would that be the remote or local endpoint? Their final values are often unknown until after bind-time, or even lazier at connect-time with port-sharing options, for example.

@antonfirsov
Copy link
Member

antonfirsov commented Aug 31, 2021

@tmds API addition is a very deep rabbit hole, the scope seems to be much bigger here than with SetRawSocketOption. It's not enough to just expose Socket(int, int, int, ???) stuff, for consistency we also need to consider a bunch of other additions like EndPoint.RawAddressFamily or Socket.RawAddressFamily. I'm skeptical if we can reach consensus and get the funding for such changes in 7.0.0. This is why I'm suggesting to evaluate minor changes which may be just enough to unblock users.

_rightEndPoint is the way Socket tracks the type of EndPoint it should use.

_rightEndPoint seems to be unused with connected Send/Receive. In SendTo the remoteEp parameter is mandatory, in ReceiveFrom, we may expect the user to pass remoteEP when used with arbitrary address families.

@tmds
Copy link
Member

tmds commented Aug 31, 2021

Then, for the normal stuff (IPv4 etc), the values are universal.

There is no standard that specifies these values are the same cross-platform.

API addition is a very deep rabbit hole

The problem that needs to be solved is distinguish raw values from defined enum values.

for consistency we also need to consider a bunch of other additions like EndPoint.RawAddressFamily or Socket.RawAddressFamily

These enums have an Unknown value which could be used instead of adding new APIs.

_rightEndPoint seems to be unused with connected Send/Receive. In SendTo the remoteEp parameter is mandatory, in ReceiveFrom, we may expect the user to pass remoteEP when used with arbitrary address families.

Yes, only specific operations require _rightEndPoint. The options are: a. not support them, b. allow the user to specify his own EndPoint type, or c. introduce a RawEndPoint type.

@antonfirsov
Copy link
Member

These enums have an Unknown value which could be used instead of adding new APIs.

Sounds reasonable to me, but wondering if anyone would push back on that.

b. allow the user to specify his own EndPoint type

I think this is something we should be capable to support with the current API-s. (Connect(), SendTo(), ReceiveFrom() can be used to specify a custom EndPoint type.) The only API hole is seems to be around connected socket creation from a handle. We can add Socket(SafeSocketHandle handle, EndPoint remoteEp) to deal with this.

@tmds
Copy link
Member

tmds commented Sep 2, 2021

The only API hole is seems to be around connected socket creation from a handle. We can add Socket(SafeSocketHandle handle, EndPoint remoteEp) to deal with this.

If we don't add a constructor to Socket/SafeSocketHandle that accepts raw values, the user will need to P/Invoke.

const int AF_VSOCK = 40;
const int SOCK_STREAM = 1;
const int SOCK_CLOEXEC = 0x80000;

[DllImport("libc", SetLastError = true)]
public static extern int socket(int domain, int type, int protocol);

int fd = socket(AF_VSOCK, SOCK_STREAM | SOCK_CLOEXEC, 0); // TODO: handle fd == -1
using var socket = new Socket(new SafeSocketHandle((IntPtr)fd, true), new MyVSockEndPoint());

@antonfirsov
Copy link
Member

If we don't add a constructor to Socket/SafeSocketHandle that accepts raw values, the user will need to P/Invoke.

Which might be acceptable for advanced scenarios. Again, I'm not super optimistic about our ability finalizing API proposals around the socket area, mostly because it has lower priority in comparison to other hot topics. (See our backlog of stale suggestions). I may give it a try though :)

for consistency we also need to consider a bunch of other additions like EndPoint.RawAddressFamily or Socket.RawAddressFamily

These enums have an Unknown value which could be used instead of adding new APIs.

This would mean that the raw address family and socket/protocol type is settable, but not queryable. This is quite atypical for BCL API-s, and likely not what we want there, meaning that we need to expose properties which is where the rabbit hole begins ...

Regarding the code sample in #58327 (comment): I meant the Socket(SafeSocketHandle handle, EndPoint remoteEp) constructor to create Socket instances which are already connected using unmanaged connect or accept. Socket(handle) should be sufficient to create, bind and connect the socket via managed API-s, assigning _rightEndPoint when necessary. The only thing we need to do here is to find a way to relax the AddressFamily validation in the SocketAddress constructor.

@karelz karelz added this to the 7.0.0 milestone Sep 2, 2021
@karelz karelz removed the untriaged New issue has not been triaged by the area owner label Sep 2, 2021
@karelz
Copy link
Member

karelz commented Sep 2, 2021

Triage: Similar to #58378 (comment) -- there may be dragons, let's find out more first.

@karelz karelz added the enhancement Product code improvement that does NOT require public API changes/additions label Sep 2, 2021
@antonfirsov
Copy link
Member

Triage: not critical for 7.0, pushing to future for now.

@antonfirsov antonfirsov modified the milestones: 7.0.0, Future Jun 2, 2022
@crozone
Copy link

crozone commented May 8, 2023

Ah yes interesting, I didn't realize SocketAddress could be fuzzed into changing the AF byte, this goes against both the docs and the code comments, so I did not expect this to be a supported scenario.

// Access to unmanaged serialized data. This doesn't
// allow access to the first 2 bytes of unmanaged memory
// that are supposed to contain the address family which
// is readonly.
public byte this[int offset]

Then yes if that's supported that works, that removes the need for quite a bit of PInvoke, except the initial socket() call.

I noticed this myself when implementing a SocketCanEndPoint : EndPoint to allow binding to a SocketCAN socket on Linux. I didn't end up needing to set the first two bytes since AddressFamily.ControllerAreaNetwork is actually correctly translated into CAN_AF by TryConvertAddressFamilyPalToPlatform, but it is very weird that the comment explicitly says those bytes are read-only, and yet they're clearly not. Is it a bug?

@wfurt
Copy link
Member

wfurt commented May 14, 2023

This may be officially relaxed as part of #30797. I don't if it would be useful but it would provide ability to send/receive without need for SocketCanEndPoint.

@crozone
Copy link

crozone commented Jun 13, 2023

This may be officially relaxed as part of #30797. I don't if it would be useful but it would provide ability to send/receive without need for SocketCanEndPoint.

In the SocketCAN usecase, the SocketCanEndPoint is only used to bind the socket to the CAN interface with .Bind(), but not after that. This means that the performance impact isn't as serious as needing to pass it into every SendTo() or ReceiveFrom(), but there's still an annoying extra copy through the EndPoint indirection.

Is there any scope in that ticket for also making Bind() accept a sockaddr in the same way? Maybe even a BindRaw() that bypasses the SocketFamily Pal translation stuff, since the first two bytes are the address family.

SocketCAN code for reference: https://github.com/crozone/SocketCanNet/blob/main/SocketCanNet/CanSocketEndPoint.cs

https://github.com/crozone/SocketCanNet/blob/f33ff5ca0d4a72d463d5373884134002b678fdf1/SocketCanNet/SocketExtensions.cs#L98-L107

@wfurt
Copy link
Member

wfurt commented Jun 13, 2023

there is no plan for Bind(SocketAddress) at the moment. That is simple function and p/invoke would be trivial IMHO. (unlike ReceiveFromAsync where there is lot of internal complexity and platform differences)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-System.Net.Sockets enhancement Product code improvement that does NOT require public API changes/additions
Projects
None yet
Development

No branches or pull requests

7 participants