Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wasi-sockets: Introduce UDP streams #7243

Merged
merged 6 commits into from
Oct 25, 2023

Conversation

badeend
Copy link
Contributor

@badeend badeend commented Oct 15, 2023

Introduce new inbound-datagram-stream and outbound-datagram-stream types and moved receive and send methods to those respectively. These streams are returned by bind and can be individually subscribed to. This resolves a design issue where a UDP server would end up in a spin loop because receive returned EWOULDBLOCK but poll_* always returned immediately because the socket was ready for sending. In this new setup, users can poll each direction separately. Fixes WebAssembly/wasi-sockets#64

Additionally

@badeend badeend requested a review from a team as a code owner October 15, 2023 08:46
@badeend badeend requested review from alexcrichton and removed request for a team October 15, 2023 08:46
@github-actions github-actions bot added the wasi Issues pertaining to WASI label Oct 15, 2023
@alexcrichton
Copy link
Member

Thanks for this! I've got a few comments on some API design decisions as well as their implications on the implementation:

  • Primarily I think it would be best to avoid the usage of Mutex<T> in the implementation. I feel like it's best to reflect the concurrency of the underlying OS which is that mutexes aren't required and concurrent calls to various bits and pieces are ok. I realize that mutexes aren't held during blocking I/O operations as a part of this PR, but that effectively means that the lock is overhead and I'd prefer to do away with it. I tested this out locally and the main issue that I ran into was that the inbound/outbound streams need access to the remote_address() of the state of the socket. This is problematic because, theoretically, a start_connect can race with a send or receive call, meaning that a Mutex is basically required to arbitrate that race (or something atomic like that). Could this perhaps be altered at the API level where an inbound/outbound stream is returned from finish-connect in addition to finish-bind? That way the streams from bind are never connected and the streams from connect are always connected. That may have weird implications if all the streams are used, though, so I'm not 100% sure.

  • Bikeshed-wise the name inbound-* and outbound-* are inconsistent with wasi:io/streams usage of input-* and output-*. Is it perhaps worth it to use the same terminology?

  • Bikeshed-wise the oubound-datagram-stream type is similar to output-stream but has a different mechanism for receiving data. The output-stream type for example has a check-write method which returns how much can be written and only then can write be called. Additionally write traps if too much is given to it. The main rationale for this design decision is that it avoids the need for callees to receive a possibly unbounded amount of data forcibly within their module (since calling send will copy over all the data into the second module if there is one). This may best be solved with a check-write lookalike for outbound-datagram-stream as well perhaps?

@badeend
Copy link
Contributor Author

badeend commented Oct 16, 2023

I think it would be best to avoid the usage of Mutex in the implementation.

I agree, yet I couldn't figure out how to make it work.
My initial version had bind and connect both return streams. (Also I allowed connect to be called multiple times, but I didn't want to pollute this PR too much).
In that setup, calling connect has to invalidate the previously returned streams somehow. At which point I ended up at Mutex again. Allowing the previous streams to remain somehow usable might be easier to implement, but is arguably a worse API design.


Bikeshed-wise the name inbound-* and outbound-* are inconsistent with wasi:io/streams usage of input-* and output-*. Is it perhaps worth it to use the same terminology?

I (tried to) use the wasi-http naming scheme, as they feel more natural to the domain at hand (networking) as opposed to input&output.
Sidenote, I just noticed that wasi-http actually uses slightly different terminology (incoming&outgoing) 😶

For the stream types, I'm fine with either naming scheme. They're temporary anyway.
For the datagram types themselves I do prefer to keep calling them incoming&outgoing-datagram rather than input&output-datagram.


The output-stream type for example has a check-write method which returns how much can be written and only then can write be called

Good point. I'll look into it.

The main rationale for this design decision is that it avoids the need for callees to receive a possibly unbounded amount of data forcibly within their module

Even with a check-write method on outbound-datagram-stream that can still happen, right. I assume check-write will return the amount of datagrams that can be sent. But each datagram can still contain an unbounded number of bytes in the data field.

@alexcrichton
Copy link
Member

One way to get invalidation working without Mutex is to have a generational counter stored in each incoming/outgoing stream which is compared to the "current generation", an atomic counter, stored within the Arc. That way invalidation would be "increment the generation" and that may work out?

For naming ok sounds like we're already a bit inconsistent, so I think it's ok to keep the names here 👍

For check-write you're correct that you can always still pass in more data, but implementations can trap when this happens vs being expected to work correctly. I also think that for datagrams the number returned from check-write should probably be "size of all datagrams summed up" since you're right that the size of an individual datagram can change. Unless we can place a hard cap on the size of the datagram though in which case implementations could trap if any individual datagram is too large.

@badeend
Copy link
Contributor Author

badeend commented Oct 16, 2023

Potentially answering my own question:

calling connect has to invalidate the previously returned streams somehow

Rather than performing validation in the *-stream code, maybe we can require the consumer to have dropped all prior child streams before calling connect (again) and trap otherwise. That way it should be impossible to have multiple competing streams active at the same time.

@alexcrichton
Copy link
Member

Ah I think we raced there a bit, but I like your idea more of requiring the streams are dropped than my idea of a generation counter

Introduce new `inbound-datagram-stream` and `outbound-datagram-stream` types and moved `receive` and `send` methods to those respectively. These streams are returned by `bind` can be individually subscribed to. This resolves a design issue where a UDP server  would end up in a spin loop because `receive` returned EWOULDBLOCK but poll_* always returned immediately because the socket was ready for sending. In this new setup, users can poll each direction separately. Fixes WebAssembly/wasi-sockets#64

Additionally:
- Enable send-like behaviour by making `outbound-datagram::remote-address` optional. Fixes WebAssembly/wasi-sockets#57
- Dropped the `network` parameter from the `connect` call, because `bind` is now _required_ to perform IO.
Remove the Mutex again. Instead allow `stream` to be called multiple times, but trap if the previous streams are still active.
@badeend
Copy link
Contributor Author

badeend commented Oct 23, 2023

This is ready for review again.

I have:

  • Updated the API to disallow multiple pairs of active streams.
  • Updated the API to mostly match wasi-io's input-stream & output-stream.
  • Removed the mutex
  • Renamed the resources to align with wasi-http.

Copy link
Member

@alexcrichton alexcrichton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great to me, thanks again for this!

@alexcrichton alexcrichton added this pull request to the merge queue Oct 24, 2023
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Oct 24, 2023
@badeend
Copy link
Contributor Author

badeend commented Oct 24, 2023

Alright, looks like the MacOS issue is fixed now

@alexcrichton alexcrichton added this pull request to the merge queue Oct 25, 2023
Merged via the queue into bytecodealliance:main with commit a6a9bdf Oct 25, 2023
40 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
wasi Issues pertaining to WASI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

UDP subscribe can't be split between read/write
2 participants