Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UDP subscribe can't be split between read/write #64

Closed
alexcrichton opened this issue Oct 9, 2023 · 2 comments · Fixed by bytecodealliance/wasmtime#7243 or #70
Closed

UDP subscribe can't be split between read/write #64

alexcrichton opened this issue Oct 9, 2023 · 2 comments · Fixed by bytecodealliance/wasmtime#7243 or #70

Comments

@alexcrichton
Copy link
Contributor

I'm debugging a flaky test right now in Wasmtime's CI, and one interesting behavior I'm seeing is that there's a blocking_receive helper for UDP sockets which tries to wait in a loop, and I was going to add an upper limit to the loop to help time it out. This loop, however, never blocks. The socket is always ready for I/O, despite the receive call not actually returning anything. Still trying to figure out the latter, but for the former I believe one issue is that the subscribe for UDP sockets is either readable or writable readiness right now so if a socket is acting as a server then there's no way to block it for read readiness since it's always writable if you're not writing.

alexcrichton added a commit to alexcrichton/wasmtime that referenced this issue Oct 9, 2023
This commit is an attempt to fix some networking tests in synchronous
mode in our test suite. Currently networking tests don't actually run in
synchronous mode on CI which is why no failures have been surfaced yet,
but the refactoring in bytecodealliance#7182 is going to start doing this.

Currently the `udp_sample_application.rs` test blocks infinitely in
synchronous mode for me locally, most of the time. This appears to be an
interaction between how Tokio handles readiness and how we're
entering the event loop. We're effectively entering the Tokio event loop
with a future that's always ready which ends up starving Tokio of
otherwise performing its background work such as updating flags for
readiness of reading/writing.

The fix here is to add a yield at the start of an `in_tokio` block which
is used in synchronous mode. This is a kludge fix but the intention is
to enable Tokio to have a chance to update readiness flags and process
events from epoll/kqueue/etc.

An additional fix to this issue is WebAssembly/wasi-sockets#64 where the
test is waiting on `READABLE` or `WRITABLE`, but in this specific case
it should only wait on `READABLE`. If it waited on just this then that
would also fix this issue. Nevertheless having a `yield_now` is expected
to have little-to-no overhead and otherwise fix this edge case of an
always-ready future.
alexcrichton added a commit to alexcrichton/wasmtime that referenced this issue Oct 9, 2023
This commit is an attempt to fix some networking tests in synchronous
mode in our test suite. Currently networking tests don't actually run in
synchronous mode on CI which is why no failures have been surfaced yet,
but the refactoring in bytecodealliance#7182 is going to start doing this.

Currently the `udp_sample_application.rs` test blocks infinitely in
synchronous mode for me locally, most of the time. This appears to be an
interaction between how Tokio handles readiness and how we're
entering the event loop. We're effectively entering the Tokio event loop
with a future that's always ready which ends up starving Tokio of
otherwise performing its background work such as updating flags for
readiness of reading/writing.

The fix here is to add a yield at the start of an `in_tokio` block which
is used in synchronous mode. This is a kludge fix but the intention is
to enable Tokio to have a chance to update readiness flags and process
events from epoll/kqueue/etc.

An additional fix to this issue is WebAssembly/wasi-sockets#64 where the
test is waiting on `READABLE` or `WRITABLE`, but in this specific case
it should only wait on `READABLE`. If it waited on just this then that
would also fix this issue. Nevertheless having a `yield_now` is expected
to have little-to-no overhead and otherwise fix this edge case of an
always-ready future.
@badeend
Copy link
Collaborator

badeend commented Oct 9, 2023

Yup. Discovered the same thing yesterday. I was already working on an updated design. If it is causing issues for the CI, maybe we should temporarily comment the test out

@alexcrichton
Copy link
Contributor Author

Nah no worries about CI, this ended up uncovering a separate bug (linked above) which needs to be orthogonally fixed. Wanted to file this to keep it on record though!

alexcrichton added a commit to alexcrichton/wasmtime that referenced this issue Oct 9, 2023
This commit is an attempt to fix some networking tests in synchronous
mode in our test suite. Currently networking tests don't actually run in
synchronous mode on CI which is why no failures have been surfaced yet,
but the refactoring in bytecodealliance#7182 is going to start doing this.

Currently the `udp_sample_application.rs` test blocks infinitely in
synchronous mode for me locally, most of the time. This appears to be an
interaction between how Tokio handles readiness and how we're
entering the event loop. We're effectively entering the Tokio event loop
with a future that's always ready which ends up starving Tokio of
otherwise performing its background work such as updating flags for
readiness of reading/writing.

The fix here is to add a yield at the start of an `in_tokio` block which
is used in synchronous mode. This is a kludge fix but the intention is
to enable Tokio to have a chance to update readiness flags and process
events from epoll/kqueue/etc.

An additional fix to this issue is WebAssembly/wasi-sockets#64 where the
test is waiting on `READABLE` or `WRITABLE`, but in this specific case
it should only wait on `READABLE`. If it waited on just this then that
would also fix this issue. Nevertheless having a `yield_now` is expected
to have little-to-no overhead and otherwise fix this edge case of an
always-ready future.
github-merge-queue bot pushed a commit to bytecodealliance/wasmtime that referenced this issue Oct 9, 2023
* Refactor the test-programs test suite

This commit is a large refactoring that reorganizes `test-programs` and
how we tests wasms in Wasmtime. Often writing tests requires complicated
interactions with the guest which can't be done via hand-written `*.wat`
syntax and requires a compiler to get engaged. For this purpose Wasmtime
currently has the `crates/test-programs/*` test suite which builds files
from source and then runs the tests. This has been somewhat cumbersome
in the past though and it's not been easy to extend this over time, so
this commit attempts to address this.

The scheme implemented in this PR looks like:

* All wasm test programs live in `crates/test-programs/src/bin/*.rs`.
  All of them, no exceptions.

* Wasm tests have shared support located at
  `crates/test-programs/src/lib.rs` and its submodules, such as bindings
  generation for WASI.

* Wasm tests are built by a new `crates/test-programs/artifacts` crate.
  This crate compiles modules and additionally creates components for
  all test programs. The crate itself only records the path to these
  outputs and a small amount of testing support, but otherwise doesn't
  interact with `wasmtime`-the-crate itself.

* All tests in `crates/test-programs/tests/*.rs` have moved. For example
  wasi-http tests now live at `crates/wasi-http/tests/*.rs`. Legacy
  tests of wasi-common now live at `crates/wasi-common/tests/*.rs`.
  Modern tests for preview2 live at `crates/wasi/tests/*.rs`.

* Wasm tests are bucketed based on their filename prefix. For example
  `preview1_*` is tested in wasi-common and wasmtime-wasi. The
  `preview2_*` prefix is only tested with wasmtime-wasi, however.

* A new `cli_*` prefix is used to execute tests as part of
  `tests/all/main.rs`. This is a new submodule in
  `tests/all/cli_tests.rs` which executes these components on the
  command line. Many old "command" tests were migrated here.

* Helper macros are generated to assert that a test suite is run in its
  entirety. This way if a `preview1_*` test is added it's asserted to
  get added to both wasi-common and wasmtime-wasi in the various modes
  they run tests.

Overall this moved a number of tests around and refactored some edges of
the tests, but this should not lose any tests (except one that wasn't
actually testing anything). Additionally the hope is that it's much
easier to add tests in the future. The process is to add a new file in
`crates/test-programs/src/bin/*.rs` named appropriately. For example a
preview2 executable is `preview2_*` and a CLI tests is `cli_*`. When
building the test suite an error is generated in the appropriate module
then of "please write a test here", and then a test is written in the
same manner as the other tests in the module.

* Remove no-longer-needed fetches

prtest:full

* I'm worried wasi is running low on semicolons

* Add the WASI target in all CI actions

* Add unknown-unknown target on all CI builders too

* Fix building test artifacts under miri

Need to avoid wrappers for these cross-compiled targets

* Break circular dependency for packaging

Don't use the workspace dep for `wasmtime-wasi` since it injects a
version, instead use a `path = '..'` dependency to fool Cargo into
dropping the dependency during the package phase.

* Fix some merge conflicts with tests

* Fix rebase for new tests

* Remove stray comment

* Fix some flaky tests

* Fix network tests in synchronous mode

This commit is an attempt to fix some networking tests in synchronous
mode in our test suite. Currently networking tests don't actually run in
synchronous mode on CI which is why no failures have been surfaced yet,
but the refactoring in #7182 is going to start doing this.

Currently the `udp_sample_application.rs` test blocks infinitely in
synchronous mode for me locally, most of the time. This appears to be an
interaction between how Tokio handles readiness and how we're
entering the event loop. We're effectively entering the Tokio event loop
with a future that's always ready which ends up starving Tokio of
otherwise performing its background work such as updating flags for
readiness of reading/writing.

The fix here is to add a yield at the start of an `in_tokio` block which
is used in synchronous mode. This is a kludge fix but the intention is
to enable Tokio to have a chance to update readiness flags and process
events from epoll/kqueue/etc.

An additional fix to this issue is WebAssembly/wasi-sockets#64 where the
test is waiting on `READABLE` or `WRITABLE`, but in this specific case
it should only wait on `READABLE`. If it waited on just this then that
would also fix this issue. Nevertheless having a `yield_now` is expected
to have little-to-no overhead and otherwise fix this edge case of an
always-ready future.

* Fix passing empty arguments on the CLI

* Add another blocking accept

* Update crates/test-programs/src/bin/api_proxy.rs

Co-authored-by: Trevor Elliott <awesomelyawesome@gmail.com>

---------

Co-authored-by: Trevor Elliott <awesomelyawesome@gmail.com>
badeend added a commit to badeend/wasmtime that referenced this issue Oct 15, 2023
Introduce new `inbound-datagram-stream` and `outbound-datagram-stream` types and moved `receive` and `send` methods to those respectively. These streams are returned by `bind` can be individually subscribed to. This resolves a design issue where a UDP server  would end up in a spin loop because `receive` returned EWOULDBLOCK but poll_* always returned immediately because the socket was ready for sending. In this new setup, users can poll each direction separately. Fixes WebAssembly/wasi-sockets#64

Additionally:
- Enable send-like behaviour by making `outbound-datagram::remote-address` optional. Fixes WebAssembly/wasi-sockets#57
- Dropped the `network` parameter from the `connect` call, because `bind` is now _required_ to perform IO.
badeend added a commit to badeend/wasmtime that referenced this issue Oct 23, 2023
Introduce new `inbound-datagram-stream` and `outbound-datagram-stream` types and moved `receive` and `send` methods to those respectively. These streams are returned by `bind` can be individually subscribed to. This resolves a design issue where a UDP server  would end up in a spin loop because `receive` returned EWOULDBLOCK but poll_* always returned immediately because the socket was ready for sending. In this new setup, users can poll each direction separately. Fixes WebAssembly/wasi-sockets#64

Additionally:
- Enable send-like behaviour by making `outbound-datagram::remote-address` optional. Fixes WebAssembly/wasi-sockets#57
- Dropped the `network` parameter from the `connect` call, because `bind` is now _required_ to perform IO.
github-merge-queue bot pushed a commit to bytecodealliance/wasmtime that referenced this issue Oct 24, 2023
* Introduce UDP streams

Introduce new `inbound-datagram-stream` and `outbound-datagram-stream` types and moved `receive` and `send` methods to those respectively. These streams are returned by `bind` can be individually subscribed to. This resolves a design issue where a UDP server  would end up in a spin loop because `receive` returned EWOULDBLOCK but poll_* always returned immediately because the socket was ready for sending. In this new setup, users can poll each direction separately. Fixes WebAssembly/wasi-sockets#64

Additionally:
- Enable send-like behaviour by making `outbound-datagram::remote-address` optional. Fixes WebAssembly/wasi-sockets#57
- Dropped the `network` parameter from the `connect` call, because `bind` is now _required_ to perform IO.

* Align names with wasi-http

* Revert previous changes to `bind`. Replace `connect` with `stream`

Remove the Mutex again. Instead allow `stream` to be called multiple times, but trap if the previous streams are still active.

* The code block was treated as Rust code.

* Align more closely to wasi-io's input&output-stream
github-merge-queue bot pushed a commit to bytecodealliance/wasmtime that referenced this issue Oct 25, 2023
* Introduce UDP streams

Introduce new `inbound-datagram-stream` and `outbound-datagram-stream` types and moved `receive` and `send` methods to those respectively. These streams are returned by `bind` can be individually subscribed to. This resolves a design issue where a UDP server  would end up in a spin loop because `receive` returned EWOULDBLOCK but poll_* always returned immediately because the socket was ready for sending. In this new setup, users can poll each direction separately. Fixes WebAssembly/wasi-sockets#64

Additionally:
- Enable send-like behaviour by making `outbound-datagram::remote-address` optional. Fixes WebAssembly/wasi-sockets#57
- Dropped the `network` parameter from the `connect` call, because `bind` is now _required_ to perform IO.

* Align names with wasi-http

* Revert previous changes to `bind`. Replace `connect` with `stream`

Remove the Mutex again. Instead allow `stream` to be called multiple times, but trap if the previous streams are still active.

* The code block was treated as Rust code.

* Align more closely to wasi-io's input&output-stream

* Use `send` instead of `sendto` on connected sockets.

prtest:full
badeend added a commit to badeend/wasi-sockets that referenced this issue Nov 8, 2023
# TCP
- Allow `accept()` to return transient errors.
  The original provision was added to align with preview3 streams that may only fail once. However, after discussing with Dan Gohman, we came to the conclusion that a `stream` of `result<>` could do the trick fine too.
    Fixes: WebAssembly#22
- Fold `ephemeral-ports-exhausted` into `address-in-use`. There is no cross-platform way to know the distinction between them.
- Remove `concurrency-conflict` clutter,
  and just document it to be always possible.
- Simplify "not supported", "invalid argument" and "invalid state" error cases.
  There is a myriad of reasons why an argument might be invalid or an operation might be not supported. But there is few cross platform consistency in which of those error cases result in which error codes. Many wasi-sockets codes were unnecessarily detailed and had no standardized equivalent in POSIX, so wasi-libc will probably just map them all back into a single EOPNOTSUPP or EINVAL or ...
- Remove create-tcp/udp-socket not supported errors.
  These stem from back when the entire wasi-sockets proposal was one big single thing. In this day and age, when an implementation doesn't want to support TCP and/or UDP, it can simply _not_ implement that interface, rather than returning an error at runtime.
- Document that `connect` may return ECONNABORTED
- Document the set of socket options that are inherited through `accept`
- Clarify `connect` failure state:
```md
	POSIX mentions:
	> If connect() fails, the state of the socket is unspecified. Conforming applications should
	> close the file descriptor and create a new socket before attempting to reconnect.

	WASI prescribes the following behavior:
	- If `connect` fails because an input/state validation error, the socket should remain usable.
	- If a connection was actually attempted but failed, the socket should become unusable for further network communication.
```
- Clarify `local-address` behavior on unbound socket:
```md
POSIX mentions:
> If the socket has not been bound to a local name, the value
> stored in the object pointed to by `address` is unspecified.

WASI is stricter and requires `local-address` to return `not-bound` when the socket hasn't been bound yet.
```
- Remove TCP_NODELAY for the time being. The semantics of TCP_NODELAY (and TCP_CORK for that matter) and its effects on the output-stream needs to investigated and specified. I don't expect there to be anything insurmountable. Its just that I haven't had the time to do so yet and I can't promise to have it done before the stabilization Preview2. So, in order to get wasi-sockets ready for Preview2, it was discussed to temporarily remove `no-delay` and reevaluate its inclusion before Preview3.

# UDP

- Introduce new `incoming-datagram-stream` and `outgoing-datagram-stream` types and moved `receive` and `send` methods to those respectively. These streams are returned by `stream` and can be individually subscribed to. This resolves a design issue where a UDP server would end up in a spin loop because `receive` returned EWOULDBLOCK but poll_* always returned immediately because the socket was ready for sending. In this new setup, users can poll each direction separately. Fixes WebAssembly#64
- Dropped the `network` parameter from the `connect` call, because `bind` is now _required_ to perform IO.
- Enable send-like behaviour by making `outgoing-datagram::remote-address` optional. Fixes WebAssembly#57

# IP name lookup
- Remove the non-essential parameters for now. Post-preview2 these can be reevaluated again.
- Lift the restriction against parsing IP addresses. Before, implementations still needed to parse IP addresses to decide whether or not to return an error.
badeend added a commit to badeend/wasi-sockets that referenced this issue Nov 8, 2023
- Allow `accept()` to return transient errors.
  The original provision was added to align with preview3 streams that may only fail once. However, after discussing with Dan Gohman, we came to the conclusion that a `stream` of `result<>` could do the trick fine too.
    Fixes: WebAssembly#22
- Fold `ephemeral-ports-exhausted` into `address-in-use`. There is no cross-platform way to know the distinction between them.
- Remove `concurrency-conflict` clutter,
  and just document it to be always possible.
- Simplify "not supported", "invalid argument" and "invalid state" error cases.
  There is a myriad of reasons why an argument might be invalid or an operation might be not supported. But there is few cross platform consistency in which of those error cases result in which error codes. Many wasi-sockets codes were unnecessarily detailed and had no standardized equivalent in POSIX, so wasi-libc will probably just map them all back into a single EOPNOTSUPP or EINVAL or ...
- Remove create-tcp/udp-socket not supported errors.
  These stem from back when the entire wasi-sockets proposal was one big single thing. In this day and age, when an implementation doesn't want to support TCP and/or UDP, it can simply _not_ implement that interface, rather than returning an error at runtime.
- Document that `connect` may return ECONNABORTED
- Document the set of socket options that are inherited through `accept`
- Clarify `connect` failure state:
```md
	POSIX mentions:
	> If connect() fails, the state of the socket is unspecified. Conforming applications should
	> close the file descriptor and create a new socket before attempting to reconnect.

	WASI prescribes the following behavior:
	- If `connect` fails because an input/state validation error, the socket should remain usable.
	- If a connection was actually attempted but failed, the socket should become unusable for further network communication.
```
- Clarify `local-address` behavior on unbound socket:
```md
POSIX mentions:
> If the socket has not been bound to a local name, the value
> stored in the object pointed to by `address` is unspecified.

WASI is stricter and requires `local-address` to return `not-bound` when the socket hasn't been bound yet.
```
- Remove TCP_NODELAY for the time being. The semantics of TCP_NODELAY (and TCP_CORK for that matter) and its effects on the output-stream needs to investigated and specified. I don't expect there to be anything insurmountable. Its just that I haven't had the time to do so yet and I can't promise to have it done before the stabilization Preview2. So, in order to get wasi-sockets ready for Preview2, it was discussed to temporarily remove `no-delay` and reevaluate its inclusion before Preview3.

- Introduce new `incoming-datagram-stream` and `outgoing-datagram-stream` types and moved `receive` and `send` methods to those respectively. These streams are returned by `stream` and can be individually subscribed to. This resolves a design issue where a UDP server would end up in a spin loop because `receive` returned EWOULDBLOCK but poll_* always returned immediately because the socket was ready for sending. In this new setup, users can poll each direction separately. Fixes WebAssembly#64
- Dropped the `network` parameter from the `connect` call, because `bind` is now _required_ to perform IO.
- Enable send-like behaviour by making `outgoing-datagram::remote-address` optional. Fixes WebAssembly#57

- Remove the non-essential parameters for now. Post-preview2 these can be reevaluated again.
- Lift the restriction against parsing IP addresses. Before, implementations still needed to parse IP addresses to decide whether or not to return an error.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants