-
Notifications
You must be signed in to change notification settings - Fork 266
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Re-introduces internal fsapi.File with non-blocking methods #1613
Conversation
// - No-op files, such as those which read from /dev/null, should return | ||
// immediately true, as data will never become available. | ||
// - See /RATIONALE.md for detailed notes including impact of blocking. | ||
Poll(flag Pflag, timeoutMillis int32) (ready bool, errno experimentalsys.Errno) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the thing that tripped the whole debate was me noticing a PR that basically obviated this parameter based on concerns about blocking. Rather than get through that debate, which seems still murky, this moves the whole feature set internal. This allows users to still supply filesystems even though there remain plenty of blocking methods, such as Read.
the main impact of this change is now you cannot really wrap all functions used by wasi anymore. However, it is still better than not exposing the ones that are stable. I wasn't aware the File.Poll interface wasn't ready until #1606 which sadly happened after I promised publicly we would release our file api. 🤷 I tried.. |
Folks aren't currently on the same page on whether or not users should be able to implement the timeout parameter on `File.Poll` or whether it should always be externally controlled via a sleep loop which uses poll (immediate). Until we are sure internally, we shouldn't expose this for implementation, as it would be confusing for the same reason. This migrates the non-blocking functionality back into the internal package. One future way out could be to allow the user to decide if the backend is allowed to use the timeout parameter or not. Just as they control the FS, they could control a poller, which would then be able to span across both normal files and sockets. One could then be able to choose in the case of a single file and a specific known wasm, simply use the native timeout. In other words, since there are valid alternate answers, even though Go doesn't support custom pollers we could, if we decide that native polling is basically not allowed. Alternatively, we could force it to never be allowed by removing the timeout parameter (making it a poll immediate). Anyway, I will leave that decision up to @ncruces and @evacchi who will take responsibility of whichever decision is made. Signed-off-by: Adrian Cole <adrian@tetrate.io>
f82e74c
to
e7cf1de
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my opinion, non-blocking I/O on file systems belongs to the realm of advanced optimizations, which most programs will never successfully leverage, so starting with exposing a blocking-only API seems like it will still unlock most use cases that need to support customizing the file system layer.
I think it's a much safer move to test the experimental sys.File
as a blocking API only, and introduce non-blocking on file system operations only when we have a clear use case for it. Take Go as an example, even if the language is designed for concurrency, the runtime still performs blocking I/O on files and simply spawns more threads on demand because the async I/O facilities are usually too limited to be useful as a general purpose solution (in other words, they tend to create more problems than they solve).
Some more details about my experience on the topic:
-
Read-heavy file I/O tend to benefit from the kernel page cache, in which case the reads barely ever block because the data usually resides in memory; the optimization path for those programs tend to be more layers of caches in user space, using mmap, or specializations like sendfile/splice, rarely is async I/O the right answer.
-
Applications doing write-heavy operations often use write-optimized data structures such as write-ahead logs, in which cases the opportunities for non-blocking I/O are very limited; writes hit the page cache, and the only operations that would be truly blocking are synchronization points like fsync/fdatasync, in which case blocking the current thread is usually acceptable. Once again, the complexity of syncing with async I/O rarely justifies the investment.
I hope the perspective is useful, I want to emphasize that I'm in favor of this change, I think that a simpler API will help address most of the reasons why we wanted to expose the file system API in the first place, support for non-blocking file I/O will likely deserve more research to get right 👍
I hear you @achille-roussel, though we have spent a long time on this and I think that there are impacts by not doing it.. basically it will never be done. The main cons are..
TL;DR; I think it will make people have to use external projects that completely re-implement the whole syscall layer, if they want to affect non-blocking implementation. Worst case is no one will move this past the finish line later, so it will permanently reside in this state. |
As most of the downside are limited to pre-opens which are special cased anyway, I think we're good here. It can be re-introduced later and meanwhile not block wazero 1.4 |
…abs#1613) Signed-off-by: Adrian Cole <adrian@tetrate.io> Signed-off-by: Jeroen Bobbeldijk <jeroen@klippa.com>
Folks aren't currently on the same page on whether or not users should be able to implement the timeout parameter on
File.Poll
or whether it should always be externally controlled via a sleep loop which uses poll (immediate). Until we are sure internally, we shouldn't expose this for implementation, as it would be confusing for the same reason.This migrates the non-blocking functionality back into the internal package.
One future way out could be to allow the user to decide if the backend is allowed to use the timeout parameter or not. Just as they control the FS, they could control a poller, which would then be able to span across both normal files and sockets. One could then be able to choose in the case of a single file and a specific known wasm, simply use the native timeout. In other words, since there are valid alternate answers, even though Go doesn't support custom pollers we could, if we decide that native polling is basically not allowed. Alternatively, we could force it to never be allowed by removing the timeout parameter (making it a poll immediate). Anyway, I will leave that decision up to @ncruces and @evacchi who will take responsibility of whichever decision is made.