-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Review of IO blocking behaviour #24526
Comments
I'm not an IO guru so have nothing concrete to add, but just have to say wow, really nice analysis. |
What do you want |
Hi @stevengj, I'm not sure what the right interface is, but I am sure that it needs to be well specified and unambiguous so that compatible implementations can be dropped in and "just work". If I understand that a very common use case for simple files is to want to read to the end of what is in the file now. However, the less common use case of reading form a file that is occasionally appended to by a 3rd party needs to be handled too. Consider also that the
Maybe the answer is to have a user selectable blocking/non-blocking mode ( Maybe it would be better to have clearly distinct types for simple-file-like-things and stream-like-things. Or maybe it's best to enforce blocking everywhere and encourage the use of tasks and/or patterns like Another alternative would be to have a |
What does this mean?
|
Hi @bicycle1885, julia> io = IOBuffer()
IOBuffer(data=UInt8[...], readable=true, writable=true, seekable=true, append=false, size=0, maxsize=Inf, ptr=1, mark=-1)
julia> eof(io)
true
julia> write(io, "Hello")
5
julia> eof(io)
true
julia> write(io, " World!")
7
julia> eof(io)
true
julia> read(io)
0-element Array{UInt8,1}
julia> String(take!(io))
"Hello World!" |
|
@bicycle1885 I accept that the behaviour of The point of posting this issue is to highlight behaviour inconsistencies between the various IO types. I realised that I'd kind of figured out how to use the IO types I was familiar with, but that I this was largely a matter of trial and error and copying usage patterns from other code. When it came to trying to debug the HTTP.jl package, and coming up against questions of "how should function this behave", it wasn't a simple matter of "read and follow the spec". The task seemed to come down to figuring out which of the many IO variants in Base was most like what we wanted and emulating it as best we could. Maybe the solution is just to have clearer documentation, or as I said above "Maybe it would be better to have clearly distinct types for simple-file-like-things and stream-like-things.". Maybe there needs to be seperate documentation for |
This is a great analysis and we do want to make sure all of this is sane and consistent, but it's a bit too much to bite off for the impending 1.0 release, so based on discussion on triage, we're going to say that I/O blocking behavior is not yet stable and may change in the 1.x series. We will have to be careful to make sure that any changes we make don't break key libraries (e.g. HTTP). |
Luckily, HTTP.jl currently uses it's own streaming buffer type ( |
Maybe we need something like @oxinabox's https://github.com/oxinabox/InterfaceTesting.jl for IO. |
A bit ago I wrote up some notes trying to wrap my brain around the streaming interface in the context of trying to make my SampledSignals.jl stream semantics better match Base. Definitely not as nicely-organized as @samoconnor's work here, but might be useful. Also relevant is JuliaIO/FileIO.jl#78, where I'd like to support opening up media files in a streaming way, where you end up with a stream of images or audio samples, rather than a stream of bytes. [edit: premature submission and wrong issue link] |
I suspect if we add |
The main difference between the way I designed the SampledSignals streams and Base is that I don't have an const N = 4096 # amount to read each time
while true
buf = read(stream, N)
# do stuff with buf
length(buf) == N || break
end
|
Another thing: |
After the refinements proposed above, we would end up with this:
|
@StefanKarpinski where do you see this fitting into the This kind of bug comes up quite frequently: JuliaLang/MbedTLS.jl#186 where it kind of works but seems a bit glitchy or slow sometimes. The root cause is often that someone wrote some generic code that assumed a read would be non-blocking, but by the time the call makes its way through a few generic Julia should be a great language for building easily compossible pieces of IO machinery; and for building data consumers and producers that work regardless of where data came from (or how it was buffered, encrypted, proxied, cached, etc). But, unless the |
I don't know, I'm not really the person to ask. @JeffBezanson, @vtjnash, @Keno, opinions on this? |
Awesome work - I think this is a really nice and consistent architecture, and I've also been bitten by the current state of things. I'm doubtful that this would be a 1.x sort of thing though, right? Seems like some of these changes would be breaky enough that they'd need to wait until 2.0. Questions
Comments
Another asymmetry in the allocating/non-allocating versions is when you want to get a
I agree it's weird for |
Yes, corrected above.
I think we should keep the
To me having a I'm not sure that I understand what use case the auto-sizeing is supposed to help with. buf = Vector{UInt8}(undef, 4096)
while !eof(io)
readbytes!(io, buf)
render(my_display, buf)
end I start out reading small chunks of data and achieving low output latency.
The current
That would be my preference.
That sounds pretty sane. I guess the
Yes. I agree. |
Question, as of 2020-02-11 what is the recommended, geneeral way to read from a stream multiple times onto a preallocated array (possibly without extra allocations)? (eg, if read provided the start/end range) buf = Vector{UInt8}(undef, 30)
read!(buf, start=1, end=10)
read!(buf, start=11, end=20)
read!(buf, start=21, end=30) Would slicing with ssize_t read(int fd, void *buf, size_t count);
size_t fread(void *ptr, size_t size, size_t nmemb, FILE *stream); |
Fixes one of the issues mentioned in #24526
Fixes one of the issues mentioned in #24526
Fixes one of the issues mentioned in #24526
Fixes one of the issues mentioned in #24526
Fixes one of the issues mentioned in #24526
Fixes one of the issues mentioned in #24526
Fixes one of the issues mentioned in #24526
This name was suggested in #24526 and * Has a good analogy to close(), so people are likely to be able to guess what it means. * Is more specific to IO (conversely, it's easy to imagine shutdown() being wanted for any number of things unrelated to closing the write side of an IO stream).
This name was suggested in #24526 and * Has a good analogy to close(), so people are likely to be able to guess what it means. * Is more specific to IO (conversely, it's easy to imagine shutdown() being wanted for any number of things unrelated to closing the write side of an IO stream).
This name was suggested in #24526 and * Has a good analogy to close(), so people are likely to be able to guess what it means. * Is more specific to IO (conversely, it's easy to imagine shutdown() being wanted for any number of things unrelated to closing the write side of an IO stream).
This name was suggested in #24526 and * Has a good analogy to close(), so people are likely to be able to guess what it means. * Is more specific to IO (conversely, it's easy to imagine shutdown() being wanted for any number of things unrelated to closing the write side of an IO stream).
This name was suggested in JuliaLang#24526 and * Has a good analogy to close(), so people are likely to be able to guess what it means. * Is more specific to IO (conversely, it's easy to imagine shutdown() being wanted for any number of things unrelated to closing the write side of an IO stream).
This name was suggested in JuliaLang#24526 and * Has a good analogy to close(), so people are likely to be able to guess what it means. * Is more specific to IO (conversely, it's easy to imagine shutdown() being wanted for any number of things unrelated to closing the write side of an IO stream).
Fixes one of the issues mentioned in #24526
I've recently been helping @quinnj to debug his new HTTP.jl package, in particular a new
FIFOBuffer
type that is intended to behave like the IO buffers in Base (#87, #86, #76, #75, #74). The process of trying to be consistent with Base has highlighted a number of IO behaviour inconsistencies. I've hacked up a script that runs through a sequence of IO operations for various types and produces a MD table of the results (see below).The two main issues are with the blocking behaviour of
read()
andeof()
.The spec for
eof()
says: "If the stream is not yet exhausted, this function will block to wait for more data if necessary, and then return false."eof()
works per spec forBufferStream
andTCPSocket
.Filesystem.File
,IOStream
, andPipeBuffer
,eof()
does not block to wait for more data. Instead it returnstrue
without blocking if there is not currently data available to be read.IOBuffer
it seems thateof()
just always returns true.The spec for
read(::IO, ::Int)
says: "[By default] this function will block repeatedly trying to read all requested bytes, until an error or end-of-file occurs."BufferStream
andTCPSocket
,read()
behaves as specified.read()
seems to just return however many bytes are available at the time and does not block.IOBuffer
,read()
always returns an empty array.Other issues:
BufferStream
andIOStream
,isreadable()
returns true afterclose()
is called.BufferStream
,iswriteable()
returns true afterclose()
.BufferStream
andIOStream
,read()
afterclose()
returns empty data, whereas the other types throw an error.BufferStream
andTCPSocket
,read(io, String)
blocks until the stream is closed (this seems consistent with the blocking behaviour ofread(io, nb)
, however the other types return immediately with a string containing however many bytes are available at the time.mark
/reset
don't work forBufferStream
mark/reset broken for BufferStream #24465shutdown(fd, SHUT_WR)
oruv_shutdown()
. If a TCP server waits for a request to be sent before responding, a Julia client would have to callclose()
to signal end of request, but would then be unable to read the response.shutdown
is related to the inconsistencies withisreadable()
andiswriteable()
. It seem like maybe there should be acloseread()
andclosewrite()
that respectively causeisreadable()
andiswriteable()
to return false.close
onBufferStream
causeseof()
= true andisopen()
= false, butiswriteable()
andisreadable()
are both still true, and in fact reads and writes continue to work with the closed stream. It seems thatBufferStream
would benefit from a seperateclosewrite()
for signallingeof()
to the reader.Note that
TCPSocket
behaves almost the same asBufferStream
but without theisopen()
andisreadable()
issues:The text was updated successfully, but these errors were encountered: