-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Check for close_notify message in Base.isopen(ctx::SSLContext) #145
Conversation
Review request: @vtjnash, @quinnj, @malmaud, @KristofferC |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic seems sound to me. I find it a little difficult to imagine any downsides to a change like this: will any of these new calls unexpectedly throw? Is it bad to call Base.start_reading
if ctx.bio
happens to be in a bad state or something?
I think it all seems good though.
Codecov Report
@@ Coverage Diff @@
## master #145 +/- ##
==========================================
+ Coverage 68.82% 69.57% +0.75%
==========================================
Files 10 10
Lines 433 447 +14
==========================================
+ Hits 298 311 +13
- Misses 135 136 +1
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's fascinating to realize there's actually a spec for this race condition in the http/1.1 keep-alive documentation: (https://www.w3.org/Protocols/rfc2616/rfc2616-sec8.html). Accordingly, apparently nobody handles this particularly well.
src/ssl.jl
Outdated
# has sent a close_notify message on an otherwise idle connection. | ||
# https://tools.ietf.org/html/rfc5246#section-7.2.1 | ||
Base.start_reading(ctx.bio) | ||
yield() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isopen
isn't typically a yield-point. Should use Base.process_events(false)
instead
Maybe we should add something like: (EDIT: add to users, like HTTP.jl) kill_on_data(s::IO) = eof(s) || s.in_use || close(s)
s.in_use = true
send(s, "TYP PATH HTTP/1.1\nHEADERS\n\nBODY"))
reply = recv(s)
s.in_use = false
@async kill_on_data(s)
return reply |
- @schedule task to keep socket in active state - and monitor for MBEDTLS_ERR_SSL_PEER_CLOSE_NOTIFY Whenever MBEDTLS_ERR_SSL_PEER_CLOSE_NOTIFY is recieved call close(::SSLContext). move zero-byte read into function: decrypt_available_bytes
@vtjnash, your I've moved the I've changed the handling of I've also moved the zero-byte read into a function |
…calling isopen(::SSLContext) does that for us
There is now a pleasing symmetry between the status query functions |
# Ensure that libuv is reading data from the socket in case the peer | ||
# has sent a close_notify message on an otherwise idle connection. | ||
# https://tools.ietf.org/html/rfc5246#section-7.2.1 | ||
Base.start_reading(ctx.bio) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Base.start_reading
should always be very cheap:
elseif stream.status == StatusPaused
stream.status = StatusActive
return Int32(0)
elseif stream.status == StatusActive
return Int32(0)
https://github.com/JuliaLang/julia/blob/master/base/stream.jl#L651-L655
Yes, this seems good. I like that yours is handling it at the TLS level, rather than entirely deferring to the higher level. For non-SSL streams, should we add a similar watchdog to HTTP.jl? I think |
I hadn't given it much thought, most HTTP APIs mandate TLS these days, so that's the most common use-case. Is it the case that I've just tried the HTTP.jl patch below for plain HTTP/TCP connections. It keeps the socket in active state and does an explicit
With the patch, we see the FIN and respond with a FIN and the connection is cleanly closed:
--- a/src/ConnectionPool.jl
+++ b/src/ConnectionPool.jl
@@ -488,6 +488,15 @@ struct ConnectTimeout <: Exception
port
end
+function tcp_monitor(tcp)
+ while isopen(tcp)
+ Base.start_reading(tcp)
+ wait(tcp.readnotify)
+ yield()
+ end
+ close(tcp)
+end
+
function getconnection(::Type{TCPSocket},
host::AbstractString,
port::AbstractString;
@@ -502,6 +511,7 @@ function getconnection(::Type{TCPSocket},
if connect_timeout == 0
tcp = Sockets.connect(Sockets.getaddrinfo(host), p)
keepalive && keepalive!(tcp)
+ @schedule tcp_monitor(tcp)
return tcp
end Part of the problem here is that the Base API does not distinguish between isopenread / isopenwrite or closeread / closewrite. I think if the Base API for bi-directional streams was more explicit about this it would be easier to reason about cases like this. |
The following is a revised version of the HTTP.jl patch that works for both TLS and TCP. --- a/src/ConnectionPool.jl
+++ b/src/ConnectionPool.jl
@@ -453,6 +453,7 @@ function getconnection(::Type{Transaction{T}},
busy = findall(T, host, port, pipeline_limit)
if length(busy) < connection_limit
io = getconnection(T, host, port; kw...)
+ @schedule tcp_monitor(io)
c = Connection(host, port,
pipeline_limit, idle_timeout,
io)
@@ -488,6 +489,16 @@ struct ConnectTimeout <: Exception
port
end
+function tcp_monitor(io)
+ tcp = tcpsocket(io)
+ while isopen(io)
+ Base.start_reading(tcp)
+ wait(tcp.readnotify)
+ yield()
+ end
+ close(tcp)
+end
+ |
This HTTP.jl PR JuliaWeb/HTTP.jl#235 implements the patch above. |
When a connection is returned to the (read) pool, add a monitor to it for receiving unexpected data (or EOF), and kill / close the Connection object if any activity occurs before the next write (when it should have simply been waiting idle in the pool) per JuliaLang/MbedTLS.jl#145 (comment) closes #214 closes #199 closes #220 closes JuliaWeb/GitHub.jl#106
Definitely, no. That would introduce all sorts of other subtle bugs and races into the IO system. I've made a couter-proposal PR with my concept of what I think this should look like.
Not a major issue, since we'll just call all of them. Unfortunately, all versions of this also triggers a very serious bug with IdDict, and so this PR is currently unusable on all versions of Julia. |
@vtjnash, I assume this is the IdDict issue: JuliaLang/julia#26839 ? |
I'll address the HTTP.jl PR in it's own comments section. With regard to the the So I'm in favour of merging this PR as is (but I don't have commit access). |
I plan on putting some load on a local application w/ this change in to see how it fares; I'll post back here if I see any issues. |
thx @quinnj |
* ConnectionPool: monitor idle connections When a connection is returned to the (read) pool, add a monitor to it for receiving unexpected data (or EOF), and kill / close the Connection object if any activity occurs before the next write (when it should have simply been waiting idle in the pool) per JuliaLang/MbedTLS.jl#145 (comment) closes #214 closes #199 closes #220 closes JuliaWeb/GitHub.jl#106 * - Encapsulate read|writebusy/sequence/count logic in new isbusy function. - Move close() on eof() || !isbusy() to new monitor_idle_connection function. - Make monitor_idle_connection() a noop for ::Connection{SSLContext} * require Julia 0.6.3 #236 (comment)
This change enables
Base.isopen(ctx::SSLContext)
to detect that the peer has closed an idle connection.Before this change an idle TLS connection could receive a
close_notify
message from the TLS peer, but we would never notice it because theLibuvStream
was no longer inStatusActive
so the message never found its way into the read buffer.This appears to be the root cause of various EOFError() issues related to re-use of idle connections:
JuliaWeb/HTTP.jl#214
JuliaWeb/HTTP.jl#199
JuliaWeb/HTTP.jl#220
JuliaWeb/GitHub.jl#106
This change modifies
Base.isopen(ctx::SSLContext)
to:Base.start_reading(ctx.bio); yield()
to ensure that theLibuvStream
is active,mbedtls_ssl_read
to ensure that theclose_notify
message is processed by the TLS library if it has been received,MBEDTLS_ERR_SSL_PEER_CLOSE_NOTIFY
Note that the
idle_timeout=
change to GitHub.jl is still desirable because there is a race-condition when a request is sent on an idle connection at the same moment that the server decides to sendclose_notify
and drop the connection.