timeout macro -- what happens to the task that timed out? memory leak? #114

samoconnor · 2017-11-10T23:09:23Z

The timeout macro runs an expr in an @async task, but does not wait for it to complete if the timeout expires.

Is there a risk of memory/resource leakage here? i.e. will abandoned tasks and whatever objects they have references to accumulate over time?

Is there a risk of data corruption? e.g.. when used with the @retry macro, can we end up with multiple tasks all running the same expr and manipulating the same data structure with odd results?

As far as I know, Julia has no api for aborting/terminating a task, so there may not be a good way to solve this problem.

If it is true that we can't currently implement a generically safe @timeout macro in Julia, maybe it is best to file bug-reports (or make PRs) against whatever lower level API we would like to see have a timeout option. It should be easier to handle correct cleanup of abandoned tasks for a specific API than to find a general solution that works for any expr in the @timeout macro.

In the case of TCP connect, it looks like adding a timeout at the uv_tcp_connect level is non trivial: joyent/libuv#1415.

Maybe we could add a timeout option to the Julia connect() -> TCPSocket function. This would have to ensure that abandoned connect attempts are cleaned up when they eventually ETIMEDOUT at the kernel level. It might also have to implement some kind of throttling to prevent an unreasonable number of concurrent abandoned connect tasks from piling up.

My preference for now would be to leave the connect() -> TCPSocket timeout behaviour as is and not try to impose a shorter timeout on top of it. The TCP connection protocol has a well defined sequence of attempts and retries and exponential backoff already. It doesn't seem like a good thing to try to subvert this. i.e. the networks ability to recover from congestion depends on all clients following the same exponential backoff policy.

Note, on my mac, ETIMEOUT seems to take about 75 seconds:

julia> @time try connect("203.10.110.101", 80) catch e; @show e ; end
e = connect: connection timed out (ETIMEDOUT)
 75.203830 seconds (54 allocations: 2.156 KiB)
connect: connection timed out (ETIMEDOUT)

The text was updated successfully, but these errors were encountered:

samoconnor · 2017-11-20T23:31:35Z

Bump. This seems like a real concern for people trying to build long-running production systems (currently a necessity because of Julia's long start-up / JIT delays). Any thoughts @quinnj ?

quinnj · 2017-11-26T04:18:20Z

Sorry for the delay here. Yeah, I think it's sensible to just have Inf as the default; I also added a note that if they're set shorter, it may lead to underlying resources not actually being freed immediately.

samoconnor mentioned this issue Nov 20, 2017

Default timeouts are incorrectly documented (and maybe dangerous per #114) #123

Closed

samoconnor added a commit to JuliaCloud/AWSCore.jl that referenced this issue Nov 21, 2017

Set HTTP timeouts to Inf to work around JuliaWeb/HTTP.jl#114

3741d8c

quinnj closed this as completed in d6804f0 Nov 26, 2017

samoconnor mentioned this issue Mar 2, 2018

localhost_is_ec2() == false on ECS running in AWS Batch JuliaCloud/AWSCore.jl#24

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

timeout macro -- what happens to the task that timed out? memory leak? #114

timeout macro -- what happens to the task that timed out? memory leak? #114

samoconnor commented Nov 10, 2017

samoconnor commented Nov 20, 2017

quinnj commented Nov 26, 2017

timeout macro -- what happens to the task that timed out? memory leak? #114

timeout macro -- what happens to the task that timed out? memory leak? #114

Comments

samoconnor commented Nov 10, 2017

samoconnor commented Nov 20, 2017

quinnj commented Nov 26, 2017