Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

timeout macro -- what happens to the task that timed out? memory leak? #114

Closed
samoconnor opened this issue Nov 10, 2017 · 2 comments
Closed

Comments

@samoconnor
Copy link
Contributor

The timeout macro runs an expr in an @async task, but does not wait for it to complete if the timeout expires.

Is there a risk of memory/resource leakage here? i.e. will abandoned tasks and whatever objects they have references to accumulate over time?

Is there a risk of data corruption? e.g.. when used with the @retry macro, can we end up with multiple tasks all running the same expr and manipulating the same data structure with odd results?

As far as I know, Julia has no api for aborting/terminating a task, so there may not be a good way to solve this problem.

If it is true that we can't currently implement a generically safe @timeout macro in Julia, maybe it is best to file bug-reports (or make PRs) against whatever lower level API we would like to see have a timeout option. It should be easier to handle correct cleanup of abandoned tasks for a specific API than to find a general solution that works for any expr in the @timeout macro.

In the case of TCP connect, it looks like adding a timeout at the uv_tcp_connect level is non trivial: joyent/libuv#1415.

Maybe we could add a timeout option to the Julia connect() -> TCPSocket function. This would have to ensure that abandoned connect attempts are cleaned up when they eventually ETIMEDOUT at the kernel level. It might also have to implement some kind of throttling to prevent an unreasonable number of concurrent abandoned connect tasks from piling up.

My preference for now would be to leave the connect() -> TCPSocket timeout behaviour as is and not try to impose a shorter timeout on top of it. The TCP connection protocol has a well defined sequence of attempts and retries and exponential backoff already. It doesn't seem like a good thing to try to subvert this. i.e. the networks ability to recover from congestion depends on all clients following the same exponential backoff policy.

Note, on my mac, ETIMEOUT seems to take about 75 seconds:

julia> @time try connect("203.10.110.101", 80) catch e; @show e ; end
e = connect: connection timed out (ETIMEDOUT)
 75.203830 seconds (54 allocations: 2.156 KiB)
connect: connection timed out (ETIMEDOUT)
@samoconnor
Copy link
Contributor Author

Bump. This seems like a real concern for people trying to build long-running production systems (currently a necessity because of Julia's long start-up / JIT delays). Any thoughts @quinnj ?

@quinnj
Copy link
Member

quinnj commented Nov 26, 2017

Sorry for the delay here. Yeah, I think it's sensible to just have Inf as the default; I also added a note that if they're set shorter, it may lead to underlying resources not actually being freed immediately.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants