-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
setTimeLimit(): varying results in multicore with failing job #665
Comments
FYI, this was started initially in rstudio/promises#86 in a related effort; the double-reporting issue there I think has been resolved in dev, but I think the variable failing is |
Thanks for this. Some quick initial comments. I think I understand what's going on. The time-out limit is not reset on the worker after the future expression has been finished. This means it'll also apply to the code wrapping up the results and sending it back to the parent R process. Depending on exactly when this timeout kicks in, you'll get different types of errors. Your different examples show this and makes sense to me. For example, the "unexpected result" error message is from an internal sanity check that makes sure that the proper data structure has been received by the parent R process, but in your example it received nothing, indicating that the connection was interrupted. That error is actually of class I think that if you add thisfut <- future({
on.exit(setTimeLimit(cpu = Inf, elapsed = Inf, transient = FALSE)) ## TESTING ONLY!
setTimeLimit(elapsed = 2, transient = TRUE)
Sys.sleep(3)
1L
}) then this problem should go away? Is that the case? If so, I'll look into possibly adding a The reason why I say BTW, |
Thank you for the prompt and detailed response!
I see that this ( I understand that you are concerned with imposing this as default behavior. For now I'm going to have to use it in production (so that failed futures don't leak workers, a problem I've been dealing with). Do you think the use of |
I'm only hesitant because I haven't had a deep focus of thinking about what it would entail to allow
Since it's evaluated within a res <- local({
on.exit(...)
42L
}) You cannot just do: res <- {
on.exit(...)
42L
} So, for
Yes, using an explicit thisfut <- future(local{
on.exit(setTimeLimit(cpu = Inf, elapsed = Inf, transient = FALSE))
setTimeLimit(elapsed = 2, transient = TRUE)
Sys.sleep(3)
1L
})) would be the most future-proof solution (pun intended), and it explicitly communicates the scope and intent. That said, I will consider adding something like |
Actually, I'm having a difficult time meeting my expectation of print({
setTimeLimit(cpu = Inf, elapsed = Inf, transient = FALSE)
e <- environment()
reg.finalizer(e,
function(ign) {cat('fin1\n');setTimeLimit(cpu = Inf, elapsed = Inf, transient = FALSE);},
onexit = TRUE)
on.exit({cat('fin2\n');setTimeLimit(cpu = Inf, elapsed = Inf, transient = FALSE);})
setTimeLimit(cpu = 1, elapsed = 1, transient = TRUE)
system("sleep 3")
# Sys.sleep(3)
message("hey")
setTimeLimit(cpu = Inf, elapsed = Inf, transient = FALSE)
1L
}) Also done with Some of the time I get My expectation is that this should always: fail, show the error message, not display |
I only skimmed your comment here, but my first instinct is that also Also, it doesn't look like that finalizer makes a difference in this example, since it'll only kick in if you delete |
Okay, I think it's finally sinking in: neither For I should have listened more closely to your first cautions: Thank you for the dialogue! |
In a multicore plan, if the expression fails (e.g., due to timeout here) then the return from
value(.)
might be either the error itself (expected) or one of two other failures ("Unexpected result" or "major/minor"). I do not see a pattern in how it varies. Further, when "Unexpected", the worker is not freed.Other than the preamble, the expression is unchanged, and I tend to reset workers between (but I tried without resetting, does not appear to influence the results).
It takes several tries to get all three returns. This is in a fresh R instance, nothing else loaded or done between calls.
The first ("Unexpected error") does not automatically free the worker, I must explicitly
resetWorkers(plan())
to get it back.I know there are cautions against using
R.utils::withTimeout
when coordinating with other processes, which I will infer to be similar-enough tosetTimeLimit
but not applicable since the timeout is completely within the other process.sessionInfo()
The text was updated successfully, but these errors were encountered: