Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow @testitem to set its own timeout #129

Merged
merged 7 commits into from
Dec 8, 2023

Conversation

nickrobinson251
Copy link
Collaborator

@nickrobinson251 nickrobinson251 commented Dec 7, 2023

  • close Allow individual @testitems to set timeout limit #13
  • an @testitem can set its own timeout (in seconds) via the timeout keyword, like @testitem "foo" timeout=600
  • this value takes precedence over the testitem_timeout passed to runtests(...) (i.e. this PR makes testitem_timeout be a "default" timeout value that can be overridden per testitem.)
    • The alternative would be to take the max of the two values, which is what we do for retries (so that inconsistency is perhaps a little unfortunate).
    • But giving the @testitem's own timeout=... value precedence allows a @testitem to set a lower limit than the default, which is useful for tests that should go fast but are at risk of blowup.

@@ -428,7 +431,7 @@ end

any_non_pass(ts::DefaultTestSet) = ts.anynonpass

function record_timeout!(testitem, run_number::Int, timeout_limit::Real)
function record_timeout!(testitem, run_number::Int, timeout_limit::Float64)
timeout_s = round(Int, timeout_limit)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why the conversion to a float then to an Int? can we just stay in Int?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yah, this is a hassle. it's just because we were lax before and so were allowing users to pass any Real, so in this PR i converted them to Float64 asap, so we at least know the type we have... but i think going the other way and rounding them to Int is the better idea (we still need to accept Floats to avoid a breaking change, as you say)

Comment on lines 493 to 495
# TODO: should the testitem's own timeout take precedence or should we just take the `max`?
# timeout = max(something(testitem.timeout, 0.0), default_timeout)
timeout = something(testitem.timeout, default_timeout)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

user should definitely take precedence. It can be useful to set a lower bound here so that for tests that should go fast but are at risk of blowup, you can set a lower timeout than the default so those timeouts can catch regressions more often


If a `@testitem` should be aborted after a certain period of time, e.g. the test is known
to occassionally hang, then you can set a timeout (in seconds) by passing the `timeout` keyword.
Note that `timeout` currently only works when tests are run with multiple workers.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is that? (out of curiosity)

Copy link
Collaborator Author

@nickrobinson251 nickrobinson251 Dec 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just not implemented yet #87

We effectively largely have two separate code paths, depending on whether or not there are workers.
If there are workers, we spawn a task to run the test and a task to run a timer, and they race.
But we haven't yet implemented the equivalent for the single-process path, and that path currently doesn't spawn any tasks, it just evals code sequentially (but it could spawn tasks, maybe even using something like try_with_timeout)

I think just because initial development was for the CI use-case, for which we expect workers to be used, and the single-process case has got less love (for similar reasons, the "eager" logs case also hasn't recieved as much love #74)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think to robustly cancel a task on the same process, we'd need something like JuliaLang/julia#6283

Comment on lines 133 to 134
If a `@testitem` sets its own `timeout` keyword, then that takes precedence.
Note timeouts are currently only applied when `nworkers > 0`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd vote to limit this to int64 positive seconds to avoid conversions and general need to handle various number types. I know that's technically breaking, so take it with a bit of a grain of salt, but if we're storing up breaking changes to a release someday, something to consider

Copy link
Collaborator

@mcmcgrath13 mcmcgrath13 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you!

@nickrobinson251 nickrobinson251 merged commit 94f74bf into main Dec 8, 2023
9 checks passed
@nickrobinson251 nickrobinson251 deleted the npr-testitem-specific-timeouts branch December 8, 2023 15:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow individual @testitems to set timeout limit
3 participants