-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make the dummy data timeout configurable #2258
Conversation
We're seeing cases where the generator is making progress, but slow enough that the default timeout doesn't generate enough patients. Being able to bump up the timeout would unblock users in this particular case, although it's obviously not a great solution. As part of this we use the default values from the config class to define the keyword argument defaults, and we remove the default values from the docstring (as they're already shown in the docs in the function signature). These changes reduce the possibility for these values to go out of sync.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As well as two nitpicky review comments inline, two more:
- I wonder if it would be helpful somewhere to have a test that we actually respect the configured timeout. I think the code is correct and we do pass it through correctly, but this seems like the sort of thing that would be easy to get wrong.
- It might be worth logging how to change the timeout in the dummy data generators. We already do this for the population size.
None of this is merge-blocking though, these are just my slightly nitpicky thoughts in the course of reading through it. Please feel free to do any or none of them.
|
||
_legacy_<br> | ||
Use legacy dummy data. | ||
|
||
_timeout_<br> | ||
Maximum time in seconds to spend generating dummy data. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may be too pedantic to be worth noting but this isn't technically right. The timeout is the time after which the generator will stop trying to generate more dummy data. In the event you hit the timeout it will typically take longer than the timeout.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that is true. But I can't think of a way of writing this which makes it more correct without simultaneously making it harder to understand, and to some extend defeating the point of documenting it in the first place.
Unless we just say:
Maximum time in seconds (approximately) to spend generating dummy data.
@@ -102,7 +102,8 @@ def test_configured_population_size(legacy): | |||
intervals=years(1).starting_on("2020-01-01"), | |||
) | |||
|
|||
measures.configure_dummy_data(population_size=10, legacy=legacy) | |||
measures.configure_dummy_data(population_size=99, legacy=legacy, timeout=123) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The answer to this doesn't really matter, but for my curiousity, why did the population size change here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just noticed that it was setting it to the default value (or what is now the default, maybe it wasn't when the test was written) and therefore this would pass even if the argument was completely ignored.
Yes, I agree with both of these, thanks. I think I might merge now though so as to unblock the user (because I need to head out in a bit) and then follow these up later. Both are worth doing though. |
We were deliberately being slow in adding configuration options to avoid locking ourselves in to any regrettable APIs. However this option now has a demonstrable use case, and also doesn't really lock us in to much of anything.
Docs now look like:
Related threads:
https://bennettoxford.slack.com/archives/C01D7H9LYKB/p1732791626217789
https://bennettoxford.slack.com/archives/C07MZ5WCJV6/p1732800801777969
https://bennettoxford.slack.com/archives/C07MZ5WCJV6/p1732801544406129