-
-
Notifications
You must be signed in to change notification settings - Fork 30.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GH-86275: Implementation of hypothesis stubs for property-based tests, with zoneinfo tests #22863
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great - I'm very excited to get this running 😁
This PR is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 5 days |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2 PRs in 1. Do you consider the hypothesis_helper part to be finished and ready to merge?
I intend to add more on the issue later.
When you're done making the requested changes, leave the comment: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2 PRs in 1. Do you consider the hypothesis_helper part to be finished and ready to merge?
The content is probably ready to go, but it's still mostly a draft, so it doesn't have niceties like documentation and NEWS and whatnot. I figured I'd get something that works and allow that to be used as a base for implementing something that works.
I agree this is_sort of_ 2 PRs in one, but we don't have anything that uses hypothesis in the standard library, so I think we need something here in order to exercise the code paths. We could switch it out for a simpler set of hypothesis tests (though presumably a simpler set would exercise a smaller subset of the stubs).
If we break it into multiple PRs, we should do it at the end, immediately before merge.
argstr = ", ".join(self.__stub_args) | ||
kwargstr = ", ".join( | ||
f"{kw}={val}" for kw, val in self.__stub_kwargs.items() | ||
) | ||
|
||
in_parens = argstr | ||
if kwargstr: | ||
in_parens += ", " + kwargstr | ||
|
||
return f"{self.__qualname__}({in_parens})" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the various .map()
, .filter()
, .__or__()
methods don't alter the repr at all, I'd probably prefer that this be less specific - better to represent integers().map(str)
as <stub object at 0x...>
than integers()
!
I also don't think it's worth the code to make the repr track all the methods, unless adding a Stub.__trailing_repr
argument would work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, this should be working now. Here's a little test file:
from test.support.hypothesis_helper import hypothesis
def complement(x):
return not x
def identity(x):
return x
st = hypothesis.strategies
test_strategies = [
st.integers(),
st.floats().filter(complement),
st.lists(st.booleans).flatmap(identity),
st.characters() | st.dates().map(lambda x: x),
st.booleans() | st.uuids(),
]
for test_strategy in test_strategies:
print(repr(test_strategy))
Results:
hypothesis.strategies.integers()
hypothesis.strategies.floats().filter(complement)
hypothesis.strategies.lists().flatmap(identity)
one_of(hypothesis.strategies.characters(), hypothesis.strategies.dates().map(<lambda>))
one_of(hypothesis.strategies.booleans(), hypothesis.strategies.uuids())
@terryjreedy How does this look? |
CI is green, please do the honours! |
Thanks @hugovk @Zac-HD @erlend-aasland @terryjreedy and everyone else who helped with this! Looking forward to making more use of this in the future! |
Looks like this is causing some buildbot failures on |
* main: pythongh-91896: Fixup some docs issues following ByteString deprecation (python#104422) pythonGH-104371: check return value of calling `mv.release` (python#104417) pythongh-104415: Fix refleak tests for `typing.ByteString` deprecation (python#104416) pythonGH-86275: Implementation of hypothesis stubs for property-based tests, with zoneinfo tests (python#22863) pythonGH-103082: Filter LINE events in VM, to simplify tool implementation. (pythonGH-104387) pythongh-93649: Split gc- and allocation tests from _testcapimodule.c (pythonGH-104403) pythongh-104389: Add 'unused' keyword to Argument Clinic C converters (python#104390) pythongh-101819: Prepare _io._IOBase for module state (python#104386) pythongh-104413: Fix refleak when super attribute throws AttributeError (python#104414) Fix refleak in `super_descr_get` (python#104408) pythongh-87526: Remove dead initialization from _zoneinfo parse_abbr() (python#24700) pythongh-91896: Improve visibility of `ByteString` deprecation warnings (python#104294) pythongh-104371: Fix calls to `__release_buffer__` while an exception is active (python#104378) pythongh-104377: fix cell in comprehension that is free in outer scope (python#104394) pythongh-104392: Remove _paramspec_tvars from typing (python#104393) pythongh-104396: uuid.py to skip platform check for emscripten and wasi (pythongh-104397) pythongh-99108: Refresh HACL* from upstream (python#104401) pythongh-104301: Allow leading whitespace in disambiguated pdb statements (python#104342)
Anecdotal feedback: On the current CI run on #103764, the Hypothesis tests are the slowest part of the CI run. Seems like it's running the full test suite, maybe it should run only test_zoneinfo or whatever tests actually use Hypothesis? |
The slowest part of the CI run by a long way: they took 46 minutes to complete on #104421, whereas no other job took longer than 26 minutes. I agree with @JelleZijlstra -- changing the workflow file so that this job only runs on PRs that affect zoneinfo seems like a good idea? |
I think the issue here is that the new |
The problem with this is that there's no easy way for |
I feel like adding new hypothesis tests will be rare enough that doing it manually isn't too bad. |
I agree that that doesn't sound ideal, but honestly it doesn't sound as bad (to me) as having to wait 45 minutes for CI to complete on every PR. |
I sure hope not!
I don't really understand why this is so slow TBH. It's running the same test suite as the I think we should solve the root problems here. |
Agree that we'll hopefully add similar tests to more files! Still, pull requests that add hypothesis tests to a new file won't be terribly common, I expect.
I don't think it's just test_zoneinfo; if I'm reading the logs in https://github.com/python/cpython/actions/runs/4960065291/jobs/8875040006?pr=103764 right, zoneinfo itself completed in 10 s or so. Actually I think the issue may be that the tests are all run serially, while the normal CI run uses
|
FYI I just got a |
(To update, for anybody reading this thread in the future: the extremely long time taken by the test was fixed by #104427. It now only takes 13 minutes in CI 🎉) |
FWIW you can detect this by checking for a
Happy to, though I think that the |
No, that's a separate failure on a later commit in that PR. The test failure on the previous commit in that PR is the relevant one that I was talking about. |
@Zac-HD here's a direct link to the run where |
Oh, I see, sorry. Expect a PR to set |
One interesting thing -- and this definitely isn't due to I don't know if that helps @ericsnowcurrently debug the root cause of the crashes at all! |
FYI, gh-105109 should resolve the failures. |
This PR adds the property tests for
zoneinfo
currently maintained at https://github.com/Zac-HD/stdlib-property-tests, along with a compatibility interface forhypothesis
that gracefully degrades to simple parameterized tests.I haven't stubbed out the entire interface, but I have stubbed out a bit more than I am using as part of the
zoneinfo
tests right now (mostly the subset that I saw in use in the otherstdlib-property-tests
). We can scale down the stubbed interface if we want, and then have people add in the parts they need as it comes up. I don't think it's a good idea to proactively stub out the entirehypothesis
interface, since those things would need maintenance over time.If we like this approach, I will also add in some tooling to run the tests with hypothesis, for use in a buildbot and/or optional PR job.
CC: @Zac-HD, @cfbolz
https://bugs.python.org/issue42109