Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-86275: Implementation of hypothesis stubs for property-based tests, with zoneinfo tests #22863

Merged
merged 12 commits into from
May 12, 2023

Conversation

pganssle
Copy link
Member

@pganssle pganssle commented Oct 21, 2020

This PR adds the property tests for zoneinfo currently maintained at https://github.com/Zac-HD/stdlib-property-tests, along with a compatibility interface for hypothesis that gracefully degrades to simple parameterized tests.

I haven't stubbed out the entire interface, but I have stubbed out a bit more than I am using as part of the zoneinfo tests right now (mostly the subset that I saw in use in the other stdlib-property-tests). We can scale down the stubbed interface if we want, and then have people add in the parts they need as it comes up. I don't think it's a good idea to proactively stub out the entire hypothesis interface, since those things would need maintenance over time.

If we like this approach, I will also add in some tooling to run the tests with hypothesis, for use in a buildbot and/or optional PR job.

CC: @Zac-HD, @cfbolz

https://bugs.python.org/issue42109

Copy link
Contributor

@Zac-HD Zac-HD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great - I'm very excited to get this running 😁

Lib/test/support/_hypothesis_stubs/__init__.py Outdated Show resolved Hide resolved
@github-actions
Copy link

This PR is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 5 days

@github-actions github-actions bot added the stale Stale PR or inactive for long period of time. label Dec 16, 2020
@pganssle pganssle removed the stale Stale PR or inactive for long period of time. label Dec 16, 2020
Copy link
Member

@terryjreedy terryjreedy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 PRs in 1. Do you consider the hypothesis_helper part to be finished and ready to merge?

I intend to add more on the issue later.

Lib/test/support/_hypothesis_stubs/_helpers.py Outdated Show resolved Hide resolved
Lib/test/test_zoneinfo/test_zoneinfo_property.py Outdated Show resolved Hide resolved
@bedevere-bot
Copy link

When you're done making the requested changes, leave the comment: I have made the requested changes; please review again.

Copy link
Member Author

@pganssle pganssle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 PRs in 1. Do you consider the hypothesis_helper part to be finished and ready to merge?

The content is probably ready to go, but it's still mostly a draft, so it doesn't have niceties like documentation and NEWS and whatnot. I figured I'd get something that works and allow that to be used as a base for implementing something that works.

I agree this is_sort of_ 2 PRs in one, but we don't have anything that uses hypothesis in the standard library, so I think we need something here in order to exercise the code paths. We could switch it out for a simpler set of hypothesis tests (though presumably a simpler set would exercise a smaller subset of the stubs).

If we break it into multiple PRs, we should do it at the end, immediately before merge.

Lib/test/support/_hypothesis_stubs/_helpers.py Outdated Show resolved Hide resolved
Lib/test/test_zoneinfo/test_zoneinfo_property.py Outdated Show resolved Hide resolved
Lib/test/support/_hypothesis_stubs/__init__.py Outdated Show resolved Hide resolved
Comment on lines 9 to 18
argstr = ", ".join(self.__stub_args)
kwargstr = ", ".join(
f"{kw}={val}" for kw, val in self.__stub_kwargs.items()
)

in_parens = argstr
if kwargstr:
in_parens += ", " + kwargstr

return f"{self.__qualname__}({in_parens})"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the various .map(), .filter(), .__or__() methods don't alter the repr at all, I'd probably prefer that this be less specific - better to represent integers().map(str) as <stub object at 0x...> than integers()!

I also don't think it's worth the code to make the repr track all the methods, unless adding a Stub.__trailing_repr argument would work?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, this should be working now. Here's a little test file:

from test.support.hypothesis_helper import hypothesis

def complement(x):
    return not x

def identity(x):
    return x

st = hypothesis.strategies

test_strategies = [
    st.integers(),
    st.floats().filter(complement),
    st.lists(st.booleans).flatmap(identity),
    st.characters() | st.dates().map(lambda x: x),
    st.booleans() | st.uuids(),
]

for test_strategy in test_strategies:
    print(repr(test_strategy))

Results:

hypothesis.strategies.integers()
hypothesis.strategies.floats().filter(complement)
hypothesis.strategies.lists().flatmap(identity)
one_of(hypothesis.strategies.characters(), hypothesis.strategies.dates().map(<lambda>))
one_of(hypothesis.strategies.booleans(), hypothesis.strategies.uuids())

@cpython-cla-bot
Copy link

cpython-cla-bot bot commented Apr 25, 2023

All commit authors signed the Contributor License Agreement.
CLA signed

@pganssle pganssle changed the title bpo-42109: Implementation of hypothesis stubs for property-based tests, with zoneinfo tests GH-86275: Implementation of hypothesis stubs for property-based tests, with zoneinfo tests Apr 25, 2023
@pganssle
Copy link
Member Author

@terryjreedy How does this look?

@hugovk
Copy link
Member

hugovk commented May 11, 2023

CI is green, please do the honours!

@pganssle pganssle merged commit d50c37d into python:main May 12, 2023
@pganssle
Copy link
Member Author

Thanks @hugovk @Zac-HD @erlend-aasland @terryjreedy and everyone else who helped with this!

Looking forward to making more use of this in the future!

@AlexWaygood
Copy link
Member

AlexWaygood commented May 12, 2023

Looks like this is causing some buildbot failures on main: https://buildbot.python.org/all/#/builders/464/builds/4434/steps/7/logs/stdio

carljm added a commit to carljm/cpython that referenced this pull request May 12, 2023
* main:
  pythongh-91896: Fixup some docs issues following ByteString deprecation (python#104422)
  pythonGH-104371: check return value of calling `mv.release` (python#104417)
  pythongh-104415: Fix refleak tests for `typing.ByteString` deprecation (python#104416)
  pythonGH-86275: Implementation of hypothesis stubs for property-based tests, with zoneinfo tests (python#22863)
  pythonGH-103082: Filter LINE events in VM, to simplify tool implementation. (pythonGH-104387)
  pythongh-93649: Split gc- and allocation tests from _testcapimodule.c (pythonGH-104403)
  pythongh-104389: Add 'unused' keyword to Argument Clinic C converters (python#104390)
  pythongh-101819: Prepare _io._IOBase for module state (python#104386)
  pythongh-104413: Fix refleak when super attribute throws AttributeError (python#104414)
  Fix refleak in `super_descr_get` (python#104408)
  pythongh-87526: Remove dead initialization from _zoneinfo parse_abbr() (python#24700)
  pythongh-91896: Improve visibility of `ByteString` deprecation warnings (python#104294)
  pythongh-104371: Fix calls to `__release_buffer__` while an exception is active (python#104378)
  pythongh-104377: fix cell in comprehension that is free in outer scope (python#104394)
  pythongh-104392: Remove _paramspec_tvars from typing (python#104393)
  pythongh-104396: uuid.py to skip platform check for emscripten and wasi (pythongh-104397)
  pythongh-99108: Refresh HACL* from upstream (python#104401)
  pythongh-104301: Allow leading whitespace in disambiguated pdb statements (python#104342)
@JelleZijlstra
Copy link
Member

Anecdotal feedback: On the current CI run on #103764, the Hypothesis tests are the slowest part of the CI run.

Seems like it's running the full test suite, maybe it should run only test_zoneinfo or whatever tests actually use Hypothesis?

@AlexWaygood
Copy link
Member

Anecdotal feedback: On the current CI run on #103764, the Hypothesis tests are the slowest part of the CI run.

The slowest part of the CI run by a long way: they took 46 minutes to complete on #104421, whereas no other job took longer than 26 minutes.

I agree with @JelleZijlstra -- changing the workflow file so that this job only runs on PRs that affect zoneinfo seems like a good idea?

@JelleZijlstra
Copy link
Member

Looks like this is causing some buildbot failures on main: https://buildbot.python.org/all/#/builders/464/builds/4434/steps/7/logs/stdio

I think the issue here is that the new _hypothesis_stubs directory doesn't get picked up by make install. I see how to fix that, will send a PR.

@pganssle
Copy link
Member Author

Anecdotal feedback: On the current CI run on #103764, the Hypothesis tests are the slowest part of the CI run.

Seems like it's running the full test suite, maybe it should run only test_zoneinfo or whatever tests actually use Hypothesis?

The problem with this is that there's no easy way for unittest to automatically identify which tests are using hypothesis, and when new tests using hypothesis are added, they'd have to manually be added to this job, which everyone will forget to do. The compromise I'm currently using is to filter out the slowest non-hypothesis tests until we have some ability to do more granular filters.

@JelleZijlstra
Copy link
Member

Anecdotal feedback: On the current CI run on #103764, the Hypothesis tests are the slowest part of the CI run.
Seems like it's running the full test suite, maybe it should run only test_zoneinfo or whatever tests actually use Hypothesis?

The problem with this is that there's no easy way for unittest to automatically identify which tests are using hypothesis, and when new tests using hypothesis are added, they'd have to manually be added to this job, which everyone will forget to do. The compromise I'm currently using is to filter out the slowest non-hypothesis tests until we have some ability to do more granular filters.

I feel like adding new hypothesis tests will be rare enough that doing it manually isn't too bad.

@AlexWaygood
Copy link
Member

The problem with this is that there's no easy way for unittest to automatically identify which tests are using hypothesis, and when new tests using hypothesis are added, they'd have to manually be added to this job, which everyone will forget to do.

I agree that that doesn't sound ideal, but honestly it doesn't sound as bad (to me) as having to wait 45 minutes for CI to complete on every PR.

@pganssle
Copy link
Member Author

pganssle commented May 12, 2023

I feel like adding new hypothesis tests will be rare enough that doing it manually isn't too bad.

I sure hope not! hypothesis is extremely useful, particularly for the kind of things we do here. The only reason I haven't added them to almost every change I've added to datetime is because I couldn't. We've got a whole bunch more to be imported from here as well.

I agree that that doesn't sound ideal, but honestly it doesn't sound as bad (to me) as having to wait 45 minutes for CI to complete on every PR.

I don't really understand why this is so slow TBH. It's running the same test suite as the Ubuntu job, plus the parts of test_zoneinfo that run property tests. Is test_zoneinfo actually going slow, or does the test suite just run really slowly in a virtual environment for some reason?

I think we should solve the root problems here. unittest or python -m test should grow the ability to have "run only these tests" functionality rather than just "don't run these tests", and we should figure out why it's so slow to run these tests in CI.

@JelleZijlstra
Copy link
Member

I feel like adding new hypothesis tests will be rare enough that doing it manually isn't too bad.

I sure hope not! hypothesis is extremely useful, particularly for the kind of things we do here. The only reason I haven't added them to almost every change I've added to datetime is because I couldn't. We've got a whole bunch more to be imported from here as well.

Agree that we'll hopefully add similar tests to more files! Still, pull requests that add hypothesis tests to a new file won't be terribly common, I expect.

I agree that that doesn't sound ideal, but honestly it doesn't sound as bad (to me) as having to wait 45 minutes for CI to complete on every PR.

I don't really understand why this is so slow TBH. It's running the same test suite as the Ubuntu job, plus the parts of test_zoneinfo that run property tests. Is test_zoneinfo actually going slow, or does the test suite just run really slowly in a virtual environment for some reason?

I don't think it's just test_zoneinfo; if I'm reading the logs in https://github.com/python/cpython/actions/runs/4960065291/jobs/8875040006?pr=103764 right, zoneinfo itself completed in 10 s or so. Actually I think the issue may be that the tests are all run serially, while the normal CI run uses -j4. On the normal runs, all the tests ran in 13 min on my PR, and the hypothesis run took 41 min.

I think we should solve the root problems here. unittest or python -m test should grow the ability to have "run only these tests" functionality rather than just "don't run these tests", and we should figure out why it's so slow to run these tests in CI.

@AlexWaygood
Copy link
Member

FYI I just got a test_zoneinfo failure from the new job in #104421 (which doesn't touch anything to do with zoneinfo)

@pganssle
Copy link
Member Author

FYI I just got a test_zoneinfo failure from the new job in #104421 (which doesn't touch anything to do with zoneinfo)

Yeah I think we should tweak the hypothesis settings to make it a bit less flaky.

@Zac-HD Any interest in helping with that?

@AlexWaygood
Copy link
Member

(To update, for anybody reading this thread in the future: the extremely long time taken by the test was fixed by #104427. It now only takes 13 minutes in CI 🎉)

@Zac-HD
Copy link
Contributor

Zac-HD commented May 12, 2023

I think we should solve the root problems here. unittest or python -m test should grow the ability to have "run only these tests" functionality rather than just "don't run these tests"

FWIW you can detect this by checking for a .hypothesis attribute on the test function or method; I guess in unittest you'd create a custom TestLoader subclass which knows how to check this attribute?

I think we should tweak the hypothesis settings to make it a bit less flaky. @Zac-HD Any interest in helping with that?

Happy to, though I think that the -j 4 and stubs-install fixes have resolved this? This failure Alex mentions is a segfault in a threading test, which I don't think is fixable by configuring Hypothesis.

@AlexWaygood
Copy link
Member

AlexWaygood commented May 12, 2023

This failure Alex mentions is a segfault in a threading test, which I don't think is fixable by configuring Hypothesis.

No, that's a separate failure on a later commit in that PR. The test failure on the previous commit in that PR is the relevant one that I was talking about.

@AlexWaygood
Copy link
Member

@Zac-HD here's a direct link to the run where test_zoneinfo failed on #104421: https://github.com/python/cpython/actions/runs/4960409839/jobs/8875856040

@Zac-HD
Copy link
Contributor

Zac-HD commented May 12, 2023

Oh, I see, sorry. Expect a PR to set deadline=None globally tomorrow 🙂

@AlexWaygood
Copy link
Member

AlexWaygood commented May 13, 2023

One interesting thing -- and this definitely isn't due to hypothesis, but might be due to the specific way that the test suite is being invoked in CI -- is that test_threading and test__xxsubinterpreters seem to be crashing constantly in the new "Hypothesis Tests on Ubuntu" job in CI (see https://github.com/python/cpython/actions/runs/4966613914/jobs/8888147069?pr=104421 for another example). These test failures had been observed before (see #104341), so they definitely aren't caused by the addition of the new CI job. But on other jobs in CI, the crashes had been sporadic/intermittent; on this new job, they seem to be crashing all the time.

I don't know if that helps @ericsnowcurrently debug the root cause of the crashes at all!

@ericsnowcurrently
Copy link
Member

FYI, gh-105109 should resolve the failures.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.