-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a property-based test using Hypothesis & Hypothesmith #1566
Conversation
This is a great tool, thank you! I'm not sure a normal unit test is a good place for this sort of test though. For one, it's currently failing, so we have to first fix the bugs you linked. As I understand it, it's also partly random, which is great for finding new edge cases, but means that people submitting PRs or running tests locally may see tests failing randomly because we happened to find a previously undetected bug. |
Glad you like it! There are definitely some questions about how you want to use it, but I thought that opening a PR would give us a more concrete starting point for discussion than an issue 😄
Yeah... the obvious options are (1) don't merge it yet, (2)
Using the 'Sleeper' test failures are pretty rare though - upstream in Hypothesis we might have two or three per year over a few thousand tests. Once property-based tests are passing they don't tend to discover more failures, but you can easily run them in a loop overnight if that's a concern and better tooling for this is planned. |
Thanks for the PR! I recently was introduced to Hypothesis and Hypothesmith while contributing to flake8-bugbear and I really like it! I do have a bit of concerns though, here's my two cents FWIW. As a regular contributor, I second JelleZijlstra's slight apprehension about having Hypothesmith as a normal unit test. Since the flow we somewhat document on creating a new PR on the Black repository includes running all unit tests, quite a few PR contributors would hit the Hypothesmith test failing (from looking how many build jobs failed on this PR). Tests are usually designed to either make sure something works as expected or a bug wasn't reintroduced, i.e. making sure stuff still works. While Hypothesmith pushes the code further and further so it can find unseen bugs. Regular unit tests and Hypothesmith tests are similar, but have different enough goals that they have different appropriate reactions to when they fail. This would add more complexity for a new contributor or a drive-by minor contributor since they might not be familiar enough with project and/or Hypothesmith to know/understand what to do next. Experienced contributor can deal with this new test once they quickly get up to speed with it, but means contributing to Black has a bit more friction. Although, good documentation on this special test in Black's CONTRIBUTING.md could go a long way on making this easier for newer contributor. Another problem for new / newer contributor is that even if they submit a simple typo fix PR fully through the web interface, there is a non-small chance that the checks will fail due to luck. And if they aren't familiar with Hypothesmith or CI in general, they will probably end up confused. A red X isn't nice to see, and definitely not for newer contributors. Not the greatest experience for someone joining Black, the PSF org, or even open source (and on this front, Black can definitely do better). TL;DR is that this addition complicates the path of contributing to Black, mostly for newer and less experienced contributors. And no, I still want to see Hypothesis and Hypothesmith bring better testing to Black, but as you have pointed out, this is merely the start of a discussion :) |
Thanks for your comment @ichard26! I think you're suggesting that we
which makes sense to me. |
Thanks @Zac-HD! +1 on separate test here. Maybe only on Ubuntu too and I’ll be happy to merge! |
@cooperlees - how will you want to run this? A new |
I feel a new fuzz_tests.yml GitHub Action workflow only on ubuntu unless you see value testing in different OS's was what I envisioned
Maybe even run it with coverage to see how much of the code the fuxx tests hit for some signal too?
https://github.com/psf/black/blob/master/.github/workflows/test.yml#L28 |
15307de
to
0dc7694
Compare
This looks great, will merge if I can rebase after @ambv’s large magic trailing comma rewrite! |
Thanks! ✨ 🍰 ✨ |
At PyCon 2019, I gave a few talks about property-based testing, and also spoke to @ambv about generating test inputs for Black. There's a long history of this working really well for compilers (e.g. CSmith), and for formatting tools like prettier/prettier#371.
So starting at the sprints I wrote Hypothesmith to generate Python source code, reported a bunch of bugs, and eventually after some more upgrades thought that it might be useful to test
black
in its own test suite as well as my downstream code. So here we are!This test will fail due to #970, #1012, #1557, and possibly other bugs. While Hypothesis will present (mostly) minimal failing inputs and can report multiple bugs based on exception type and location, you can expect to see mostly trivial or who-would-ever-write-that failures for a while. However, fixing those edge cases will allow Hypothesmith to explore elsewhere and find more subtle bugs.