Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add covering inputs as @example(...)s in the code #13

Closed
Zac-HD opened this issue Oct 31, 2022 · 4 comments · Fixed by #25
Closed

Add covering inputs as @example(...)s in the code #13

Zac-HD opened this issue Oct 31, 2022 · 4 comments · Fixed by #25
Assignees

Comments

@Zac-HD
Copy link
Owner

Zac-HD commented Oct 31, 2022

Here's a neat workflow, combining the benefits for PBT and fuzzing with deterministic tests:

  1. Use the fuzzer to find a reasonably diverse set of covering examples (already works)
  2. Automatically edit them into the code as explicit @example(...) cases (this issue!)
  3. Run your standard CI with only full-explicit deterministic examples (already works; see also GH-86275: Implementation of hypothesis stubs for property-based tests, with zoneinfo tests python/cpython#22863)

So what will it take to have automatically-maintained explicit examples? Some quick notes:

  • This only works for test cases which can be written using the @example() decorator, which rules out stateful tests or those using st.data(). We'll also have trouble with reprs that can't be eval'd back to an equivalent object - we might get a short distance by representing objects from st.builds() as the result of the call (also useful for Explaining failing examples - by showing which arguments (don't) matter HypothesisWorks/hypothesis#3411), but this seems like a fundamental limitation.
  • We need to know where the test is, and how to insert the decorator. Introspection works, albeit with some pretty painful edge cases we'll need to bail out on, and I think LibCST should make the latter pretty easy - we can construct a string call, attempt to parse it, and then insert it into the decorator list.
  • My preferred UX for this is "HypoFuzz dumps a <hash>.patch file and the user does git apply ...". We can dump the file on disk, and also make it downloadable from the dashboard for remote use. The patch shouldn't be too ugly, e.g. one line per arg, but users are expected to run their choice of autoformatter.
  • I mentioned "automatically-maintained": it'd be nice to remove previously-covering examples when the set updates; or crucial if we haven't shrunk to a minimal covering example (and currently we don't!). This probably means using magic comments to distinguish human-added examples from machine-maintained covering examples. Note that fuzzer-discovered minimal failing examples might be automatically added to the former set!

This seems fiddly, but not actually that hard - we already report covering examples on the dashboard, after all. No timeline on when I'll get to this, but I'd be very happy to provide advice and code review to anyone interested in contributing 🙂

@mentalisttraceur
Copy link

mentalisttraceur commented Nov 15, 2022

So if I understand correctly, the big-picture benefit is to work better with graceful degradation of hypothesis tests to just plain unit tests with hard-coded cases?

By automatically providing those hard-coded cases in a form accessible to stub implementations of the hypothesis APIs which want to stay so simple that they can't even read the example database?

(Besides working around any hassle with making the example database available everywhere, which to me seems better solved by providing a separate solution with great UX for hosting/distributing/syncing the example database.)

@Zac-HD
Copy link
Owner Author

Zac-HD commented Nov 15, 2022

Smoothly stepping down to a parametrized unit-test, yes. For some users (e.g. CPython) this is so that they can use a stub implementation; others might be happy to use Hypothesis' phases setting which already supports this!

At scale, say 100+ people and 1M+ lines of code, it can be really important to have fast and deterministic CI because you're testing for regressions rather than bugs via that workflow above, and can run a separate bug-hunting program 'alongside' your CI system. Otherwise I agree that sharing the example database is almost always going to be a better and easier solution, via e.g. Hypothesis' RedisExampleDatabase or just writing your own trivial wrapper around whatever datastore you like.

@Zac-HD
Copy link
Owner Author

Zac-HD commented Apr 26, 2023

Turns out that HypothesisWorks/hypothesis#3631 is a rather nice feature for Hypothesis itself, so we'll ship almost all the code upstream and HypoFuzz can just call into the hooks I left😁

@Zac-HD
Copy link
Owner Author

Zac-HD commented May 31, 2023

OK, we've shipped all the upstreamable internals in Hypothesis - including removal of @example()s with a specific tag - and I have a MVP branch up at https://github.com/Zac-HD/hypofuzz/compare/write-patches. Further notes:

  • Do we want explain mode for covering examples? On one hand, the # or any other generated value comments are actually pretty nice; on the other it can be a big of a performance hog at the moment. The branch doesn't included these at the moment.
  • We should probably offer both a failing-examples-only and a failing-and-covering-examples patch.
  • Dashboard interface: I'm imagining some fairly small links between the main-page chart and the table of details, to download each patch. It might also be nice to have a view-patch-in-browser option, not sure that's worth it but if we do it should have syntax highlighting.
  • Do we want per-test patches accessible from their pages? Would have to compute these on demand, which come to think of it we should do for all the patches to save on CPU time.
  • We want to serve the "long-running CI job which uploads a patch file" niche, and I'd prefer to have an atexit handler rather than saving every X minutes. To make uploading patch files easier, we can copy to a canonical .hypothesis/patches/latest-{covering,failing}.patch location. To support naive use of timeout, we'll also need to register a handler for sigterm. Finally, of course, we'll need to document the full GitHub Actions configuration we recommend.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants