[red-knot] Property tests #14178

sharkdp · 2024-11-07T22:11:25Z

Summary

This PR adds a new property_tests module with quickcheck-based tests that verify certain properties of types. The following properties are currently checked:

is_equivalent_to:
- is reflexive: T is equivalent to itself
is_subtype_of:
- is reflexive: T is a subtype of T
- is antisymmetric: if S <: T and T <: S, then S is equivalent to T
- is transitive: S <: T & T <: U => S <: U
is_disjoint_from:
- is irreflexive: T is not disjoint from T
- is symmetric: S disjoint from T => T disjoint from S
is_assignable_to:
- is reflexive
negate:
- is an involution: T.negate().negate() is equivalent to T

There are also some tests that validate higher-level properties like:

S <: T implies that S is not disjoint from T
S <: T implies that S is assignable to T
A singleton type must also be single-valued

These tests found a few bugs so far:

Some additional notes:

Quickcheck-based property tests are non-deterministic and finding counter-examples might take an arbitrary long time. This makes them bad candidates for running in CI (for every PR). We can think of running them in a cron-job way from time to time, similar to fuzzing. But for now, it's only possible to run them locally (see instructions in source code).
Some tests currently find false positive "counterexamples" because our understanding of equivalence of types is not yet complete. We do not understand that int | str is the same as str | int, for example. These tests are in a separate property_tests::flaky module.
Properties can not be formulated in every way possible, due to the fact that is_disjoint_from and is_subtype_of can produce false negative answers.
The current shrinking implementation is very naive, which leads to counterexamples that are very long (str & Any & ~tuple[Any] & ~tuple[Unknown] & ~Literal[""] & ~Literal["a"] | str & int & ~tuple[Any] & ~tuple[Unknown]), requiring the developer to simplify manually. It has not been a major issue so far, but there is a comment in the code how this can be improved.

The tests are currently implemented using a macro. This is a single commit on top which can easily be reverted, if we prefer the plain code instead. With the macro:

// `S <: T` implies that `S` can be assigned to `T`.
type_property_test!(
    subtype_of_implies_assignable_to, db,
    forall types s, t. s.is_subtype_of(db, t) => s.is_assignable_to(db, t)
);

without the macro:

/// `S <: T` implies that `S` can be assigned to `T`.
#[quickcheck]
fn subtype_of_implies_assignable_to(s: Ty, t: Ty) -> bool {
    let db = get_cached_db();

    let s = s.into_type(&db);
    let t = t.into_type(&db);

    !s.is_subtype_of(&*db, t) || s.is_assignable_to(&*db, t)
}

Test Plan

while cargo test --release -p red_knot_python_semantic --features property_tests types::property_tests; do :; done

crates/red_knot_python_semantic/src/types.rs

## Summary Minor fix to `Type::is_subtype_of` to make sure that Boolean literals are subtypes of `int`, to match runtime semantics. Found this while doing some property-testing experiments [1]. [1] #14178 ## Test Plan New unit test.

crates/red_knot_python_semantic/src/types.rs

) ## Summary Another bug found using [property testing](#14178). ## Test Plan New unit test

carljm · 2024-11-08T16:55:40Z

This is awesome! Thank you for doing this, I've been wanting to explore this. I think property testing is very well suited to testing type relation invariants, and I do think we should move forward with actually landing this.

crates/red_knot_python_semantic/Cargo.toml

carljm

This is great, thank you!

crates/red_knot_python_semantic/src/types/property_tests.rs

carljm · 2024-12-02T19:21:06Z

crates/red_knot_python_semantic/src/types/property_tests.rs

+    db,
+    t,
+    t.negate(db).negate(db).is_equivalent_to(db, t)
+);


Another useful property we could test here when we fix the issue about is_subtype_of and non-fully-static types is that a non-fully-static type never participates in subtyping. (This assumes that we add Type::is_fully_static so we can check this predicate.)

Added as a task in #14524

MichaReiser · 2024-12-02T20:09:14Z

I think you can just ignore them and they can then be run by name

sharkdp · 2024-12-02T20:20:48Z

I think you can just ignore them and they can then be run by name

That's an option, but I wanted to mark some of them as #[ignore]d for another reason as well (because they yield many false positives right now). I guess I can also just comment them out for now.. but I don't like commented out committed code :-/

MichaReiser · 2024-12-02T21:20:42Z

I'd prefer ignoring them with different reasons. Commented out code defeats the reason why we're merging the tests in the first place

## Summary Simplify tuples containing `Never` to `Never`: ```py from typing import Never def never() -> Never: ... reveal_type((1, never(), "foo")) # revealed: Never ``` I should note that mypy and pyright do *not* perform this simplification. I don't know why. There is [only one place](https://github.com/astral-sh/ruff/blob/5137fcc9c81610f687b6cb45413ef83c2c5eea73/crates/red_knot_python_semantic/src/types/infer.rs#L1477-L1484) where we use `TupleType::new` directly (instead of `Type::tuple`, which changes behavior here). This appears when creating `TypeVar` constraints, and it looks to me like it should stay this way, because we're using `TupleType` to store a list of constraints there, instead of an actual type. We also store `tuple[constraint1, constraint2, …]` as the type for the `constraint1, constraint2, …` tuple expression. This would mean that we infer a type of `tuple[str, Never]` for the following type variable constraints, without simplifying it to `Never`. This seems like a weird edge case that's maybe not worth looking further into?! ```py from typing import Never # vvvvvvvvvv def f[T: (str, Never)](x: T): pass ``` ## Test Plan - Added a new unit test. Did not add additional Markdown tests as that seems superfluous. - Tested the example above using red knot, mypy, pyright. - Verified that this allows us to remove `contains_never` from the property tests (#14178 (comment))

Cargo.lock

MichaReiser · 2024-12-03T08:16:25Z

crates/red_knot_python_semantic/src/types/property_tests.rs

+    db.lock().unwrap()
+}
+
+macro_rules! type_property_test {


I find the macro fairly hard to read. What I understand is that their primary benefit is to reduce the repetition of creating the cached db and dereferencing it.

How much faster is reusing the db over creating a new database?

Have you considered adding a snapshot method to the database instead of locking it? Databases are thread safe and can be shared (for as long as you don't update the db with a &mut Db reference

pub fn snapshot(&self) -> Self { Self { storage: self.storage.clone(), files: self.files.snapshot(), system: self.system.clone(), vendored: self.vendored.clone(), events: Arc::clone(&self.events), } }

A test then becomes

#[test] #[quickcheck_macros::quickcheck] fn equivalent_to_is_reflexive2(t: crate::types::tests::Ty) -> bool { let db = get_cached_db(); let t = t.into_type(&db); t.is_equivalent_to(&db, t) }

Which I don't find too bad

How much faster is reusing the db over creating a new database?

Three orders of magnitude. It's essential. I found some bugs only after I added this optimization.

Have you considered adding a snapshot method to the database instead of locking it

I have not. Would the benefit be to get rid of the let db = &*db;, or do you think that would be faster?

I find the macro fairly hard to read

Because macros are hard to read, or because this particular macro is hard to read? 😄

I don't disagree, but it's also not like we're hiding lots of complexity in there. It's really just a boilerplate-reduction macro. I personally care much more about the readability of the individual tests. The macro is ugly, yes. But I won't have to read the macro code again when writing new tests.

A test then becomes [...] Which I don't find too bad

Well, you picked the easiest example there ;-). With just one type and no logical implication. Did you see the macro-vs-non-macro comparison in the PR description? Or this example, which has three types:

#[quickcheck] fn subtype_of_is_transitive(s: Ty, t: Ty, u: Ty) -> bool { let db = get_cached_db(); let db = &*db; let s = s.into_type(db); let t = t.into_type(db); let u = u.into_type(db); !(s.is_subtype_of(db, t) && t.is_subtype_of(db, u)) || s.is_subtype_of(db, u) }

type_property_test!( subtype_of_is_transitive, db, (s, t, u), s.is_subtype_of(db, t) && t.is_subtype_of(db, u) => s.is_subtype_of(db, u) );

their primary benefit is to reduce the repetition of creating the cached db and dereferencing it

It also removes the necessity for lots of let t = t.into_type(db); lines, and allows us to write logical implications using premise => conclusion instead of !(premise) || (conclusion).

Let me know what you think. I'm going to be okay if we remove it again 😄.

I have not. Would the benefit be to get rid of the let db = &*db;, or do you think that would be faster?

It would allow you to remove the mutex guard and the deref (you still need to use &db when calling any function)

Because macros are hard to read, or because this particular macro is hard to read? 😄

It's somewhat both. I had to expand the macro to understand what the different parts map to. Like what's up with the (s, t, u)? which part is even the assertion? Some extra documentation might help.

Did you see the macro-vs-non-macro comparison in the PR description? Or this example, which has three types:

I admit, I did not! I see how it simplifies the code a bit, but it doesn't significantly reduce the number of lines.

Could we implement into_type for tuples so that you can keep writing (s, t, u) = (s, t, u).into_types(db)?

It also removes the necessity for lots of let t = t.into_type(db); lines, and allows us to write logical implications using premise => conclusion instead of !(premise) || (conclusion).

I do like that. Maybe we can limit the macro part to just that?

I'm not opposed to keeping the macros. I just had to expand them, and I still spent about 5 minutes trying to figure out what was going on. But I'm also not sure if it's worth it if you spend more time on it.

Like what's up with the (s, t, u)? which part is even the assertion?

These are valid concerns. I now pushed another commit where I change the syntax to the following:

type_property_test!( subtype_of_implies_assignable_to, db, forall types s, t. s.is_subtype_of(db, t) => s.is_assignable_to(db, t) );

The intention is that it reads like a mathematical property ∀ S, T. ….

I also simplified the macro definition (only two clauses instead of four), clarified the variable names and added some documentation.

I kind of like how this looks, and would merge it for now. I'm happy to revisit it though if I get more feedback on it (from you or others).

It would allow you to remove the mutex guard and the deref (you still need to use &db when calling any function)

I'll note it down as a task for me (low priority), thanks.

Thanks. I like this much more!

* main: [`ruff`] Extend unnecessary-regular-expression to non-literal strings (`RUF055`) (#14679) Minor followups to RUF052 (#14755) [red-knot] Property tests (#14178) [red-knot] `is_subtype_of` fix for `KnownInstance` types (#14750) Improve docs for flake8-use-pathlib rules (#14741) [`ruff`] Implemented `used-dummy-variable` (`RUF052`) (#14611) [red-knot] Simplify tuples containing `Never` (#14744) Possible fix for flaky file watching test (#14543) [`flake8-import-conventions`] Improve syntax check for aliases supplied in configuration for `unconventional-import-alias (ICN001)` (#14745) [red-knot] Deeper understanding of `LiteralString` (#14649) red-knot: support narrowing for bool(E) (#14668) [`refurb`] Handle non-finite decimals in `verbose-decimal-constructor (FURB157)` (#14596) [red-knot] Re-enable linter corpus tests (#14736)

sharkdp mentioned this pull request Nov 7, 2024

[red-knot] Minor: fix Literal[True] <: int #14177

Merged

AlexWaygood added the red-knot Multi-file analysis & type inference label Nov 7, 2024

sharkdp force-pushed the david/property-tests branch 2 times, most recently from c1f05ab to 634b4f6 Compare November 7, 2024 22:17

sharkdp commented Nov 7, 2024

View reviewed changes

crates/red_knot_python_semantic/src/types.rs Outdated Show resolved Hide resolved

This comment was marked as off-topic.

Sign in to view

sharkdp commented Nov 7, 2024

View reviewed changes

crates/red_knot_python_semantic/src/types.rs Outdated Show resolved Hide resolved

This was referenced Nov 8, 2024

[red-knot] Fix intersection simplification for ~Any/~Unknown #14195

Merged

[red-knot] Fix is_assignable_to for unions #14196

Merged

sharkdp added a commit that referenced this pull request Nov 8, 2024

[red-knot] Fix intersection simplification for ~Any/~Unknown (#14195

670f958

) ## Summary Another bug found using [property testing](#14178). ## Test Plan New unit test

sharkdp force-pushed the david/property-tests branch from c1086b8 to 87c67df Compare November 9, 2024 19:21

sharkdp force-pushed the david/property-tests branch 3 times, most recently from 40de0e3 to 207eb44 Compare December 2, 2024 14:45

sharkdp commented Dec 2, 2024

View reviewed changes

crates/red_knot_python_semantic/Cargo.toml Outdated Show resolved Hide resolved

sharkdp force-pushed the david/property-tests branch 2 times, most recently from 7ccc8c6 to 6c44442 Compare December 2, 2024 15:42

sharkdp marked this pull request as ready for review December 2, 2024 15:43

sharkdp requested review from carljm and AlexWaygood as code owners December 2, 2024 15:43

This comment was marked as resolved.

Sign in to view

carljm approved these changes Dec 2, 2024

View reviewed changes

sharkdp mentioned this pull request Dec 2, 2024

[red-knot] Simplify tuples containing Never #14744

Merged

[red-knot] Add property tests for Type

b58fdc6

Use a type_property_test macro

838f68b

sharkdp force-pushed the david/property-tests branch from 6c44442 to 92ee882 Compare December 3, 2024 07:55

Move tests to submodules

89c4c07

sharkdp force-pushed the david/property-tests branch from 92ee882 to 89c4c07 Compare December 3, 2024 08:07

MichaReiser reviewed Dec 3, 2024

View reviewed changes

No default features

e1c091a

MichaReiser approved these changes Dec 3, 2024

View reviewed changes

Macro: documentation, simplification, clearer syntax

2034e2d

sharkdp merged commit 7430900 into main Dec 3, 2024
21 checks passed

sharkdp deleted the david/property-tests branch December 3, 2024 12:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[red-knot] Property tests #14178

[red-knot] Property tests #14178

sharkdp commented Nov 7, 2024 •

edited

Loading

This comment was marked as off-topic.

carljm commented Nov 8, 2024

This comment was marked as resolved.

carljm left a comment

carljm Dec 2, 2024

sharkdp Dec 3, 2024

MichaReiser commented Dec 2, 2024

sharkdp commented Dec 2, 2024

MichaReiser commented Dec 2, 2024

MichaReiser Dec 3, 2024

sharkdp Dec 3, 2024

MichaReiser Dec 3, 2024

sharkdp Dec 3, 2024 •

edited

Loading

MichaReiser Dec 3, 2024

[red-knot] Property tests #14178

[red-knot] Property tests #14178

Conversation

sharkdp commented Nov 7, 2024 • edited Loading

Summary

Test Plan

This comment was marked as off-topic.

carljm commented Nov 8, 2024

This comment was marked as resolved.

carljm left a comment

Choose a reason for hiding this comment

carljm Dec 2, 2024

Choose a reason for hiding this comment

sharkdp Dec 3, 2024

Choose a reason for hiding this comment

MichaReiser commented Dec 2, 2024

sharkdp commented Dec 2, 2024

MichaReiser commented Dec 2, 2024

MichaReiser Dec 3, 2024

Choose a reason for hiding this comment

sharkdp Dec 3, 2024

Choose a reason for hiding this comment

MichaReiser Dec 3, 2024

Choose a reason for hiding this comment

sharkdp Dec 3, 2024 • edited Loading

Choose a reason for hiding this comment

MichaReiser Dec 3, 2024

Choose a reason for hiding this comment

sharkdp commented Nov 7, 2024 •

edited

Loading

sharkdp Dec 3, 2024 •

edited

Loading