Add tests for all kinds of assignment statement #10786

dhruvmanila · 2024-04-05T04:24:30Z

Summary

This PR adds test cases for assignment, augmented assignment and annotated assignment statement.

dhruvmanila · 2024-04-05T04:25:18Z

crates/ruff_python_parser/src/parser/statement.rs

-                if self.eat(TokenKind::Equal) {
+                if self.at(TokenKind::Equal) {
                    Stmt::Assign(self.parse_assign_statement(parsed_expr, start))
-                } else if self.eat(TokenKind::Colon) {
+                } else if self.at(TokenKind::Colon) {


I think it makes more sense that the parsing function starts with the expected token. It's also consistent with how other parsing function works.

dhruvmanila · 2024-04-05T04:27:33Z

crates/ruff_python_parser/src/parser/statement.rs

-        let mut value = self.parse_expression_list();

-        if self.at(TokenKind::Equal) {
-            self.parse_list(RecoveryContextKind::AssignmentTargets, |parser| {
-                parser.bump(TokenKind::Equal);
+        // test_err assign_stmt_missing_rhs
+        // x =
+        // 1 + 1
+        // x = y =
+        // 2 + 2
+        // x = = y
+        // 3 + 3
+        self.parse_list(RecoveryContextKind::AssignmentTargets, |parser| {
+            parser.bump(TokenKind::Equal);
+
+            // test_err_assign_stmt_keyword_target
+            // a = pass = c
+            // a + b
+            // a = b = pass = c
+            // a + b
+            targets.push(parser.parse_star_expression_list().expr);
+        });

-                let mut parsed_expr = parser.parse_expression_list();
+        // test_err assign_stmt_invalid_starred_expr
+        // x = *a and b
+        // x = *yield x
+        // x = *lambda x: x

-                std::mem::swap(&mut value, &mut parsed_expr);
+        // The last target is the value of the assignment statement.
+        // SAFETY: There is at least one element in the vector which is the parameter `target`.
+        let value = targets.pop().unwrap();


Consuming the = token in the parsing function helps us remove the memory swap logic. What we do now is collect all the expression assuming that they are assignment targets and at the end pop one from the end which will become the value. Then, we'll validate the targets.

dhruvmanila · 2024-04-05T04:29:30Z

crates/ruff_python_parser/src/parser/statement.rs

+    /// Validate that the given expression is a valid assignment target.
+    ///
+    /// If the expression is a list or tuple, then validate each element in the list.
+    /// If it's a starred expression, then validate the value of the starred expression.
+    ///
+    /// Report an error for each invalid assignment expression found.
+    pub(super) fn validate_assignment_target(&mut self, expr: &Expr) {
+        match expr {
+            Expr::Starred(ast::ExprStarred { value, .. }) => self.validate_assignment_target(value),
+            Expr::List(ast::ExprList { elts, .. }) | Expr::Tuple(ast::ExprTuple { elts, .. }) => {
+                for expr in elts {
+                    self.validate_assignment_target(expr);
+                }
+            }
+            Expr::Name(_) | Expr::Attribute(_) | Expr::Subscript(_) => {}
+            _ => self.add_error(ParseErrorType::InvalidAssignmentTarget, expr.range()),
+        }
+    }


This now matches the behavior of CPython.

Earlier, for list, tuples and starred expression, we'd highlight the entire expression as being considered invalid for an assignment target which is incorrect. For example, the highlighted range below is incorrect:

38 | ... = 42 39 | *foo() = 42 40 | [x, foo(), y] = [42, 42, 42] | ^^^^^^^^^^^^^ Syntax Error: invalid assignment target 41 | [[a, b], [[42]], d] = [[1, 2], [[3]], 4] 42 | (x, foo(), y) = (42, 42, 42)

A list is allowed as an assignment target but it's one or more of the elements inside the list which is an invalid target. Same goes for tuples and starred expression. For example,

| 1 | "abc": str = "def" 2 | [123, 456]: (int, int) = (1, 2) | ^^^ Syntax Error: invalid assignment target | | 1 | "abc": str = "def" 2 | [123, 456]: (int, int) = (1, 2) | ^^^ Syntax Error: invalid assignment target

Moving the logic in the Parser struct also avoids allocating a vector of ParseError and instead we can just add it directly.

codspeed-hq · 2024-04-05T04:30:58Z

CodSpeed Performance Report

Merging #10786 will degrade performances by 8.21%

_{Comparing dhruv/assign-stmt (5b99e0e) with dhruv/parser (22d70b0)}

Summary

⚡ 1 improvements
❌ 2 regressions
✅ 27 untouched benchmarks

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Benchmarks breakdown

	Benchmark	`dhruv/parser`	`dhruv/assign-stmt`	Change
❌	`parser[large/dataset.py]`	26.5 ms	28.8 ms	-7.77%
❌	`parser[pydantic/types.py]`	10.8 ms	11.8 ms	-8.21%
⚡	`parser[unicode/pypinyin.py]`	1.7 ms	1.6 ms	+5.38%

dhruvmanila · 2024-04-05T04:32:18Z

crates/ruff_python_parser/src/parser/statement.rs

+        let value = if self.eat(TokenKind::Equal) {
+            if self.at_expr() {
+                Some(Box::new(self.parse_expression_list().expr))
+            } else {
+                // test_err ann_assign_stmt_missing_rhs
+                // x: int =
+                self.add_error(
+                    ParseErrorType::OtherError("Expected an expression".to_string()),
+                    self.current_token_range(),
+                );
+                None
+            }
+        } else {
+            None
+        };


I'm adopting this pattern wherever I can. This is to avoid adding an invalid expression node in the AST where None is feasible.

MichaReiser · 2024-04-05T06:43:01Z

Hmm, can you double check if the codspeed regression is real?

crates/ruff_python_parser/src/parser/statement.rs

MichaReiser

The inline tests make reviewing so much easier. Thanks for implementing them.

github-actions · 2024-04-05T12:06:12Z

`ruff-ecosystem` results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

✅ ecosystem check detected no linter changes.

Formatter (stable)

✅ ecosystem check detected no format changes.

Formatter (preview)

✅ ecosystem check detected no format changes.

dhruvmanila · 2024-04-09T11:38:22Z

Regarding the regression, I made one change which removed the regression locally but it's still showing up in Codspeed. This change is to take the list parsing path only if there are more than one assignment target. This is in a9b362e commit.

Here's local benchmarks:

Here, the baseline is dhruv/parser and current build is dhruv/assign-stmt without the a9b362e commit. It can be seen that the performance has indeed regressed:

❯ cargo bench -p ruff_benchmark --bench parser --profile=profiling -- --baseline=parser
    Finished profiling [optimized + debuginfo] target(s) in 21.50s
     Running benches/parser.rs (target/profiling/deps/parser-d8622fac03063ec1)
parser/numpy/globals.py time:   [9.0928 µs 9.0953 µs 9.0979 µs]
                        thrpt:  [324.32 MiB/s 324.42 MiB/s 324.51 MiB/s]
                 change:
                        time:   [+4.2718% +4.3901% +4.5088%] (p = 0.00 < 0.05)
                        thrpt:  [-4.3143% -4.2055% -4.0968%]
                        Performance has regressed.
Found 9 outliers among 100 measurements (9.00%)
  4 (4.00%) high mild
  5 (5.00%) high severe
parser/unicode/pypinyin.py
                        time:   [34.544 µs 34.558 µs 34.572 µs]
                        thrpt:  [121.54 MiB/s 121.59 MiB/s 121.64 MiB/s]
                 change:
                        time:   [+1.7319% +2.1667% +2.4890%] (p = 0.00 < 0.05)
                        thrpt:  [-2.4286% -2.1207% -1.7025%]
                        Performance has regressed.
Found 5 outliers among 100 measurements (5.00%)
  1 (1.00%) low severe
  1 (1.00%) high mild
  3 (3.00%) high severe
parser/pydantic/types.py
                        time:   [257.20 µs 257.24 µs 257.28 µs]
                        thrpt:  [99.125 MiB/s 99.142 MiB/s 99.158 MiB/s]
                 change:
                        time:   [+2.1911% +2.3220% +2.4558%] (p = 0.00 < 0.05)
                        thrpt:  [-2.3969% -2.2693% -2.1441%]
                        Performance has regressed.
Found 10 outliers among 100 measurements (10.00%)
  2 (2.00%) high mild
  8 (8.00%) high severe
parser/numpy/ctypeslib.py
                        time:   [110.57 µs 110.59 µs 110.62 µs]
                        thrpt:  [150.52 MiB/s 150.56 MiB/s 150.60 MiB/s]
                 change:
                        time:   [+1.0194% +1.1439% +1.2748%] (p = 0.00 < 0.05)
                        thrpt:  [-1.2588% -1.1310% -1.0091%]
                        Performance has regressed.
Found 11 outliers among 100 measurements (11.00%)
  4 (4.00%) high mild
  7 (7.00%) high severe
parser/large/dataset.py time:   [656.06 µs 656.14 µs 656.22 µs]
                        thrpt:  [61.996 MiB/s 62.004 MiB/s 62.011 MiB/s]
                 change:
                        time:   [+2.1560% +2.2464% +2.3389%] (p = 0.00 < 0.05)
                        thrpt:  [-2.2854% -2.1971% -2.1105%]
                        Performance has regressed.
Found 8 outliers among 100 measurements (8.00%)
  3 (3.00%) high mild
  5 (5.00%) high severe

Here, the baseline is dhruv/assign-stmt without a9b362e and the current build is dhruv/assign-stmt with the a9b362e commit. It can be seen that the performance has indeed improved:

❯ cargo bench -p ruff_benchmark --bench parser --profile=profiling -- --baseline=orig-assign-stmt
    Finished profiling [optimized + debuginfo] target(s) in 21.43s
     Running benches/parser.rs (target/profiling/deps/parser-d8622fac03063ec1)
parser/numpy/globals.py time:   [8.6279 µs 8.6297 µs 8.6316 µs]
                        thrpt:  [341.84 MiB/s 341.92 MiB/s 341.99 MiB/s]
                 change:
                        time:   [-4.7410% -4.6397% -4.5339%] (p = 0.00 < 0.05)
                        thrpt:  [+4.7492% +4.8654% +4.9769%]
                        Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
  2 (2.00%) low mild
  4 (4.00%) high severe
parser/unicode/pypinyin.py
                        time:   [34.073 µs 34.080 µs 34.087 µs]
                        thrpt:  [123.27 MiB/s 123.30 MiB/s 123.32 MiB/s]
                 change:
                        time:   [-0.7874% -0.6685% -0.5517%] (p = 0.00 < 0.05)
                        thrpt:  [+0.5547% +0.6730% +0.7936%]
                        Change within noise threshold.
Found 7 outliers among 100 measurements (7.00%)
  1 (1.00%) low mild
  2 (2.00%) high mild
  4 (4.00%) high severe
parser/pydantic/types.py
                        time:   [248.44 µs 248.48 µs 248.54 µs]
                        thrpt:  [102.61 MiB/s 102.64 MiB/s 102.65 MiB/s]
                 change:
                        time:   [-3.1474% -3.0314% -2.9278%] (p = 0.00 < 0.05)
                        thrpt:  [+3.0161% +3.1262% +3.2497%]
                        Performance has improved.
Found 11 outliers among 100 measurements (11.00%)
  4 (4.00%) high mild
  7 (7.00%) high severe
parser/numpy/ctypeslib.py
                        time:   [109.71 µs 109.75 µs 109.83 µs]
                        thrpt:  [151.61 MiB/s 151.72 MiB/s 151.78 MiB/s]
                 change:
                        time:   [-0.7424% -0.6053% -0.4651%] (p = 0.00 < 0.05)
                        thrpt:  [+0.4672% +0.6090% +0.7479%]
                        Change within noise threshold.
Found 14 outliers among 100 measurements (14.00%)
  7 (7.00%) high mild
  7 (7.00%) high severe
parser/large/dataset.py time:   [632.69 µs 632.90 µs 633.14 µs]
                        thrpt:  [64.256 MiB/s 64.280 MiB/s 64.301 MiB/s]
                 change:
                        time:   [-3.1404% -3.0469% -2.9653%] (p = 0.00 < 0.05)
                        thrpt:  [+3.0559% +3.1426% +3.2422%]
                        Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
  3 (3.00%) high mild
  2 (2.00%) high severe

And for clarity, here the baseline is dhruv/parser and current build is dhruv/assign-stmt with the a9b362e commit. There are no visible regressions:

❯ cargo bench -p ruff_benchmark --bench parser --profile=profiling -- --baseline=parser
    Finished profiling [optimized + debuginfo] target(s) in 21.37s
     Running benches/parser.rs (target/profiling/deps/parser-d8622fac03063ec1)
parser/numpy/globals.py time:   [8.6349 µs 8.6366 µs 8.6384 µs]
                        thrpt:  [341.58 MiB/s 341.65 MiB/s 341.72 MiB/s]
                 change:
                        time:   [-0.9459% -0.8310% -0.7193%] (p = 0.00 < 0.05)
                        thrpt:  [+0.7245% +0.8379% +0.9550%]
                        Change within noise threshold.
Found 8 outliers among 100 measurements (8.00%)
  2 (2.00%) high mild
  6 (6.00%) high severe
parser/unicode/pypinyin.py
                        time:   [33.984 µs 33.989 µs 33.995 µs]
                        thrpt:  [123.60 MiB/s 123.62 MiB/s 123.64 MiB/s]
                 change:
                        time:   [+0.1451% +0.5866% +0.9046%] (p = 0.00 < 0.05)
                        thrpt:  [-0.8965% -0.5832% -0.1449%]
                        Change within noise threshold.
Found 8 outliers among 100 measurements (8.00%)
  5 (5.00%) high mild
  3 (3.00%) high severe
parser/pydantic/types.py
                        time:   [251.89 µs 251.94 µs 251.99 µs]
                        thrpt:  [101.21 MiB/s 101.23 MiB/s 101.25 MiB/s]
                 change:
                        time:   [+0.0362% +0.1804% +0.3243%] (p = 0.01 < 0.05)
                        thrpt:  [-0.3232% -0.1801% -0.0362%]
                        Change within noise threshold.
Found 7 outliers among 100 measurements (7.00%)
  2 (2.00%) high mild
  5 (5.00%) high severe
parser/numpy/ctypeslib.py
                        time:   [108.39 µs 108.63 µs 108.99 µs]
                        thrpt:  [152.78 MiB/s 153.28 MiB/s 153.63 MiB/s]
                 change:
                        time:   [-1.1797% -0.9915% -0.7578%] (p = 0.00 < 0.05)
                        thrpt:  [+0.7636% +1.0014% +1.1938%]
                        Change within noise threshold.
Found 3 outliers among 100 measurements (3.00%)
  2 (2.00%) high mild
  1 (1.00%) high severe
parser/large/dataset.py time:   [634.40 µs 634.60 µs 634.82 µs]
                        thrpt:  [64.086 MiB/s 64.108 MiB/s 64.128 MiB/s]
                 change:
                        time:   [-1.1884% -1.0887% -0.9926%] (p = 0.00 < 0.05)
                        thrpt:  [+1.0025% +1.1007% +1.2027%]
                        Change within noise threshold.
Found 3 outliers among 100 measurements (3.00%)
  1 (1.00%) high mild
  2 (2.00%) high severe

I ran hyperfine benchmarks as well on CPython repository which gave me the same results.

This reverts commit 0011520.

## Summary This PR adds test cases for assignment, augmented assignment and annotated assignment statement.

dhruvmanila added parser Related to the parser testing Related to testing Ruff itself labels Apr 5, 2024

dhruvmanila commented Apr 5, 2024

View reviewed changes

dhruvmanila force-pushed the dhruv/type-params branch from f0266fe to 1acb167 Compare April 5, 2024 04:38

dhruvmanila force-pushed the dhruv/assign-stmt branch from a516f8f to 6f9ab08 Compare April 5, 2024 04:38

dhruvmanila marked this pull request as ready for review April 5, 2024 04:38

dhruvmanila requested a review from MichaReiser as a code owner April 5, 2024 04:38

MichaReiser approved these changes Apr 5, 2024

View reviewed changes

crates/ruff_python_parser/src/parser/statement.rs Show resolved Hide resolved

MichaReiser reviewed Apr 5, 2024

View reviewed changes

dhruvmanila force-pushed the dhruv/assign-stmt branch from 6f9ab08 to b17f370 Compare April 5, 2024 11:00

dhruvmanila force-pushed the dhruv/type-params branch 3 times, most recently from 1799d92 to be7fd69 Compare April 6, 2024 17:13

dhruvmanila force-pushed the dhruv/assign-stmt branch from 4f2da31 to 3c033ec Compare April 6, 2024 17:40

dhruvmanila force-pushed the dhruv/type-params branch from be7fd69 to 23cb399 Compare April 7, 2024 12:57

dhruvmanila force-pushed the dhruv/assign-stmt branch from 3c033ec to 567aa7a Compare April 7, 2024 13:33

dhruvmanila force-pushed the dhruv/type-params branch from 23cb399 to cf2e597 Compare April 9, 2024 08:50

Base automatically changed from dhruv/type-params to dhruv/parser April 9, 2024 08:52

dhruvmanila added 4 commits April 9, 2024 14:23

Add tests for all kinds of assignment statement

9ec1d48

Delete unreferenced snapshots

4ee684c

Add explicit assert for assign statement

82567f7

Add more test cases, validate annotated assignment target

19f5d9f

dhruvmanila force-pushed the dhruv/assign-stmt branch from 567aa7a to 19f5d9f Compare April 9, 2024 09:03

Avoid list parsing for common case of single assign target

a9b362e

dhruvmanila mentioned this pull request Apr 9, 2024

Investigate list parsing for possible performance improvement #10847

Open

dhruvmanila added 2 commits April 9, 2024 18:13

Inline parse_expression_list, parse_star_expression_list

0011520

Revert "Inline parse_expression_list, parse_star_expression_list"

5b99e0e

This reverts commit 0011520.

dhruvmanila merged commit b7b5117 into dhruv/parser Apr 9, 2024
16 of 17 checks passed

dhruvmanila deleted the dhruv/assign-stmt branch April 9, 2024 13:45

dhruvmanila added a commit that referenced this pull request Apr 11, 2024

Add tests for all kinds of assignment statement (#10786)

2184f82

## Summary This PR adds test cases for assignment, augmented assignment and annotated assignment statement.

dhruvmanila added a commit that referenced this pull request Apr 15, 2024

Add tests for all kinds of assignment statement (#10786)

215818c

## Summary This PR adds test cases for assignment, augmented assignment and annotated assignment statement.

dhruvmanila added a commit that referenced this pull request Apr 16, 2024

Add tests for all kinds of assignment statement (#10786)

9cfb185

## Summary This PR adds test cases for assignment, augmented assignment and annotated assignment statement.

dhruvmanila added a commit that referenced this pull request Apr 16, 2024

Add tests for all kinds of assignment statement (#10786)

cd4cc54

## Summary This PR adds test cases for assignment, augmented assignment and annotated assignment statement.

dhruvmanila added a commit that referenced this pull request Apr 17, 2024

Add tests for all kinds of assignment statement (#10786)

d4ba059

## Summary This PR adds test cases for assignment, augmented assignment and annotated assignment statement.

dhruvmanila added a commit that referenced this pull request Apr 18, 2024

Add tests for all kinds of assignment statement (#10786)

9812eec

## Summary This PR adds test cases for assignment, augmented assignment and annotated assignment statement.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add tests for all kinds of assignment statement #10786

Add tests for all kinds of assignment statement #10786

dhruvmanila commented Apr 5, 2024

dhruvmanila Apr 5, 2024

dhruvmanila Apr 5, 2024

dhruvmanila Apr 5, 2024 •

edited

Loading

codspeed-hq bot commented Apr 5, 2024 •

edited

Loading

dhruvmanila Apr 5, 2024

MichaReiser commented Apr 5, 2024

MichaReiser left a comment

github-actions bot commented Apr 5, 2024 •

edited

Loading

dhruvmanila commented Apr 9, 2024

Add tests for all kinds of assignment statement #10786

Add tests for all kinds of assignment statement #10786

Conversation

dhruvmanila commented Apr 5, 2024

Summary

dhruvmanila Apr 5, 2024

Choose a reason for hiding this comment

dhruvmanila Apr 5, 2024

Choose a reason for hiding this comment

dhruvmanila Apr 5, 2024 • edited Loading

Choose a reason for hiding this comment

codspeed-hq bot commented Apr 5, 2024 • edited Loading

CodSpeed Performance Report

Merging #10786 will degrade performances by 8.21%

Summary

Benchmarks breakdown

dhruvmanila Apr 5, 2024

Choose a reason for hiding this comment

MichaReiser commented Apr 5, 2024

MichaReiser left a comment

Choose a reason for hiding this comment

github-actions bot commented Apr 5, 2024 • edited Loading

ruff-ecosystem results

Linter (stable)

Linter (preview)

Formatter (stable)

Formatter (preview)

dhruvmanila commented Apr 9, 2024

dhruvmanila Apr 5, 2024 •

edited

Loading

codspeed-hq bot commented Apr 5, 2024 •

edited

Loading

github-actions bot commented Apr 5, 2024 •

edited

Loading

`ruff-ecosystem` results