perf: greatly improve hash efficiency in computing attributes #158

QuadnucYard · 2024-11-25T11:07:02Z

This PR resolves a critical issue that the formatter runs surprisingly slow on some large documents (e.g., the touying package), which was noticed earlier by @Myriad-Dreamin but not publicly disclosed.

After conducting an investigation with timing, I found that it is the HashMap in node attribution computation instead of Doc creating and pretty printing that hurts. In discussions with @Enter-tainer and @Myriad-Dreamin, we discovered that the previously used SyntaxNode hash value is computed recursively, which should be avoided in our preprocessing. As an alternative, we can use the span as the unique identifier for nodes. The root node needs to be created from Source. Otherwise, it would have no valid span attached.

With the changes in this PR, the time of formatting tablex.typ (the largest one in our test assets) dramatically drops from ~500ms to ~20ms. Additionally, unit tests can be completed much faster, although the improvement is not that significant.

Benchmarks will be added later.

Enter-tainer

Overall LGTM, the refactoring for AttrStore looks nice 👍

Enter-tainer · 2024-11-25T11:33:57Z

It would be better if we can add benchmark related code too.

QuadnucYard · 2024-11-25T11:50:31Z

Here are the benchmark results, run from my laptop:

With FxHashMap:

attrs-tablex            time:   [750.69 µs 756.21 µs 761.75 µs]
Found 3 outliers among 100 measurements (3.00%)
  3 (3.00%) high severe

attrs-cetz-manual       time:   [131.79 µs 133.00 µs 134.38 µs]
Found 7 outliers among 100 measurements (7.00%)
  6 (6.00%) high mild
  1 (1.00%) high severe

attrs-undergraduate-math
                        time:   [207.06 µs 210.71 µs 215.02 µs]
Found 8 outliers among 100 measurements (8.00%)
  5 (5.00%) high mild
  3 (3.00%) high severe

attrs-packages/codly    time:   [414.37 µs 419.51 µs 424.97 µs]

attrs-packages/fletcher-diagram
                        time:   [83.599 µs 86.387 µs 89.831 µs]
Found 11 outliers among 100 measurements (11.00%)
  8 (8.00%) high mild
  3 (3.00%) high severe

attrs-packages/fletcher-draw
                        time:   [272.75 µs 278.16 µs 284.72 µs]
Found 6 outliers among 100 measurements (6.00%)
  6 (6.00%) high mild

pretty-tablex           time:   [15.852 ms 16.369 ms 17.054 ms]
Found 12 outliers among 100 measurements (12.00%)
  9 (9.00%) high mild
  3 (3.00%) high severe

pretty-cetz-manual      time:   [2.6282 ms 2.7033 ms 2.7865 ms]
Found 11 outliers among 100 measurements (11.00%)
  5 (5.00%) high mild
  6 (6.00%) high severe

pretty-undergraduate-math
                        time:   [2.6843 ms 2.8005 ms 2.9342 ms]
Found 10 outliers among 100 measurements (10.00%)
  5 (5.00%) high mild
  5 (5.00%) high severe

pretty-packages/codly   time:   [9.3115 ms 9.7219 ms 10.270 ms]
Found 13 outliers among 100 measurements (13.00%)
  5 (5.00%) high mild
  8 (8.00%) high severe

pretty-packages/fletcher-diagram
                        time:   [2.6072 ms 2.6400 ms 2.6739 ms]

pretty-packages/fletcher-draw
                        time:   [6.2645 ms 6.3296 ms 6.3946 ms]

Compared with std HashMap:

attrs-tablex            time:   [1.4011 ms 1.4216 ms 1.4440 ms]
                        change: [+82.195% +86.714% +90.790%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 3 outliers among 100 measurements (3.00%)
  3 (3.00%) high mild

attrs-cetz-manual       time:   [305.22 µs 316.43 µs 329.72 µs]
                        change: [+118.77% +124.72% +131.43%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 5 outliers among 100 measurements (5.00%)
  2 (2.00%) high mild
  3 (3.00%) high severe

attrs-undergraduate-math
                        time:   [427.62 µs 439.27 µs 452.34 µs]
                        change: [+61.585% +77.187% +90.993%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 10 outliers among 100 measurements (10.00%)
  1 (1.00%) low mild
  5 (5.00%) high mild
  4 (4.00%) high severe

attrs-packages/codly    time:   [744.81 µs 764.46 µs 792.11 µs]
                        change: [+77.806% +81.824% +86.477%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 5 outliers among 100 measurements (5.00%)
  2 (2.00%) high mild
  3 (3.00%) high severe

attrs-packages/fletcher-diagram
                        time:   [193.31 µs 195.71 µs 198.22 µs]
                        change: [+116.63% +125.35% +133.35%] (p = 0.00 < 0.05)
                        Performance has regressed.

attrs-packages/fletcher-draw
                        time:   [469.45 µs 474.21 µs 479.21 µs]
                        change: [+67.398% +70.803% +74.332%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

pretty-tablex           time:   [18.346 ms 19.065 ms 19.879 ms]
                        change: [+10.237% +16.475% +23.116%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 13 outliers among 100 measurements (13.00%)
  3 (3.00%) high mild
  10 (10.00%) high severe

pretty-cetz-manual      time:   [2.7415 ms 2.8517 ms 2.9843 ms]
                        change: [+0.3902% +5.4898% +11.897%] (p = 0.05 < 0.05)
                        Change within noise threshold.
Found 17 outliers among 100 measurements (17.00%)
  6 (6.00%) high mild
  11 (11.00%) high severe

pretty-undergraduate-math
                        time:   [2.8137 ms 2.8480 ms 2.8821 ms]
                        change: [-3.0979% +1.6950% +6.1962%] (p = 0.48 > 0.05)
                        No change in performance detected.

pretty-packages/codly   time:   [10.239 ms 10.345 ms 10.450 ms]
                        change: [+0.6745% +6.4063% +11.239%] (p = 0.01 < 0.05)
                        Change within noise threshold.
Found 5 outliers among 100 measurements (5.00%)
  3 (3.00%) low mild
  2 (2.00%) high mild

pretty-packages/fletcher-diagram
                        time:   [2.8858 ms 2.9260 ms 2.9715 ms]
                        change: [+8.7981% +10.833% +13.203%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high severe

pretty-packages/fletcher-draw
                        time:   [6.7288 ms 6.8184 ms 6.9105 ms]
                        change: [+5.8656% +7.7232% +9.6154%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

We can see FxHashMap brings great efficiency improvement to attribute computation. In the whole process, it seems that the improvement is slight but still worthy.

QuadnucYard · 2024-11-25T12:14:25Z

More benches with touying.

attrs-packages/touying/core
                        time:   [512.64 µs 518.20 µs 523.82 µs]
attrs-packages/touying/utils
                        time:   [262.51 µs 267.10 µs 272.57 µs]
pretty-packages/touying/core
                        time:   [12.837 ms 13.005 ms 13.207 ms]
pretty-packages/touying/utils
                        time:   [5.4145 ms 5.4656 ms 5.5193 ms]

The two files are not so powerful as tablex 😃

QuadnucYard · 2024-11-25T12:27:07Z

Some bench results before this change:

pretty-undergraduate-math
                        time:   [39.546 ms 39.921 ms 40.308 ms]

pretty-cetz-manual      time:   [16.950 ms 17.194 ms 17.442 ms]

pretty-tablex           time:   [516.00 ms 521.13 ms 526.83 ms]

pretty-codly            time:   [186.91 ms 188.12 ms 189.31 ms]

Enter-tainer

Thanks!

QuadnucYard · 2024-11-25T15:12:39Z

Running benchmark:

cargo bench  # run all benchmarks
cargo bench -- pretty  # run benchmarks of pretty-print only

perf: hash Span instead of SyntaxNode for attrs

2fb6fc2

Enter-tainer approved these changes Nov 25, 2024

View reviewed changes

QuadnucYard added 2 commits November 25, 2024 19:53

test: add benchmarks

1f30927

test: add touying test assets

c3225d5

QuadnucYard force-pushed the perf-hashmap branch from dbed0cc to c3225d5 Compare November 25, 2024 12:14

QuadnucYard marked this pull request as ready for review November 25, 2024 12:15

Enter-tainer approved these changes Nov 25, 2024

View reviewed changes

Enter-tainer merged commit 5231058 into Enter-tainer:master Nov 25, 2024
11 checks passed

QuadnucYard deleted the perf-hashmap branch November 25, 2024 13:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: greatly improve hash efficiency in computing attributes #158

perf: greatly improve hash efficiency in computing attributes #158

QuadnucYard commented Nov 25, 2024

Enter-tainer left a comment

Enter-tainer commented Nov 25, 2024

QuadnucYard commented Nov 25, 2024

QuadnucYard commented Nov 25, 2024

QuadnucYard commented Nov 25, 2024

Enter-tainer left a comment

QuadnucYard commented Nov 25, 2024

perf: greatly improve hash efficiency in computing attributes #158

perf: greatly improve hash efficiency in computing attributes #158

Conversation

QuadnucYard commented Nov 25, 2024

Enter-tainer left a comment

Choose a reason for hiding this comment

Enter-tainer commented Nov 25, 2024

QuadnucYard commented Nov 25, 2024

QuadnucYard commented Nov 25, 2024

QuadnucYard commented Nov 25, 2024

Enter-tainer left a comment

Choose a reason for hiding this comment

QuadnucYard commented Nov 25, 2024