Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: greatly improve hash efficiency in computing attributes #158

Merged
merged 3 commits into from
Nov 25, 2024

Conversation

QuadnucYard
Copy link
Contributor

This PR resolves a critical issue that the formatter runs surprisingly slow on some large documents (e.g., the touying package), which was noticed earlier by @Myriad-Dreamin but not publicly disclosed.

After conducting an investigation with timing, I found that it is the HashMap in node attribution computation instead of Doc creating and pretty printing that hurts. In discussions with @Enter-tainer and @Myriad-Dreamin, we discovered that the previously used SyntaxNode hash value is computed recursively, which should be avoided in our preprocessing. As an alternative, we can use the span as the unique identifier for nodes. The root node needs to be created from Source. Otherwise, it would have no valid span attached.

With the changes in this PR, the time of formatting tablex.typ (the largest one in our test assets) dramatically drops from ~500ms to ~20ms. Additionally, unit tests can be completed much faster, although the improvement is not that significant.

Benchmarks will be added later.

Copy link
Owner

@Enter-tainer Enter-tainer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM, the refactoring for AttrStore looks nice 👍

@Enter-tainer
Copy link
Owner

It would be better if we can add benchmark related code too.

@QuadnucYard
Copy link
Contributor Author

Here are the benchmark results, run from my laptop:

With FxHashMap:

attrs-tablex            time:   [750.69 µs 756.21 µs 761.75 µs]
Found 3 outliers among 100 measurements (3.00%)
  3 (3.00%) high severe

attrs-cetz-manual       time:   [131.79 µs 133.00 µs 134.38 µs]
Found 7 outliers among 100 measurements (7.00%)
  6 (6.00%) high mild
  1 (1.00%) high severe

attrs-undergraduate-math
                        time:   [207.06 µs 210.71 µs 215.02 µs]
Found 8 outliers among 100 measurements (8.00%)
  5 (5.00%) high mild
  3 (3.00%) high severe

attrs-packages/codly    time:   [414.37 µs 419.51 µs 424.97 µs]

attrs-packages/fletcher-diagram
                        time:   [83.599 µs 86.387 µs 89.831 µs]
Found 11 outliers among 100 measurements (11.00%)
  8 (8.00%) high mild
  3 (3.00%) high severe

attrs-packages/fletcher-draw
                        time:   [272.75 µs 278.16 µs 284.72 µs]
Found 6 outliers among 100 measurements (6.00%)
  6 (6.00%) high mild

pretty-tablex           time:   [15.852 ms 16.369 ms 17.054 ms]
Found 12 outliers among 100 measurements (12.00%)
  9 (9.00%) high mild
  3 (3.00%) high severe

pretty-cetz-manual      time:   [2.6282 ms 2.7033 ms 2.7865 ms]
Found 11 outliers among 100 measurements (11.00%)
  5 (5.00%) high mild
  6 (6.00%) high severe

pretty-undergraduate-math
                        time:   [2.6843 ms 2.8005 ms 2.9342 ms]
Found 10 outliers among 100 measurements (10.00%)
  5 (5.00%) high mild
  5 (5.00%) high severe

pretty-packages/codly   time:   [9.3115 ms 9.7219 ms 10.270 ms]
Found 13 outliers among 100 measurements (13.00%)
  5 (5.00%) high mild
  8 (8.00%) high severe

pretty-packages/fletcher-diagram
                        time:   [2.6072 ms 2.6400 ms 2.6739 ms]

pretty-packages/fletcher-draw
                        time:   [6.2645 ms 6.3296 ms 6.3946 ms]

Compared with std HashMap:

attrs-tablex            time:   [1.4011 ms 1.4216 ms 1.4440 ms]
                        change: [+82.195% +86.714% +90.790%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 3 outliers among 100 measurements (3.00%)
  3 (3.00%) high mild

attrs-cetz-manual       time:   [305.22 µs 316.43 µs 329.72 µs]
                        change: [+118.77% +124.72% +131.43%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 5 outliers among 100 measurements (5.00%)
  2 (2.00%) high mild
  3 (3.00%) high severe

attrs-undergraduate-math
                        time:   [427.62 µs 439.27 µs 452.34 µs]
                        change: [+61.585% +77.187% +90.993%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 10 outliers among 100 measurements (10.00%)
  1 (1.00%) low mild
  5 (5.00%) high mild
  4 (4.00%) high severe

attrs-packages/codly    time:   [744.81 µs 764.46 µs 792.11 µs]
                        change: [+77.806% +81.824% +86.477%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 5 outliers among 100 measurements (5.00%)
  2 (2.00%) high mild
  3 (3.00%) high severe

attrs-packages/fletcher-diagram
                        time:   [193.31 µs 195.71 µs 198.22 µs]
                        change: [+116.63% +125.35% +133.35%] (p = 0.00 < 0.05)
                        Performance has regressed.

attrs-packages/fletcher-draw
                        time:   [469.45 µs 474.21 µs 479.21 µs]
                        change: [+67.398% +70.803% +74.332%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

pretty-tablex           time:   [18.346 ms 19.065 ms 19.879 ms]
                        change: [+10.237% +16.475% +23.116%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 13 outliers among 100 measurements (13.00%)
  3 (3.00%) high mild
  10 (10.00%) high severe

pretty-cetz-manual      time:   [2.7415 ms 2.8517 ms 2.9843 ms]
                        change: [+0.3902% +5.4898% +11.897%] (p = 0.05 < 0.05)
                        Change within noise threshold.
Found 17 outliers among 100 measurements (17.00%)
  6 (6.00%) high mild
  11 (11.00%) high severe

pretty-undergraduate-math
                        time:   [2.8137 ms 2.8480 ms 2.8821 ms]
                        change: [-3.0979% +1.6950% +6.1962%] (p = 0.48 > 0.05)
                        No change in performance detected.

pretty-packages/codly   time:   [10.239 ms 10.345 ms 10.450 ms]
                        change: [+0.6745% +6.4063% +11.239%] (p = 0.01 < 0.05)
                        Change within noise threshold.
Found 5 outliers among 100 measurements (5.00%)
  3 (3.00%) low mild
  2 (2.00%) high mild

pretty-packages/fletcher-diagram
                        time:   [2.8858 ms 2.9260 ms 2.9715 ms]
                        change: [+8.7981% +10.833% +13.203%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high severe

pretty-packages/fletcher-draw
                        time:   [6.7288 ms 6.8184 ms 6.9105 ms]
                        change: [+5.8656% +7.7232% +9.6154%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

We can see FxHashMap brings great efficiency improvement to attribute computation. In the whole process, it seems that the improvement is slight but still worthy.

@QuadnucYard
Copy link
Contributor Author

More benches with touying.

attrs-packages/touying/core
                        time:   [512.64 µs 518.20 µs 523.82 µs]
attrs-packages/touying/utils
                        time:   [262.51 µs 267.10 µs 272.57 µs]
pretty-packages/touying/core
                        time:   [12.837 ms 13.005 ms 13.207 ms]
pretty-packages/touying/utils
                        time:   [5.4145 ms 5.4656 ms 5.5193 ms]

The two files are not so powerful as tablex 😃

@QuadnucYard QuadnucYard marked this pull request as ready for review November 25, 2024 12:15
@QuadnucYard
Copy link
Contributor Author

Some bench results before this change:

pretty-undergraduate-math
                        time:   [39.546 ms 39.921 ms 40.308 ms]

pretty-cetz-manual      time:   [16.950 ms 17.194 ms 17.442 ms]

pretty-tablex           time:   [516.00 ms 521.13 ms 526.83 ms]

pretty-codly            time:   [186.91 ms 188.12 ms 189.31 ms]

Copy link
Owner

@Enter-tainer Enter-tainer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@Enter-tainer Enter-tainer merged commit 5231058 into Enter-tainer:master Nov 25, 2024
11 checks passed
@QuadnucYard QuadnucYard deleted the perf-hashmap branch November 25, 2024 13:10
@QuadnucYard
Copy link
Contributor Author

Running benchmark:

cargo bench  # run all benchmarks
cargo bench -- pretty  # run benchmarks of pretty-print only

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants