Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

General performance improvements #8

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

General performance improvements #8

wants to merge 2 commits into from

Conversation

ghost
Copy link

@ghost ghost commented Jan 15, 2017

The basic idea is: ns == countNl x == lineIndex (measure ch) and n - ns == lineIndex (measure f) - p - 1

Before:

benchmarking splitAtLine/long 1200
time                 82.20 μs   (81.81 μs .. 82.59 μs)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 82.33 μs   (82.04 μs .. 82.67 μs)
std dev              1.052 μs   (879.3 ns .. 1.291 μs)

After:

benchmarking splitAtLine/long 1200
time                 76.08 μs   (75.58 μs .. 76.51 μs)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 75.41 μs   (75.04 μs .. 75.85 μs)
std dev              1.362 μs   (1.171 μs .. 1.654 μs)
variance introduced by outliers: 13% (moderately inflated)

@ghost
Copy link
Author

ghost commented Jan 15, 2017

Benchmarks for folding

Before:

benchmarking foldCount/long 1200
time                 500.3 μs   (498.6 μs .. 502.4 μs)
                     1.000 R²   (0.999 R² .. 1.000 R²)
mean                 503.8 μs   (500.6 μs .. 508.0 μs)
std dev              12.75 μs   (9.952 μs .. 18.13 μs)
variance introduced by outliers: 17% (moderately inflated)

benchmarking foldId/long 1200
time                 4.336 s    (4.281 s .. 4.386 s)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 4.360 s    (4.348 s .. 4.367 s)
std dev              11.29 ms   (0.0 s .. 12.86 ms)
variance introduced by outliers: 19% (moderately inflated)

benchmarking foldReverse/long 1200
time                 5.072 s    (4.600 s .. 5.292 s)
                     0.999 R²   (0.998 R² .. 1.000 R²)
mean                 5.189 s    (5.084 s .. 5.240 s)
std dev              91.04 ms   (0.0 s .. 93.10 ms)
variance introduced by outliers: 19% (moderately inflated)

After:

benchmarking foldCount/long 1200
time                 468.4 μs   (463.8 μs .. 475.6 μs)
                     0.999 R²   (0.998 R² .. 1.000 R²)
mean                 467.6 μs   (465.8 μs .. 470.0 μs)
std dev              7.297 μs   (5.419 μs .. 11.26 μs)

benchmarking foldId/long 1200
time                 3.914 s    (3.768 s .. 4.080 s)
                     1.000 R²   (0.999 R² .. 1.000 R²)
mean                 3.886 s    (3.853 s .. 3.900 s)
std dev              29.55 ms   (0.0 s .. 32.84 ms)
variance introduced by outliers: 19% (moderately inflated)

benchmarking foldReverse/long 1200
time                 4.889 s    (4.847 s .. 4.927 s)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 4.900 s    (4.889 s .. 4.906 s)
std dev              9.898 ms   (0.0 s .. 11.07 ms)
variance introduced by outliers: 19% (moderately inflated)

@ghost
Copy link
Author

ghost commented Jan 15, 2017

The motivation for lines should be obvious with this:

benchmarking lines/long 1200
time                 164.9 μs   (164.5 μs .. 165.7 μs)
                     1.000 R²   (0.999 R² .. 1.000 R²)
mean                 167.6 μs   (166.4 μs .. 169.6 μs)
std dev              5.137 μs   (3.668 μs .. 7.423 μs)
variance introduced by outliers: 27% (moderately inflated)

benchmarking lines'/long 1200
time                 12.29 ms   (12.10 ms .. 12.58 ms)
                     0.998 R²   (0.995 R² .. 0.999 R²)
mean                 12.03 ms   (11.89 ms .. 12.17 ms)
std dev              345.9 μs   (262.0 μs .. 458.8 μs)

@ghost ghost changed the title Avoid unnecessary linecount in splitAtLine WIP: General performance improvements Jan 15, 2017
src/Yi/Rope.hs Outdated
go x = case viewl x of
EmptyL -> False
Chunk _ t :< ts -> TX.any p t || go ts
any p = Prelude.any (==True) . fmap (TX.any p) . toTextList
Copy link
Member

@ethercrow ethercrow Jan 15, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any (== True) is or

src/Yi/Rope.hs Outdated
go x = case viewl x of
EmptyL -> True
Chunk _ t :< ts -> TX.all p t && go ts
all p = Prelude.all (==True) . fmap (TX.all p) . toTextList
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all (== True) is and

@ghost
Copy link
Author

ghost commented Jan 15, 2017

I think that's about everything. I removed the folding patch, because folding over small tress takes longer, and most users will probably be using small text files anyway. There should be an order of magnitude improvement in lines and lines', and everything else should remain about the same.

@ghost ghost changed the title WIP: General performance improvements General performance improvements Jan 15, 2017
@ethercrow
Copy link
Member

I've tried to run the benchmarks on my machine.

lines' before

benchmarking lines'/long 1200
time                 13.12 ms   (12.85 ms .. 13.40 ms)
                     0.997 R²   (0.994 R² .. 0.999 R²)
mean                 13.15 ms   (12.96 ms .. 13.32 ms)
std dev              467.1 μs   (385.0 μs .. 580.4 μs)
variance introduced by outliers: 11% (moderately inflated)

benchmarking lines'/wide 1200
time                 203.4 μs   (200.0 μs .. 206.6 μs)
                     0.998 R²   (0.997 R² .. 0.999 R²)
mean                 203.3 μs   (200.6 μs .. 206.0 μs)
std dev              9.382 μs   (7.785 μs .. 11.53 μs)
variance introduced by outliers: 45% (moderately inflated)

benchmarking lines'/short 1200
time                 3.303 μs   (3.261 μs .. 3.348 μs)
                     0.999 R²   (0.998 R² .. 0.999 R²)
mean                 3.279 μs   (3.239 μs .. 3.320 μs)
std dev              131.0 ns   (109.2 ns .. 163.8 ns)
variance introduced by outliers: 52% (severely inflated)

benchmarking lines'/tiny 1200
time                 112.0 ns   (109.5 ns .. 114.3 ns)
                     0.998 R²   (0.997 R² .. 0.999 R²)
mean                 110.3 ns   (108.8 ns .. 111.8 ns)
std dev              5.329 ns   (4.519 ns .. 6.495 ns)
variance introduced by outliers: 69% (severely inflated)

lines' after

benchmarking lines'/long 1200
time                 539.5 μs   (529.6 μs .. 549.5 μs)
                     0.997 R²   (0.996 R² .. 0.998 R²)
mean                 536.6 μs   (527.6 μs .. 545.8 μs)
std dev              29.39 μs   (24.15 μs .. 34.74 μs)
variance introduced by outliers: 48% (moderately inflated)

benchmarking lines'/wide 1200
time                 413.2 μs   (399.4 μs .. 424.1 μs)
                     0.995 R²   (0.993 R² .. 0.997 R²)
mean                 404.6 μs   (397.6 μs .. 411.4 μs)
std dev              23.16 μs   (19.38 μs .. 28.29 μs)
variance introduced by outliers: 52% (severely inflated)

benchmarking lines'/short 1200
time                 1.489 μs   (1.464 μs .. 1.512 μs)
                     0.998 R²   (0.997 R² .. 0.999 R²)
mean                 1.511 μs   (1.487 μs .. 1.535 μs)
std dev              83.93 ns   (69.78 ns .. 102.7 ns)
variance introduced by outliers: 70% (severely inflated)

benchmarking lines'/tiny 1200
time                 86.99 ns   (85.23 ns .. 88.76 ns)
                     0.997 R²   (0.996 R² .. 0.998 R²)
mean                 86.19 ns   (84.42 ns .. 87.87 ns)
std dev              5.633 ns   (4.807 ns .. 6.571 ns)
variance introduced by outliers: 81% (severely inflated)

@ethercrow
Copy link
Member

splitAtLine before

benchmarking splitAtLine/long 1200
time                 75.64 μs   (74.29 μs .. 77.32 μs)
                     0.997 R²   (0.997 R² .. 0.998 R²)
mean                 76.61 μs   (75.65 μs .. 77.63 μs)
std dev              3.257 μs   (2.811 μs .. 3.974 μs)
variance introduced by outliers: 45% (moderately inflated)

benchmarking splitAtLine/wide 1200
time                 31.19 μs   (30.81 μs .. 31.64 μs)
                     0.998 R²   (0.997 R² .. 0.999 R²)
mean                 30.91 μs   (30.60 μs .. 31.32 μs)
std dev              1.193 μs   (949.2 ns .. 1.701 μs)
variance introduced by outliers: 44% (moderately inflated)

benchmarking splitAtLine/short 1200
time                 74.45 μs   (73.43 μs .. 75.37 μs)
                     0.998 R²   (0.997 R² .. 0.999 R²)
mean                 73.92 μs   (72.89 μs .. 74.85 μs)
std dev              3.387 μs   (2.839 μs .. 4.136 μs)
variance introduced by outliers: 49% (moderately inflated)

benchmarking splitAtLine/tiny 1200
time                 71.72 μs   (70.58 μs .. 73.06 μs)
                     0.997 R²   (0.996 R² .. 0.999 R²)
mean                 72.46 μs   (71.52 μs .. 73.47 μs)
std dev              3.273 μs   (2.760 μs .. 4.074 μs)
variance introduced by outliers: 48% (moderately inflated)

splitAtLine after

benchmarking splitAtLine/long 1200
time                 87.69 μs   (86.19 μs .. 89.15 μs)
                     0.998 R²   (0.997 R² .. 0.999 R²)
mean                 89.88 μs   (88.57 μs .. 91.28 μs)
std dev              4.584 μs   (3.764 μs .. 5.755 μs)
variance introduced by outliers: 54% (severely inflated)

benchmarking splitAtLine/wide 1200
time                 42.09 μs   (41.54 μs .. 42.74 μs)
                     0.998 R²   (0.997 R² .. 0.999 R²)
mean                 42.96 μs   (42.19 μs .. 43.90 μs)
std dev              2.863 μs   (2.232 μs .. 3.787 μs)
variance introduced by outliers: 69% (severely inflated)

benchmarking splitAtLine/short 1200
time                 84.99 μs   (83.96 μs .. 86.25 μs)
                     0.998 R²   (0.998 R² .. 0.999 R²)
mean                 86.01 μs   (84.96 μs .. 87.22 μs)
std dev              3.781 μs   (2.874 μs .. 5.199 μs)
variance introduced by outliers: 46% (moderately inflated)

benchmarking splitAtLine/tiny 1200
time                 86.77 μs   (84.82 μs .. 88.66 μs)
                     0.997 R²   (0.995 R² .. 0.999 R²)
mean                 85.24 μs   (83.94 μs .. 86.58 μs)
std dev              4.499 μs   (3.670 μs .. 5.889 μs)
variance introduced by outliers: 55% (severely inflated)

@ethercrow
Copy link
Member

The speedup for lines'/long is impressive but do you know why lines'/wide has regressed?

@ghost
Copy link
Author

ghost commented Jan 17, 2017

do you know why lines'/wide has regressed?

I'm guessing that it has to do with creating the rope at the end (which results in calls to TX.chunksOf). The cost of this operation is O(n). For wide text, n is fairly large, and hence we see it being slower. For everything else, n ~ 1 => O(1) in an amortized sense.

I'm not sure what caused the slowdown in splitAtLine. I can verify the regression on my machine as well, and I don't think it has to do with the change to the function itself. Maybe GHC is missing a potential optimization because of the increase in code size?

I'm hoping I didn't make any mistakes. In the short term, there's a noticeable reduction in lag for the rasa text editor (which is what made me realize this in the first place). In the long run, it should allow us to use lines more liberally in Yi, since running lines on the full buffer isn't an expensive operation anymore (sans wide text).

@ethercrow
Copy link
Member

Could you extract the lines' patch in a separate PR so we can release it for Rasa and then tackle the rest?

@ChrisPenner
Copy link
Member

Thanks for putting time into this! I'm also looking into ways to cache some of the information we use lines' for in Rasa so as to further improve performance; but Rasa's referencing scheme is based around Rows and Columns so anything which improves indexing/splitting on lines is helpful.

In particular it would be useful to either index into specific lines in constant time or to be able to determine the number of characters within a given range/within a given row. I'm considering caching this on a per buffer level in Rasa. Since we recently limited the text-interface to range operations it should be plausible to keep an updated cache mapping line-number to line-length. That's besides this issue though; just figured I'd keep you informed.

@ethercrow
Copy link
Member

This has bitrotten a bit, please update.

@ghost
Copy link
Author

ghost commented Sep 14, 2017

This has bitrotten a bit, please update.

Rebased.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants