Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf(blame): general improvements #878

Merged
merged 4 commits into from
Sep 22, 2023
Merged

perf(blame): general improvements #878

merged 4 commits into from
Sep 22, 2023

Conversation

lewis6991
Copy link
Owner

@lewis6991 lewis6991 commented Sep 12, 2023

fix(blame): do not run concurrent processes

Limits the amount of git-blame processes in the background to 1

perf(blame): run blame for entire file instead of per line

Previously current_line_blame would run a git-blame process per line
(via the -L flag) in an attempt to be more efficient. However after
some investigation it seems that running git-blame for the entire file
rarely exceeds 2x the time it takes to run for a single line, even for
large files.

This change alters current_line_blame to run git-blame for the entire
file after each buffer edit and caches that in memory. This makes the
first git-blame after an edit ~2x slower, but makes any cursor movements
after that instant.

A follow-up to this would be for current_line_blame to track buffer
updates to avoid the cache needing to be invalidated on every edit.

Fixes #877

@WhyNotHugo
Copy link
Contributor

This keeps a single CPU at 100% while scrolling around (which is an improvement over keeping all CPUs at 100%). However, after completing a motion, it can take several seconds for a blame to show up (I assume that this is because the previous execution needs to exit before another instance of git blame is executed, and then we still need to wait for its output).

@lewis6991
Copy link
Owner Author

lewis6991 commented Sep 20, 2023

Yes that's exactly right.

I can't do anything about 100% CPU utilisation as we can't make git magically more efficient.

Note though that the results are cached.

@WhyNotHugo
Copy link
Contributor

Fetching a single line: 9.43s user 2.45s system 99% cpu 11.888 total per line.

Fetching the entire file: 15.91s user 3.67s system 99% cpu 19.598 total.

I think that fetching the entire file and keeping that in memory pays of as soon as the cursor switches line at least once.

That said, there are other considerations to take into account. Mostly, that changes would need to be tracked to understand when the in-memory blame data is stale.

I definitely think that it's an option worth exploring, but I also understand that it might be a huge scope growth for the issue at hand (and for this plugin in general).

@WhyNotHugo
Copy link
Contributor

This is definitely an improvement, thanks. My computer fans no longer jump to the maximum when I scroll through a file in a large repository.

@lewis6991
Copy link
Owner Author

lewis6991 commented Sep 20, 2023

Thanks for looking into this. If it's only 2x slowdown then just blaming the whole file will probably be worth it like you said.

Currently we invalidate the cache on any buffer change so we shouldn't need to track anything... for now, but tracking the changes would be a nice future improvement.

I'll run a couple of benchmarks my end and look into changing the PR to blaming the whole file.

@lewis6991 lewis6991 force-pushed the fix/blameperf branch 4 times, most recently from 2845a84 to 05cb146 Compare September 22, 2023 09:14
@lewis6991 lewis6991 changed the title fix(blame): do not run concurrent processes perf(blame): general improvements Sep 22, 2023
@lewis6991 lewis6991 force-pushed the fix/blameperf branch 3 times, most recently from e6e6138 to 30c57f0 Compare September 22, 2023 11:07
Previously current_line_blame would run a git-blame process per line
(via the `-L` flag) in an attempt to be more efficient. However after
some investigation it seems that running git-blame for the entire file
rarely exceeds 2x the time it takes to run for a single line, even for
large files.

This change alters current_line_blame to run git-blame for the entire
file after each buffer edit and caches that in memory. This makes the
first git-blame after an edit ~2x slower, but makes any cursor movements
after that instant.

A follow-up to this would be for current_line_blame to track buffer
updates to avoid the cache needing to be invalidated on every edit.
@lewis6991 lewis6991 merged commit b018a2b into main Sep 22, 2023
4 checks passed
@lewis6991 lewis6991 deleted the fix/blameperf branch September 22, 2023 11:24
@WhyNotHugo
Copy link
Contributor

Thanks again for this! The different in large repositories in incredible: my fans are a lot less noisy and the blame information updates immediately when scrolling around!

@lewis6991
Copy link
Owner Author

No problem happy to help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

current_line_blame is too CPU intensive
2 participants