Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace merge-tree in more places #5416

Merged
merged 14 commits into from
Nov 5, 2024
Merged

Replace merge-tree in more places #5416

merged 14 commits into from
Nov 5, 2024

Conversation

Byron
Copy link
Collaborator

@Byron Byron commented Nov 3, 2024

The gitoxide merge-tree implementation is faster and better in quality, so it should be used where applicable. This PR attempts to find these places and make the switch.

The goal is also to provide performance metrics, even though it's not expected we will see great numbers here as merge-tree is a smaller portion of the total work to be done.

Follow-up on #5411 .

Tasks

Use merge-tree in…

  • Apply and Unapply branch
  • unapply ownership
  • back to integration
  • Oplog Listing (3x faster, 3,5x for merge-trees)
  • checkout branch trees
  • upstream integration statuses
  • use latest gix for nicer API
  • be sure to switch gix to the one from main

Notes for the Reviewer

  • All uses that deal with resolve_index() have been postponed for now, as it's probably easiest to implement this directly in the gitoxide merge tree function, for achieving the best possible quality. In theory, it might already work doing a post-process on the conflict entries, but that would be harder to get right.
  • There are a couple of other uses in gitbutler-cherry-pick and gitbutler-edit-mode that rely on the index quite a bit, and even checkout an index. But it seems that it doesn't rely on the index to show merge-conflicts.

Follow-Up

Apply and Unapply

Before

When applying, merge-trees takes a whopping 72ms. It's clearly not the bottleneck, with a lot of time being spent in checkout.

Screenshot 2024-11-03 at 14 22 36

When unapplying, it now clocks in at 139ms, so once again isn't going to change anything when replaced. Checking out files takes a long time, comparatively.

Screenshot 2024-11-03 at 14 24 46

After

Like expected, there isn't much of a relevant change, with both merges during unapply and apply clocking in to ~90ms. Yes, it seems a bit faster, but is still far from flexing any muscle.

Screenshot 2024-11-03 at 16 05 34

Unapply Ownership

Before

A tiny portion of 66ms of the 4s runtime is related to merging trees. Since it's the same repository (GitLab) the after should look very similar.

Screenshot 2024-11-03 at 17 38 25

After

It's basically the same at 64ms.

Screenshot 2024-11-03 at 17 38 59''

Set Base Branch / Back to Integration

Before

112ms

Screenshot 2024-11-03 at 19 30 17

After

93ms

Screenshot 2024-11-03 at 19 34 21

Oplog Listing

Before

Listing takes 22s, and one full diff takes 7s, which is duplicated for an additional stats. Merging of trees takes 1.93s.

Screenshot 2024-11-03 at 19 52 45

After replacing merge-trees

Merging trees now take 585ms, 3.3x of what was before.

Screenshot 2024-11-03 at 20 28 31

After using gitoxide for the diffing

The whole operation is down to 7.3s, a 3x improvement! This is also due to not duplicating work.

Screenshot 2024-11-04 at 08 14 31

It's noticeable how the TOML deserialisation takes 2s, which is a lot in comparison to the total runtime. But like I mentioned on Discord, I think there is ways to speed this up by not precomputing everything, instead of parallelising.

Checkout Branch Trees

Before

Merging in total takes about 300ms with two trivial branches. Like noticed before, writing the index to a tree is very expensive.

Screenshot 2024-11-04 at 11 04 06

After

Now the same task takes 140ms for merging and writing the tree. Notably, writing the tree now takes 6ms , 29x faster than having to write a whole index.

Screenshot 2024-11-04 at 12 13 51

Upstream Integration Statuses

Before

As often, most time is spent elsewhere (and duplicated in stack_series()), but that aside we are looking at 1.09s for merging trees in that function.

Screenshot 2024-11-04 at 16 11 10

After

Now it clocks in at 640ms, at 1.7x.

Screenshot 2024-11-04 at 19 21 50

This will help measuring the performance changes during apply and unapply operations.
Copy link

vercel bot commented Nov 3, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
gitbutler-web ✅ Ready (Inspect) Visit Preview 💬 Add feedback Nov 5, 2024 7:06pm

Unfortunately it doesn't seem to be called from the UI anymore, just in tests,
so there isn't any benchmark results.

let mut opts = DiffOptions::new();
opts.include_untracked(true);
opts.ignore_submodules(true);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gitoxide has no option to ignore submodules, but it would conflict on them if both change to different values. It's planned to have similar resolution facilities like Git has, but it's not implemented yet.

})
.for_each_to_obtain_tree(&wd_tree, |change| -> Result<_> {
match change {
Change::Addition { location, .. } => {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not 100% sure that this is what was intended in the git2 implementation, or why that is.

In the GItLab repo, I see 464 files changed in a discard hunk snapshot, but that was 4.6k if any location is used. Maybe this is why this 'add-only' filter is applied here. The UI gets quite slow displaying this.

| Change::Modification { .. }
| Change::Rewrite { .. } => {}
}
if let Some(counts) = change
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the expensive operation, of 3s, half goes into calculating line numbers. If performance ever becomes a bigger problem, these diffs should be avoided and be done on demand entirely.

@@ -15,6 +15,21 @@ impl TestingRepository {
pub fn open() -> Self {
let tempdir = tempdir().unwrap();
let repository = git2::Repository::init_opts(tempdir.path(), &init_opts()).unwrap();
// TODO(ST): remove this once `gix::Repository::index_or_load_from_tree_or_empty()`
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gitoxide will soon get an upgrade so this won't be an issue in future. For now it's easiest to cater to it and avoid an unborn HEAD.

@Byron Byron marked this pull request as ready for review November 4, 2024 19:22
@Byron Byron mentioned this pull request Nov 5, 2024
8 tasks
Copy link
Member

@krlvi krlvi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! Let's merge it in a couple of hours, when we know that the newly released 0.13.9 is okay :)

@Byron Byron marked this pull request as draft November 5, 2024 18:35
@Byron Byron marked this pull request as ready for review November 5, 2024 19:06
@krlvi krlvi merged commit 7fe5d0c into master Nov 5, 2024
19 checks passed
@krlvi krlvi deleted the tree-merge branch November 5, 2024 21:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
rust Pull requests that update Rust code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants