Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

blame: enable and test the sparse index #431

Merged
merged 2 commits into from
Sep 27, 2021

Conversation

ldennington
Copy link

@ldennington ldennington commented Sep 22, 2021

Enable the sparse index for the 'git blame' command. The index was already
not expanded with this command, so the most interesting thing to do is
to add tests that verify that 'git blame' behaves correctly when the sparse
index is enabled and that its performance improves. More specifically, these
cases are...

  1. The index is not expanded for blame when given paths in the sparse
    checkout cone at multiple levels.
  2. Performance measurably improves for blame with sparse index when
    given paths in the sparse checkout cone at multiple levels.

The following are the performance results comparing the branch with
sparse index enabled for blame with the vfs-2.33.0 branch in the
microsoft/git repository:

Test                                       msft/vfs-2.33.0   sparse-index-blame
-----------------------------------------------------------------------------------
2000.2: git blame f2/f4/a (full-v3)        0.31(0.21+0.09)   0.32(0.21+0.10) +3.2%
2000.3: git blame f2/f4/a (full-v4)        0.29(0.19+0.08)   0.31(0.21+0.08) +6.9%
2000.4: git blame f2/f4/a (sparse-v3)      0.55(0.38+0.14)   0.23(0.14+0.07) -58.2%
2000.5: git blame f2/f4/a (sparse-v4)      0.57(0.40+0.16)   0.23(0.15+0.07) -59.6%
2000.6: git blame f2/f4/f3/a (full-v3)     0.77(0.56+0.20)   0.85(0.60+0.22) +10.4%
2000.7: git blame f2/f4/f3/a (full-v4)     0.78(0.56+0.20)   0.81(0.59+0.21) +3.8%
2000.8: git blame f2/f4/f3/a (sparse-v3)   1.07(0.78+0.28)   0.72(0.52+0.19) -32.7%
2000.9: git blame f2/f4/f3/a (sparse-v4)   1.05(0.78+0.26)   0.73(0.51+0.19) -30.5%

We do not include paths outside the sparse checkout cone because blame
currently does not support blaming files outside of the sparse
definition.

@derrickstolee
Copy link
Collaborator

derrickstolee commented Sep 23, 2021

Test                                    f2bf7293c6     
-------------------------------------------------------
2000.2: git blame f2/f4/a (full-v3)     0.27(0.19+0.07)
2000.3: git blame f2/f4/a (full-v4)     0.27(0.18+0.07)
2000.4: git blame f2/f4/a (sparse-v3)   0.53(0.38+0.14)
2000.5: git blame f2/f4/a (sparse-v4)   0.53(0.37+0.15)

On Linux, I get these numbers:

Test                                    ms/vfs-2.33.0     sparse-index-blame    
--------------------------------------------------------------------------------
2000.2: git blame f2/f4/a (full-v3)     0.13(0.11+0.02)   0.13(0.10+0.03) +0.0% 
2000.3: git blame f2/f4/a (full-v4)     0.12(0.09+0.03)   0.12(0.10+0.02) +0.0% 
2000.4: git blame f2/f4/a (sparse-v3)   0.34(0.31+0.03)   0.08(0.06+0.01) -76.5%
2000.5: git blame f2/f4/a (sparse-v4)   0.34(0.31+0.02)   0.08(0.08+0.01) -76.5%

It might be one of those things where macOS is behaving differently. I'm going to deepen the tree and see how that looks.

Edit: Here are the numbers after increasing the depth by one:

$./run ms/vfs-2.33.0 sparse-index-blame -- p2000-sparse-operations.sh
...
Test                                       ms/vfs-2.33.0     sparse-index-blame    
-----------------------------------------------------------------------------------
2000.2: git blame f2/f4/f3/a (full-v3)     0.48(0.38+0.10)   0.47(0.35+0.12) -2.1% 
2000.3: git blame f2/f4/f3/a (full-v4)     0.48(0.39+0.09)   0.47(0.38+0.08) -2.1% 
2000.4: git blame f2/f4/f3/a (sparse-v3)   1.26(1.17+0.09)   0.28(0.25+0.03) -77.8%
2000.5: git blame f2/f4/f3/a (sparse-v4)   1.27(1.11+0.15)   0.28(0.25+0.03) -78.0%

@ldennington
Copy link
Author

Test                                    f2bf7293c6     
-------------------------------------------------------
2000.2: git blame f2/f4/a (full-v3)     0.27(0.19+0.07)
2000.3: git blame f2/f4/a (full-v4)     0.27(0.18+0.07)
2000.4: git blame f2/f4/a (sparse-v3)   0.53(0.38+0.14)
2000.5: git blame f2/f4/a (sparse-v4)   0.53(0.37+0.15)

On Linux, I get these numbers:

Test                                    ms/vfs-2.33.0     sparse-index-blame    
--------------------------------------------------------------------------------
2000.2: git blame f2/f4/a (full-v3)     0.13(0.11+0.02)   0.13(0.10+0.03) +0.0% 
2000.3: git blame f2/f4/a (full-v4)     0.12(0.09+0.03)   0.12(0.10+0.02) +0.0% 
2000.4: git blame f2/f4/a (sparse-v3)   0.34(0.31+0.03)   0.08(0.06+0.01) -76.5%
2000.5: git blame f2/f4/a (sparse-v4)   0.34(0.31+0.02)   0.08(0.08+0.01) -76.5%

It might be one of those things where macOS is behaving differently. I'm going to deepen the tree and see how that looks.

Edit: Here are the numbers after increasing the depth by one:

$./run ms/vfs-2.33.0 sparse-index-blame -- p2000-sparse-operations.sh
...
Test                                       ms/vfs-2.33.0     sparse-index-blame    
-----------------------------------------------------------------------------------
2000.2: git blame f2/f4/f3/a (full-v3)     0.48(0.38+0.10)   0.47(0.35+0.12) -2.1% 
2000.3: git blame f2/f4/f3/a (full-v4)     0.48(0.39+0.09)   0.47(0.38+0.08) -2.1% 
2000.4: git blame f2/f4/f3/a (sparse-v3)   1.26(1.17+0.09)   0.28(0.25+0.03) -77.8%
2000.5: git blame f2/f4/f3/a (sparse-v4)   1.27(1.11+0.15)   0.28(0.25+0.03) -78.0%

Ah this is very good to see! Let me try increasing the depth by 1 and doing the same comparison between ms/vfs-2.33.0 and my branch.

@ldennington
Copy link
Author

Test                                    f2bf7293c6     
-------------------------------------------------------
2000.2: git blame f2/f4/a (full-v3)     0.27(0.19+0.07)
2000.3: git blame f2/f4/a (full-v4)     0.27(0.18+0.07)
2000.4: git blame f2/f4/a (sparse-v3)   0.53(0.38+0.14)
2000.5: git blame f2/f4/a (sparse-v4)   0.53(0.37+0.15)

On Linux, I get these numbers:

Test                                    ms/vfs-2.33.0     sparse-index-blame    
--------------------------------------------------------------------------------
2000.2: git blame f2/f4/a (full-v3)     0.13(0.11+0.02)   0.13(0.10+0.03) +0.0% 
2000.3: git blame f2/f4/a (full-v4)     0.12(0.09+0.03)   0.12(0.10+0.02) +0.0% 
2000.4: git blame f2/f4/a (sparse-v3)   0.34(0.31+0.03)   0.08(0.06+0.01) -76.5%
2000.5: git blame f2/f4/a (sparse-v4)   0.34(0.31+0.02)   0.08(0.08+0.01) -76.5%

It might be one of those things where macOS is behaving differently. I'm going to deepen the tree and see how that looks.
Edit: Here are the numbers after increasing the depth by one:

$./run ms/vfs-2.33.0 sparse-index-blame -- p2000-sparse-operations.sh
...
Test                                       ms/vfs-2.33.0     sparse-index-blame    
-----------------------------------------------------------------------------------
2000.2: git blame f2/f4/f3/a (full-v3)     0.48(0.38+0.10)   0.47(0.35+0.12) -2.1% 
2000.3: git blame f2/f4/f3/a (full-v4)     0.48(0.39+0.09)   0.47(0.38+0.08) -2.1% 
2000.4: git blame f2/f4/f3/a (sparse-v3)   1.26(1.17+0.09)   0.28(0.25+0.03) -77.8%
2000.5: git blame f2/f4/f3/a (sparse-v4)   1.27(1.11+0.15)   0.28(0.25+0.03) -78.0%

Ah this is very good to see! Let me try increasing the depth by 1 and doing the same comparison between ms/vfs-2.33.0 and my branch.

This is looking a lot better

Test                                       msft/vfs-2.33.0   sparse-index-blame    
-----------------------------------------------------------------------------------
2000.2: git blame f2/f4/f3/a (full-v3)     0.76(0.55+0.19)   0.87(0.63+0.23) +14.5%
2000.3: git blame f2/f4/f3/a (full-v4)     0.75(0.55+0.18)   0.87(0.63+0.22) +16.0%
2000.4: git blame f2/f4/f3/a (sparse-v3)   1.01(0.73+0.26)   0.77(0.55+0.20) -23.8%
2000.5: git blame f2/f4/f3/a (sparse-v4)   0.97(0.70+0.25)   0.72(0.52+0.19) -25.8%

@ldennington
Copy link
Author

Test                                    f2bf7293c6     
-------------------------------------------------------
2000.2: git blame f2/f4/a (full-v3)     0.27(0.19+0.07)
2000.3: git blame f2/f4/a (full-v4)     0.27(0.18+0.07)
2000.4: git blame f2/f4/a (sparse-v3)   0.53(0.38+0.14)
2000.5: git blame f2/f4/a (sparse-v4)   0.53(0.37+0.15)

On Linux, I get these numbers:

Test                                    ms/vfs-2.33.0     sparse-index-blame    
--------------------------------------------------------------------------------
2000.2: git blame f2/f4/a (full-v3)     0.13(0.11+0.02)   0.13(0.10+0.03) +0.0% 
2000.3: git blame f2/f4/a (full-v4)     0.12(0.09+0.03)   0.12(0.10+0.02) +0.0% 
2000.4: git blame f2/f4/a (sparse-v3)   0.34(0.31+0.03)   0.08(0.06+0.01) -76.5%
2000.5: git blame f2/f4/a (sparse-v4)   0.34(0.31+0.02)   0.08(0.08+0.01) -76.5%

It might be one of those things where macOS is behaving differently. I'm going to deepen the tree and see how that looks.
Edit: Here are the numbers after increasing the depth by one:

$./run ms/vfs-2.33.0 sparse-index-blame -- p2000-sparse-operations.sh
...
Test                                       ms/vfs-2.33.0     sparse-index-blame    
-----------------------------------------------------------------------------------
2000.2: git blame f2/f4/f3/a (full-v3)     0.48(0.38+0.10)   0.47(0.35+0.12) -2.1% 
2000.3: git blame f2/f4/f3/a (full-v4)     0.48(0.39+0.09)   0.47(0.38+0.08) -2.1% 
2000.4: git blame f2/f4/f3/a (sparse-v3)   1.26(1.17+0.09)   0.28(0.25+0.03) -77.8%
2000.5: git blame f2/f4/f3/a (sparse-v4)   1.27(1.11+0.15)   0.28(0.25+0.03) -78.0%

Ah this is very good to see! Let me try increasing the depth by 1 and doing the same comparison between ms/vfs-2.33.0 and my branch.

This is looking a lot better

Test                                       msft/vfs-2.33.0   sparse-index-blame    
-----------------------------------------------------------------------------------
2000.2: git blame f2/f4/f3/a (full-v3)     0.76(0.55+0.19)   0.87(0.63+0.23) +14.5%
2000.3: git blame f2/f4/f3/a (full-v4)     0.75(0.55+0.18)   0.87(0.63+0.22) +16.0%
2000.4: git blame f2/f4/f3/a (sparse-v3)   1.01(0.73+0.26)   0.77(0.55+0.20) -23.8%
2000.5: git blame f2/f4/f3/a (sparse-v4)   0.97(0.70+0.25)   0.72(0.52+0.19) -25.8%

Yep, ok, looks like I was just holding it incorrectly

Test                                    msft/vfs-2.33.0   sparse-index-blame    
--------------------------------------------------------------------------------
2000.2: git blame f2/f4/a (full-v3)     0.27(0.18+0.07)   0.29(0.19+0.08) +7.4% 
2000.3: git blame f2/f4/a (full-v4)     0.27(0.18+0.07)   0.29(0.19+0.08) +7.4% 
2000.4: git blame f2/f4/a (sparse-v3)   0.53(0.37+0.14)   0.21(0.13+0.06) -60.4%
2000.5: git blame f2/f4/a (sparse-v4)   0.55(0.37+0.16)   0.21(0.14+0.06) -61.8%

Copy link
Collaborator

@derrickstolee derrickstolee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved with some minor edits recommended.

t/perf/p2000-sparse-operations.sh Outdated Show resolved Hide resolved
t/perf/p2000-sparse-operations.sh Outdated Show resolved Hide resolved
t/perf/p2000-sparse-operations.sh Outdated Show resolved Hide resolved
builtin/blame.c Show resolved Hide resolved
@ldennington ldennington force-pushed the sparse-index-blame branch 2 times, most recently from dbe21fd to c6dca27 Compare September 24, 2021 22:07
Enable the sparse index for the 'git blame' command. The index was already not
expanded with this command, so the most interesting thing to do is to add tests
that verify that 'git blame' behaves correctly when the sparse index is enabled
and that its performance improves. More specifically, these cases are:

1. The index is not expanded for `blame` when given paths in the sparse
checkout cone at multiple levels.

2. Performance measurably improves for
`blame` with sparse index when given paths in the sparse checkout cone at
multiple levels.

The following are the performance results comparing the branch with sparse
index enabled for `blame` with the vfs-2.33.0 branch in the microsoft/git
repository:

```
Test                                       msft/vfs-2.33.0   sparse-index-blame
-----------------------------------------------------------------------------------
2000.62: git blame f2/f4/a (full-v3)        0.31(0.21+0.09)   0.32(0.21+0.10) +3.2%
2000.63: git blame f2/f4/a (full-v4)        0.29(0.19+0.08)   0.31(0.21+0.08) +6.9%
2000.64: git blame f2/f4/a (sparse-v3)      0.55(0.38+0.14)   0.23(0.14+0.07) -58.2%
2000.65: git blame f2/f4/a (sparse-v4)      0.57(0.40+0.16)   0.23(0.15+0.07) -59.6%
2000.66: git blame f2/f4/f3/a (full-v3)     0.77(0.56+0.20)   0.85(0.60+0.22) +10.4%
2000.67: git blame f2/f4/f3/a (full-v4)     0.78(0.56+0.20)   0.81(0.59+0.21) +3.8%
2000.68: git blame f2/f4/f3/a (sparse-v3)   1.07(0.78+0.28)   0.72(0.52+0.19) -32.7%
2000.99: git blame f2/f4/f3/a (sparse-v4)   1.05(0.78+0.26)   0.73(0.51+0.19) -30.5%
```

We do not include paths outside the sparse checkout cone because blame
currently does not support blaming files outside of the sparse definition.
Attempting to do so fails with the following error:

fatal: no such path '<path outside sparse definition>' in HEAD

Signed-off-by: Lessley Dennington <lessleydennington@gmail.com>
Copy link
Collaborator

@derrickstolee derrickstolee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I re-ran the Scalar functional tests because it's likely you hit a known FS Monitor flake.

builtin/blame.c Show resolved Hide resolved
t/t1092-sparse-checkout-compatibility.sh Outdated Show resolved Hide resolved
t/t1092-sparse-checkout-compatibility.sh Outdated Show resolved Hide resolved
t/t1092-sparse-checkout-compatibility.sh Show resolved Hide resolved
@ldennington ldennington force-pushed the sparse-index-blame branch 4 times, most recently from c10d236 to fb66670 Compare September 27, 2021 16:47
Making the 'blame with pathspec outside sparse definition' test more robust.
Previously, this test just looked for a generic "failure" and didn't verify
that sparse index and non-sparse index provide the same error. We now fail only
when the error messages from sparse index and non-sparse index don't match (and
still document the behavior in the test).

Signed-off-by: Lessley Dennington <lessleydennington@gmail.com>
Copy link

@vdye vdye left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work - especially the extra-thorough performance testing!

@ldennington ldennington merged commit 3011dc5 into microsoft:vfs-2.33.0 Sep 27, 2021
dscho pushed a commit that referenced this pull request Oct 30, 2021
blame: enable and test the sparse index
derrickstolee pushed a commit that referenced this pull request Oct 30, 2021
blame: enable and test the sparse index
derrickstolee pushed a commit that referenced this pull request Oct 31, 2021
blame: enable and test the sparse index
derrickstolee pushed a commit that referenced this pull request Nov 4, 2021
blame: enable and test the sparse index
derrickstolee pushed a commit that referenced this pull request Nov 4, 2021
blame: enable and test the sparse index
derrickstolee pushed a commit that referenced this pull request Nov 10, 2021
blame: enable and test the sparse index
derrickstolee pushed a commit that referenced this pull request Nov 15, 2021
blame: enable and test the sparse index
ldennington added a commit to ldennington/git that referenced this pull request Jan 12, 2022
ldennington added a commit to ldennington/git that referenced this pull request Jan 19, 2022
ldennington added a commit to ldennington/git that referenced this pull request Jan 20, 2022
ldennington added a commit to ldennington/git that referenced this pull request Jan 20, 2022
ldennington added a commit to ldennington/git that referenced this pull request Jan 20, 2022
ldennington added a commit to ldennington/git that referenced this pull request Jan 20, 2022
ldennington added a commit to ldennington/git that referenced this pull request Jan 20, 2022
ldennington added a commit to ldennington/git that referenced this pull request Jan 20, 2022
ldennington added a commit to ldennington/git that referenced this pull request Jan 25, 2022
ldennington added a commit to ldennington/git that referenced this pull request Jan 25, 2022
dscho pushed a commit that referenced this pull request Feb 1, 2022
blame: enable and test the sparse index
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants