-
Notifications
You must be signed in to change notification settings - Fork 98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve next_cache_entry
cache traversal performance
#429
Improve next_cache_entry
cache traversal performance
#429
Conversation
I think this is a very promising direction, especially in the short term.
This is great!
This is a good point to indicate that But let's make a note to explore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was worried that the changes here could be short-term and want to be removed in the future, but it seems like adding this "hint memory" to next_cache_entry()
is a good thing to do, anyway. It avoids quadratic growth in the algorithm!
Just a couple nits.
int hint = -1; | ||
while (1) { | ||
int cmp; | ||
struct cache_entry *ce; | ||
|
||
if (o->diff_index_cached) | ||
ce = next_cache_entry(o); | ||
ce = next_cache_entry(o, &hint); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that I look closely at your implementation, this old code had quadratic complexity! That helps me understand why having ~70,000 index entries still took 5 seconds to iterate.
I suppose I was expecting hint
to be initialized with some information other than -1
, but this seems to be a good idea regardless of any issues with the sparse index.
Intended to test handling of cache entries outside `traverse_trees` (accomplished here by having the pathspec match no cache entries). Signed-off-by: Victoria Dye <vdye@github.com>
To find the first non-unpacked cache entry, `next_cache_entry` iterates through index, starting at `cache_bottom`, to find the first cache entry. The performance of this in full indexes is helped by `cache_bottom` advancing with each invocation of `mark_ce_used` (called by `unpack_index_entry`). However, the presence of sparse directories can prevent the `cache_bottom` from advancing in a sparse index case, effectively forcing `next_cache_entry` to search from the beginning of the index each time it is called. The need to preserve `cache_bottom` for the sparse index is documented in 17a1bb5, so to get the benefit it provides in "shortcutting" already-searched entries a separate `hint` position is used. The hint position is set inside of `next_cache_entry` to `last_searched_position + 1`, allowing full _and_ sparse index iterations to skip already-searched entries. The performance is significantly improved for the sparse index case based on the `p2000` results for a `git reset` with a non-matching pathspec (heavily using `next_cache_entry`): Test ms/vfs-2.33.0 HEAD ------------------------------------------------------ (full-v3) 0.79(0.38+0.30) 0.91(0.43+0.34) +15.2% (full-v4) 0.80(0.38+0.29) 0.85(0.40+0.35) +6.2% (sparse-v3) 0.76(0.43+0.69) 0.44(0.08+0.67) -42.1% (sparse-v4) 0.71(0.40+0.65) 0.41(0.09+0.65) -42.3% Signed-off-by: Victoria Dye <vdye@github.com>
{ | ||
const struct index_state *index = o->src_index; | ||
int pos = o->cache_bottom; | ||
|
||
if (*hint > pos) | ||
pos = *hint; | ||
|
||
while (pos < index->cache_nr) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the retroactive review comments - if it would be better to discuss on a call lmk!
I see index->cache_nr
frequently but am having some trouble understanding it/finding docs on what it means (I think you mentioned "cache number," but I'm still not sure why that's significant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cache_nr
is the number of entries in the cache
array.
In the Git codebase, you'll see a lot of X_nr
and X_alloc
representing that there are X_nr
elements in an array currently allocated to have X_alloc
entries.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
@@ -665,17 +665,24 @@ static void mark_ce_used_same_name(struct cache_entry *ce, | |||
} | |||
} | |||
|
|||
static struct cache_entry *next_cache_entry(struct unpack_trees_options *o) | |||
static struct cache_entry *next_cache_entry(struct unpack_trees_options *o, int *hint) | |||
{ | |||
const struct index_state *index = o->src_index; | |||
int pos = o->cache_bottom; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know cache_bottom
is critical to this change, but a little unsure what it means?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As we scan the index, we are also drilling down the trees, so the cache-tree extension (if available) is giving us some idea of the range of index entries corresponding to a given tree. This can be computed dynamically, too. The cache_bottom
value is storing the expected lower-end of this range. As we recurse into subtrees, the cache_bottom
is stored in a stack variable somewhere and then the o->cache_bottom
value is rewritten as the recursion ends.
In general: the cache_bottom
thing is necessary to make these index scans work properly, but it is complicated because the index is a flat list of files. It's just one more thing that is added on top to try and make sense of that difference with the trees in the object database.
Improve `next_cache_entry` cache traversal performance
Improve `next_cache_entry` cache traversal performance
Improve `next_cache_entry` cache traversal performance
Improve `next_cache_entry` cache traversal performance
Improve `next_cache_entry` cache traversal performance
Improve `next_cache_entry` cache traversal performance
Improve `next_cache_entry` cache traversal performance
…aversal Improve `next_cache_entry` cache traversal performance
Changes
next_cache_entry
to accept a "cached" starting position for index searchnext_cache_entry
searches thesrc_index
, which shouldn't be modified in-place over the course ofunpack_trees
, the first search position for the nextnext_cache_entry
call can belast_position + 1
This might also help with
git blame
performance, if it's usingunpack_trees
(CC: @ldennington).Performance
This change ultimately only had an effect on sparse index checkouts, as demonstrated by the performance test results:
The reason appears to be because, in full indexes,
mark_ce_used
will advancecache_bottom
such that, innext_cache_entry
, already-checked entries are skipped. In sparse indexes, however, the presence of sparse directory entries will block thecache_bottom
from advancing, so already-checked cache entries are not skipped innext_cache_entry
. Since the cache tree still needscache_bottom
to not advance (per 17a1bb5), the transienthint
position letsnext_cache_entry
shortcut already-checked entries even for a sparse index.