-
Notifications
You must be signed in to change notification settings - Fork 133
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sparse index: delete ignored files outside sparse cone #1009
Sparse index: delete ignored files outside sparse cone #1009
Commits on Aug 24, 2021
-
t7519: rewrite sparse index test
The sparse index is tested with the FS Monitor hook and extension since f8fe49e (fsmonitor: integrate with sparse index, 2021-07-14). This test was very fragile because it shared an index across sparse and non-sparse behavior. Since that expansion and contraction could cause the index to lose its FS Monitor bitmap and token, behavior is fragile to changes in 'git sparse-checkout set'. Rewrite the test to use two clones of the original repo: full and sparse. This allows us to also keep the test files (actual, expect, trace2.txt) out of the repos we are testing with 'git status'. Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Configuration menu - View commit details
-
Copy full SHA for c407b2c - Browse repository at this point
Copy the full SHA c407b2cView commit details -
sparse-index: silently return when not using cone-mode patterns
While the sparse-index is only enabled when core.sparseCheckoutCone is also enabled, it is possible for the user to modify the sparse-checkout file manually in a way that does not match cone-mode patterns. In this case, we should refuse to convert an index into a sparse index, since the sparse_checkout_patterns will not be initialized with recursive and parent path hashsets. Also silently return if there are no cache entries, which is a simple case: there are no paths to make sparse! Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Configuration menu - View commit details
-
Copy full SHA for 8660877 - Browse repository at this point
Copy the full SHA 8660877View commit details
Commits on Sep 7, 2021
-
unpack-trees: fix nested sparse-dir search
The iterated search in find_cache_entry() was recently modified to include a loop that searches backwards for a sparse directory entry that matches the given traverse_info and name_entry. However, the string comparison failed to actually concatenate those two strings, so this failed to find a sparse directory when it was not a top-level directory. This caused some errors in rare cases where a 'git checkout' spanned a diff that modified files within the sparse directory entry, but we could not correctly find the entry. Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Helped-by: René Scharfe <l.s.r@web.de> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Configuration menu - View commit details
-
Copy full SHA for edb00d3 - Browse repository at this point
Copy the full SHA edb00d3View commit details -
sparse-index: silently return when cache tree fails
If cache_tree_update() returns a non-zero value, then it could not create the cache tree. This is likely due to a path having a merge conflict. Since we are already returning early, let's return silently to avoid making it seem like we failed to write the index at all. If we remove our dependence on the cache tree within convert_to_sparse(), then we could still recover from this scenario and have a sparse index. Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Configuration menu - View commit details
-
Copy full SHA for c8620de - Browse repository at this point
Copy the full SHA c8620deView commit details -
sparse-index: use WRITE_TREE_MISSING_OK
When updating the cache tree in convert_to_sparse(), the WRITE_TREE_MISSING_OK flag indicates that trees might be computed that do not already exist within the object database. This happens in cases such as 'git add' creating new trees that it wants to store in anticipation of a following 'git commit'. If this flag is not specified, then it might trigger a promisor fetch or a failure due to the object not existing locally. Use WRITE_TREE_MISSING_OK during convert_to_sparse() to avoid these possible reasons for the cache_tree_update() to fail. Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Configuration menu - View commit details
-
Copy full SHA for 3717161 - Browse repository at this point
Copy the full SHA 3717161View commit details -
sparse-checkout: create helper methods
As we integrate the sparse index into more builtins, we occasionally need to check the sparse-checkout patterns to see if a path is within the sparse-checkout cone. Create some helper methods that help initialize the patterns and check for pattern matching to make this easier. The existing callers of commands like get_sparse_checkout_patterns() use a custom 'struct pattern_list' that is not necessarily the one in the 'struct index_state', so there are not many previous uses that could adopt these helpers. There are just two in builtin/add.c and sparse-index.c that can use path_in_sparse_checkout(). We add a path_in_cone_mode_sparse_checkout() as well that will only return false if the path is outside of the sparse-checkout definition _and_ the sparse-checkout patterns are in cone mode. Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Configuration menu - View commit details
-
Copy full SHA for 98b4cae - Browse repository at this point
Copy the full SHA 98b4caeView commit details -
attr: be careful about sparse directories
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Configuration menu - View commit details
-
Copy full SHA for 6ec3cb2 - Browse repository at this point
Copy the full SHA 6ec3cb2View commit details -
sparse-index: add SPARSE_INDEX_MEMORY_ONLY flag
The convert_to_sparse() method checks for the GIT_TEST_SPARSE_INDEX environment variable or the "index.sparse" config setting before converting the index to a sparse one. This is for ease of use since all current consumers are preparing to compress the index before writing it to disk. If these settings are not enabled, then convert_to_sparse() silently returns without doing anything. We will add a consumer in the next change that wants to use the sparse index as an in-memory data structure, regardless of whether the on-disk format should be sparse. To that end, create the SPARSE_INDEX_MEMORY_ONLY flag that will skip these config checks when enabled. All current consumers are modified to pass '0' in the new 'flags' parameter. Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Configuration menu - View commit details
-
Copy full SHA for d57f48c - Browse repository at this point
Copy the full SHA d57f48cView commit details -
sparse-checkout: clear tracked sparse dirs
When changing the scope of a sparse-checkout using cone mode, we might have some tracked directories go out of scope. The current logic removes the tracked files from within those directories, but leaves the ignored files within those directories. This is a bit unexpected to users who have given input to Git saying they don't need those directories anymore. This is something that is new to the cone mode pattern type: the user has explicitly said "I want these directories and _not_ those directories." The typical sparse-checkout patterns more generally apply to "I want files with with these patterns" so it is natural to leave ignored files as they are. This focus on directories in cone mode provides us an opportunity to change the behavior. Leaving these ignored files in the sparse directories makes it impossible to gain performance benefits in the sparse index. When we track into these directories, we need to know if the files are ignored or not, which might depend on the _tracked_ .gitignore file(s) within the sparse directory. This depends on the indexed version of the file, so the sparse directory must be expanded. We must take special care to look for untracked, non-ignored files in these directories before deleting them. We do not want to delete any meaningful work that the users were doing in those directories and perhaps forgot to add and commit before switching sparse-checkout definitions. Since those untracked files might be code files that generated ignored build output, also do not delete any ignored files from these directories in that case. The users can recover their state by resetting their sparse-checkout definition to include that directory and continue. Alternatively, they can see the warning that is presented and delete the directory themselves to regain the performance they expect. By deleting the sparse directories when changing scope (or running 'git sparse-checkout reapply') we regain these performance benefits as if the repository was in a clean state. Since these ignored files are frequently build output or helper files from IDEs, the users should not need the files now that the tracked files are removed. If the tracked files reappear, then they will have newer timestamps than the build artifacts, so the artifacts will need to be regenerated anyway. Use the sparse-index as a data structure in order to find the sparse directories that can be safely deleted. Re-expand the index to a full one if it was full before. Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Configuration menu - View commit details
-
Copy full SHA for 91b53f2 - Browse repository at this point
Copy the full SHA 91b53f2View commit details