-
-
Notifications
You must be signed in to change notification settings - Fork 319
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tree -> index diff for status #1410
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Byron
force-pushed
the
status
branch
2 times, most recently
from
June 24, 2024 05:20
bff7dde
to
5c886aa
Compare
Byron
force-pushed
the
status
branch
2 times, most recently
from
July 12, 2024 13:06
af64f8f
to
eef0fe0
Compare
Byron
force-pushed
the
status
branch
2 times, most recently
from
December 28, 2024 08:26
964280d
to
acd054a
Compare
Add a function to help resume the iterator without holding onto the data directly.
Byron
force-pushed
the
status
branch
2 times, most recently
from
December 28, 2024 16:24
531179c
to
3115ed5
Compare
A depth-first traversal yields the `.git/index` order. It's a breaking change as the `Visitor` trait gets another way to pop a tracked path, suitable for the stack used for depth first.
This allows a depth-first traversal with a delegate.
…)` now returns `&mut str`. This helps callers to avoid converting to UTF8 by hand.
This makes the result similar to `git ls-tree` in terms of ordering.
It uses depth-first traversal out of the box which allows it to save the sorting in the end. It's also a little bit faster.
Byron
force-pushed
the
status
branch
5 times, most recently
from
December 30, 2024 14:49
28a8e93
to
a867322
Compare
It comes with pathspec support to allow for easier integration into the `status()` machinery.
Byron
force-pushed
the
status
branch
2 times, most recently
from
December 30, 2024 20:21
46bbae2
to
38e3c50
Compare
…tree and an index. It also respects `status.rename` and `status.renameLimit` to configure rename tracking.
This copmpletes the `is_dirty()` implementation.
That way one can officially use "section.name" strings or `&Section::NAME`.
Byron
force-pushed
the
status
branch
2 times, most recently
from
January 3, 2025 19:05
5b8140f
to
21baf6f
Compare
…tatus. Note that it is still possible to disable the head-index status. Types moved around, effectivey removing the `iter::` module for most more general types, i.e. those that are quite genericlally useful in a status.
This completes the status implementation.
This was referenced Jan 4, 2025
Merged
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Based on #1368
diff-correctness →
gix-status
→ gix resetImprove
gix status
to the point where it's suitable for use inreset
functionality.Leads to a proper worktree reset implementation, eventually leading to a high-level reset similar to how git supports it.
Architecture
The reason this PR deals quite a bit with
gix status
is that for a safe implementation ofreset()
we need to be sure that the files we would want to touch don't don't carry modifications or are untracked files. In order to know what would need to be done, we have to diff thecurrent-index with target-index
. The set of files to touch can then be used to lookup information provided bygit-status
, like worktree modifications, index modifications, and untracked files, to know if we can proceed or not. Here is also where the reset-modes would affect the outcome, i.e. what to change and how.This is a very modular approach which facilitates testing and understanding of what otherwise would be a very complex algorithm. Having a set of changes as output also allows to one day parallelize applying these changes.
This leaves us in a situation where the current
checkout()
implementation wants to become a fastpath for situations where the reset involves an empty tree as source (i.e. create everything and overwrite local changes).Extra Tasks
Out-of-band tasks that just should finally be done, with potential for great impact.
hasconfig
as part ofresolve_includes()
without actual lookahead.Tasks
gix
gix tree entries
depth-first tree iterator as basis for efficient-just-in-time diffinggix-index::from_tree()
if that's faster - try with benchmark, or big-repo tests.gix::status
should respectstatus.renames
andstatus.renameLimit
- also update docs ofStatusPlatform::index_worktree_rewrites()
is_dirty()
to use full statusSubmodule::status()
to do full status.gix
should- no as it's taken care off with the status on index/worktreeConflict::try_from_entries()
be used to condense merge-conflicts in tree-index diffs?Preliminary Performance
On WebKit
On the linux kernel it's only 1.34x though, and without parallelism on WebKit it's 10% slower.
Status Enables
cargo package
and its use of complete status information.gitoxide
backend Byron/built#1vergen
#298built
can get fully-functional is-dirty flag for 'describe()'Inbetween
Next PR: Reset
reset()
that checks if it's allowed to perform a worktree modification is allowed, or if an entry should be skipped. That way we can postpone safety checks like --hardPostponed
What follows is important for resets, but won't be needed for
cargo
worktree resets.gix index entries
to optionally expand sparse entriesgix status
with implemented 'porcelain-v2` display modeResearch
git ls-files -s
andgit ls-tree -r
use the same ordering, so comparisons are trivial and similar totree/tree
diffs.run_diff_index
is the entrypoint for tree/index diffs. It's in diff-core though, so hard to follow. Oh, and that goes intounpack_trees
which is even worse.wt_status_collect_updated_cb
is the primary callback to collect the actual tree/index change.Abandoned DF-Iter
Research
gix status
can deal a little better with submodules. Even though in this case a lot of submodule-related information is needed for a complete reset, probably only doable by a higher-level caller which orchestrates it.merge
andkeep
? How to controlrefresh
? Maybe partial (only the files we touch), and full, to also update the files we don't touch as part of status? Maybe it's part of status if that is run before.git reset
andgit checkout
in terms ofHEAD
modifications. With the former changingHEAD
s referent, and the latter changingHEAD
itself.checkout()
method as technically that's areset --hard
with optional overwrite check. Could it be rolled into one, with pathspec support added?reset()
performs just as well, which is unlikely as there is more overhead. But maybe it's not worth to maintain two versions over it. But if so, one should probably rename it.git status
: what about rename tracking? It's available for tree-diffs and quite complex on its own. Probably only needs HEAD-vs-index rename tracking. No, also can have worktree rename tracking, even though it's hard to imagine how this can be fast unless it's tightly integrated with untracked-files handling. This screams for a generalization of the tracking code though as the testing and implementation is complex, but should be generalisable.Re-learn
pathspecs
normalize themselves to turn from any kind of specification into repo-root relative patterns...
and that root will be always be used to open files like../.gitignore
, which is useful for display to the user)By default, each thread consumes 8MB of memory for the stack which can quickly
stack as machines have more cores and, especially during status, more threads
are started than there are cores. This overcommitting is by design, but
at least we should make sure that memory doesn't grow unnecessarily.
Especially iterators know the code they execute, hence these versions
should have a way to tune the stack size to reduce the peak memory footprint.