Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rename-tracking for gix-status #1306

Merged
merged 26 commits into from
Mar 14, 2024
Merged

rename-tracking for gix-status #1306

merged 26 commits into from
Mar 14, 2024

Conversation

Byron
Copy link
Member

@Byron Byron commented Feb 25, 2024

Based on #1301


diff-correctness → gix-status → gix reset


Improve gix status to the point where it's suitable for use in reset functinoality.
Leads to a proper worktree reset implementation, eventually leading to a high-level reset similar to how git supports it.

Architecture

The reason this PR deals quite a bit with gix status is that for a safe implementation of reset() we need to be sure that the files we would want to touch don't don't carry modifications or are untracked files. In order to know what would need to be done, we have to diff the current-index with target-index. The set of files to touch can then be used to lookup information provided by git-status, like worktree modifications, index modifications, and untracked files, to know if we can proceed or not. Here is also where the reset-modes would affect the outcome, i.e. what to change and how.

This is a very modular approach which facilitates testing and understanding of what otherwise would be a very complex algorithm. Having a set of changes as output also allows to one day parallelize applying these changes.

This leaves us in a situation where the current checkout() implementation wants to become a fastpath for situations where the reset involves an empty tree as source (i.e. create everything and overwrite local changes).

On the way to reset() it's a valid choice to warm up more with the matter by improving on the current gix status implementation and assure correctness of what's there, which currently doesn't seem to be the case in comparison. Further, implementing gix status similarly to git status should be made possible.

Tasks

  • mix index-check with dirwalk
  • rename-tracking in gix-status crate
  • Submodule::status() that uses the new iterator and respects submodule::config::Ignore
    • assure there are tests for each of the ignore-variants
  • integrate gix-status into gix as is, without tree->index diffs, as iterator and with submodule modifications
  • have a basic 'is-dirty' implementation for submodules, in a way that can be extended later
  • add is_dirty() functionality, also for describe(), making use of auto-interrupt on iter drop
  • gix commit describe with is-dirty support
  • Add dirty-suffix support to gix submodules, more expensive, but more detailed.
  • add gix is-clean - fail if it's dirty
  • use new status iterator in gix status
    • interruption doesn't seem to work - fix it. It's due to sorting - must be able to pass own interrupt atomic to iterator
    • respect status.showUntrackedFiles
    • add rewrites support
    • add flags for showing ignored files as well
    • revalidate performance - problem is multi-threaded index modification check, tends to not scale well
    • more elaborate printing, similar to git status --porcelain=2, particularly for submodules (seems less important now, let's mention it with a 'not implemented error and format flag)
  • finalize onefetch integration, probably make a new release
  • inform helix about it (for use from main)

Next PR

  • diff index with index to learn what we would want to do in the worktree, or alternatively,
    diff tree with index (with reverse-diff functionality to simulate diff of index with tree), for better performance as it
    would avoid having to allocate a whole index even though we are only interested in a diff.
    • Must include rename tracking.
  • how to make diff results available from status with all transformations applied, to allow user to obtain diffs of any kind?

Status Enables

Next PR: Reset

  • reset() that checks if it's allowed to perform a worktree modification is allowed, or if an entry should be skipped. That way we can postpone safety checks like --hard

Postponed

What follows is important for resets, but won't be needed for cargo worktree resets.

  • a way to expand sparse dirs (but figure out if this is truly always necessary) - probably not, unless sparse dirs can be empty, but even then no expansion is needed
    • wire it up in gix index entries to optionally expand sparse entries
  • gix status with actual submodule support - needs status in gix (crate) effectively
  • gix status with actual conflict support

Research

  • Ignored files are considered expandable and can be overwritten on reset
  • How to integrate submodules - probably easy to answer once gix status can deal a little better with submodules. Even though in this case a lot of submodule-related information is needed for a complete reset, probably only doable by a higher-level caller which orchestrates it.
  • How to deal with various modes like merge and keep? How to control refresh? Maybe partial (only the files we touch), and full, to also update the files we don't touch as part of status? Maybe it's part of status if that is run before.
  • Worthwhile to make explicit the difference between git reset and git checkout in terms of HEAD modifications. With the former changing HEADs referent, and the latter changing HEAD itself.
  • figure out how this relates to the current checkout() method as technically that's a reset --hard with optional overwrite check. Could it be rolled into one, with pathspec support added?
    • just keep them separate until it's clear that reset() performs just as well, which is unlikely as there is more overhead. But maybe it's not worth to maintain two versions over it. But if so, one should probably rename it.
  • for git status: what about rename tracking? It's available for tree-diffs and quite complex on its own. Probably only needs HEAD-vs-index rename tracking. No, also can have worktree rename tracking, even though it's hard to imagine how this can be fast unless it's tightly integrated with untracked-files handling. This screams for a generalization of the tracking code though as the testing and implementation is complex, but should be generalisable.

Re-learn

  • pathspecs normalize themselves to turn from any kind of specification into repo-root relative patterns.
  • attribute/ignore file sources are naturally relative to the root of the repo, which remains relative (i.e. can be .. and that root will be always be used to open files like ../.gitignore, which is useful for display to the user)

@Byron Byron force-pushed the status branch 7 times, most recently from ac926a1 to 4e0feff Compare February 29, 2024 10:30
@Byron Byron force-pushed the status branch 3 times, most recently from db23c78 to 24edf36 Compare March 3, 2024 21:07
Byron added 3 commits March 4, 2024 17:34
That way, the constructor becaomes more versatile as the user can chose
to pass attribute stacks that have more functionality, and thus can be
used in more places.
Previously, the source was entirely missing, now it's also made available.
Further, all the cloning of these resources is now left to the user,
which should safe time.
@Byron Byron force-pushed the status branch 9 times, most recently from d5ec06b to f01cf70 Compare March 7, 2024 21:09
@Byron Byron mentioned this pull request Mar 8, 2024
8 tasks
@Byron Byron force-pushed the status branch 5 times, most recently from 2e5c255 to 34db78d Compare March 9, 2024 20:16
Byron added 11 commits March 10, 2024 10:21
We also move the `IndexPersistedOrInMemory` type to the `worktree` module
as its more widely useful.
Previously, it would not allow to enter the repository, making
a walk impossible.
That way it's possible to obtain submodule status information,
with enough information to implement `git status`-like commands.
The simplest way to learn if the repository is dirty or not.
…x using `commit::describe::Resolution::format_with_dirty_suffix()`
That way a suffix will be added depending on the dirty-state of the repository.
This is a submodule-centric and greatly simplified way of obtaining
describe information with dirty-suffix.

Note that `status` information is also possible, but it seems
hard to display nicely, which this command isn't great at
in the first place.
It's a good way to compare the time it takes to run a full status
compared to a quick is-dirty check.
Byron added 2 commits March 10, 2024 11:55
Otherwise users might not have too much delay until an interrupt is possible,
wasting a lot of time.
Submodule changes are now picked up as long as the submodule is
in the index.
Further, it's possible to enable rename-tracking between
index and worktree separately.
Byron added 6 commits March 12, 2024 16:25
This enables rename-tracking between worktree and index, something
that Git also doesn't do or doesn't do by default.
It is, however, available in `git2`.
This increases the compatibility when using a patched-in version of
gix in other crates.
That way it's possible to more easily and straight-forwardly understand
the status of an entry, comparing index to worktree.
@Byron Byron merged commit 3e5c974 into main Mar 14, 2024
18 checks passed
@Byron Byron mentioned this pull request Mar 15, 2024
8 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant