-
Notifications
You must be signed in to change notification settings - Fork 96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
virtualization: add new builtin command to print hydration level #659
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great @jeffhostetler!
Do we want to fold the functionality into git diagnose
, though? Or into git status
(potentially with a config option to turn off the behavior if it should turn out to be too costly)?
I think Putting it into |
builtin/virtualization.c
Outdated
c_skipped = count_skipped(the_repository); | ||
c_total = (uint64_t)the_repository->index->cache_nr; | ||
c_hydrated = c_total - c_skipped; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this is really measuring "how many paths in the index have the skip-wortree bit" and using that to measure non-hydrated paths. This doesn't need to be specific to virtual paths.
You could implement this as a subcommand of git sparse-checkout
, say git sparse-checkout stats
. Bonus points if you are sparse index aware and report the density of skip-worktree directories before expanding the index and counting how many skip-worktree files would exist otherwise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was wondering about also having sparse-index stats. It would be cool to have pre- and post-expanded stats. I was mainly thinking about solving a problem for GVFS support staff, but there may be a wider audience for it.
Let me revisit where to put the code. Overnight I realized that a new command, while nice, would require me to allow-list it in the WDG telemetry. Adding it to Adding it to I worry about the extra overhead in Let me look at Do we want to float it in msft/git as an experiment and let WDG evaluate it ? This was their biggest ask in a recent meeting. Or rather, over-hydration was their largest support pain point. |
Maybe we could tack that onto the code that identifies |
D'oh. It turns out that wt-status.c already has code to compute and print this. It just that the value https://github.com/microsoft/git/blob/vfs-2.45.2/wt-status.c#L1761 And the following code hides it from GVFS users. https://github.com/microsoft/git/blob/vfs-2.45.2/wt-status.c#L1607 This might get a lot simpler.... |
When sparse-checkout is enabled, add the sparse-checkout percentage to the Trace2 data stream. This number was already computed and printed on the console in the "You are in a sparse checkout..." message. It would be helpful to log it too for performance monitoring. Signed-off-by: Jeff Hostetler <jeffhostetler@github.com>
6d9e562
to
bbddf35
Compare
I redid this to just print the already-computed value in wt-status. This could go upstream by itself or we could keep it in MSFT/git. I'm wondering about adding a MSFT-specific commit on top to either remove the |
You are referring to this, right? Lines 1607 to 1608 in bbddf35
The text (that is output if we remove that early return) reads: "You are in a sparse checkout." I wonder whether we want to do something like this instead? diff --git a/wt-status.c b/wt-status.c
index b98e4f699427..afef512a6fea 100644
--- a/wt-status.c
+++ b/wt-status.c
@@ -1604,10 +1604,14 @@ static void show_sparse_checkout_in_use(struct wt_status *s,
{
if (s->state.sparse_checkout_percentage == SPARSE_CHECKOUT_DISABLED)
return;
- if (core_virtualfilesystem)
- return;
-
- if (s->state.sparse_checkout_percentage == SPARSE_CHECKOUT_SPARSE_INDEX)
+ if (core_virtualfilesystem) {
+ if (s->state.sparse_checkout_percentage == SPARSE_CHECKOUT_SPARSE_INDEX)
+ status_printf_ln(s, color, _("You are in a fully-hydrated checkout."));
+ else
+ status_printf_ln(s, color,
+ _("You are in a partially-hydrated checkout with %d%% of tracked files present."),
+ s->state.sparse_checkout_percentage);
+ } else if (s->state.sparse_checkout_percentage == SPARSE_CHECKOUT_SPARSE_INDEX)
status_printf_ln(s, color, _("You are in a sparse checkout."));
else
status_printf_ln(s, color,
As we're talking about code executed as part of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❤️
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
New traces are nice. Want to update the PR title before merging?
yeah, let me get caught up and i'll send a new version shortly. |
@dscho Is this message correct? Or is this one of the cases that won't happen? If we have a sparse-index, we are building upon a sparse-checkout, right? I'm wondering if this case should just say "You are in a sparse checkout with a sparse index with %d entries."
|
Add VFS checkout hydration percentage information to the default `git status` output. When VFS is enable, users will now see a "You are in a partially-hydrated checkout with <percentage> of tracked files present." message. Upstream `git status` normally prints a "You are in a sparse checkout with <percentage> of tracked files present." This message was hidden in `microsoft/git` when `core_virtualfilesystem` is set (because GVFS users are always (and secretly) in a sparse checkout) and it was thought that it would annoy users. However, we now believe that it may be helpful for users to always see the percentage and know when they are over-hyrdated, since over-hyrdation can occur by accident and may greatly impact their Git performance. Knowing this value may help with GVFS support. Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Jeff Hostetler <jeffhostetler@github.com>
Target `macos-11` is deprecated now and is in scheduled brownouts. Update to `macos-13`. Signed-off-by: Jeff Hostetler <jeffhostetler@github.com>
The sparse index isn't compatible with a virtual filesystem (you need all the blobs available in the index for immediate hydration) so this case shouldn't happen. It could be a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for fixing the functional tests YAML while you're here. I'm happy with your messaging around the sparse index, even though it should never happen.
@derrickstolee The scalar functional tests seem to be stuck. It's been running for hours. Is this normal or can we just ignore it and go on? |
It's actually not stuck, but a side effect of GitHub's branch protection model being a bit... simplistic. We want to require the Scalar Functional Tests to pass for PRs and we want to see a clear visual signal when they fail, but we cannot do that directly, instead, we can only require individual GitHub workflow jobs to pass. See here: Notice how it says specifically |
Thanks!!! |
GVFS users can easily (and accidentally) over-hydrate their enlistments. This causes some commands to be very slow. Create a command to print the current hydration level. This should help our support team investigate the state of their enlistment. This command will print something like: ``` % git virtualization Skipped: 2 Hydrated: 3 Total: 5 Hydration: 60.00% ``` and log those values to Trace2 in a `data_json` record of the form: ``` {"skipped":2,"hydrated":3,"total":5,"hydration":60.00} ```
GVFS users can easily (and accidentally) over-hydrate their enlistments. This causes some commands to be very slow. Create a command to print the current hydration level. This should help our support team investigate the state of their enlistment. This command will print something like: ``` % git virtualization Skipped: 2 Hydrated: 3 Total: 5 Hydration: 60.00% ``` and log those values to Trace2 in a `data_json` record of the form: ``` {"skipped":2,"hydrated":3,"total":5,"hydration":60.00} ```
GVFS users can easily (and accidentally) over-hydrate their enlistments. This causes some commands to be very slow. Create a command to print the current hydration level. This should help our support team investigate the state of their enlistment. This command will print something like: ``` % git virtualization Skipped: 2 Hydrated: 3 Total: 5 Hydration: 60.00% ``` and log those values to Trace2 in a `data_json` record of the form: ``` {"skipped":2,"hydrated":3,"total":5,"hydration":60.00} ```
GVFS users can easily (and accidentally) over-hydrate their enlistments. This causes some commands to be very slow.
Create a command to print the current hydration level. This should help our support team investigate the state of their enlistment.
This command will print something like:
and log those values to Trace2 in a
data_json
record of the form: