Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle Scalar enlistments without src subdirectory #402

Merged
merged 8 commits into from
Aug 4, 2021
Merged

Handle Scalar enlistments without src subdirectory #402

merged 8 commits into from
Aug 4, 2021

Conversation

vdye
Copy link
Collaborator

@vdye vdye commented Jul 27, 2021

Updates to allow Scalar commands to work properly on enlistments that are not cloned into a src subdirectory (e.g., if scalar register is run on a git clone'd repository).

Added regression tests + ran manual tests with scalar diagnose to ensure diagnostics directory was created in the enlistment directory both with and without src subdirectory.

@vdye vdye requested a review from derrickstolee July 27, 2021 15:41
@vdye vdye self-assigned this Jul 27, 2021
Copy link
Collaborator

@derrickstolee derrickstolee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A good start. I mostly have some nits, but I do have some questions about the tests.

contrib/scalar/scalar.c Outdated Show resolved Hide resolved
contrib/scalar/t/t9099-scalar.sh Show resolved Hide resolved
contrib/scalar/scalar.c Outdated Show resolved Hide resolved
contrib/scalar/scalar.c Outdated Show resolved Hide resolved
contrib/scalar/scalar.c Outdated Show resolved Hide resolved
}

/* Given an absolute working directory, find the enlistment root */
static void strbuf_enlistmentroot(struct strbuf *buf)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could I ask for a _ between enlistment and root? That will shut up my VS Code's cSpell... ;-)

Copy link
Collaborator Author

@vdye vdye Jul 28, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is is out-of-date - the function was removed in favor of find_enlistment() (+other misc refactoring).

As a note, I had separated the original commit & fixup! commit that included changes based on review comments to make it easier to review the implementation of those changes. Since (I think) those have been addressed/re-reviewed, I've squashed everything together into a single commit again.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should use the form fixup! in the final commit, so that the faulty commit in our branch thicket can be fixed in the rebase to the next Git for Windows version.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That works for me - once all other review comments are addressed, I'll restructure the commit history into a series of fixups.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've restructured the commit history, applying fixups to (I think) the appropriate commits. I'll also be able to help with any merge conflict resolution later on (based on my local rebase, it's pretty messy due to the context differences).

Copy link
Member

@dscho dscho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am somewhat worried that my (apparently incorrect) assumption that every Scalar enlistment contains a src subdirectory that is the actual Git worktree might be encoded throughout scalar.c and might need more places to be adjusted (and certainly the documentation needs to be adjusted, too).

And I am also somewhat worried that it might be too confusing to consider every Git worktree to be a Scalar enlistment unless the worktree's path ends in /src (in which case its parent directory is treated as the corresponding Scalar enlistment).

size_t len = path->len;

strbuf_addstr(path, "/.git");
if (is_git_directory(path->buf)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite follow... What are the exact rules as to what constitutes an enlistment directory? I thought that an enlistment directory must contain a src/ subdirectory that is a Git worktree. If that is not so, we will have to update documentation, and potentially touch up more code locations.

Copy link
Collaborator Author

@vdye vdye Jul 28, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the issue linked to this pull request (and what I saw of the .NET Scalar implementation), the enlistment root could be the git working directory itself (mostly used when scalar register-ing an existing git clone. @derrickstolee would probably be the best one to clarify on that, though.

For what it's worth, I think I caught any instances where that assumption was made - since most operations are performed on the working directory anyway, there wasn't a lot to change.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we're still missing the edge case where <root>/src/.git is the repository, but the current directory is <root>/out/.

So I think we need the following strategy:

  • Loop
    • if src/.git is a Git directory, we are in the enlistment root
    • if .git is a Git directory,
      • if the current directory is named src, the enlistment root is the parent directory
      • otherwise the current directory is the enlistment root
    • otherwise, continue with the parent directory
    • if there is no parent directory (use offset_1st_component() to accommodate for DOS drives and network shares), error out

Of course, we could be a bit smart by determining the parent directory already before the .git check (because we need it both to test whether the current directory is called src as well as in the case where we continue with the parent directory).

Copy link
Collaborator Author

@vdye vdye Jul 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In addition to the changes made earlier to handle parallel directories (like <root>/out), this should now handle DOS/network paths with root-level scalar registrations (using offset_1st_component). That logic is in a separate fixup! commit if you'd like to review it on its own - I think it avoids off-by-one errors/infinite loops/other general path issues, but I would appreciate a second opinion.

static void find_enlistment(struct strbuf *path,
struct strbuf *enlistment_root)
{
strbuf_addstr(path, "/src");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do it this way, we can no longer start in <root>/out (where <root>/src/.git is the Git repository), as we will never hit a parent directory that contains a .git subdirectory.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thought was that, by blindly adding /src to the end of the input path, we'll be able to reliably find the git working directory anywhere from an enlistment root that (as recommended) contains a src subdirectory all the way down through the git clone itself. With a src subdirectory:

test-repo
└── src
    ├── .git
    └── docs

All of these input paths will find the working directory:

  • test-repo
  • test-repo/src
  • test-repo/src/docs
  • test-repo/src/docs/...

In a clone without the src working directory:

existing
├── .git
└── docs

find_enlistment(...) will work with these input paths:

  • existing
  • existing/docs
  • existing/docs/...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think @vdye's assessment is correct.

scalar register should work on any Git repository, and then so should scalar unregister and scalar delete <dir>. The src/ is only guaranteed for repositories created by scalar clone.

VFS for Git has a .gvfs directory adjacent to the src directory which stores required metadata, so the concept of "enlistment" is a key concept there. Scalar inherited that terminology, but we expanded this in microsoft/scalar#272 and microsoft/scalar#274.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of these input paths will find the working directory:

  • test-repo
  • test-repo/src
  • test-repo/src/docs
  • test-repo/src/docs/...

But will it find the enlistment root in test-repo/out?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, after the most recent commit (comment).

Copy link
Member

@dscho dscho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a couple of suggestions. The way I read offset_1st_component(), I think we need to have the >= offset condition (instead of > offset, which would be an off-by-one).

But hey, I am the emperor of off-by-ones, please double check my math.

contrib/scalar/scalar.c Show resolved Hide resolved

if (enlistment_root) {
strbuf_addbuf(enlistment_root, path);
strbuf_parentdir(enlistment_root);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to use strbuf_add(enlistment_root, path, len); here instead of copying the full src path and then finding its parent directory?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that definitely simplifies the implementation. Updated!

contrib/scalar/scalar.c Outdated Show resolved Hide resolved

/* reset path to parent */
strbuf_setlen(path, len);
strbuf_parentdir(path);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, we should probably have strbuf_parentdir() return -1 if no parent directory was found, and break out of the loop in that case.

contrib/scalar/scalar.c Outdated Show resolved Hide resolved
if (!len)
die(_("could not find enlistment root"));
strbuf_trim_trailing_dir_sep(&path);
if (!!find_enlistment(&path, enlistment_root))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do not need the !! here, as any non-zero return value of find_enlistment() will make this condition be evaluated as true. The !! is only needed if we need to turn all non-zero values into 1.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed the return value for find_enlistment(...) to be 1 if the enlistment is found, 0 if not (so the condition is now if (!find_enlistment(...)). Since the output is only used as a boolean (and not passed through as an exit code), it made more sense to me to have the return value behave the way I would expect a boolean to. I'm not sure of the "right" convention (0 for success vs 1 for success) to follow, though - is there a standard way of deciding that?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the rule for regular functions whose name does not imply a Boolean return value is that -1 indicates an error and 0 indicates success.

For example, is_directory() would imply a Boolean to me, whereas find_enlistment() would suggest to me that we expect to find an enlistment, and have to error out if none is found. Therefore, I would tend to use the standard Unix rule "0 == success" here.

contrib/scalar/scalar.c Show resolved Hide resolved
Copy link
Member

@dscho dscho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good to me! I'm just not yet 100% convinced that we want to interpret the int return values as Booleans.

size_t offset = offset_1st_component(buf->buf);
char *path_sep = find_last_dir_sep(buf->buf + offset);
strbuf_setlen(buf, path_sep ? path_sep - buf->buf : offset);

return (buf->len < len);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a minor nit: Git's current coding style would lose the parentheses here.

}

/**
* Given an absolute path at or below the enlistment root, find the
* enlistment root and working directory
* enlistment root and working directory. Returns 1 if enlistment found,
* otherwise 0.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function name find_enlistment() sounds to me as if we expect to find an enlistment, i.e. it is not a "yes/no?" question. Therefore, I would let the function return 0 in the regular case, and -1 if no enlistment was found.

Indeed, I would potentially even go further and show an error message if no enlistment was found (this would require a copy of path to be stored, and released, though).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Compared to strbuf_parentdir, I'm more convinced that this should be represented with "success/error" than "true/false" (since the input path is mangled in the process). However, find_enlistment(...) ended up only having one usage in this file - would it be better to just re-integrate it with setup_enlistment_directory(...)? That function is already meant to throw an error if the enlistment isn't found (as opposed to find_enlistment(...), which I had intended not throw an error in case it was used only to get information about the enlistment).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it be better to just re-integrate it with setup_enlistment_directory(...)?

That's a good idea!

contrib/scalar/scalar.c Show resolved Hide resolved
Copy link
Member

@dscho dscho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wonderful!

Copy link
Collaborator

@derrickstolee derrickstolee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correctness looks great. I found some nits that could cause review friction upstream (the fewer style nits we have, the more reviewers need to look harder to provide feedback). I'm not worried about them in microsoft/git, but we'd want to fix them as we prepare our upstream submission.

contrib/scalar/scalar.c Outdated Show resolved Hide resolved
contrib/scalar/scalar.c Outdated Show resolved Hide resolved
contrib/scalar/scalar.c Outdated Show resolved Hide resolved
contrib/scalar/scalar.c Outdated Show resolved Hide resolved
Improve handling of non-`src` workdirs in `scalar unregister`
Add argument to `setup_enlistment_directory` for `scalar run`
Add argument to `setup_enlistment_directory` for `scalar reconfigure`
Use explicit `enlistment_root` result for `scalar diagnose`
Improve non-`src` enlistment workdir handling for `scalar delete`
Add extra argument to `setup_enlistment_directory` in `scalar cache-server`
@vdye vdye merged commit 4a95a52 into microsoft:vfs-2.32.0 Aug 4, 2021
@vdye vdye deleted the bugfix/scalar-register-directory branch August 4, 2021 19:00
derrickstolee pushed a commit that referenced this pull request Aug 4, 2021
Handle Scalar enlistments without `src` subdirectory
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants