Handle Scalar enlistments without `src` subdirectory #402

vdye · 2021-07-27T15:41:03Z

Updates to allow Scalar commands to work properly on enlistments that are not cloned into a src subdirectory (e.g., if scalar register is run on a git clone'd repository).

Added regression tests + ran manual tests with scalar diagnose to ensure diagnostics directory was created in the enlistment directory both with and without src subdirectory.

derrickstolee

A good start. I mostly have some nits, but I do have some questions about the tests.

contrib/scalar/scalar.c

contrib/scalar/t/t9099-scalar.sh

contrib/scalar/scalar.c

dscho · 2021-07-28T20:28:12Z

contrib/scalar/scalar.c

+}
+
+/* Given an absolute working directory, find the enlistment root */
+static void strbuf_enlistmentroot(struct strbuf *buf)


Could I ask for a _ between enlistment and root? That will shut up my VS Code's cSpell... ;-)

I think this is is out-of-date - the function was removed in favor of find_enlistment() (+other misc refactoring).

As a note, I had separated the original commit & fixup! commit that included changes based on review comments to make it easier to review the implementation of those changes. Since (I think) those have been addressed/re-reviewed, I've squashed everything together into a single commit again.

I think we should use the form fixup! in the final commit, so that the faulty commit in our branch thicket can be fixed in the rebase to the next Git for Windows version.

That works for me - once all other review comments are addressed, I'll restructure the commit history into a series of fixups.

I've restructured the commit history, applying fixups to (I think) the appropriate commits. I'll also be able to help with any merge conflict resolution later on (based on my local rebase, it's pretty messy due to the context differences).

contrib/scalar/scalar.c

dscho

I am somewhat worried that my (apparently incorrect) assumption that every Scalar enlistment contains a src subdirectory that is the actual Git worktree might be encoded throughout scalar.c and might need more places to be adjusted (and certainly the documentation needs to be adjusted, too).

And I am also somewhat worried that it might be too confusing to consider every Git worktree to be a Scalar enlistment unless the worktree's path ends in /src (in which case its parent directory is treated as the corresponding Scalar enlistment).

dscho · 2021-07-28T20:56:58Z

contrib/scalar/scalar.c

+		size_t len = path->len;
+
+		strbuf_addstr(path, "/.git");
+		if (is_git_directory(path->buf)) {


I don't quite follow... What are the exact rules as to what constitutes an enlistment directory? I thought that an enlistment directory must contain a src/ subdirectory that is a Git worktree. If that is not so, we will have to update documentation, and potentially touch up more code locations.

According to the issue linked to this pull request (and what I saw of the .NET Scalar implementation), the enlistment root could be the git working directory itself (mostly used when scalar register-ing an existing git clone. @derrickstolee would probably be the best one to clarify on that, though.

For what it's worth, I think I caught any instances where that assumption was made - since most operations are performed on the working directory anyway, there wasn't a lot to change.

I think we're still missing the edge case where <root>/src/.git is the repository, but the current directory is <root>/out/.

So I think we need the following strategy:

Loop

if src/.git is a Git directory, we are in the enlistment root

if .git is a Git directory,

if the current directory is named src, the enlistment root is the parent directory

otherwise the current directory is the enlistment root

otherwise, continue with the parent directory

if there is no parent directory (use offset_1st_component() to accommodate for DOS drives and network shares), error out

Of course, we could be a bit smart by determining the parent directory already before the .git check (because we need it both to test whether the current directory is called src as well as in the case where we continue with the parent directory).

In addition to the changes made earlier to handle parallel directories (like <root>/out), this should now handle DOS/network paths with root-level scalar registrations (using offset_1st_component). That logic is in a separate fixup! commit if you'd like to review it on its own - I think it avoids off-by-one errors/infinite loops/other general path issues, but I would appreciate a second opinion.

dscho · 2021-07-28T20:57:50Z

contrib/scalar/scalar.c

+static void find_enlistment(struct strbuf *path,
+				   struct strbuf *enlistment_root)
+{
+	strbuf_addstr(path, "/src");


If we do it this way, we can no longer start in <root>/out (where <root>/src/.git is the Git repository), as we will never hit a parent directory that contains a .git subdirectory.

My thought was that, by blindly adding /src to the end of the input path, we'll be able to reliably find the git working directory anywhere from an enlistment root that (as recommended) contains a src subdirectory all the way down through the git clone itself. With a src subdirectory:

test-repo └── src ├── .git └── docs

All of these input paths will find the working directory:

test-repo

test-repo/src

test-repo/src/docs

test-repo/src/docs/...

In a clone without the src working directory:

existing ├── .git └── docs

find_enlistment(...) will work with these input paths:

existing

existing/docs

existing/docs/...

I think @vdye's assessment is correct.

scalar register should work on any Git repository, and then so should scalar unregister and scalar delete <dir>. The src/ is only guaranteed for repositories created by scalar clone.

VFS for Git has a .gvfs directory adjacent to the src directory which stores required metadata, so the concept of "enlistment" is a key concept there. Scalar inherited that terminology, but we expanded this in microsoft/scalar#272 and microsoft/scalar#274.

All of these input paths will find the working directory:

test-repo

test-repo/src

test-repo/src/docs

test-repo/src/docs/...

But will it find the enlistment root in test-repo/out?

Yes, after the most recent commit (comment).

dscho

I left a couple of suggestions. The way I read offset_1st_component(), I think we need to have the >= offset condition (instead of > offset, which would be an off-by-one).

But hey, I am the emperor of off-by-ones, please double check my math.

contrib/scalar/scalar.c

dscho · 2021-08-02T14:45:37Z

contrib/scalar/scalar.c

+
+			if (enlistment_root) {
+				strbuf_addbuf(enlistment_root, path);
+				strbuf_parentdir(enlistment_root);


Would it make sense to use strbuf_add(enlistment_root, path, len); here instead of copying the full src path and then finding its parent directory?

Yes, that definitely simplifies the implementation. Updated!

contrib/scalar/scalar.c

dscho · 2021-08-02T14:53:38Z

contrib/scalar/scalar.c

+
+		/* reset path to parent */
+		strbuf_setlen(path, len);
+		strbuf_parentdir(path);


Here, we should probably have strbuf_parentdir() return -1 if no parent directory was found, and break out of the loop in that case.

contrib/scalar/scalar.c

dscho · 2021-08-02T14:55:58Z

contrib/scalar/scalar.c

-			if (!len)
-				die(_("could not find enlistment root"));
+	strbuf_trim_trailing_dir_sep(&path);
+	if (!!find_enlistment(&path, enlistment_root))


We do not need the !! here, as any non-zero return value of find_enlistment() will make this condition be evaluated as true. The !! is only needed if we need to turn all non-zero values into 1.

I changed the return value for find_enlistment(...) to be 1 if the enlistment is found, 0 if not (so the condition is now if (!find_enlistment(...)). Since the output is only used as a boolean (and not passed through as an exit code), it made more sense to me to have the return value behave the way I would expect a boolean to. I'm not sure of the "right" convention (0 for success vs 1 for success) to follow, though - is there a standard way of deciding that?

I think the rule for regular functions whose name does not imply a Boolean return value is that -1 indicates an error and 0 indicates success.

For example, is_directory() would imply a Boolean to me, whereas find_enlistment() would suggest to me that we expect to find an enlistment, and have to error out if none is found. Therefore, I would tend to use the standard Unix rule "0 == success" here.

contrib/scalar/scalar.c

dscho

Looks pretty good to me! I'm just not yet 100% convinced that we want to interpret the int return values as Booleans.

dscho · 2021-08-03T12:06:59Z

contrib/scalar/scalar.c

 	size_t offset = offset_1st_component(buf->buf);
 	char *path_sep = find_last_dir_sep(buf->buf + offset);
 	strbuf_setlen(buf, path_sep ? path_sep - buf->buf : offset);
+
+	return (buf->len < len);


Just a minor nit: Git's current coding style would lose the parentheses here.

dscho · 2021-08-03T12:10:44Z

contrib/scalar/scalar.c

 }

 /**
 * Given an absolute path at or below the enlistment root, find the
- * enlistment root and working directory
+ * enlistment root and working directory. Returns 1 if enlistment found,
+ * otherwise 0.


The function name find_enlistment() sounds to me as if we expect to find an enlistment, i.e. it is not a "yes/no?" question. Therefore, I would let the function return 0 in the regular case, and -1 if no enlistment was found.

Indeed, I would potentially even go further and show an error message if no enlistment was found (this would require a copy of path to be stored, and released, though).

Compared to strbuf_parentdir, I'm more convinced that this should be represented with "success/error" than "true/false" (since the input path is mangled in the process). However, find_enlistment(...) ended up only having one usage in this file - would it be better to just re-integrate it with setup_enlistment_directory(...)? That function is already meant to throw an error if the enlistment isn't found (as opposed to find_enlistment(...), which I had intended not throw an error in case it was used only to get information about the enlistment).

would it be better to just re-integrate it with setup_enlistment_directory(...)?

That's a good idea!

contrib/scalar/scalar.c

dscho

Wonderful!

derrickstolee

Correctness looks great. I found some nits that could cause review friction upstream (the fewer style nits we have, the more reviewers need to look harder to provide feedback). I'm not worried about them in microsoft/git, but we'd want to fix them as we prepare our upstream submission.

contrib/scalar/scalar.c

Fix to allow non- workdirs

Add argument for register

Improve handling of non-`src` workdirs in `scalar unregister`

Add argument to `setup_enlistment_directory` for `scalar run`

Add argument to `setup_enlistment_directory` for `scalar reconfigure`

Use explicit `enlistment_root` result for `scalar diagnose`

Improve non-`src` enlistment workdir handling for `scalar delete`

Add extra argument to `setup_enlistment_directory` in `scalar cache-server`

Handle Scalar enlistments without `src` subdirectory

vdye requested a review from derrickstolee July 27, 2021 15:41

vdye self-assigned this Jul 27, 2021

derrickstolee reviewed Jul 27, 2021

View reviewed changes

derrickstolee mentioned this pull request Jul 28, 2021

Upstreaming the Scalar command gitgitgadget/git#1005

Closed

dscho reviewed Jul 28, 2021

View reviewed changes

contrib/scalar/scalar.c Show resolved Hide resolved

dscho reviewed Jul 28, 2021

View reviewed changes

dscho reviewed Aug 2, 2021

View reviewed changes

dscho reviewed Aug 3, 2021

View reviewed changes

dscho approved these changes Aug 4, 2021

View reviewed changes

derrickstolee approved these changes Aug 4, 2021

View reviewed changes

contrib/scalar/scalar.c Outdated Show resolved Hide resolved

contrib/scalar/scalar.c Outdated Show resolved Hide resolved

contrib/scalar/scalar.c Outdated Show resolved Hide resolved

contrib/scalar/scalar.c Outdated Show resolved Hide resolved

vdye added 8 commits August 4, 2021 13:32

fixup! scalar register: set recommended config settings

72061b8

Fix to allow non- workdirs

fixup! scalar register/unregister: start/stop maintenance on repository

2059701

Add argument for register

fixup! scalar unregister: handle deleted enlistment directory gracefully

419cbc5

Improve handling of non-`src` workdirs in `scalar unregister`

fixup! scalar: implement the run command

f05be53

Add argument to `setup_enlistment_directory` for `scalar run`

fixup! scalar: allow reconfiguring an existing enlistment

f6ca26f

Add argument to `setup_enlistment_directory` for `scalar reconfigure`

fixup! Implement scalar diagnose

b59e481

Use explicit `enlistment_root` result for `scalar diagnose`

fixup! scalar: implement the delete command

ea3e358

Improve non-`src` enlistment workdir handling for `scalar delete`

fixup! scalar: add the cache-server command

1976a6e

Add extra argument to `setup_enlistment_directory` in `scalar cache-server`

derrickstolee approved these changes Aug 4, 2021

View reviewed changes

vdye merged commit 4a95a52 into microsoft:vfs-2.32.0 Aug 4, 2021

vdye deleted the bugfix/scalar-register-directory branch August 4, 2021 19:00

derrickstolee pushed a commit that referenced this pull request Aug 4, 2021

Merge pull request #402 from vdye/bugfix/scalar-register-directory

e720ed3

Handle Scalar enlistments without `src` subdirectory

derrickstolee mentioned this pull request Aug 4, 2021

[DO NOT MERGE] Tentative vfs-2.33.0 branch #405

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Handle Scalar enlistments without `src` subdirectory #402

Handle Scalar enlistments without `src` subdirectory #402

vdye commented Jul 27, 2021

derrickstolee left a comment

dscho Jul 28, 2021

vdye Jul 28, 2021 •

edited

Loading

dscho Jul 28, 2021

vdye Jul 29, 2021

vdye Aug 4, 2021

dscho left a comment

dscho Jul 28, 2021

vdye Jul 28, 2021 •

edited

Loading

dscho Jul 29, 2021

vdye Jul 29, 2021 •

edited

Loading

dscho Jul 28, 2021

vdye Jul 28, 2021

derrickstolee Jul 29, 2021

dscho Jul 29, 2021

vdye Jul 29, 2021

dscho left a comment

dscho Aug 2, 2021

vdye Aug 2, 2021

dscho Aug 2, 2021

dscho Aug 2, 2021

vdye Aug 2, 2021

dscho Aug 3, 2021

dscho left a comment

dscho Aug 3, 2021

dscho Aug 3, 2021

vdye Aug 3, 2021

dscho Aug 3, 2021

dscho left a comment

derrickstolee left a comment

Handle Scalar enlistments without src subdirectory #402

Handle Scalar enlistments without src subdirectory #402

Conversation

vdye commented Jul 27, 2021

derrickstolee left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vdye Jul 28, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dscho left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vdye Jul 28, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vdye Jul 29, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dscho left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dscho left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dscho left a comment

Choose a reason for hiding this comment

derrickstolee left a comment

Choose a reason for hiding this comment

Handle Scalar enlistments without `src` subdirectory #402

Handle Scalar enlistments without `src` subdirectory #402

vdye Jul 28, 2021 •

edited

Loading

vdye Jul 28, 2021 •

edited

Loading

vdye Jul 29, 2021 •

edited

Loading