Add internal-tools `check-mode`, replacing `check-mode.sh` #1711

EliahKagan · 2024-11-29T09:24:19Z

Overview

This adds an it check-mode subcommand and modifies the justfile recipe check-mode to use it, as well as to have the test and ci-test recipes run it.

The approach taken here is to base the check-mode subcommand on the approach in the script. While internal-tools can use gix crates, this doesn't actually use them: it uses an external git in the same way the shell script it is based on did (though I would be interested in modifying it to take use gix-index in the future; see below).

The new Rust tool replaces the shell script, which is removed. Because the ci-test recipe runs the check-mode recipe, and the CI test job runs the ci-test recipe, a separate CI check-mode job is no longer needed; it is removed as well.

Because it is run as part of the CI test job, and it dependencies mostly overlap with others used there, I think this probably solves the problem of dependency installation and caching anticipated in #1709 (comment), without requiring jobs to share caches.

Testing

I tested it in the same way I tested that shell script, except that I did not do manual testing with macOS, since I don't anticipate specific problems with macOS for this tool (an advantage of using Rust), and since I didn't have easy interactive access to a macOS system while working on it.

My testing did, as there, include testing it on multiple platforms on CI, even though that does not remain at the tip of the branch. (See commit history.)
I did also test with the pathological files I had tried before; I used git checkout and git revert --no-commit commands to retrieve them, rather than recommitting them (they're already in the repo history, after all), which is why they do not appear in the recent commit history. In particular:
```
git revert --no-commit ed757ea
cargo run --bin=it -- cm
git revert --abort
```
```
git revert --no-commit a9b3b48
cargo run --bin=it -- cm
git revert --abort
```

Broad design considerations

I view this Rust tool, foreshadowed in discussions in #1589, #1708, and #1709, as an improvement over the script. However, further improvement will be possible. The approach taken here takes an approach you mentioned in #1589 (comment):

If a tool were to use Git to learn about executable bits, I'd think it could even use git.exe if it doesn't have access (or doesn't want to use) gix-index.

Except:

It's a bit better than that, in that it is able to be written platform-agnostically and use git. (In the way git is being run here, it's not necessary to special-case git.exe for Windows.)
It's a bit worse than that, because we do have access to gix-index, and I also feel that it should be used. But I figured it may be a good idea, as well as interesting, to start by "translating" the script to Rust.

In addition, it might be useful to have it accept options, though I'm not sure what I would control besides verbosity: there is probably not a likely use case for scanning only part of the repository. In principle it would be good to be able to specify another repository, but unlike with the other tools, which operate on repositories to help produce test fixtures, the purpose of this tool is to operate on the gitoxide repository.

The Rust version already, in my opinion, has significant advantages over the shell script. So rather than rewriting it immediately to use gix-index instead of git subprocesses, I'm opening a PR to replace the script with this.

Specific comparison to the shell script

The advantages over the preceding shell script way are:

It's easier to tell from the code what is going on, and to reason about its correctness.
Errors are explicit, with the thiserror context method called to describe all errors, except for those that are anticipated only to be possible due to bugs in this tool itself, which use expect.
A bug in the shell script would potentially have caused it to fail with an error when reading from large blobs, because set -e combined with set -o pipefail can make a command like x | y fail the script, when y completes successfully before reading all of its input, causing x to attempt to write to a broken pipe.
It is probably faster, due to buffering and tailored choices about how to read from and interact with subprocesses. I regard this to be, by far, the least important benefit, though, since this code does not really need to be fast.

Regarding the piping bug, the implicated code in the shell script is:

gitoxide/etc/check-mode.sh

Lines 16 to 17 in ee33221

    
           # Extract the first two bytes (or less if shorter) and put in symbolic form. 
        
           symbolic_magic="$(git cat-file blob "$oid" | od -N2 -An -ta)"

If the blob is big enough, then git cat-file may have enough left to write after od -N2 terminates that the git cat-file is terminated with SIGPIPE. This is a situation where we don't expect git to write to standard output unless the data it writes are correct and usable, and where terminating normally with an exit status of 0 is one of two reasonable results.

Specific implementation questions

Rewriting it in Rust immediately caused me to be confronted with how and how much to read, whether to close pipes, whether and when to wait, and whether and when to allow a child process to possibly become a zombie and never be waited on by the it process. There are two aspects that I am hoping for input on, in case they are not as reasonable as I hope.

First, is takein and droping stdout from a Child process created with Stdout::piped(), the recommended way to break a pipe before waiting on the subprocess?

gitoxide/tests/it/src/commands/check_mode.rs

Lines 110 to 115 in 8729858

    
           let mut stdout = child.stdout.take().expect("should have captured stdout"); 
        
           let count = stdout.read(&mut buf).context("Error reading data from blob")?; 
        
           drop(stdout); // Let the pipe break rather than waiting for the rest of the blob. 
        
           // TODO: Maybe check status? On Unix, it should be 0 or SIGPIPE. Not sure about Windows. 
        
           _ = child.wait().context("Failure running `git` subprocess to read blob")?;

Second, is it okay that in situations where I am treating it as an error, I allow Child process objects to be dropped implicitly without ever being waited on? In non-error conditions--including conditions that we will later report as an error because they reveal the #!/+x mismatches we're look for, but that are not errors in operation--for both the outer-loop read of git ls-files -sz and the inner runs of git cat-file, I do call wait(). But when propagaing Errs out with ?, I do not.

I ask because, even though the internal-tools are not published to a crate registry, I know it is possible for their code to be used by other code, and not just via the subcommands they implement. For this reason, I try to avoid behavior that would be severely wrong in such usage, such as changing directory, but I have not taken care to wait on subprocesses when bailing out of the entire subcommand with an error.

My guess is that this is fine, but it's the sort of thing I would ordinarily make a note of, so I figured I should do so in case there are use cases I don't anticipate.

All parts are included, but this is not yet believed to be correct.

Paths of files/blobs with mismatches had been shown literally, even when containing unusual characters, and had been followed by two newlines instead of one. This fixes that, and also includes some small stylistic refactoring.

- Delete the old `etc/check-mode.sh` script. - Update `justfile` recipe `check-mode` to use internal-tools. - Modify and expand `check-mode` CI job to use internal-tools. - Temporarily make `check-mode` job a matrix job to test platforms. The latter of those changes should be undone once things looks like they are working. I have manually tested `it check-mode` on Arch Linux and Windows in the same ways `check-mode.sh` was tested (and unlike with a shell-script based approach, it is not anticipated to differ on macOS). So trying three platforms on CI is a secondary supporting strategy to catch possible problems.

Since Windows was failing due to the shell called by `just` treating `\` path separators as escape characters (the main effect of which was that they were removed in quote removal). This was not a problem for internal-tools, but it broke the `just` recipe at least in some ways of running it on Windows, including CI and at least one local system tested.

And not on macOS on Windows. (If it is to be kept as a separate job, then possibly caching should be enabled for it. But I suspect it can be moved into another existing job, most likely the `test` job, which already runs a number of distinct checks via `just`.)

This adds the `check-mode` recipe to the `test` recipe and, more significantly because of its effect on CI, to the `ci-test` recipe. This causes the CI `test` job to run the `check-mode` recipe (and thus `it check-mode` through it). While the separate `check-mode` CI job was useful for developing the tool, it imposes extra work, due to the need to build dependencies that are, to a substantial extent, shared with those already present (and cached) for the `test` job. Now that it is run via the `test` CI job, the separate `check-mode` CI job is removed.

Byron · 2024-11-29T15:19:12Z

Thanks a lot, I think that's already miles beyond in terms of maintainability than what we had with the shell-script that it replaces.

Because it is run as part of the CI test job, and it dependencies mostly overlap with others used there, I think this probably solves the problem of dependency installation and caching anticipated in #1709 (comment), without requiring jobs to share caches.

I think so too - that's great.

The Rust version already, in my opinion, has significant advantages over the shell script. So rather than rewriting it immediately to use gix-index instead of git subprocesses, I'm opening a PR to replace the script with this.

I am happy to support you with any increments you wish to take. For a moment, I was concerned about regex, but it turns out it's already used in the tree as well in order to support the regex-based search through commit messages that revspecs can do.

I'd definitely be very interested to witness the final transformation to when gix-index is used. If you choose to go that way, I recommend to use the highest possible abstraction via gix::open().

First, is takein and droping stdout from a Child process created with Stdout::piped(), the recommended way to break a pipe before waiting on the subprocess?

Unfortunately, std::fs::File and friends don't have an explicit close() method, which is a huge oversight in my book. There is even an issue for that here. Thus dropping or otherwise consuming them is the only way if the io-close crate isn't used.

Second, is it okay that in situations where I am treating it as an error, I allow Child process objects to be dropped implicitly without ever being waited on?

I think that's a very common hazard, as the child processes may just keep running in the background for all kinds of reasons. In a server-application this might be a problem, but for this program I'd prefer simple code over covering all eventualities.

EliahKagan · 2024-12-21T15:23:18Z

I think that's a very common hazard, as the child processes may just keep running in the background for all kinds of reasons. In a server-application this might be a problem, but for this program I'd prefer simple code of covering all eventualities.

My reading of this is that you prefer simple code over code covering all eventualities, i.e., that I need not and should not cover that eventuality in this particular case, if doing so would make the code significantly more complicated (which I think it would, in the absence of other changes).

However, if that is a misreading and you do want that covered, please let me know.

Byron · 2024-12-21T17:53:42Z

Thanks for clarifying, that's indeed what I tried to say. Thanks to MacOS, luckily I can blame overzealous auto-compete 😅. I have also corrected my comment so it's more clear if anybody gets to read it in future.

EliahKagan added 7 commits November 28, 2024 21:06

Partial sketch of check_mode "translation" to Rust

967b9df

More of a sketch of check_mode "translation" to Rust

76268f8

All parts are included, but this is not yet believed to be correct.

Fix path formatting; refactor slightly for clarity

b17935b

Paths of files/blobs with mismatches had been shown literally, even when containing unusual characters, and had been followed by two newlines instead of one. This fixes that, and also includes some small stylistic refactoring.

EliahKagan changed the title ~~Add a check-mode internal-tools subcommand, replacing check-mode.sh~~ Add internal-tools check-mode, replacing check-mode.sh Nov 29, 2024

EliahKagan and others added 3 commits November 29, 2024 04:38

Copyedit and clarify CheckMode doc comment

e57167f

Eliminate a variable that didn't make things clearer

5a803b3

minor refactors

7e8aedf

Byron merged commit c146b7a into GitoxideLabs:main Nov 29, 2024
20 checks passed

EliahKagan deleted the run-ci/mode-it branch November 29, 2024 15:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add internal-tools `check-mode`, replacing `check-mode.sh` #1711

Add internal-tools `check-mode`, replacing `check-mode.sh` #1711

EliahKagan commented Nov 29, 2024 •

edited

Loading

Byron commented Nov 29, 2024 •

edited

Loading

EliahKagan commented Dec 21, 2024

Byron commented Dec 21, 2024

	# Extract the first two bytes (or less if shorter) and put in symbolic form.
	symbolic_magic="$(git cat-file blob "$oid" \| od -N2 -An -ta)"

	let mut stdout = child.stdout.take().expect("should have captured stdout");
	let count = stdout.read(&mut buf).context("Error reading data from blob")?;
	drop(stdout); // Let the pipe break rather than waiting for the rest of the blob.

	// TODO: Maybe check status? On Unix, it should be 0 or SIGPIPE. Not sure about Windows.
	_ = child.wait().context("Failure running `git` subprocess to read blob")?;

Add internal-tools check-mode, replacing check-mode.sh #1711

Add internal-tools check-mode, replacing check-mode.sh #1711

Conversation

EliahKagan commented Nov 29, 2024 • edited Loading

Overview

Testing

Broad design considerations

Specific comparison to the shell script

Specific implementation questions

Byron commented Nov 29, 2024 • edited Loading

EliahKagan commented Dec 21, 2024

Byron commented Dec 21, 2024

Add internal-tools `check-mode`, replacing `check-mode.sh` #1711

Add internal-tools `check-mode`, replacing `check-mode.sh` #1711

EliahKagan commented Nov 29, 2024 •

edited

Loading

Byron commented Nov 29, 2024 •

edited

Loading