Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add internal-tools check-mode, replacing check-mode.sh #1711

Merged
merged 10 commits into from
Nov 29, 2024

Conversation

EliahKagan
Copy link
Member

@EliahKagan EliahKagan commented Nov 29, 2024

Overview

This adds an it check-mode subcommand and modifies the justfile recipe check-mode to use it, as well as to have the test and ci-test recipes run it.

The approach taken here is to base the check-mode subcommand on the approach in the script. While internal-tools can use gix crates, this doesn't actually use them: it uses an external git in the same way the shell script it is based on did (though I would be interested in modifying it to take use gix-index in the future; see below).

The new Rust tool replaces the shell script, which is removed. Because the ci-test recipe runs the check-mode recipe, and the CI test job runs the ci-test recipe, a separate CI check-mode job is no longer needed; it is removed as well.

Because it is run as part of the CI test job, and it dependencies mostly overlap with others used there, I think this probably solves the problem of dependency installation and caching anticipated in #1709 (comment), without requiring jobs to share caches.

Testing

I tested it in the same way I tested that shell script, except that I did not do manual testing with macOS, since I don't anticipate specific problems with macOS for this tool (an advantage of using Rust), and since I didn't have easy interactive access to a macOS system while working on it.

  • My testing did, as there, include testing it on multiple platforms on CI, even though that does not remain at the tip of the branch. (See commit history.)

  • I did also test with the pathological files I had tried before; I used git checkout and git revert --no-commit commands to retrieve them, rather than recommitting them (they're already in the repo history, after all), which is why they do not appear in the recent commit history. In particular:

    git revert --no-commit ed757ea
    cargo run --bin=it -- cm
    git revert --abort
    git revert --no-commit a9b3b48
    cargo run --bin=it -- cm
    git revert --abort

Broad design considerations

I view this Rust tool, foreshadowed in discussions in #1589, #1708, and #1709, as an improvement over the script. However, further improvement will be possible. The approach taken here takes an approach you mentioned in #1589 (comment):

If a tool were to use Git to learn about executable bits, I'd think it could even use git.exe if it doesn't have access (or doesn't want to use) gix-index.

Except:

  • It's a bit better than that, in that it is able to be written platform-agnostically and use git. (In the way git is being run here, it's not necessary to special-case git.exe for Windows.)
  • It's a bit worse than that, because we do have access to gix-index, and I also feel that it should be used. But I figured it may be a good idea, as well as interesting, to start by "translating" the script to Rust.

In addition, it might be useful to have it accept options, though I'm not sure what I would control besides verbosity: there is probably not a likely use case for scanning only part of the repository. In principle it would be good to be able to specify another repository, but unlike with the other tools, which operate on repositories to help produce test fixtures, the purpose of this tool is to operate on the gitoxide repository.

The Rust version already, in my opinion, has significant advantages over the shell script. So rather than rewriting it immediately to use gix-index instead of git subprocesses, I'm opening a PR to replace the script with this.

Specific comparison to the shell script

The advantages over the preceding shell script way are:

  • It's easier to tell from the code what is going on, and to reason about its correctness.
  • Errors are explicit, with the thiserror context method called to describe all errors, except for those that are anticipated only to be possible due to bugs in this tool itself, which use expect.
  • A bug in the shell script would potentially have caused it to fail with an error when reading from large blobs, because set -e combined with set -o pipefail can make a command like x | y fail the script, when y completes successfully before reading all of its input, causing x to attempt to write to a broken pipe.
  • It is probably faster, due to buffering and tailored choices about how to read from and interact with subprocesses. I regard this to be, by far, the least important benefit, though, since this code does not really need to be fast.

Regarding the piping bug, the implicated code in the shell script is:

# Extract the first two bytes (or less if shorter) and put in symbolic form.
symbolic_magic="$(git cat-file blob "$oid" | od -N2 -An -ta)"

If the blob is big enough, then git cat-file may have enough left to write after od -N2 terminates that the git cat-file is terminated with SIGPIPE. This is a situation where we don't expect git to write to standard output unless the data it writes are correct and usable, and where terminating normally with an exit status of 0 is one of two reasonable results.

Specific implementation questions

Rewriting it in Rust immediately caused me to be confronted with how and how much to read, whether to close pipes, whether and when to wait, and whether and when to allow a child process to possibly become a zombie and never be waited on by the it process. There are two aspects that I am hoping for input on, in case they are not as reasonable as I hope.

First, is takein and droping stdout from a Child process created with Stdout::piped(), the recommended way to break a pipe before waiting on the subprocess?

let mut stdout = child.stdout.take().expect("should have captured stdout");
let count = stdout.read(&mut buf).context("Error reading data from blob")?;
drop(stdout); // Let the pipe break rather than waiting for the rest of the blob.
// TODO: Maybe check status? On Unix, it should be 0 or SIGPIPE. Not sure about Windows.
_ = child.wait().context("Failure running `git` subprocess to read blob")?;

Second, is it okay that in situations where I am treating it as an error, I allow Child process objects to be dropped implicitly without ever being waited on? In non-error conditions--including conditions that we will later report as an error because they reveal the #!/+x mismatches we're look for, but that are not errors in operation--for both the outer-loop read of git ls-files -sz and the inner runs of git cat-file, I do call wait(). But when propagaing Errs out with ?, I do not.

I ask because, even though the internal-tools are not published to a crate registry, I know it is possible for their code to be used by other code, and not just via the subcommands they implement. For this reason, I try to avoid behavior that would be severely wrong in such usage, such as changing directory, but I have not taken care to wait on subprocesses when bailing out of the entire subcommand with an error.

My guess is that this is fine, but it's the sort of thing I would ordinarily make a note of, so I figured I should do so in case there are use cases I don't anticipate.

All parts are included, but this is not yet believed to be correct.
Paths of files/blobs with mismatches had been shown literally, even
when containing unusual characters, and had been followed by two
newlines instead of one. This fixes that, and also includes some
small stylistic refactoring.
- Delete the old `etc/check-mode.sh` script.
- Update `justfile` recipe `check-mode` to use internal-tools.
- Modify and expand `check-mode` CI job to use internal-tools.
- Temporarily make `check-mode` job a matrix job to test platforms.

The latter of those changes should be undone once things looks like
they are working. I have manually tested `it check-mode` on Arch
Linux and Windows in the same ways `check-mode.sh` was tested (and
unlike with a shell-script based approach, it is not anticipated to
differ on macOS). So trying three platforms on CI is a secondary
supporting strategy to catch possible problems.
Since Windows was failing due to the shell called by `just`
treating `\` path separators as escape characters (the main effect
of which was that they were removed in quote removal).

This was not a problem for internal-tools, but it broke the `just`
recipe at least in some ways of running it on Windows, including CI
and at least one local system tested.
And not on macOS on Windows.

(If it is to be kept as a separate job, then possibly caching
should be enabled for it. But I suspect it can be moved into
another existing job, most likely the `test` job, which already
runs a number of distinct checks via `just`.)
This adds the `check-mode` recipe to the `test` recipe and, more
significantly because of its effect on CI, to the `ci-test` recipe.

This causes the CI `test` job to run the `check-mode` recipe (and
thus `it check-mode` through it).

While the separate `check-mode` CI job was useful for developing
the tool, it imposes extra work, due to the need to build
dependencies that are, to a substantial extent, shared with those
already present (and cached) for the `test` job. Now that it is
run via the `test` CI job, the separate `check-mode` CI job is
removed.
@EliahKagan EliahKagan changed the title Add a check-mode internal-tools subcommand, replacing check-mode.sh Add internal-tools check-mode, replacing check-mode.sh Nov 29, 2024
@Byron
Copy link
Member

Byron commented Nov 29, 2024

Thanks a lot, I think that's already miles beyond in terms of maintainability than what we had with the shell-script that it replaces.

Because it is run as part of the CI test job, and it dependencies mostly overlap with others used there, I think this probably solves the problem of dependency installation and caching anticipated in #1709 (comment), without requiring jobs to share caches.

I think so too - that's great.

The Rust version already, in my opinion, has significant advantages over the shell script. So rather than rewriting it immediately to use gix-index instead of git subprocesses, I'm opening a PR to replace the script with this.

I am happy to support you with any increments you wish to take. For a moment, I was concerned about regex, but it turns out it's already used in the tree as well in order to support the regex-based search through commit messages that revspecs can do.

I'd definitely be very interested to witness the final transformation to when gix-index is used. If you choose to go that way, I recommend to use the highest possible abstraction via gix::open().

First, is takein and droping stdout from a Child process created with Stdout::piped(), the recommended way to break a pipe before waiting on the subprocess?

Unfortunately, std::fs::File and friends don't have an explicit close() method, which is a huge oversight in my book. There is even an issue for that here. Thus dropping or otherwise consuming them is the only way if the io-close crate isn't used.

Second, is it okay that in situations where I am treating it as an error, I allow Child process objects to be dropped implicitly without ever being waited on?

I think that's a very common hazard, as the child processes may just keep running in the background for all kinds of reasons. In a server-application this might be a problem, but for this program I'd prefer simple code over covering all eventualities.

@Byron Byron merged commit c146b7a into GitoxideLabs:main Nov 29, 2024
20 checks passed
@EliahKagan EliahKagan deleted the run-ci/mode-it branch November 29, 2024 15:39
@EliahKagan
Copy link
Member Author

I think that's a very common hazard, as the child processes may just keep running in the background for all kinds of reasons. In a server-application this might be a problem, but for this program I'd prefer simple code of covering all eventualities.

My reading of this is that you prefer simple code over code covering all eventualities, i.e., that I need not and should not cover that eventuality in this particular case, if doing so would make the code significantly more complicated (which I think it would, in the absence of other changes).

However, if that is a misreading and you do want that covered, please let me know.

@Byron
Copy link
Member

Byron commented Dec 21, 2024

Thanks for clarifying, that's indeed what I tried to say. Thanks to MacOS, luckily I can blame overzealous auto-compete 😅. I have also corrected my comment so it's more clear if anybody gets to read it in future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants