-
-
Notifications
You must be signed in to change notification settings - Fork 311
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
run git config -l
in temp dir when looking up system config
#1523
Conversation
Commit a9a3545 made `gix::config::File::from_globals()` try to find the system config (typically `/etc/config`) by subprocessing to `git config -l --show-origin`. That command takes about 500 ms when run in certain directories in our VFS at Google. Since we don't need anything but the system config, let's run the command in the system's temporary directory instead. I also added `--system` for good measure. I've confirmed that just adding `--system` is not enough, however - `git config -l --system --show-origin` is still slow in our VFS.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot, a much appreciated change and: "Oh wow, this code runs at Google now!"
With that said, since --system
doesn't make a performance difference and fails on MacOS (see below), I will have to remove that before merging.
❯ which git
/usr/bin/git
gitbutler ( missing-project) [$] via
❯ git --version
git version 2.39.3 (Apple Git-146)
gitbutler ( missing-project) [$] via
❯ git config -l --system --show-origin
fatal: unable to read config file '/etc/gitconfig': No such file or directory
``` ❯ which git /usr/bin/git gitbutler ( missing-project) [$] via ❯ git --version git version 2.39.3 (Apple Git-146) gitbutler ( missing-project) [$] via ❯ git config -l --system --show-origin fatal: unable to read config file '/etc/gitconfig': No such file or directory ```
The nature of the macOS failure is interesting:
It has the usual value of I cannot create |
It works on my Mac Actually, |
Thanks for the quick review. I hope the lack of GitHub checks didn't lead to anything getting broken (doesn't look like it AFAICT). |
@Byron you can add a dummy job "test-pass": |
This tests that `EXE_INFO` never picks up a local scope configuration file when looking for the configuration associated with the installation -- not even if the system and global scopes provide no variables, such that the first row of of `git config -l` output would usually be a local scope entry. This is a regression test for a second bug that GitoxideLabs#1523 ended up fixing (in addition to the performance bug that motivatd that PR). By running `git config -l` in a directory that `git` is unlikely to treat as a local repository, that avoids reading local scope configuration even if there are no broader-scope config files for `git` to read from. This is important because local configuration variables may be unsuitable to treat as if they are the installation configuration, for reasons that are more important than distinguishing between installation (usually system scope) and global scope variables. Consider, for example, if `gix clone` is run from inside a repository. For operations on other repositories, including clones, that may be undertaken from within a first repository, the first repository's local configuration variables are not allowed to have any effect. This is important for correctness and, in some cases, for security. But if the local configuration file, usually `.git/config`, is wrongly recognized to be the configuration file associated with the `git` installation, then values from it will be used. Then, among various other possible incorrect outcomes: - Secrets from the repository `gix clone` is run from can be leaked to the remote of the repository being cloned, for example if authentication credentials are stored in `http.extraHeader`. - More rarely, the configuration in the repository `gix clone` is run from may expose information related to the repository being cloned. For example, a custom value of `core.sshCommand` with extra logging may be configured in the first repository, but where logging authentication or other information from cloning the second repository would unexpectedly expose it as well. While GitoxideLabs#1523 fixed this in nearly all practical situations, there are some situations where there could still be a problem, such that the way `git config -l` commands are run should be further hardened. (The test added here is to aid in testing such changes.) - If the system has an old version of `git` without a patch for GHSA-j342-m5hw-rr3v or other vulnerabilities where `git` would perform operations on a repository owned by another user, then `git config -l` may treat a shared temporary directory as a `git` repository, if another user on the system has created and populated a `.git` directory there. `env::temp_dir()` almost always gives a shared temporary directory on Unix-like systems, and in rare cases can give one on Windows. The other user on the system may, on some systems, even be an account representing a low-privileged service/daemon. - I think it is possible, though I do not know of cases, for a downstream distribution to patch such vulnerabilities in `git`, but do so in a way where `git config -l` commands refuse to display any configuration when run from a repository owned by another user (and not listed in `safe.directory`). If this were to happen, then we would fail to discover a path to the config file associated with the `git` installation. I expect that rarely to happen because patches are usually backported rather than being written in a different way, and `git` does not have this problem. - In the rare case that the user has made the temporary directory `env::temp_dir()` a `git` repository, this should still ideally not cause that local scope configuration to be used here. - The `TMPDIR` environment variable on a Unix-like system, or the `TMP` and/or `TEMP` environment variables on Windows, may be set to incorrect values (such as directories that do not exist or should not be used, or the empty string), or unset. Then `env::temp_dir()` may not find a suitable directory, and any of the above problems could potentially occur. These are all unlikely compared to the situation before, so even if this is hardened further, the biggest improvement was in GitoxideLabs#1523. The test introduced in this commit passes with the code as it currently stands, and fails when the `current_dir(env::temp_dir())` line GitoxideLabs#1523 added is commented out, demonstrating that it works as a regression test. However, the test is not quite done: - It does not detect the bug on macOS. This is to be expected, because on macOS there are usually "unknown"-scope values at the beginning of the output, and these should be (and are) detected as belonging to the installation. This is why it is not correct to pass `--system`. See 20ef4e9, 6b1c243, and: GitoxideLabs#1523 (comment) This is probably acceptable. - The test wrongly fails on macOS, because it thinks correct values like /Library/Developer/CommandLineTools/usr/share/git-core/gitconfig from the "unknown" scope on macOS may be from the local scope. As currently written, the test expects that when there is nothing from the system and global scopes, `EXE_INFO` returns `None`, rather than `Some(path)` where `path` is from the macOS "unknown" scope associated with the installation. This false positive will have to be fixed, so that the test suite can pass on macOS, and so the test is at least somewhat useful on macOS (while possibly being more precise on other OSes).
When invoking `git` to find the configuration file path associated with the `git` installation itself, this sets `GIT_DIR` to a path that cannot be a `.git` directory for any repository, to keep `git config -l` from including any local scope entries in the output of the `git config -l ...` command that is used to find the origin for the first Git configuration variable. Specifically, a path to the null device is used. This is `/dev/null` on Unix and `NUL` on Windows. This is not a directory, and when treated as a file it is always treated as empty: reading from it, if successful, reaches end-of-file immediately. This problem is unlikely since GitoxideLabs#1523, which caused this `git` invocation to use a `/tmp`-like location (varying by system and environment) as its current working directory. Although the goal of that change was just to improve performance, it pretty much fixed the bug where local-scope configuration could be treated as installation-level configuration when no configuration variables are available from higher scopes. This change further hardens against two edge cases: - If the `git` command is an old and unpatched vulnerable version in which `safe.directory` is not yet implemented, or in which GHSA-j342-m5hw-rr3v or other vulnerabilities where `git` would perform operations on untrusted local repositories owned by other users are unpatched, then a `.git` subdirectory of a shared `/tmp` or `/tmp`-like directory could be created by another account, and its local configuration would still have been used. (This is not a bug in gitoxide per se; having vulnerable software installed that other software may use is inherently insecure. But it is nice to offer a small amount of protection against this when readily feasible.) - If the `/tmp`-like location is a Git repository owned by the current user, then its local configuration would have been used. Any path guaranteed to point to a nonexistent entry or one that is guaranteed to be (or to be treated as) an empty file or directory should be sufficient here. Using the null device, even though it is not directory-like, seems like a reasonably intuitive way to do it. A note for Windows: There is more than one reasonable path to the null device. One is DOS-style relative path `NUL`, as used here. One of the others, which `NUL` in effect resolves to when opened, is the fully qualified Windows device namespace path `\\.\NUL`. I used the former here to ensure we avoid any situation where `git` would misinterpret a `\` in `\\.\NUL` in a POSIX-like fashion. This seems unlikely, and it could be looked into further if reasons surface to prefer `\\.\NUL`. One possible reason to prefer `\\.\NUL` is that which names are treated as reserved legacy DOS device names changes from version to version of Windows, with Windows 11 treating some of them as ordinary filenames. However, while this affects names such as `CON`, it does not affect `NUL`, at least written unsuffixed. I'm not sure if any Microsoft documentation has yet been updated to explain this in detail, but see: - dotnet/docs#41193 - python/cpython#95486 (comment) - python/cpython#95486 (comment) At least historically, it has been possible on Windows, though rare, for the null device to be absent. This was the case on Windows Fundamentals for Legacy PCs (WinFPE). Even if that somehow were ever to happen today, this usage should be okay, because attempting to open the device would still fail rather than open some other file (as may even be happening in Git for Windows already), the name `NUL` would still presumably be reserved (much as the names `COM?` where `?` is replaced with a Unicode superscript 1, 2, or 3 are reserved even though those devices don't really exist), and I think `git config -l` commands should still shrug off the error opening the file and give non-local-scope configuration, as it does when `GIT_DIR` is set to a nonexistent location.
When running `git config -l ...` to find the configuration file path associated with the `git` installation itself, the current working directory for the subprocess was set to the current directory prior to GitoxideLabs#1523, and to `/tmp` or a `/tmp`-like directory since GitoxideLabs#1523 (which improved performance and security). This builds on GitoxideLabs#1523, as well as on subsequent changes to run `git` in a way that its behavior depends less on its CWD, by making an even more robust choice of CWD for the subprocess, so that the CWD is less likely to be deeply nested or on network storage; more likely to exist; and, on Unix-like systems, less likely to contain a `.git` entry (though a `git` with security updates should refuse to take any configuration from such a repository unless it is owned by the user). Due to a combination of other measures that harden against malicious or unusual contents (especially setting `GIT_DIR`), the most significant benefit of this change is to fix the problem that a nonexistent temp dir would prevent the command from succeeding. The main way that could happen is if `TMPDIR` on Unix-like systems, or `TMP` or `TEMP` on Windows, is set to an incorrect value. Because these variables are sometimes reasonable to customize for specific purposes, it is plausible for them to be set to incorrect values by accident. Except on Windows, this always uses `/` as the CWD for the subprocess. On Windows, we use the Windows directory (usually `C:\Windows`) rather than the root of the system drive (usually `C:\`), because: - We are currently obtaining this information from environment variables, and it is possible for our own parent process to pass down an overly sanitized environment. Although this can be so sanitized we cannot find the Windows directory, this is less likely to occur than being unable to find the root of the system drive. This due to moderately broad awareness that the `SystemRoot` environment variable (which, somewhat confusingly, holds the path of the Windows directory, not the root of the system drive) should be preserved even when clearing most other variables. Some libraries will even automatically preserve `SystemRoot` when clearing others or restore it. For example: * https://go-review.googlesource.com/c/go/+/174318 As far as I know, such treatment of `SystemDrive` is less common. And also these two considerations, which are minor by comparison: - Under the current behavior of `env::temp_dir()`, which is now a fallback if we cannot determine the Windows directory, we already fall back to the Windows directory evenutally, if temp dir related environment variables are also unset. This is because `env::temp_dir()` usually calls `GetTempDir2` in the Windows API, which implements that fallback behavior (after first trying the user's user profile directory). Avoiding adding yet another place to fall back to that would not otherwise be attempted slightly decreases behavioral complexity, and there is no reason to think a directory like `C:\` would work when a directory like `C:\Windows` doesn't. - The root of the system drive on a Windows system usually permits limited user accounts to create new directories there, so a directory like `C:\` on Windows actually has most of the disadvantages of a location like `/tmp` on a Unix-like system. * GHSA-vw2c-22j4-2fh2 * GHSA-mgvv-9p9g-3jv4 This is actually a much less significant reason to prefer a directory like `C:\Windows` to a directory like `C:\` than it might seem. After all, if `C:\.git` exists and and `git` uses it when run from `C:\`, then `git` would usually also use it when run from `C:\Windows` (and from numerous other locations)! However, the reason there is still a small reason to prefer a location like `C:\Windows` to a location like `C:\` is that, if a system has a vulnerable `git` but a user or system administrator has sought to work around it by listing `C:\` in `GIT_CEILING_DIRECTORIES`, then that may keep `git` from traversing upward into `C:\`, but it would not keep `C:\` from being used if that is where we already are. An even more significant reason this motivation is a minor one is that the other measures we are taking, including setting `GIT_DIR`, should be sufficient to avoid at least the security dimension of the problem, which arises from actually using the configuration from a repo that is discovered. The reason we do not prefer the user's user profile directory is: - The user profile directory may be more deeply nested. - The user profile directory may sometimes be on slow network storage when the discovered Windows directory is not. - In some situations, the user profile directory does not actually exist, or does not exist yet. - Overly sanitized environments are more likely to lack the `USERPROFILE` vairable than the `SystemRoot` variable. - Users may occasionally choose to have their entire user profile directory be a Git repository. - It's no easier to avoid the problem of using `C:\.git` in a user profile directory than in `C:\Windows`: they're usually both under `C:\`, and are both not the same as `C:\`. (If the user profile directory is a repository, then that will avoid that problem, yet be its own problem, if not for other measures that prevent both.)
It doesn't have to have any configuration variables set in the command scope, though in practice it always should (with versions of `git` with which `gix-testtools` is compatible) because we are setting them with `GIT_CONFIG_{COUNT,KEY,VALUE}`, which, like `-c`, sets configuration variables in the command scope. But it must not have any configuration variables set in other scopes. Of course, actual fixture scripts, when run, most often create Git repositories, in which case there will in practice almost always be local scope configuration for the script. The main point here is to make sure we are omitting variables from the global and system scopes, as we have already been doing (and testing for), and also omitting them from the other high scopes, such as the "unknown" scope associated with an Apple Git installation that is higher than the system scope and unaffected by `GIT_CONFIG_SYSTEM` but affected by `GIT_CONFIG_NOSYSTEM`. Like the tests that preceded it, this test creates a new empty temporary directory to set as the CWD of the test `git` command configured by `configure_command`. As such, there should be no local scope configuration, because this directory should be a subdirectory of a system `/tmp`-like directory. A `/tmp`-like directory can technically be a Git repository, and can even contribute configuration if the repository is owned by the current user or something is keeping `safe.directory` protections from excluding it. When the goal is to *get* configuration from scopes higher than the local scope, it is worth taking steps to prevent this (which for `gix-path` is achieved by the combination of GitoxideLabs#1523 and GitoxideLabs#1567). But in the test suite, temporary directories should preferably be in locations that are only `git` repositories (owned by the curent user) in the unusual situation that this is desired, and supporting tooling such as `git` should be at recent enough versions to really support the usage that the test suite makes of it. Furthermore, while it is possible to clear the local scope in this new test -- such as by using, as the command's CWD, a subdirectory of a directory that is, in the command's environment, prepended to `GIT_CEILING_DIRECTORIES` -- this would tend to hide problems in the actual `gix-testtools` code that this is testing. If any such measure needs to be taken, it would be better done in code that uses `configure_command` (or in `configure_command` itself it it is widely needed and generally acceptable), rather than in tests of `configure_command`. For these reasons, the test is, at least for now, deliberately written in such a way that it will fail if the directory we get from `tempfile::TempDir::new()` is somehow a Git repository. This commit consolidates the two preceding test cases into this one. So now there is a single test that `configure-command` both: - Excludes global and system scope config, as it already did. - Also excludes other high-scoped config, which it doesn't yet do. Thus, this new test is expected to fail on most macOS systems (where `git` is Apple Git and the installation-associated "unknown" scope configuration file is nonempty), until that is fixed.
This pull request, in addition to improving performance, fixed the vulnerability CVE-2024-45305 (RUSTSEC-2024-0367, GHSA-v26r-4c9c-h3j6). (It may make sense to edit the PR description to mention this, though either way should be fine because this comment can itself be found by searching. Further related hardening, as well as a fix for a separate vulnerability, is included in #1567, but the fix for CVE-2024-45305 is here, and the changes there build on this.) |
Commit a9a3545 made
gix::config::File::from_globals()
try to find the system config (typically/etc/config
) by subprocessing togit config -l --show-origin
. That command takes about 500 ms when run in certain directories in our VFS at Google. Since we don't need anything but the system config, let's run the command in the system's temporary directory instead.I also added
--system
for good measure. I've confirmed that just adding--system
is not enough, however -git config -l --system --show-origin
is still slow in our VFS.