Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

run git config -l in temp dir when looking up system config #1523

Merged
merged 2 commits into from
Aug 15, 2024

Conversation

martinvonz
Copy link
Contributor

Commit a9a3545 made gix::config::File::from_globals() try to find the system config (typically /etc/config) by subprocessing to git config -l --show-origin. That command takes about 500 ms when run in certain directories in our VFS at Google. Since we don't need anything but the system config, let's run the command in the system's temporary directory instead.

I also added --system for good measure. I've confirmed that just adding --system is not enough, however - git config -l --system --show-origin is still slow in our VFS.

Commit a9a3545 made `gix::config::File::from_globals()` try to
find the system config (typically `/etc/config`) by subprocessing to
`git config -l --show-origin`. That command takes about 500 ms when
run in certain directories in our VFS at Google. Since we don't need
anything but the system config, let's run the command in the system's
temporary directory instead.

I also added `--system` for good measure. I've confirmed that just
adding `--system` is not enough, however - `git config -l --system
--show-origin` is still slow in our VFS.
Byron
Byron previously approved these changes Aug 15, 2024
Copy link
Member

@Byron Byron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot, a much appreciated change and: "Oh wow, this code runs at Google now!"

With that said, since --system doesn't make a performance difference and fails on MacOS (see below), I will have to remove that before merging.


❯ which git
/usr/bin/git

gitbutler ( missing-project) [$] via 
❯ git --version
git version 2.39.3 (Apple Git-146)

gitbutler ( missing-project) [$] via 
❯ git config -l --system --show-origin
fatal: unable to read config file '/etc/gitconfig': No such file or directory

```
❯ which git
/usr/bin/git

gitbutler ( missing-project) [$] via 
❯ git --version
git version 2.39.3 (Apple Git-146)

gitbutler ( missing-project) [$] via 
❯ git config -l --system --show-origin
fatal: unable to read config file '/etc/gitconfig': No such file or directory
```
@EliahKagan
Copy link
Member

EliahKagan commented Aug 15, 2024

The nature of the macOS failure is interesting:

~> git config -l --show-origin --show-scope
unknown file:/Library/Developer/CommandLineTools/usr/share/git-core/gitconfig   credential.helper=osxkeychain
unknown file:/Library/Developer/CommandLineTools/usr/share/git-core/gitconfig   init.defaultbranch=main
global  file:/Users/ek/.gitconfig       user.name=Eliah Kagan
global  file:/Users/ek/.gitconfig       user.email=degeneracypressure@gmail.com
global  file:/Users/ek/.gitconfig       core.precomposeunicode=false

It has the usual value of /etc/gitconfig as the path of its system configuration file, which usually does not exist, and it has its installation configuration in a separate file (whose scope is shown as unknown).

I cannot create /etc/gitconfig on this system, however, so I don't know which has precedence when both exist.

@martinvonz
Copy link
Contributor Author

With that said, since --system doesn't make a performance difference and fails on MacOS (see below), I will have to remove that before merging.

It works on my Mac git config -l --system but that seems to be because we have a custom git at Google.

Actually, --system was not what I thought it was anyway, so I'm glad you're removing that! :) On my Linux machine, --system is /usr/share/git-core/config, while the first ("installation") path in git config -l --show-origin is /etc/gitconfig.

@Byron Byron merged commit 83c9de0 into GitoxideLabs:main Aug 15, 2024
19 checks passed
@Byron
Copy link
Member

Byron commented Aug 15, 2024

Sorry for the early merge - apparently the "Require status checks before merging" option does not like I thought require all checks to pass (what I want), but is a no-op unless a check is chosen. Now I would have to select all checks and keep them in sync with what's actually there, which seems inconvenient at best.

Screenshot 2024-08-15 at 17 27 38

And I so wanted to have auto-merge 😭.

@martinvonz
Copy link
Contributor Author

Thanks for the quick review. I hope the lack of GitHub checks didn't lead to anything getting broken (doesn't look like it AFAICT).

@NobodyXu
Copy link
Contributor

@Byron you can add a dummy job "test-pass":

https://github.com/cargo-bins/cargo-binstall/blob/c4abcb90a5fac18f5f8744cbf7bd11b00a37ca41/.github/workflows/ci.yml#L372

EliahKagan added a commit to EliahKagan/gitoxide that referenced this pull request Aug 30, 2024
This tests that `EXE_INFO` never picks up a local scope
configuration file when looking for the configuration associated
with the installation -- not even if the system and global scopes
provide no variables, such that the first row of of
`git config -l` output would usually be a local scope entry.

This is a regression test for a second bug that GitoxideLabs#1523 ended up
fixing (in addition to the performance bug that motivatd that PR).
By running `git config -l` in a directory that `git` is unlikely to
treat as a local repository, that avoids reading local scope
configuration even if there are no broader-scope config files for
`git` to read from.

This is important because local configuration variables may be
unsuitable to treat as if they are the installation configuration,
for reasons that are more important than distinguishing between
installation (usually system scope) and global scope variables.

Consider, for example, if `gix clone` is run from inside a
repository. For operations on other repositories, including clones,
that may be undertaken from within a first repository, the first
repository's local configuration variables are not allowed to have
any effect. This is important for correctness and, in some cases,
for security. But if the local configuration file, usually
`.git/config`, is wrongly recognized to be the configuration file
associated with the `git` installation, then values from it will
be used. Then, among various other possible incorrect outcomes:

- Secrets from the repository `gix clone` is run from can be
  leaked to the remote of the repository being cloned, for example
  if authentication credentials are stored in `http.extraHeader`.

- More rarely, the configuration in the repository `gix clone` is
  run from may expose information related to the repository being
  cloned. For example, a custom value of `core.sshCommand` with
  extra logging may be configured in the first repository, but
  where logging authentication or other information from cloning
  the second repository would unexpectedly expose it as well.

While GitoxideLabs#1523 fixed this in nearly all practical situations, there
are some situations where there could still be a problem, such that
the way `git config -l` commands are run should be further
hardened. (The test added here is to aid in testing such changes.)

- If the system has an old version of `git` without a patch for
  GHSA-j342-m5hw-rr3v
  or other vulnerabilities where `git` would perform operations on
  a repository owned by another user, then `git config -l` may
  treat a shared temporary directory as a `git` repository, if
  another user on the system has created and populated a `.git`
  directory there.

  `env::temp_dir()` almost always gives a shared temporary
  directory on Unix-like systems, and in rare cases can give one on
  Windows. The other user on the system may, on some systems, even
  be an account representing a low-privileged service/daemon.

- I think it is possible, though I do not know of cases, for a
  downstream distribution to patch such vulnerabilities in `git`,
  but do so in a way where `git config -l` commands refuse to
  display any configuration when run from a repository owned by
  another user (and not listed in `safe.directory`). If this were
  to happen, then we would fail to discover a path to the config
  file associated with the `git` installation.

  I expect that rarely to happen because patches are usually
  backported rather than being written in a different way, and
  `git` does not have this problem.

- In the rare case that the user has made the temporary directory
  `env::temp_dir()` a `git` repository, this should still ideally
  not cause that local scope configuration to be used here.

- The `TMPDIR` environment variable on a Unix-like system, or the
  `TMP` and/or `TEMP` environment variables on Windows, may be set
  to incorrect values (such as directories that do not exist or
  should not be used, or the empty string), or unset. Then
  `env::temp_dir()` may not find a suitable directory, and any of
  the above problems could potentially occur.

These are all unlikely compared to the situation before, so even if
this is hardened further, the biggest improvement was in GitoxideLabs#1523.

The test introduced in this commit passes with the code as it
currently stands, and fails when the `current_dir(env::temp_dir())`
line GitoxideLabs#1523 added is commented out, demonstrating that it works as
a regression test.

However, the test is not quite done:

- It does not detect the bug on macOS. This is to be expected,
  because on macOS there are usually "unknown"-scope values at the
  beginning of the output, and these should be (and are) detected
  as belonging to the installation. This is why it is not correct
  to pass `--system`. See 20ef4e9, 6b1c243, and:
  GitoxideLabs#1523 (comment)

  This is probably acceptable.

- The test wrongly fails on macOS, because it thinks correct values
  like /Library/Developer/CommandLineTools/usr/share/git-core/gitconfig
  from the "unknown" scope on macOS may be from the local scope. As
  currently written, the test expects that when there is nothing
  from the system and global scopes, `EXE_INFO` returns `None`,
  rather than `Some(path)` where `path` is from the macOS "unknown"
  scope associated with the installation.

  This false positive will have to be fixed, so that the test suite
  can pass on macOS, and so the test is at least somewhat useful
  on macOS (while possibly being more precise on other OSes).
EliahKagan added a commit to EliahKagan/gitoxide that referenced this pull request Aug 30, 2024
When invoking `git` to find the configuration file path associated
with the `git` installation itself, this sets `GIT_DIR` to a path
that cannot be a `.git` directory for any repository, to keep
`git config -l` from including any local scope entries in the
output of the `git config -l ...` command that is used to find the
origin for the first Git configuration variable.

Specifically, a path to the null device is used. This is
`/dev/null` on Unix and `NUL` on Windows. This is not a directory,
and when treated as a file it is always treated as empty: reading
from it, if successful, reaches end-of-file immediately.

This problem is unlikely since GitoxideLabs#1523, which caused this `git`
invocation to use a `/tmp`-like location (varying by system and
environment) as its current working directory. Although the goal of
that change was just to improve performance, it pretty much fixed
the bug where local-scope configuration could be treated as
installation-level configuration when no configuration variables
are available from higher scopes.

This change further hardens against two edge cases:

- If the `git` command is an old and unpatched vulnerable version
  in which `safe.directory` is not yet implemented, or in which
  GHSA-j342-m5hw-rr3v
  or other vulnerabilities where `git` would perform operations on
  untrusted local repositories owned by other users are unpatched,
  then a `.git` subdirectory of a shared `/tmp` or `/tmp`-like
  directory could be created by another account, and its local
  configuration would still have been used. (This is not a bug in
  gitoxide per se; having vulnerable software installed that other
  software may use is inherently insecure. But it is nice to offer
  a small amount of protection against this when readily feasible.)

- If the `/tmp`-like location is a Git repository owned by the
  current user, then its local configuration would have been used.

Any path guaranteed to point to a nonexistent entry or one that is
guaranteed to be (or to be treated as) an empty file or directory
should be sufficient here. Using the null device, even though it is
not directory-like, seems like a reasonably intuitive way to do it.

A note for Windows: There is more than one reasonable path to the
null device. One is DOS-style relative path `NUL`, as used here.
One of the others, which `NUL` in effect resolves to when opened,
is the fully qualified Windows device namespace path `\\.\NUL`. I
used the former here to ensure we avoid any situation where `git`
would misinterpret a `\` in `\\.\NUL` in a POSIX-like fashion. This
seems unlikely, and it could be looked into further if reasons
surface to prefer `\\.\NUL`.

One possible reason to prefer `\\.\NUL` is that which names are
treated as reserved legacy DOS device names changes from version to
version of Windows, with Windows 11 treating some of them as
ordinary filenames. However, while this affects names such as
`CON`, it does not affect `NUL`, at least written unsuffixed. I'm
not sure if any Microsoft documentation has yet been updated to
explain this in detail, but see:

- dotnet/docs#41193
- python/cpython#95486 (comment)
- python/cpython#95486 (comment)

At least historically, it has been possible on Windows, though
rare, for the null device to be absent. This was the case on
Windows Fundamentals for Legacy PCs (WinFPE). Even if that somehow
were ever to happen today, this usage should be okay, because
attempting to open the device would still fail rather than open
some other file (as may even be happening in Git for Windows
already), the name `NUL` would still presumably be reserved (much
as the names `COM?` where `?` is replaced with a Unicode
superscript 1, 2, or 3 are reserved even though those devices don't
really exist), and I think `git config -l` commands should still
shrug off the error opening the file and give non-local-scope
configuration, as it does when `GIT_DIR` is set to a nonexistent
location.
EliahKagan added a commit to EliahKagan/gitoxide that referenced this pull request Aug 30, 2024
When running `git config -l ...` to find the configuration file
path associated with the `git` installation itself, the current
working directory for the subprocess was set to the current
directory prior to GitoxideLabs#1523, and to `/tmp` or a `/tmp`-like directory
since GitoxideLabs#1523 (which improved performance and security).

This builds on GitoxideLabs#1523, as well as on subsequent changes to run `git`
in a way that its behavior depends less on its CWD, by making an
even more robust choice of CWD for the subprocess, so that the CWD
is less likely to be deeply nested or on network storage; more
likely to exist; and, on Unix-like systems, less likely to contain
a `.git` entry (though a `git` with security updates should refuse
to take any configuration from such a repository unless it is owned
by the user).

Due to a combination of other measures that harden against
malicious or unusual contents (especially setting `GIT_DIR`), the
most significant benefit of this change is to fix the problem that
a nonexistent temp dir would prevent the command from succeeding.

The main way that could happen is if `TMPDIR` on Unix-like systems,
or `TMP` or `TEMP` on Windows, is set to an incorrect value.
Because these variables are sometimes reasonable to customize for
specific purposes, it is plausible for them to be set to incorrect
values by accident.

Except on Windows, this always uses `/` as the CWD for the
subprocess.

On Windows, we use the Windows directory (usually `C:\Windows`)
rather than the root of the system drive (usually `C:\`), because:

- We are currently obtaining this information from environment
  variables, and it is possible for our own parent process to pass
  down an overly sanitized environment.

  Although this can be so sanitized we cannot find the Windows
  directory, this is less likely to occur than being unable to find
  the root of the system drive.

  This due to moderately broad awareness that the `SystemRoot`
  environment variable (which, somewhat confusingly, holds the path
  of the Windows directory, not the root of the system drive)
  should be preserved even when clearing most other variables.

  Some libraries will even automatically preserve `SystemRoot` when
  clearing others or restore it. For example:

  * https://go-review.googlesource.com/c/go/+/174318

  As far as I know, such treatment of `SystemDrive` is less common.

And also these two considerations, which are minor by comparison:

- Under the current behavior of `env::temp_dir()`, which is now a
  fallback if we cannot determine the Windows directory, we already
  fall back to the Windows directory evenutally, if temp dir
  related environment variables are also unset.

  This is because `env::temp_dir()` usually calls `GetTempDir2` in
  the Windows API, which implements that fallback behavior (after
  first trying the user's user profile directory).

  Avoiding adding yet another place to fall back to that would not
  otherwise be attempted slightly decreases behavioral complexity,
  and there is no reason to think a directory like `C:\` would work
  when a directory like `C:\Windows` doesn't.

- The root of the system drive on a Windows system usually permits
  limited user accounts to create new directories there, so a
  directory like `C:\` on Windows actually has most of the
  disadvantages of a location like `/tmp` on a Unix-like system.

  * GHSA-vw2c-22j4-2fh2
  * GHSA-mgvv-9p9g-3jv4

  This is actually a much less significant reason to prefer a
  directory like `C:\Windows` to a directory like `C:\` than it
  might seem. After all, if `C:\.git` exists and and `git` uses it
  when run from `C:\`, then `git` would usually also use it when
  run from `C:\Windows` (and from numerous other locations)!

  However, the reason there is still a small reason to prefer a
  location like `C:\Windows` to a location like `C:\` is that, if a
  system has a vulnerable `git` but a user or system administrator
  has sought to work around it by listing `C:\` in
  `GIT_CEILING_DIRECTORIES`, then that may keep `git` from
  traversing upward into `C:\`, but it would not keep `C:\` from
  being used if that is where we already are.

  An even more significant reason this motivation is a minor one is
  that the other measures we are taking, including setting
  `GIT_DIR`, should be sufficient to avoid at least the security
  dimension of the problem, which arises from actually using the
  configuration from a repo that is discovered.

The reason we do not prefer the user's user profile directory is:

- The user profile directory may be more deeply nested.

- The user profile directory may sometimes be on slow network
  storage when the discovered Windows directory is not.

- In some situations, the user profile directory does not actually
  exist, or does not exist yet.

- Overly sanitized environments are more likely to lack the
  `USERPROFILE` vairable than the `SystemRoot` variable.

- Users may occasionally choose to have their entire user profile
  directory be a Git repository.

- It's no easier to avoid the problem of using `C:\.git` in a user
  profile directory than in `C:\Windows`: they're usually both under
  `C:\`, and are both not the same as `C:\`. (If the user profile
  directory is a repository, then that will avoid that problem, yet
  be its own problem, if not for other measures that prevent both.)
EliahKagan added a commit to EliahKagan/gitoxide that referenced this pull request Sep 1, 2024
It doesn't have to have any configuration variables set in the
command scope, though in practice it always should (with versions
of `git` with which `gix-testtools` is compatible) because we are
setting them with `GIT_CONFIG_{COUNT,KEY,VALUE}`, which, like `-c`,
sets configuration variables in the command scope.

But it must not have any configuration variables set in other
scopes. Of course, actual fixture scripts, when run, most often
create Git repositories, in which case there will in practice
almost always be local scope configuration for the script.

The main point here is to make sure we are omitting variables from
the global and system scopes, as we have already been doing (and
testing for), and also omitting them from the other high scopes,
such as the "unknown" scope associated with an Apple Git
installation that is higher than the system scope and unaffected
by `GIT_CONFIG_SYSTEM` but affected by `GIT_CONFIG_NOSYSTEM`.

Like the tests that preceded it, this test creates a new empty
temporary directory to set as the CWD of the test `git` command
configured by `configure_command`. As such, there should be no
local scope configuration, because this directory should be a
subdirectory of a system `/tmp`-like directory.

A `/tmp`-like directory can technically be a Git repository, and
can even contribute configuration if the repository is owned by the
current user or something is keeping `safe.directory` protections
from excluding it.

When the goal is to *get* configuration from scopes higher than the
local scope, it is worth taking steps to prevent this (which for
`gix-path` is achieved by the combination of GitoxideLabs#1523 and GitoxideLabs#1567).

But in the test suite, temporary directories should preferably be
in locations that are only `git` repositories (owned by the curent
user) in the unusual situation that this is desired, and supporting
tooling such as `git` should be at recent enough versions to really
support the usage that the test suite makes of it.

Furthermore, while it is possible to clear the local scope in this
new test -- such as by using, as the command's CWD, a subdirectory
of a directory that is, in the command's environment, prepended to
`GIT_CEILING_DIRECTORIES` -- this would tend to hide problems in
the actual `gix-testtools` code that this is testing.

If any such measure needs to be taken, it would be better done in
code that uses `configure_command` (or in `configure_command`
itself it it is widely needed and generally acceptable), rather
than in tests of `configure_command`.

For these reasons, the test is, at least for now, deliberately
written in such a way that it will fail if the directory we get
from `tempfile::TempDir::new()` is somehow a Git repository.

This commit consolidates the two preceding test cases into this
one. So now there is a single test that `configure-command` both:

- Excludes global and system scope config, as it already did.
- Also excludes other high-scoped config, which it doesn't yet do.

Thus, this new test is expected to fail on most macOS systems
(where `git` is Apple Git and the installation-associated "unknown"
scope configuration file is nonempty), until that is fixed.
@EliahKagan
Copy link
Member

This pull request, in addition to improving performance, fixed the vulnerability CVE-2024-45305 (RUSTSEC-2024-0367, GHSA-v26r-4c9c-h3j6).

(It may make sense to edit the PR description to mention this, though either way should be fine because this comment can itself be found by searching. Further related hardening, as well as a fix for a separate vulnerability, is included in #1567, but the fix for CVE-2024-45305 is here, and the changes there build on this.)