-
Notifications
You must be signed in to change notification settings - Fork 96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Integrate Scalar (ported to C) into vfs-2.32.0 #363
Integrate Scalar (ported to C) into vfs-2.32.0 #363
Conversation
When two `git maintenance` processes try to write the `.plist` file, we need to help them with serializing their efforts. The 150ms time-out value was determined from thin air. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
On macOS, we use launchctl to manage the background maintenance schedule. This uses a set of .plist files to describe the schedule, but these files are also registered with 'launchctl bootstrap'. If multiple 'git maintenance start' commands run concurrently, then they can collide replacing these schedule files and registering them with launchctl. To avoid extra launchctl commands, do a check for the .plist files on disk and check if they are registered using 'launchctl list <name>'. This command will return with exit code 0 if it exists, or exit code 113 if it does not. We can test this behavior using the GIT_TEST_MAINT_SCHEDULER environment variable. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
I had to cancel a run-away job (GitHub Actions experienced some problems, which might be related). In any case, the idea is to merge this only after |
This looks great! I'm very excited to see how this goes upstream. The organization is impeccable. We might want to combine microsoft/scalar#505 with microsoft/scalar#510 to test this version, but we can also wait for the 2.32.0 release before doing that. Also, we will want to update the |
About this: maybe we can somehow make it possible to run the Functional Tests on both Scalar.NET and Scalar/C? |
I want to delete the product code from |
276512e
to
318f20e
Compare
c2fd2c1
to
cb7cf1e
Compare
Okay, so you want to commit to the way forward. Makes sense. If need be, we can always start a maintenance track. |
d577ac9
to
b39873a
Compare
With this patch, we start the journey from the C# project at https://github.com/microsoft/scalar to move what is left to Git's own `contrib/` directory. The idea of Scalar, and before that VFS for Git, has always been to prove that Git _can_ scale, and to upstream whatever strategies have been demonstrated to help. For example, while the virtual filesystem provided by VFS for Git helped the team developing the Windows operating system to move onto Git, it is not really an upstreamable strategy: getting it to work, and the required server-side support, make this not quite feasible. The Scalar project learned from that and tackled the problem with different tactics: instead of pretending to Git that the working directory is fully populated, it _specifically_ teaches Git about partial clone (which is based on VFS for Git's cache server), about sparse checkout (which VFS for Git tried to do transparently, in the file system layer), and regularly runs maintenance tasks to keep the repository in a healthy state. With partial clone, sparse checkout and `git maintenance` having been upstreamed, there is little left that `scalar.exe` does that which `git.exe` cannot do. One such thing is that `scalar clone <url>` will automatically set up a partial, sparse clone, and configure known-helpful settings from the start. Let's bring this convenience directly into Git's tree. The idea here is that you can (optionally) build Scalar via make -C contrib/scalar/Makefile This will build the `scalar` executable and put it into the contrib/scalar/ subdirectory. The slightly awkward addition of the `contrib/scalar/*` bits to the top-level `Makefile` are actually really required: we want to link to `libgit.a`, which means that we will need to use the very same `CFLAGS` and `LDFLAGS` as the rest of Git. An early development version of this patch tried to replicate the respective conditionals in `contrib/scalar/Makefile` (just like `contrib/svn-fe/Makefile` tried to do). It turned out to be quite the whack-a-mole game: the SHA-1-related flags, the flags enabling/disabling `compat/poll/`, `compat/regex/`, `compat/win32mmap.c` etc based on the current platform... To put it mildly: it was a major mess. Instead, this patch makes minimal changes to the top-level `Makefile` so that the bits in `contrib/scalar/` can be compiled and linked, and adds a `contrib/scalar/Makefile` that uses the top-level `Makefile` in a most minimal way to do the actual compiling. Note: With this commit, we only establish the infrastructure, no Scalar functionality is implemented yet; We will do that incrementally over the next few commits. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
... which does not do much, yet... Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Over the course of Scalar's development, it became obvious that there is a need for a command that can gather all kinds of useful information that can help identify the most typical problems with large worktrees/repositories. The `diagnose` command is the culmination of this hard-won knowledge: it gathers the installed hooks, the config, a couple statistics describing the data shape, among other pieces of information, and then wraps everything up in a tidy, neat `.zip` archive. Note: in the .NET version we have the luxury of a comprehensive standard library that includes basic functionality such as writing a `.zip` file. In the C version, we lack such a commodity. Rather than introducing a dependency on, say, libzip, we slightly abuse Git's `archive` command: instead of writing the `.zip` file directly, we stage the file contents in a Git index of a temporary, bare repository, only to let `git archive` have at it, and finally removing the temporary repository. Also note: Due to the frequent spawned `git hash-object` processes, this command is quite a bit slow on Windows. Should it turn out to be a big problem, the lack of a batch mode of the `hash-object` command could potentially be worked around via using `git fast-import` with a crafted `stdin`. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Let's start implementing the `register` command. With this commit, recommended settings are configured upon `scalar register`. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This implements Scalar's opinionated `clone` command: it tries to use a partial clone and sets up a sparse checkout by default. In contrast to `git clone`, `scalar clone` sets up the worktree in the `src/` subdirectory, to encourage a separation between the source files and the build output (which helps Git tremendously because it avoids untracked files that have to be specifically ignored when refreshing the index). Also, it registers the repository for regular, scheduled maintenance, and configures a slur of configuration settings based on the experience of the Microsoft Windows and the Microsoft Office development teams. Note: We intentionally use a slightly wasteful `set_config()` function (which does not reuse a single `strbuf`, for example, though performance _really_ does not matter here) because it is very, very convenient. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This commit establishes the infrastructure to build the manual page for te `scalar` command. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Arguably, the biggest learning from the Scalar project is that scheduled maintenance is crucial to keep large repositories in a good shape. With this commit, `scalar register` starts those scheduled maintenance tasks, and `scalar unregister` stops them. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This commit adds a simple regression test, modeled after Git's own test suite. A more comprehensive functional (or: integration) test suite can be found at https://github.com/microsoft/scalar; There is no intention to port that fuller test suite to `contrib/scalar/`; Instead, it will still be used to verify the `scalar` functionality in Microsoft's Git fork. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This comes in handy during Scalar upgrades, or when config settings were messed up by mistake. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Teach the `scalar diagnose` command to gather file size information about pack files. Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
Let's populate the manual page of `scalar` a bit. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
With this commit, `git help scalar` will open the appropriate manual or HTML page (instead of looking for `gitscalar`). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The list is simply those registered under the multi-valued scalar.repo config setting. Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
For example after a Scalar upgrade, it can come in really handy if there is an easy way to reconfigure all Scalar enlistments. This new option offers this functionality. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Teach the `scalar diagnose` command to gather loose object counts. Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
Using the built-in FSMonitor makes many common commands quite a bit faster. So let's teach the `scalar register` command to enable the built-in FSMonitor and kick-start the fsmonitor--daemon process (for convenience). For simplicity, we only support the built-in FSMonitor (and no external file system monitor such as e.g. Watchman). Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Continuing the documentation journey. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
We already have the `config` command that accesses the `gvfs/config` endpoint. To implement `scalar`, we also need to be able to access the `vsts/info` endpoint. Let's add a command to do precisely that. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
On Windows, both the forward slash and the backslash are directory separators. Which means that `a\b\c` really is inside `a/b`. Therefore, we need to special-case the directory separators in the helper function `cmp_icase()` that is used in the loop in `dir_inside_of()`. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This finalizes the port of the `QueryVstsInfo()` function: we already taught `gvfs-helper` to access the `vsts/info` endpoint on demand, we implemented proper JSON parsing, and now it is time to hook it all up. To that end, we also provide a default local cache root directory. It works the same way as the .NET version of Scalar: it uses C:\scalarCache on Windows, ~/.scalarCache/ on macOS and ~/.cache/scalar on Linux Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Well, technically also the http:// protocol is allowed _when testing_... Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Azure Repos does not support partial clones at the moment, but it does support the GVFS protocol. To that end, the Microsoft fork of Git has a `gvfs-helper` command that is optionally used to perform essentially the same functionality as partial clone. Let's verify that `scalar clone` detects that situation and enables the GVFS helper. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This allows setting the GVFS-enabled cache server, or listing the one(s) associated with the remote repository. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Sadly, this is a bit trickier than merely flipping the `INCLUDE_SCALAR=YesPlease` switch: The Windows tests are run in a very different way. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
In Scalar's functional tests, we do not do anything with authentication. Therefore, we do want to avoid accessing the `vsts/info` endpoint because it requires authentication even on otherwise public repositories. Let's introduce the environment variable `SCALAR_TEST_SKIP_VSTS_INFO` which can be set to `true` to simply skip that step (and force the `url_*` style repository IDs instead of `id_*` whenever possible). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
…hen plist is registered
This adds the bare minimum to compile the `scalar` executable. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This implements the subcommands `register`, `unregister` and `list`. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This implements `clone`. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This implements `scalar run`. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This allows fixing settings after a Scalar upgrade, or after botching the enlistments configuration. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This implements the `diagnose` subcommand. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This is a convenient shortcut for `scalar unregister <enlistment> && rm -rf <enlistment>`. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This implements the `version` command for backwards-compatibility with the .NET version of Scalar. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
For convenience, this ports the `git -c <key>=<value> -C <dir> <command>` functionality to `scalar`, allowing config settings and workig directories to be set for the duration of the Scalar invocation. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Make `contrib/scalar/` work nicely with the built-in FSMonitor. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Document the whole thing. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Allow concurrent `scalar register` and `scalar unregister` calls to be more collaborative when trying to lock the global Git config at the very same time. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This topic branch offers to include `scalar` in a regular Git build, simply by setting `INCLUDE_SCALAR=YesPlease`. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
For ease of development, build and test `scalar`, too. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Prepare `scalar` to use the GVFS protocol instead of partial clone (required to support Azure Repos). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
cb7cf1e
to
2677735
Compare
Scalar is, in its own words, "an opinionated repository management tool". It builds on top of Git and aims to make it easy and effortless to work with large repositories.
Originally built using .NET, with the take-home lessons from VFS for Git, Scalar provides sort of a laboratory for experimenting with tactics and strategies to help Git scale better. Many recent scalability improvements in Git originate from Scalar, for example:
While providing an experimentation lab outside of Git, the intention of the Scalar project always was to ship its improvements into core Git (i.e. to "upstream" them). As the list above demonstrates, it worked.
It worked so much that there are essentially only very few bits and pieces that are not (yet) upstreamed. The remaining parts fall roughly into these categories:
scalar
executable itselfsrc/
subdirectory (which is the actual Git worktree), to encourage clear separation of tracked vs untracked filesgit maintenance
scalar clone
orscalar register
gvfs-helper
While the
gvfs-helper
part is very unlikely to ever make it into core Git, the remainder can easily be contributed in the form ofcontrib/scalar/
.This Pull Request adds these parts, in a neatly-structured thicket of topic branches, and it concludes the effort of three developers and almost two months.