forked from git-for-windows/git
-
Notifications
You must be signed in to change notification settings - Fork 92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
maintenance: use random minute for scheduling #597
Merged
derrickstolee
merged 8 commits into
microsoft:vfs-2.41.0.3
from
derrickstolee:maintenance-random-minute-ms
Aug 14, 2023
Merged
maintenance: use random minute for scheduling #597
derrickstolee
merged 8 commits into
microsoft:vfs-2.41.0.3
from
derrickstolee:maintenance-random-minute-ms
Aug 14, 2023
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
derrickstolee
force-pushed
the
maintenance-random-minute-ms
branch
from
August 11, 2023 13:59
54d16b7
to
9cb89fa
Compare
I've updated this PR to match the upstream series that looks like it will merge. |
When we initially created background maintenance -- with its hourly, daily, and weekly schedules -- we considered the effects of all clients launching fetches to the server every hour on the hour. The worry of DDoSing server hosts was noted, but left as something we would consider for a future update. As background maintenance has gained more adoption over the past three years, our worries about DDoSing the big Git hosts has been unfounded. Those systems, especially those serving public repositories, are already resilient to thundering herds of much smaller scale. However, sometimes organizations spin up specific custom server infrastructure either in addition to or on top of their Git host. Some of these technologies are built for a different range of scale, and can hit concurrency limits sooner. Organizations with such custom infrastructures are more likely to recommend tools like `scalar` which furthers their adoption of background maintenance. To help solve for this, create get_random_minute() as a method to help Git select a random minute when creating schedules in the future. The integrations with this method do not yet exist, but will follow in future changes. To avoid multiple sources of randomness in the Git codebase, create a new helper function, git_rand(), that returns a random uint32_t. This is similar to how rand() returns a random nonnegative value, except it is based on csprng_bytes() which is cryptographic and will return values larger than RAND_MAX. One thing that is important for testability is that we notice when we are under a test scenario and return a predictable result. The schedules themselves are not checked for this value, but at least one launchctl test checks that we do not unnecessarily reboot the schedule if it has not changed from a previous version. Signed-off-by: Derrick Stolee <derrickstolee@github.com>
The get_random_minute() method was created to allow maintenance schedules to be fixed to a random minute of the hour. This randomness is only intended to spread out the load from a number of clients, but each client should have an hour between each maintenance cycle. Use get_random_minute() when constructing the schedules for launchctl. The format already includes a 'Minute' key which is modified from 0 to the random minute. Signed-off-by: Derrick Stolee <derrickstolee@github.com>
The get_random_minute() method was created to allow maintenance schedules to be fixed to a random minute of the hour. This randomness is only intended to spread out the load from a number of clients, but each client should have an hour between each maintenance cycle. Add this random minute to the Windows scheduler integration. We need only to modify the minute value for the 'StartBoundary' tag across the three schedules. Signed-off-by: Derrick Stolee <derrickstolee@github.com>
The get_random_minute() method was created to allow maintenance schedules to be fixed to a random minute of the hour. This randomness is only intended to spread out the load from a number of clients, but each client should have an hour between each maintenance cycle. Add this random minute to the cron integration. The cron schedule specification starts with a minute indicator, which was previously inserted as the "0" string but now takes the given minute as an integer parameter. Signed-off-by: Derrick Stolee <derrickstolee@github.com>
The systemd_timer_write_unit_templates() method writes a single template that is then used to start the hourly, daily, and weekly schedules with systemd. However, in order to schedule systemd maintenance on a given minute, these templates need to be replaced with specific schedules for each of these jobs. Before modifying the schedules, move the writing method above the systemd_timer_enable_unit() method, so we can write a specific schedule for each unit. The diff is computed smaller by showing systemd_timer_enable_unit() and systemd_timer_delete_units() move instead of systemd_timer_write_unit_templates() and systemd_timer_delete_unit_templates(). Signed-off-by: Derrick Stolee <derrickstolee@github.com>
The get_random_minute() method was created to allow maintenance schedules to be fixed to a random minute of the hour. This randomness is only intended to spread out the load from a number of clients, but each client should have an hour between each maintenance cycle. Add this random minute to the systemd integration. This integration is more complicated than similar changes for other schedulers because of a neat trick that systemd allows: templating. The previous implementation generated two template files with names of the form 'git-maintenance@.(timer|service)'. The '.timer' or '.service' indicates that this is a template that is picked up when we later specify '...@<schedule>.timer' or '...@<schedule>.service'. The '<schedule>' string is then used to insert into the template both the 'OnCalendar' schedule setting and the '--schedule' parameter of the 'git maintenance run' command. In order to set these schedules to a given minute, we can no longer use the 'hourly', 'daily', or 'weekly' strings for '<schedule>' and instead need to abandon the template model for the .timer files. We can still use templates for the .service files. For this reason, we split these writes into two methods. Modify the template with a custom schedule in the 'OnCalendar' setting. This schedule has some interesting differences from cron-like patterns, but is relatively easy to figure out from context. The one that might be confusing is that '*-*-*' is a date-based pattern, but this must be omitted when using 'Mon' to signal that we care about the day of the week. Monday is used since that matches the day used for the 'weekly' schedule used previously. Now that the timer files are not templates, we might want to abandon the '@' symbol in the file names. However, this would cause users with existing schedules to get two competing schedules due to different names. The work to remove the old schedule name is one thing that we can avoid by keeping the '@' symbol in our unit names. Since we are locked into this name, it makes sense that we keep the template model for the .service files. The rest of the change involves making sure we are writing these .timer and .service files before initializing the schedule with 'systemctl' and deleting the files when we are done. Some changes are also made to share the random minute along with a single computation of the execution path of the current Git executable. In addition, older Git versions may have written a 'git-maintenance@.timer' template file. Be sure to remove this when successfully enabling maintenance (or disabling maintenance). Signed-off-by: Derrick Stolee <derrickstolee@github.com>
The 'git maintenance run' command prevents concurrent runs in the same repository using a 'maintenance.lock' file. However, when using systemd the hourly maintenance runs the same time as the daily and weekly runs. (Similarly, daily maintenance runs at the same time as weekly maintenance.) These competing commands result in some maintenance not actually being run. This overlap was something we could not fix until we made the recent change to not use the builting 'hourly', 'daily', and 'weekly' schedules in systemd. We can adjust the schedules such that: 1. Hourly runs avoid the 0th hour. 2. Daily runs avoid Monday. This will keep maintenance runs from colliding when using systemd. Signed-off-by: Derrick Stolee <derrickstolee@github.com>
When running 'git maintenance start', the current pattern is to configure global config settings to enable maintenance on the current repository and set 'maintenance.auto' to false and _then_ to set up the schedule with the system scheduler. This has a problematic error condition: if the scheduler fails to initialize, the repository still will not use automatic maintenance due to the 'maintenance.auto' setting. Fix this gap by swapping the order of operations. If Git fails to initialize maintenance, then the config changes should never happen. Reported-by: Phillip Wood <phillip.wood123@gmail.com> Signed-off-by: Derrick Stolee <derrickstolee@github.com>
derrickstolee
force-pushed
the
maintenance-random-minute-ms
branch
from
August 11, 2023 14:14
9cb89fa
to
c5cd555
Compare
dscho
approved these changes
Aug 14, 2023
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❤️
jeffhostetler
pushed a commit
that referenced
this pull request
Aug 23, 2023
* [x] This is an early version of work already under review upstream. See gitgitgadget#1567 for the version submitted upstream. This is an early version for the `microsoft/git` fork so we could potentially include it in a release to our Microsoft partners. Upgrading to a version with these changes may help with some of the auth problems plaguing GVFS Cache Servers. (It's not a long-term fix, but would avoid having fingers pointed in this direction in the future.) At least on Windows, we should re-run `git maintenance start` as part of `scalar reconfigure` during the installer. On other platforms, we will need to rely on users slowly rotating through their local repositories. > This PR is a recreation of #593 now that the upstream change required rebasing to resolve header conflicts.
dscho
pushed a commit
that referenced
this pull request
Nov 3, 2023
* [x] This is an early version of work already under review upstream. See gitgitgadget#1567 for the version submitted upstream. This is an early version for the `microsoft/git` fork so we could potentially include it in a release to our Microsoft partners. Upgrading to a version with these changes may help with some of the auth problems plaguing GVFS Cache Servers. (It's not a long-term fix, but would avoid having fingers pointed in this direction in the future.) At least on Windows, we should re-run `git maintenance start` as part of `scalar reconfigure` during the installer. On other platforms, we will need to rely on users slowly rotating through their local repositories. > This PR is a recreation of #593 now that the upstream change required rebasing to resolve header conflicts.
dscho
pushed a commit
that referenced
this pull request
Nov 3, 2023
* [x] This is an early version of work already under review upstream. See gitgitgadget#1567 for the version submitted upstream. This is an early version for the `microsoft/git` fork so we could potentially include it in a release to our Microsoft partners. Upgrading to a version with these changes may help with some of the auth problems plaguing GVFS Cache Servers. (It's not a long-term fix, but would avoid having fingers pointed in this direction in the future.) At least on Windows, we should re-run `git maintenance start` as part of `scalar reconfigure` during the installer. On other platforms, we will need to rely on users slowly rotating through their local repositories. > This PR is a recreation of #593 now that the upstream change required rebasing to resolve header conflicts.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
See gitgitgadget#1567 for the version submitted upstream. This is an early version for the
microsoft/git
fork so we could potentially include it in a release to our Microsoft partners. Upgrading to a version with these changes may help with some of the auth problems plaguing GVFS Cache Servers. (It's not a long-term fix, but would avoid having fingers pointed in this direction in the future.)At least on Windows, we should re-run
git maintenance start
as part ofscalar reconfigure
during the installer. On other platforms, we will need to rely on users slowly rotating through their local repositories.