Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix the built-in FSMonitor, and run Scalar's Functional Tests as part of the automated builds #371

Conversation

dscho
Copy link
Member

@dscho dscho commented Jun 9, 2021

Without Scalar's Functional Tests, I would not have caught two bugs causing segmentation faults in the built-in FSMonitor (which I have fixed in time for Git for Windows v2.32.0) nor the bug where the FSMonitor just hangs (which I unfortunately did not catch in time for Git for Windows v2.32.0, but at least we can fix it in v2.32.0.vfs.*).

However, running the Functional Tests locally, I could not reproduce the bugs! The hang simply did not happen here, and the segmentation faults were only triggered when Trace2 was enabled (which I had not done locally). Therefore, the (epic) bug hunt really boiled down to running Scalar's Functional Tests using GitHub Actions, over and over and over again.

It was a bit cumbersome to run the tests using GitHub Actions, though, because the original setup required a successful run of the Azure Pipeline build git installers to build full installers for a given microsoft/git branch (which would have been a PR branch such as refs/pull/348/head), and then a PR branch of microsoft/scalar to point to that successful build. Not only does that take 30-45 minutes simply to build the installers, but there are three manual steps involved, each gated on the success of the previous step.

This PR adds a new GitHub workflow that makes all of that testing a lot nicer: just push a PR branch of microsoft/git and Scalar's Functional Tests will be built and run as part of the automated builds.

I used this very workflow to investigate the remaining problems of vfs-2.32.0 that were demonstrated by Scalar's Functional Tests:

  • I first ran it with vfs-2.32.0 (plus this workflow), then reverted the built-in FSMonitor v1 -> v2 changes and ran it again, to verify that the problems were due to this upgrade (as opposed to independent changes coming in via v2.31.1 -> v2.32.0).
  • Then, I added the built-in FSMonitor v1 -> v2 changes back, incrementally, to see where things break.
  • To use my time efficiently, I pushed these incremental changes without waiting for the Functional Tests to finish.
  • That way, two days and 55 runs later (each of which took between 25-60 minutes), I could pinpoint the exact change that caused the regression.

The result is the trio of fixup! fsmonitor--daemon: use a cookie file to sync with file system patches, which fixes the hang in Scalar's Functional Tests.

While at it, I also realized that the two fixups to built-in FSMonitor v2 that I integrated into Git for Windows v2.32.0 were actually only work-arounds, and in this PR I also provide a fix for the root cause (fixup! fsmonitor: introduce core.useBuiltinFSMonitor to call the daemon via IPC, which ensures that fsmonitor_ipc__send_query() is never called with a since_token that is NULL).

dscho added 6 commits June 9, 2021 11:30
…emon via IPC

In FSMonitor v1, we made sure to only use a valid `since_token` when
querying the FSMonitor. This condition was accidentally lost in v2, and
caused segmentation faults uncovered by Scalar's Functional Tests.

I had tried to fix this in https://github.com/git-for-windows/pull/3241,
but the fix was incomplete, and I had to follow up with
https://github.com/git-for-windows/pull/3258. However, it turns out that
both of them were actually only work-arounds; I should have dug deeper
to figure out _why_ the `since_token` was no longer guaranteed not to be
`NULL`, and I finally did.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Now that we have a correct fix where we guarantee again (just like
v1 of the built-in FSMonitor) that `since_token` is not `NULL`, we can
revert the work-arounds introduced by these two PRs:

- https://github.com/git-for-windows/pull/3241

- https://github.com/git-for-windows/pull/3258

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
When flushing the FSMonitor data, we do not actually need to wait for a
cookie file. In the worst case, we will over-report a bit.

If we _do_ wait for a cookie file, in the worst case we cause a hang
because the FSMonitor daemon will wait and wait even though the `.git/`
directory might be gone already.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
When the built-in FSMonitor receives an unexpected token, we do not
actually need to wait for a cookie file. There simply is no use for
waiting in this instance. It's not like the client will all of a sudden
realize that it sent an incorrect token and somehow make up a correct
token from thin air in a follow-up query.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
When the built-in FSMonitor receives an invalid v2 token, we do not
actually need to wait for a cookie file. There simply is no use for
waiting in this instance. It's not like the client will all of a sudden
realize that it sent an incorrect token and somehow make up a correct
token from thin air in a follow-up query.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This topic branch contains fixups for v2 of the built-in FSMonitor patch
series (plus a revert of the two PRs in which I had tried earlier to fix
things).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
@dscho
Copy link
Member Author

dscho commented Jun 9, 2021

Let's see whether I can accelerate the Windows job in Scalar's Functional Tests by using a "more minimal" Git for Windows SDK. I'm on it.

Scalar's Functional Test suite is pretty comprehensive, and caught more
than just one bug in the built-in FSMonitor that was missed by Git's own
test suite.

To benefit from this test suite, automatically run it on the `vfs-*`
branches.

Note: for simplicity, we're building Git from scratch in all matrix
jobs.

Also note: for speed, we are using `git-sdk-64-minimal`, even if it
lacks the `/bin/install` that we need to install Git's files; We're
providing a minimal shell script shim instead. Also, we do not need to
bother with the Tcl/Tk parts, therefore we're skipping them, too.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
@dscho dscho force-pushed the run-scalar-functional-tests-and-fix-built-in-fsmonitor branch from e23e0aa to 059c4b3 Compare June 9, 2021 10:43
@dscho dscho requested a review from derrickstolee June 9, 2021 10:44
Copy link
Collaborator

@derrickstolee derrickstolee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for bringing in the Scalar functional tests! This will make our workflow so much simpler.

I restarted your recent builds because it had the connection error from the Azure Repos side, not a real failure on our side.

- name: Check out Scalar's source code
uses: actions/checkout@v2
with:
fetch-depth: 0 # Indicate full history so Nerdbank.GitVersioning works.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Making a note for later: After we remove the product code from Scalar, we can relax this condition.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point!

@dscho
Copy link
Member Author

dscho commented Jun 9, 2021

I restarted your recent builds because it had the connection error from the Azure Repos side, not a real failure on our side.

Thank you!

I ran into this twice during my debugging session; This is a pretty decent false positive rate, so I am not super worried. But if it persists, we might have to go with the plan @jeffhostetler laid out for dealing with 503s or other network flakiness.

@dscho dscho merged commit cbcdcb1 into microsoft:vfs-2.32.0 Jun 9, 2021
@dscho dscho deleted the run-scalar-functional-tests-and-fix-built-in-fsmonitor branch June 9, 2021 13:41
derrickstolee pushed a commit that referenced this pull request Aug 3, 2021
…x-built-in-fsmonitor

Fix the built-in FSMonitor, and run Scalar's Functional Tests as part of the automated builds
derrickstolee pushed a commit that referenced this pull request Aug 3, 2021
…x-built-in-fsmonitor

Fix the built-in FSMonitor, and run Scalar's Functional Tests as part of the automated builds
dscho added a commit that referenced this pull request Aug 5, 2021
…x-built-in-fsmonitor

Fix the built-in FSMonitor, and run Scalar's Functional Tests as part of the automated builds
derrickstolee pushed a commit that referenced this pull request Aug 9, 2021
…x-built-in-fsmonitor

Fix the built-in FSMonitor, and run Scalar's Functional Tests as part of the automated builds
derrickstolee pushed a commit that referenced this pull request Aug 12, 2021
…x-built-in-fsmonitor

Fix the built-in FSMonitor, and run Scalar's Functional Tests as part of the automated builds
derrickstolee pushed a commit that referenced this pull request Aug 17, 2021
…x-built-in-fsmonitor

Fix the built-in FSMonitor, and run Scalar's Functional Tests as part of the automated builds
dscho added a commit that referenced this pull request Oct 30, 2021
…x-built-in-fsmonitor

Fix the built-in FSMonitor, and run Scalar's Functional Tests as part of the automated builds
derrickstolee pushed a commit that referenced this pull request Oct 30, 2021
…x-built-in-fsmonitor

Fix the built-in FSMonitor, and run Scalar's Functional Tests as part of the automated builds
derrickstolee pushed a commit that referenced this pull request Oct 31, 2021
…x-built-in-fsmonitor

Fix the built-in FSMonitor, and run Scalar's Functional Tests as part of the automated builds
derrickstolee pushed a commit that referenced this pull request Nov 4, 2021
…x-built-in-fsmonitor

Fix the built-in FSMonitor, and run Scalar's Functional Tests as part of the automated builds
derrickstolee pushed a commit that referenced this pull request Nov 10, 2021
…x-built-in-fsmonitor

Fix the built-in FSMonitor, and run Scalar's Functional Tests as part of the automated builds
derrickstolee pushed a commit that referenced this pull request Nov 15, 2021
…x-built-in-fsmonitor

Fix the built-in FSMonitor, and run Scalar's Functional Tests as part of the automated builds
ldennington pushed a commit to ldennington/git that referenced this pull request Jan 12, 2022
…ts-and-fix-built-in-fsmonitor

Fix the built-in FSMonitor, and run Scalar's Functional Tests as part of the automated builds
ldennington pushed a commit to ldennington/git that referenced this pull request Jan 19, 2022
…ts-and-fix-built-in-fsmonitor

Fix the built-in FSMonitor, and run Scalar's Functional Tests as part of the automated builds
ldennington pushed a commit to ldennington/git that referenced this pull request Jan 20, 2022
…ts-and-fix-built-in-fsmonitor

Fix the built-in FSMonitor, and run Scalar's Functional Tests as part of the automated builds
dscho added a commit that referenced this pull request Nov 3, 2023
…x-built-in-fsmonitor

Fix the built-in FSMonitor, and run Scalar's Functional Tests as part of the automated builds
dscho added a commit that referenced this pull request Nov 3, 2023
…x-built-in-fsmonitor

Fix the built-in FSMonitor, and run Scalar's Functional Tests as part of the automated builds
dscho added a commit that referenced this pull request Nov 3, 2023
…x-built-in-fsmonitor

Fix the built-in FSMonitor, and run Scalar's Functional Tests as part of the automated builds
dscho added a commit that referenced this pull request Nov 8, 2023
…x-built-in-fsmonitor

Fix the built-in FSMonitor, and run Scalar's Functional Tests as part of the automated builds
dscho added a commit that referenced this pull request Nov 14, 2023
…x-built-in-fsmonitor

Fix the built-in FSMonitor, and run Scalar's Functional Tests as part of the automated builds
dscho added a commit that referenced this pull request Nov 20, 2023
…x-built-in-fsmonitor

Fix the built-in FSMonitor, and run Scalar's Functional Tests as part of the automated builds
vdye pushed a commit that referenced this pull request Feb 27, 2024
…x-built-in-fsmonitor

Fix the built-in FSMonitor, and run Scalar's Functional Tests as part of the automated builds
dscho added a commit that referenced this pull request Apr 23, 2024
…x-built-in-fsmonitor

Fix the built-in FSMonitor, and run Scalar's Functional Tests as part of the automated builds
dscho added a commit that referenced this pull request Apr 23, 2024
…x-built-in-fsmonitor

Fix the built-in FSMonitor, and run Scalar's Functional Tests as part of the automated builds
dscho added a commit that referenced this pull request Apr 23, 2024
…x-built-in-fsmonitor

Fix the built-in FSMonitor, and run Scalar's Functional Tests as part of the automated builds
dscho added a commit that referenced this pull request Apr 23, 2024
…x-built-in-fsmonitor

Fix the built-in FSMonitor, and run Scalar's Functional Tests as part of the automated builds
dscho added a commit that referenced this pull request Apr 24, 2024
…x-built-in-fsmonitor

Fix the built-in FSMonitor, and run Scalar's Functional Tests as part of the automated builds
dscho added a commit that referenced this pull request Apr 29, 2024
…x-built-in-fsmonitor

Fix the built-in FSMonitor, and run Scalar's Functional Tests as part of the automated builds
dscho added a commit that referenced this pull request May 14, 2024
…x-built-in-fsmonitor

Fix the built-in FSMonitor, and run Scalar's Functional Tests as part of the automated builds
dscho added a commit that referenced this pull request May 14, 2024
…x-built-in-fsmonitor

Fix the built-in FSMonitor, and run Scalar's Functional Tests as part of the automated builds
dscho added a commit that referenced this pull request May 14, 2024
…x-built-in-fsmonitor

Fix the built-in FSMonitor, and run Scalar's Functional Tests as part of the automated builds
dscho added a commit that referenced this pull request Jun 3, 2024
…x-built-in-fsmonitor

Fix the built-in FSMonitor, and run Scalar's Functional Tests as part of the automated builds
dscho added a commit that referenced this pull request Jun 3, 2024
…x-built-in-fsmonitor

Fix the built-in FSMonitor, and run Scalar's Functional Tests as part of the automated builds
dscho added a commit that referenced this pull request Jun 3, 2024
…x-built-in-fsmonitor

Fix the built-in FSMonitor, and run Scalar's Functional Tests as part of the automated builds
dscho added a commit that referenced this pull request Jul 17, 2024
…x-built-in-fsmonitor

Fix the built-in FSMonitor, and run Scalar's Functional Tests as part of the automated builds
dscho added a commit that referenced this pull request Jul 17, 2024
…x-built-in-fsmonitor

Fix the built-in FSMonitor, and run Scalar's Functional Tests as part of the automated builds
dscho added a commit that referenced this pull request Jul 17, 2024
…x-built-in-fsmonitor

Fix the built-in FSMonitor, and run Scalar's Functional Tests as part of the automated builds
dscho added a commit that referenced this pull request Jul 18, 2024
…x-built-in-fsmonitor

Fix the built-in FSMonitor, and run Scalar's Functional Tests as part of the automated builds
mjcheetham pushed a commit that referenced this pull request Jul 23, 2024
…x-built-in-fsmonitor

Fix the built-in FSMonitor, and run Scalar's Functional Tests as part of the automated builds
dscho added a commit that referenced this pull request Jul 25, 2024
…x-built-in-fsmonitor

Fix the built-in FSMonitor, and run Scalar's Functional Tests as part of the automated builds
mjcheetham pushed a commit that referenced this pull request Jul 29, 2024
…x-built-in-fsmonitor

Fix the built-in FSMonitor, and run Scalar's Functional Tests as part of the automated builds
dscho added a commit that referenced this pull request Sep 18, 2024
…x-built-in-fsmonitor

Fix the built-in FSMonitor, and run Scalar's Functional Tests as part of the automated builds
dscho added a commit that referenced this pull request Sep 24, 2024
…x-built-in-fsmonitor

Fix the built-in FSMonitor, and run Scalar's Functional Tests as part of the automated builds
dscho added a commit that referenced this pull request Oct 8, 2024
…x-built-in-fsmonitor

Fix the built-in FSMonitor, and run Scalar's Functional Tests as part of the automated builds
mjcheetham pushed a commit that referenced this pull request Dec 3, 2024
…x-built-in-fsmonitor

Fix the built-in FSMonitor, and run Scalar's Functional Tests as part of the automated builds
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants