Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add semaphore to limit subchannel connect to prevent race conditions #2422

Merged
merged 6 commits into from
Apr 30, 2024

Conversation

JamesNK
Copy link
Member

@JamesNK JamesNK commented Apr 29, 2024

Addresses #2420

I have a theory that concurrent calls to ConnectTransportAsync could put a subchannel in a bad state. Two connection requests happen simultaneously but a race condition occurs as they exit.

Before:

  1. Connect 1 starts.
  2. Connect 2 starts. Cancels connect 1.
  3. Connect 2 succeeds.
  4. Connect 2 exits and updates subchannel status to Ready.
  5. Connect 1 exits and updates subchannel status to TransientFailure.

This PR limits access to connect with a semaphore. A subchannel will have one connect in progress at a time.

After:

  1. Connect 1 acquires semaphore and starts.
  2. Connect 2 tries to start. Cancels connect 1. Waits with semaphore for connect 1 to finish.
  3. Connect 1 exits and updates subchannel status to TransientFailure.
  4. Connect 1 releases semaphore.
  5. Connect 2 acquires semaphore.
  6. Connect 2 succeeds.
  7. Connect 2 exits and updates subchannel status to Ready.
  8. Connect 2 releases semaphore.

Note: New test is unrelated to this exact problem. I wrote it while considering another theory. It's a good test to keep around.

@JamesNK JamesNK merged commit 63914f2 into grpc:master Apr 30, 2024
5 checks passed
@JamesNK JamesNK deleted the jamesnk/connect-threadsafety branch April 30, 2024 10:38
malandis added a commit to momentohq/client-sdk-dotnet that referenced this pull request May 24, 2024
This addresses regressions we saw when spawning many new
connections. Possibly fixed by grpc/grpc-dotnet#2422
malandis added a commit to momentohq/client-sdk-dotnet that referenced this pull request May 24, 2024
This addresses regressions we saw when spawning many new
connections. Possibly fixed by grpc/grpc-dotnet#2422
oguzhand95 referenced this pull request in cerbos/cerbos-sdk-net May 31, 2024
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Change | Age | Adoption | Passing | Confidence |
|---|---|---|---|---|---|
| [Google.Protobuf](https://github.com/protocolbuffers/protobuf) |
`3.26.1` -> `3.27.0` |
[![age](https://developer.mend.io/api/mc/badges/age/nuget/Google.Protobuf/3.27.0?slim=true)](https://docs.renovatebot.com/merge-confidence/)
|
[![adoption](https://developer.mend.io/api/mc/badges/adoption/nuget/Google.Protobuf/3.27.0?slim=true)](https://docs.renovatebot.com/merge-confidence/)
|
[![passing](https://developer.mend.io/api/mc/badges/compatibility/nuget/Google.Protobuf/3.26.1/3.27.0?slim=true)](https://docs.renovatebot.com/merge-confidence/)
|
[![confidence](https://developer.mend.io/api/mc/badges/confidence/nuget/Google.Protobuf/3.26.1/3.27.0?slim=true)](https://docs.renovatebot.com/merge-confidence/)
|
| [Grpc.Net.Client](https://github.com/grpc/grpc-dotnet) | `2.62.0` ->
`2.63.0` |
[![age](https://developer.mend.io/api/mc/badges/age/nuget/Grpc.Net.Client/2.63.0?slim=true)](https://docs.renovatebot.com/merge-confidence/)
|
[![adoption](https://developer.mend.io/api/mc/badges/adoption/nuget/Grpc.Net.Client/2.63.0?slim=true)](https://docs.renovatebot.com/merge-confidence/)
|
[![passing](https://developer.mend.io/api/mc/badges/compatibility/nuget/Grpc.Net.Client/2.62.0/2.63.0?slim=true)](https://docs.renovatebot.com/merge-confidence/)
|
[![confidence](https://developer.mend.io/api/mc/badges/confidence/nuget/Grpc.Net.Client/2.62.0/2.63.0?slim=true)](https://docs.renovatebot.com/merge-confidence/)
|
| [Microsoft.NET.Test.Sdk](https://github.com/microsoft/vstest) |
`17.9.0` -> `17.10.0` |
[![age](https://developer.mend.io/api/mc/badges/age/nuget/Microsoft.NET.Test.Sdk/17.10.0?slim=true)](https://docs.renovatebot.com/merge-confidence/)
|
[![adoption](https://developer.mend.io/api/mc/badges/adoption/nuget/Microsoft.NET.Test.Sdk/17.10.0?slim=true)](https://docs.renovatebot.com/merge-confidence/)
|
[![passing](https://developer.mend.io/api/mc/badges/compatibility/nuget/Microsoft.NET.Test.Sdk/17.9.0/17.10.0?slim=true)](https://docs.renovatebot.com/merge-confidence/)
|
[![confidence](https://developer.mend.io/api/mc/badges/confidence/nuget/Microsoft.NET.Test.Sdk/17.9.0/17.10.0?slim=true)](https://docs.renovatebot.com/merge-confidence/)
|

---

### Release Notes

<details>
<summary>grpc/grpc-dotnet (Grpc.Net.Client)</summary>

###
[`v2.63.0`](https://github.com/grpc/grpc-dotnet/releases/tag/v2.63.0)

##### What's Changed

- Prevent block inside ResolveAsync from blocking
PollingResolver.Refresh by
[@&#8203;JamesNK](https://github.com/JamesNK) in
[https://github.com/grpc/grpc-dotnet/pull/2385](https://github.com/grpc/grpc-dotnet/pull/2385)
- Bump follow-redirects from 1.15.4 to 1.15.6 in
/testassets/InteropTestsGrpcWebWebsite/Tests by
[@&#8203;dependabot](https://github.com/dependabot) in
[https://github.com/grpc/grpc-dotnet/pull/2392](https://github.com/grpc/grpc-dotnet/pull/2392)
- Update microsoft-support.md by
[@&#8203;JamesNK](https://github.com/JamesNK) in
[https://github.com/grpc/grpc-dotnet/pull/2403](https://github.com/grpc/grpc-dotnet/pull/2403)
- fix a couple typos in README.md by
[@&#8203;jjanuszkiewicz](https://github.com/jjanuszkiewicz) in
[https://github.com/grpc/grpc-dotnet/pull/2397](https://github.com/grpc/grpc-dotnet/pull/2397)
- Interrupt existing subchannel connect attempt when reconnect is
requested by [@&#8203;JamesNK](https://github.com/JamesNK) in
[https://github.com/grpc/grpc-dotnet/pull/2410](https://github.com/grpc/grpc-dotnet/pull/2410)
- Update Directory.Packages.props by
[@&#8203;WeihanLi](https://github.com/WeihanLi) in
[https://github.com/grpc/grpc-dotnet/pull/2413](https://github.com/grpc/grpc-dotnet/pull/2413)
- [#&#8203;2401](https://github.com/grpc/grpc-dotnet/issues/2401) Add
new TFM's so package dependency can be removed by
[@&#8203;thompson-tomo](https://github.com/thompson-tomo) in
[https://github.com/grpc/grpc-dotnet/pull/2402](https://github.com/grpc/grpc-dotnet/pull/2402)
- support `ReadAllAsync` for netstandard2.0 by
[@&#8203;WeihanLi](https://github.com/WeihanLi) in
[https://github.com/grpc/grpc-dotnet/pull/2411](https://github.com/grpc/grpc-dotnet/pull/2411)
- Fix ObjectDisposedException message by
[@&#8203;drewnoakes](https://github.com/drewnoakes) in
[https://github.com/grpc/grpc-dotnet/pull/2415](https://github.com/grpc/grpc-dotnet/pull/2415)
- Enable multiple connections with WinHttpHandler by default by
[@&#8203;JamesNK](https://github.com/JamesNK) in
[https://github.com/grpc/grpc-dotnet/pull/2416](https://github.com/grpc/grpc-dotnet/pull/2416)
- Fix memory leak when using call context propagation with cancellation
token by [@&#8203;JamesNK](https://github.com/JamesNK) in
[https://github.com/grpc/grpc-dotnet/pull/2421](https://github.com/grpc/grpc-dotnet/pull/2421)
- Fix HTTP/3 test errors on .NET 6 by
[@&#8203;JamesNK](https://github.com/JamesNK) in
[https://github.com/grpc/grpc-dotnet/pull/2423](https://github.com/grpc/grpc-dotnet/pull/2423)
- Add semaphore to limit subchannel connect to prevent race conditions
by [@&#8203;JamesNK](https://github.com/JamesNK) in
[https://github.com/grpc/grpc-dotnet/pull/2422](https://github.com/grpc/grpc-dotnet/pull/2422)
- Don't capture async locals in resolver by
[@&#8203;JamesNK](https://github.com/JamesNK) in
[https://github.com/grpc/grpc-dotnet/pull/2426](https://github.com/grpc/grpc-dotnet/pull/2426)
- Update Grpc.Tools to 2.63 by
[@&#8203;apolcyn](https://github.com/apolcyn) in
[https://github.com/grpc/grpc-dotnet/pull/2429](https://github.com/grpc/grpc-dotnet/pull/2429)
- Bump 2.63.x branch to 2.63.0-pre1 by
[@&#8203;apolcyn](https://github.com/apolcyn) in
[https://github.com/grpc/grpc-dotnet/pull/2430](https://github.com/grpc/grpc-dotnet/pull/2430)
- Fix build on v2.63.x - cherry pick
[https://github.com/grpc/grpc-dotnet/pull/2437](https://github.com/grpc/grpc-dotnet/pull/2437)
by [@&#8203;apolcyn](https://github.com/apolcyn) in
[https://github.com/grpc/grpc-dotnet/pull/2442](https://github.com/grpc/grpc-dotnet/pull/2442)
- Bump 2.63.x to stable release by
[@&#8203;apolcyn](https://github.com/apolcyn) in
[https://github.com/grpc/grpc-dotnet/pull/2440](https://github.com/grpc/grpc-dotnet/pull/2440)

##### New Contributors

- [@&#8203;jjanuszkiewicz](https://github.com/jjanuszkiewicz) made
their first contribution in
[https://github.com/grpc/grpc-dotnet/pull/2397](https://github.com/grpc/grpc-dotnet/pull/2397)
- [@&#8203;thompson-tomo](https://github.com/thompson-tomo) made their
first contribution in
[https://github.com/grpc/grpc-dotnet/pull/2402](https://github.com/grpc/grpc-dotnet/pull/2402)
- [@&#8203;drewnoakes](https://github.com/drewnoakes) made their first
contribution in
[https://github.com/grpc/grpc-dotnet/pull/2415](https://github.com/grpc/grpc-dotnet/pull/2415)

**Full Changelog**:
grpc/grpc-dotnet@v2.62.0...v2.63.0

</details>

<details>
<summary>microsoft/vstest (Microsoft.NET.Test.Sdk)</summary>

###
[`v17.10.0`](https://github.com/microsoft/vstest/releases/tag/v17.10.0)

##### What's Changed

- Add missing runtimeconfig.json file for 8.0 by
[@&#8203;MarcoRossignoli](https://github.com/MarcoRossignoli) in
[https://github.com/microsoft/vstest/pull/4792](https://github.com/microsoft/vstest/pull/4792)
- Localized file check-in by OneLocBuild Task: Build definition ID 1222:
Build ID
[`2338548`](https://github.com/microsoft/vstest/commit/2338548) by
[@&#8203;dotnet-bot](https://github.com/dotnet-bot) in
[https://github.com/microsoft/vstest/pull/4794](https://github.com/microsoft/vstest/pull/4794)
- Disable testhost prestart by
[@&#8203;nohwnd](https://github.com/nohwnd) in
[https://github.com/microsoft/vstest/pull/4833](https://github.com/microsoft/vstest/pull/4833)
- Terminal logger fixes by [@&#8203;nohwnd](https://github.com/nohwnd)
in
[https://github.com/microsoft/vstest/pull/4834](https://github.com/microsoft/vstest/pull/4834)
- Add RiscV64 by [@&#8203;nohwnd](https://github.com/nohwnd) in
[https://github.com/microsoft/vstest/pull/4838](https://github.com/microsoft/vstest/pull/4838)
- Add deterministic source mapping storing for Microsoft.CodeCoverage by
[@&#8203;jakubch1](https://github.com/jakubch1) in
[https://github.com/microsoft/vstest/pull/4849](https://github.com/microsoft/vstest/pull/4849)
- Fix terminal logger encoding & error by
[@&#8203;nohwnd](https://github.com/nohwnd) in
[https://github.com/microsoft/vstest/pull/4853](https://github.com/microsoft/vstest/pull/4853)
- Update sourcebuild configuration to build net previous and net current
by [@&#8203;Evangelink](https://github.com/Evangelink) in
[https://github.com/microsoft/vstest/pull/4856](https://github.com/microsoft/vstest/pull/4856)
- Updating version of Microsoft.VisualStudio.Interop to 17.10 by
[@&#8203;MSLukeWest](https://github.com/MSLukeWest) in
[https://github.com/microsoft/vstest/pull/4866](https://github.com/microsoft/vstest/pull/4866)
- Add VSTEST_DIAG_VERBOSITY to help by
[@&#8203;nohwnd](https://github.com/nohwnd) in
[https://github.com/microsoft/vstest/pull/4882](https://github.com/microsoft/vstest/pull/4882)
- Fix feature flag name by [@&#8203;nohwnd](https://github.com/nohwnd)
in
[https://github.com/microsoft/vstest/pull/4885](https://github.com/microsoft/vstest/pull/4885)
- Improve terminal logger by
[@&#8203;nohwnd](https://github.com/nohwnd) in
[https://github.com/microsoft/vstest/pull/4877](https://github.com/microsoft/vstest/pull/4877)
- Remove PackageLicenseFile preventing PackageLicenseExpression from
working by [@&#8203;lahma](https://github.com/lahma) in
[https://github.com/microsoft/vstest/pull/4890](https://github.com/microsoft/vstest/pull/4890)
- Add GitHub Actions logger by
[@&#8203;martincostello](https://github.com/martincostello) in
[https://github.com/microsoft/vstest/pull/4906](https://github.com/microsoft/vstest/pull/4906)
- Ensure to send a session complete event by
[@&#8203;drognanar](https://github.com/drognanar) in
[https://github.com/microsoft/vstest/pull/4878](https://github.com/microsoft/vstest/pull/4878)
- specify Win10 + maxversiontested to enable xaml APIs to be used in
tests running under testhost.exe by
[@&#8203;ChrisGuzak](https://github.com/ChrisGuzak) in
[https://github.com/microsoft/vstest/pull/4888](https://github.com/microsoft/vstest/pull/4888)
- Make VSTest repo buildable in VMR non-source-build by
[@&#8203;ViktorHofer](https://github.com/ViktorHofer) in
[https://github.com/microsoft/vstest/pull/4920](https://github.com/microsoft/vstest/pull/4920)
- Migrate pipelines by [@&#8203;nohwnd](https://github.com/nohwnd) in
[https://github.com/microsoft/vstest/pull/4921](https://github.com/microsoft/vstest/pull/4921)
- Add test name to MSBuild where we have frame. by
[@&#8203;nohwnd](https://github.com/nohwnd) in
[https://github.com/microsoft/vstest/pull/4935](https://github.com/microsoft/vstest/pull/4935)
- \[rel/17.10] Add list of known TestingPlatform dlls by
[@&#8203;nohwnd](https://github.com/nohwnd) in
[https://github.com/microsoft/vstest/pull/4982](https://github.com/microsoft/vstest/pull/4982)

And many infrastructure related changes and updates.

##### New Contributors

- [@&#8203;ellahathaway](https://github.com/ellahathaway) made their
first contribution in
[https://github.com/microsoft/vstest/pull/4785](https://github.com/microsoft/vstest/pull/4785)
- [@&#8203;MSLukeWest](https://github.com/MSLukeWest) made their first
contribution in
[https://github.com/microsoft/vstest/pull/4866](https://github.com/microsoft/vstest/pull/4866)
- [@&#8203;lahma](https://github.com/lahma) made their first
contribution in
[https://github.com/microsoft/vstest/pull/4890](https://github.com/microsoft/vstest/pull/4890)
- [@&#8203;ChrisGuzak](https://github.com/ChrisGuzak) made their first
contribution in
[https://github.com/microsoft/vstest/pull/4888](https://github.com/microsoft/vstest/pull/4888)

**Full Changelog**:
microsoft/vstest@v17.9.0...v17.10.0

</details>

---

### Configuration

📅 **Schedule**: Branch creation - "before 4am on Monday" (UTC),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

👻 **Immortal**: This PR will be recreated if closed unmerged. Get
[config help](https://github.com/renovatebot/renovate/discussions) if
that's undesired.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/cerbos/cerbos-sdk-net).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNy4zNjguMTAiLCJ1cGRhdGVkSW5WZXIiOiIzNy4zNjguMTAiLCJ0YXJnZXRCcmFuY2giOiJtYWluIiwibGFiZWxzIjpbImJvdHMiLCJjaG9yZSIsImNpIl19-->

Signed-off-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Signed-off-by: Oğuzhan Durgun <oguzhandurgun95@gmail.com>
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants