Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Fix race condition in GrpcDirectStreamController #1537

Merged
merged 2 commits into from
Mar 23, 2023

Conversation

mutianf
Copy link
Contributor

@mutianf mutianf commented Mar 22, 2023

Fix the race condition in GrpcDirectStreamController when controller.request() is called when there's a retry attempt. This could cause IllegalStateException:

java.lang.IllegalStateException: Not started

	at com.google.common.base.Preconditions.checkState(Preconditions.java:502)
	at io.grpc.internal.ClientCallImpl.request(ClientCallImpl.java:451)
	at com.google.api.gax.grpc.GrpcDirectStreamController.request(GrpcDirectStreamController.java:95)
	at com.google.api.gax.grpc.ExceptionResponseObserver$1.request(ExceptionResponseObserver.java:67)
	at com.google.api.gax.rpc.ServerStreamingAttemptCallable.onRequest(ServerStreamingAttemptCallable.java:345)
	at com.google.api.gax.rpc.ServerStreamingAttemptCallable.access$200(ServerStreamingAttemptCallable.java:97)
	at com.google.api.gax.rpc.ServerStreamingAttemptCallable$1.request(ServerStreamingAttemptCallable.java:168)
	at com.google.api.gax.tracing.TracedResponseObserver$1.request(TracedResponseObserver.java:83)
	at com.google.api.gax.rpc.ServerStreamingAttemptCallable.onRequest(ServerStreamingAttemptCallable.java:345)
	at com.google.api.gax.rpc.ServerStreamingAttemptCallable.access$200(ServerStreamingAttemptCallable.java:97)
	at com.google.api.gax.rpc.ServerStreamingAttemptCallable$1.request(ServerStreamingAttemptCallable.java:168)

This get triggered when GrpcDirectStreamController is created from a retry thread and controller.request() get called from another thread:

this.hasStarted = true;

// controller.request() called from the another thread

clientCall.start(new ResponseObserverAdapter(), new Metadata());

@mutianf mutianf requested a review from a team as a code owner March 22, 2023 18:53
@product-auto-label product-auto-label bot added the size: m Pull request size is medium. label Mar 22, 2023
@sonarqubecloud
Copy link

[gapic-generator-java-root] Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 1 Code Smell

100.0% 100.0% Coverage
0.0% 0.0% Duplication

clientCall.start(new ResponseObserverAdapter(), new Metadata());

this.hasStarted = true;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this line is moved from before clientCall.start to after it? I don't know the exact use case of startCommon, but it could be an issue if clientCall.start is taking some time to complete and startCommon is being called another time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hasStarted is meant to be a barrier before we could interact with a stream, grpc has the invariant that you can't call request before a stream is started. Before the barrier is opened, we simply increment the request count. The original assumption is that the only thing that could call request() this is in onStart() and onResponse(). However this is not the case when retry is enabled where controller could be created from a different thread.

I don't think clientCall.start() taking a long time will be a problem here. GrpcDirectStreamController#start() is only called from here and here for bidi, which means that every callable.call() will create a new controller instance, so I don't think there's a race condition here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, it makes sense.

@blakeli0 blakeli0 added the owlbot:run Add this label to trigger the Owlbot post processor. label Mar 22, 2023
@gcf-owl-bot gcf-owl-bot bot removed the owlbot:run Add this label to trigger the Owlbot post processor. label Mar 22, 2023
@igorbernstein2 igorbernstein2 merged commit 17d133b into googleapis:main Mar 23, 2023
@mutianf mutianf deleted the race branch March 23, 2023 15:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size: m Pull request size is medium.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants