Skip to content
This repository has been archived by the owner on Sep 26, 2023. It is now read-only.

fix: Fix race condition in BatcherImpl flush #1200

Merged
merged 2 commits into from
Oct 5, 2020

Conversation

igorbernstein2
Copy link
Contributor

Currently the following race condition exists:

T1 - awaitAllOutstandingBatches checks that numOfOutstandingBatches > 0
T2 - onBatchCompletion decrements numOfOutstandingBatches
T2 - flushLock.notifyAll()
T1 - flushLock.wait()

so T1 will wait indefinitely

The fix is quite simple: make sure that the there batches to wait for after acquiring the lock

@igorbernstein2 igorbernstein2 requested a review from a team October 5, 2020 18:04
@google-cla google-cla bot added the cla: yes This human has signed the Contributor License Agreement. label Oct 5, 2020
@codecov
Copy link

codecov bot commented Oct 5, 2020

Codecov Report

Merging #1200 into master will increase coverage by 0.00%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff            @@
##             master    #1200   +/-   ##
=========================================
  Coverage     79.06%   79.07%           
- Complexity     1197     1198    +1     
=========================================
  Files           205      205           
  Lines          5268     5270    +2     
  Branches        435      436    +1     
=========================================
+ Hits           4165     4167    +2     
  Misses          930      930           
  Partials        173      173           
Impacted Files Coverage Δ Complexity Δ
.../java/com/google/api/gax/batching/BatcherImpl.java 98.06% <100.00%> (+0.02%) 17.00 <0.00> (+1.00)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0976e20...ebad62f. Read the comment docs.

Copy link
Contributor

@miraleung miraleung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vam PTAL

@vam-google vam-google self-requested a review October 5, 2020 18:39
Copy link
Contributor

@vam-google vam-google left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but how painful will it be to add a test for that?

@igorbernstein2
Copy link
Contributor Author

I think a test would be quite difficult, but I can try in a separate PR. This is currently a blocker for a customer, so I'd prefer to get this out asap.

Would it be possible to cut a release with this change either today or tomorrow?

@igorbernstein2
Copy link
Contributor Author

igorbernstein2 commented Oct 5, 2020

Added a stress test..wasnt as bad as I expected

Currently the following race condition exists:

T1 - awaitAllOutstandingBatches checks that numOfOutstandingBatches > 0
T2 - onBatchCompletion decrements numOfOutstandingBatches
T2 - flushLock.notifyAll()
T1 - flushLock.wait()

so T1 will wait indefinitely

The fix is quite simple: make sure that the there batches to wait for after acquiring the lock
@igorbernstein2 igorbernstein2 merged commit c6308c9 into googleapis:master Oct 5, 2020
@igorbernstein2 igorbernstein2 deleted the batcher-race branch October 5, 2020 21:38
@miraleung miraleung mentioned this pull request Oct 5, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
cla: yes This human has signed the Contributor License Agreement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants