-
Notifications
You must be signed in to change notification settings - Fork 689
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Potential data race #1092
Comments
Thanks for reporting this, I have taken a look.
What I've determined is that when there documents in the batch, we internally synchronize things by going over a channel, thus the race detector can prove that the access is safe. However, when there are no documents in the batch, the loop iteration happens 0 times, and we never go through that channel synchronization, and thus the race detector correctly points out this bug. It turns out the fix is straightforward, we can eliminate the unsynchronized access by the other goroutine in the case when there are no documents, thus removing the race. I'll be submitting a PR to address this issue shortly. |
Fixed by #1121 |
this is another variation of the race found/fixed in blevesearch#1092 in that case the batch was empty, which meant we would skip the code that properly synchronized access. our fix only handled this exact case (no data operations), however there is another variation, if the batch contains only deletes (which are data ops) we still spawned the goroutine, although since there were no real updates, the again the synchronization code would be skipped, and thus the data race could happen. the fix is to check the number of updates (computed earlier on the caller's goroutine, so it's safe) instead of the length of the IndexOps (which includes updates and deletes) the key is that we should only spawn the goroutine that will range over the batch, in cases where we will synchronize on waiting for the analysis to complete (at least one update).
this is another variation of the race found/fixed in #1092 in that case the batch was empty, which meant we would skip the code that properly synchronized access. our fix only handled this exact case (no data operations), however there is another variation, if the batch contains only deletes (which are data ops) we still spawned the goroutine, although since there were no real updates, again the synchronization code would be skipped, and thus the data race could happen. the fix is to check the number of updates (computed earlier on the caller's goroutine, so it's safe) instead of the length of the IndexOps (which includes updates and deletes) the key is that we should only spawn the goroutine that will range over the batch, in cases where we will synchronize on waiting for the analysis to complete (at least one update). fixes #1149
#1091 is a sample code that could produce a data race.
I had several attempts to fix it but cannot reach a satisfying result.
Please have a look.
The text was updated successfully, but these errors were encountered: