Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-14579][SQL]Fix the race condition in StreamExecution.processAllAvailable again #12582

Closed
wants to merge 2 commits into from
Closed

Conversation

zsxwing
Copy link
Member

@zsxwing zsxwing commented Apr 21, 2016

What changes were proposed in this pull request?

#12339 didn't fix the race condition. MemorySinkSuite is still flaky: https://amplab.cs.berkeley.edu/jenkins/job/spark-master-test-maven-hadoop-2.2/814/testReport/junit/org.apache.spark.sql.streaming/MemorySinkSuite/registering_as_a_table/

Here is an execution order to reproduce it.

Time Thread 1 MicroBatchThread
1 MemorySink.getOffset
2 availableOffsets ++= newData (availableOffsets is not changed here)
3 addData(newData)
4 Set noNewData to false in processAllAvailable
5 dataAvailable returns false
6 noNewData = true
7 noNewData is true so just return
8 assert results and fail
9 dataAvailable returns true so process the new batch

This PR expands the scope of awaitBatchLock.synchronized to eliminate the above race.

How was this patch tested?

test("stress test"). It always failed before this patch. And it will pass after applying this patch. Ignore this test in the PR as it takes several minutes to finish.

@zsxwing
Copy link
Member Author

zsxwing commented Apr 21, 2016

cc @marmbrus

@SparkQA
Copy link

SparkQA commented Apr 21, 2016

Test build #56583 has finished for PR 12582 at commit eddf9fd.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@marmbrus
Copy link
Contributor

You can add the test and mark it @Ignore.

@zsxwing
Copy link
Member Author

zsxwing commented Apr 22, 2016

Added the stress test

@SparkQA
Copy link

SparkQA commented Apr 22, 2016

Test build #56612 has finished for PR 12582 at commit 6fadd0f.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@zsxwing
Copy link
Member Author

zsxwing commented Apr 22, 2016

cc @tdas

@zsxwing
Copy link
Member Author

zsxwing commented May 2, 2016

ping @marmbrus

@marmbrus
Copy link
Contributor

marmbrus commented May 2, 2016

Thanks, merging to master and 2.0

@asfgit asfgit closed this in a35a67a May 2, 2016
asfgit pushed a commit that referenced this pull request May 2, 2016
…llAvailable again

## What changes were proposed in this pull request?

#12339 didn't fix the race condition. MemorySinkSuite is still flaky: https://amplab.cs.berkeley.edu/jenkins/job/spark-master-test-maven-hadoop-2.2/814/testReport/junit/org.apache.spark.sql.streaming/MemorySinkSuite/registering_as_a_table/

Here is an execution order to reproduce it.

| Time        |Thread 1           | MicroBatchThread  |
|:-------------:|:-------------:|:-----:|
| 1 | |  `MemorySink.getOffset` |
| 2 | |  availableOffsets ++= newData (availableOffsets is not changed here)  |
| 3 | addData(newData)      |   |
| 4 | Set `noNewData` to `false` in  processAllAvailable |  |
| 5 | | `dataAvailable` returns `false`   |
| 6 | | noNewData = true |
| 7 | `noNewData` is true so just return | |
| 8 |  assert results and fail | |
| 9 |   | `dataAvailable` returns true so process the new batch |

This PR expands the scope of `awaitBatchLock.synchronized` to eliminate the above race.

## How was this patch tested?

test("stress test"). It always failed before this patch. And it will pass after applying this patch. Ignore this test in the PR as it takes several minutes to finish.

Author: Shixiong Zhu <shixiong@databricks.com>

Closes #12582 from zsxwing/SPARK-14579-2.

(cherry picked from commit a35a67a)
Signed-off-by: Michael Armbrust <michael@databricks.com>
@zsxwing zsxwing deleted the SPARK-14579-2 branch May 2, 2016 18:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants