Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-48889][SS] testStream to unload state stores before finishing #47339

Closed
wants to merge 2 commits into from

Conversation

siying
Copy link
Contributor

@siying siying commented Jul 12, 2024

What changes were proposed in this pull request?

In the end of each testStream() call, unload all state stores from the executor

Why are the changes needed?

Currently, after a test, we don't unload state store or disable maintenance task. So after a test, the maintenance task can run and fail as the checkpoint directory is already deleted. This might cause an issue and fail the next test.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

See existing tests to pass

Was this patch authored or co-authored using generative AI tooling?

No.

Copy link
Contributor

@HeartSaVioR HeartSaVioR left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change itself looks OK, left one comment for informative purpose.

@@ -813,6 +813,7 @@ trait StreamTest extends QueryTest with SharedSparkSession with TimeLimits with
case (key, None) => sparkSession.conf.unset(key)
}
sparkSession.streams.removeListener(listener)
StateStore.stop()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we leave a code comment for the reason we put this here? We already have StateStore.stop() in afterEach in various test suites, and future reviewer would like to understand why we can't simply put StateStore.stop() in afterEach. (I get that, just wanted to help future reviewers.)

Copy link
Contributor

@HeartSaVioR HeartSaVioR left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@HeartSaVioR
Copy link
Contributor

https://github.com/siying/spark/runs/27529815714

This only failed with Docker integration test org.apache.spark.sql.jdbc.OracleIntegrationSuite which is unrelated.

@HeartSaVioR
Copy link
Contributor

Thanks! Merging to master/3.5/3.4 (if there's no merge conflict).

HeartSaVioR pushed a commit that referenced this pull request Jul 17, 2024
### What changes were proposed in this pull request?
In the end of each testStream() call, unload all state stores from the executor

### Why are the changes needed?
Currently, after a test, we don't unload state store or disable maintenance task. So after a test, the maintenance task can run and fail as the checkpoint directory is already deleted. This might cause an issue and fail the next test.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
See existing tests to pass

### Was this patch authored or co-authored using generative AI tooling?
No.

Closes #47339 from siying/SPARK-48889.

Authored-by: Siying Dong <siying.dong@databricks.com>
Signed-off-by: Jungtaek Lim <kabhwan.opensource@gmail.com>
(cherry picked from commit 3a24555)
Signed-off-by: Jungtaek Lim <kabhwan.opensource@gmail.com>
HeartSaVioR pushed a commit that referenced this pull request Jul 17, 2024
### What changes were proposed in this pull request?
In the end of each testStream() call, unload all state stores from the executor

### Why are the changes needed?
Currently, after a test, we don't unload state store or disable maintenance task. So after a test, the maintenance task can run and fail as the checkpoint directory is already deleted. This might cause an issue and fail the next test.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
See existing tests to pass

### Was this patch authored or co-authored using generative AI tooling?
No.

Closes #47339 from siying/SPARK-48889.

Authored-by: Siying Dong <siying.dong@databricks.com>
Signed-off-by: Jungtaek Lim <kabhwan.opensource@gmail.com>
(cherry picked from commit 3a24555)
Signed-off-by: Jungtaek Lim <kabhwan.opensource@gmail.com>
jingz-db pushed a commit to jingz-db/spark that referenced this pull request Jul 22, 2024
### What changes were proposed in this pull request?
In the end of each testStream() call, unload all state stores from the executor

### Why are the changes needed?
Currently, after a test, we don't unload state store or disable maintenance task. So after a test, the maintenance task can run and fail as the checkpoint directory is already deleted. This might cause an issue and fail the next test.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
See existing tests to pass

### Was this patch authored or co-authored using generative AI tooling?
No.

Closes apache#47339 from siying/SPARK-48889.

Authored-by: Siying Dong <siying.dong@databricks.com>
Signed-off-by: Jungtaek Lim <kabhwan.opensource@gmail.com>
szehon-ho pushed a commit to szehon-ho/spark that referenced this pull request Aug 7, 2024
### What changes were proposed in this pull request?
In the end of each testStream() call, unload all state stores from the executor

### Why are the changes needed?
Currently, after a test, we don't unload state store or disable maintenance task. So after a test, the maintenance task can run and fail as the checkpoint directory is already deleted. This might cause an issue and fail the next test.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
See existing tests to pass

### Was this patch authored or co-authored using generative AI tooling?
No.

Closes apache#47339 from siying/SPARK-48889.

Authored-by: Siying Dong <siying.dong@databricks.com>
Signed-off-by: Jungtaek Lim <kabhwan.opensource@gmail.com>
(cherry picked from commit 3a24555)
Signed-off-by: Jungtaek Lim <kabhwan.opensource@gmail.com>
attilapiros pushed a commit to attilapiros/spark that referenced this pull request Oct 4, 2024
### What changes were proposed in this pull request?
In the end of each testStream() call, unload all state stores from the executor

### Why are the changes needed?
Currently, after a test, we don't unload state store or disable maintenance task. So after a test, the maintenance task can run and fail as the checkpoint directory is already deleted. This might cause an issue and fail the next test.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
See existing tests to pass

### Was this patch authored or co-authored using generative AI tooling?
No.

Closes apache#47339 from siying/SPARK-48889.

Authored-by: Siying Dong <siying.dong@databricks.com>
Signed-off-by: Jungtaek Lim <kabhwan.opensource@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants