-
Notifications
You must be signed in to change notification settings - Fork 28.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-23390][SQL] Flaky Test Suite: FileBasedDataSourceSuite in Spark 2.3/hadoop 2.7 #20584
Conversation
val df = spark.read.format(format).load( | ||
new Path(basePath, "first").toString, | ||
new Path(basePath, "second").toString, | ||
new Path(basePath, "third").toString) | ||
|
||
val fs = thirdPath.getFileSystem(spark.sparkContext.hadoopConfiguration) | ||
// Make sure all data files are deleted and can't be opened. | ||
files.foreach(f => fs.delete(f, false)) | ||
assert(fs.delete(thirdPath, true)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmmm .. but it asserts true on the delete completion .. I would be surprised if it's something not guaranteed ..
BTW, my rough wild guess was that case 2. (reading it but not closing it) happens in schema inference path. |
LGTM, I would merge this first if Jenkins does not fail and see whether this can help fix the flaky tests. |
I won't get in the way but I am less sure on this. I thought this is also flaky in PR builder too anyway. |
According to the log, the leaked file stream was created when building the ORC columnar reader. Schema inference is fine. |
You are right. I have run out of ideas. LGTM too for a try if it happens more frequently in spark-branch-2.3-test-sbt-hadoop-2.7. |
LGTM, seems plausible! |
I am also thinking about this. I agree with this.
I am suspicious about relationship between |
Test build #87321 has finished for PR 20584 at commit
|
merging this to master/2.3. Thanks! |
…k 2.3/hadoop 2.7 ## What changes were proposed in this pull request? This test only fails with sbt on Hadoop 2.7, I can't reproduce it locally, but here is my speculation by looking at the code: 1. FileSystem.delete doesn't delete the directory entirely, somehow we can still open the file as a 0-length empty file.(just speculation) 2. ORC intentionally allow empty files, and the reader fails during reading without closing the file stream. This PR improves the test to make sure all files are deleted and can't be opened. ## How was this patch tested? N/A Author: Wenchen Fan <wenchen@databricks.com> Closes #20584 from cloud-fan/flaky-test. (cherry picked from commit 6efd5d1) Signed-off-by: Sameer Agarwal <sameerag@apache.org>
This is one of my speculations. There 2 possibilities I can think of: 1) the task completion listener is not called before For 1), seems we've fixed it in c5a31d1 . For 2), I'm not sure and may need help from ORC folks. |
I think I rushed to take a look Initially. Thanks for fixing this. |
…k 2.3/hadoop 2.7 ## What changes were proposed in this pull request? This test only fails with sbt on Hadoop 2.7, I can't reproduce it locally, but here is my speculation by looking at the code: 1. FileSystem.delete doesn't delete the directory entirely, somehow we can still open the file as a 0-length empty file.(just speculation) 2. ORC intentionally allow empty files, and the reader fails during reading without closing the file stream. This PR improves the test to make sure all files are deleted and can't be opened. ## How was this patch tested? N/A Author: Wenchen Fan <wenchen@databricks.com> Closes apache#20584 from cloud-fan/flaky-test.
Great! https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-branch-2.3-test-sbt-hadoop-2.7/ becomes green again!!! |
My bad. Thank you, guys. For the following, I'll investigate it more.
|
This patch helps
|
For the following case, I'll make a PR for Spark ORC columnar reader very soon.
|
I created a PR, #20590 . |
What changes were proposed in this pull request?
This test only fails with sbt on Hadoop 2.7, I can't reproduce it locally, but here is my speculation by looking at the code:
This PR improves the test to make sure all files are deleted and can't be opened.
How was this patch tested?
N/A