-
Notifications
You must be signed in to change notification settings - Fork 28.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-12517] add default RDD name for one created via sc.textFile #10456
Conversation
The feature was first added at commit: 7b877b2 but was later removed (probably by mistake) at at commit: fc8b581. here is the symptom: using spark-1.5.2-bin-hadoop2.6 I get: ================================= scala> sc.textFile("/home/root/.bashrc").name res5: String = null scala> sc.binaryFiles("/home/root/.bashrc").name res6: String = /home/root/.bashrc while using Spark 1.3.1: ================================= scala> sc.textFile("/home/root/.bashrc").name res0: String = /home/root/.bashrc scala> sc.binaryFiles("/home/root/.bashrc").name res1: String = /home/root/.bashrc
@wyaron please file a JIRA and attach to the title of this PR. See how other patches are opened. |
The changes here look OK to me. |
ok to test |
Test build #48271 has finished for PR 10456 at commit
|
LGTM as well. Could you add testcases for this change? |
This change extends SparkContextSuite to verify that RDDs that are created using file paths have their default name set to the path.
This change extends SparkContextSuite to verify that RDDs that are created using file paths have their default name set to the path.
Test build #48305 has finished for PR 10456 at commit
|
Test build #48307 has finished for PR 10456 at commit
|
Test build #48309 has finished for PR 10456 at commit
|
Test build #48310 has finished for PR 10456 at commit
|
|
||
var targetPath = mockPath + "textFile" | ||
assert(sc.textFile(targetPath).name == targetPath) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you compare by ===
instead of ==
?
We can get better error messages by ===
operator.
Test build #48312 has finished for PR 10456 at commit
|
@@ -274,6 +274,31 @@ class SparkContextSuite extends SparkFunSuite with LocalSparkContext { | |||
} | |||
} | |||
|
|||
test("Default path for file based RDDs is properly set (SPARK-12517)") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indentation is broken.
Test build #48313 has finished for PR 10456 at commit
|
Thanks for the quick CR ! It seems that setName should be called AFTER the map is invoked, hence on the returned RDD. here is the original change: I have committed the proposed change which moves setName after the map. I've tested locally and now the name field is valid. does it make sense ? |
Test build #48314 has finished for PR 10456 at commit
|
Yeah, I think it's good catch. |
LGTM |
O.K, I'll merge this into |
The feature was first added at commit: 7b877b2 but was later removed (probably by mistake) at commit: fc8b581. This change sets the default path of RDDs created via sc.textFile(...) to the path argument. Here is the symptom: * Using spark-1.5.2-bin-hadoop2.6: scala> sc.textFile("/home/root/.bashrc").name res5: String = null scala> sc.binaryFiles("/home/root/.bashrc").name res6: String = /home/root/.bashrc * while using Spark 1.3.1: scala> sc.textFile("/home/root/.bashrc").name res0: String = /home/root/.bashrc scala> sc.binaryFiles("/home/root/.bashrc").name res1: String = /home/root/.bashrc Author: Yaron Weinsberg <wyaron@gmail.com> Author: yaron <yaron@il.ibm.com> Closes #10456 from wyaron/master. (cherry picked from commit 73b70f0) Signed-off-by: Kousuke Saruta <sarutak@oss.nttdata.co.jp>
The feature was first added at commit: 7b877b2 but was later removed (probably by mistake) at commit: fc8b581.
This change sets the default path of RDDs created via sc.textFile(...) to the path argument.
Here is the symptom:
scala> sc.textFile("/home/root/.bashrc").name
res5: String = null
scala> sc.binaryFiles("/home/root/.bashrc").name
res6: String = /home/root/.bashrc
scala> sc.textFile("/home/root/.bashrc").name
res0: String = /home/root/.bashrc
scala> sc.binaryFiles("/home/root/.bashrc").name
res1: String = /home/root/.bashrc