-
Notifications
You must be signed in to change notification settings - Fork 28.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SPARK-5425: Use synchronised methods in system properties to create SparkConf #4220
SPARK-5425: Use synchronised methods in system properties to create SparkConf #4220
Conversation
QA tests have started for PR 4220 at commit
|
It does look safer, since |
@srowen exactly - that was the idea :) |
QA tests have finished for PR 4220 at commit
|
Test FAILed. |
What is going on with these tests??? I've created three PRs - for 1.1, 1.2 and 1.3 and all of them failed in a very strange way. |
@@ -17,6 +17,10 @@ | |||
|
|||
package org.apache.spark | |||
|
|||
import java.util.concurrent.{TimeUnit, Executors} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ultra nit: sort imports
LGTM. I was worried about |
retest this please hopefully those test failures were random, lets see. btw, I think that if you want the exact same patch applied to multiple branches, the standard practice is to just open one PR and make a comment that it should be backported to other branches. Its easy for committers to apply to multiple branches. Makes it easier to track the PRs. though this done mean we assume that if the test pass on master, they'll pass on other branches. (somebody correct me if I'm mistaken here ...) |
retest this please |
QA tests have started for PR 4220 at commit
|
QA tests have finished for PR 4220 at commit
|
Test FAILed. |
I kinda see what is going on with the tests now. A test case in SparkSubmitSuite adds a non-existent jar to the system properties. That property never gets cleared, so later jobs which should submit successfully, are still trying to load that jar. I don't understand how your change is effecting this, though, or how this ever worked before. Maybe this change is exposing some lurking error -- something which just "happened" to work before. It seems like having multiple apps futzing w/ the system properties at the same time is bound to create problems. |
Kinda diverging into off-topic territory, but...
Yeah, this is one of those things that keep ringing in my head all the time. I think we should have a plan to have some Spark-specific entry point (e.g. But that's for another PR, this one looks good as it is (perhaps after fixing the offending test). |
I found that there is a small difference in the outcome of my code in comparison to the original code. I don't know how it affect this test suite yet, but: the default Java wrapper over the system properties, which was used originally uses |
It looks like there is The copy was created in this way: oldProperties = new Properties(System.getProperties) which did not initialize properties as they were in the original system properties but rather set them as defaults in new properties object, which in turn made The way I'm gonna fix |
QA tests have started for PR 4220 at commit
|
QA tests have finished for PR 4220 at commit
|
Test PASSed. |
@@ -42,7 +43,7 @@ private[spark] trait ResetSystemProperties extends BeforeAndAfterEach { this: Su | |||
var oldProperties: Properties = null | |||
|
|||
override def beforeEach(): Unit = { | |||
oldProperties = new Properties(System.getProperties) | |||
oldProperties = SerializationUtils.clone(System.getProperties) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for tracking this down. can you please put a comment in here explaining why you need a clone? it is really subtle, I can easily see this getting reverted down the road if somebody doesn't know why its there.
d821b15
to
74b4489
Compare
QA tests have started for PR 4220 at commit
|
lgtm |
QA tests have finished for PR 4220 at commit
|
Test FAILed. |
@JoshRosen can you take a look and maybe merge it? |
Could we get some closure on this? The broken |
@jacek-lewandowski a |
@srowen - unfortunately they are something more - they inherit from the |
@jacek-lewandowski I think Sean meant that you can do |
Look at this simple example: val parent = new Properties()
parent.setProperty("test1", "A")
val child = new Properties(parent)
child.put("test2", "B")
val copy = new Properties()
copy.putAll(child)
child.getProperty("test1")
child.getProperty("test2")
copy.getProperty("test1")
copy.getProperty("test2") which will result in:
In other words: |
I see, makes sense, thanks for the details. |
Ah OK, interesting. On another note, does the clone make a deep copy? that's necessary I think? |
It serializes the object and then deserializes so I suppose this is a deep copy. For the val parent = new Properties()
parent.setProperty("test1", "A")
val child = new Properties(parent)
child.put("test1", "C")
child.put("test2", "B")
child.getProperty("test1")
child.remove("test1")
child.getProperty("test1") will give you
When you copy in the way you suggested, there will be |
@JoshRosen who investigated this a bunch for tests |
I did a little digging and it looks like this ConcurrentModificationException error was fixed previously (8f1098a) but that the fix may have been lost as part of the migration to SparkConf. It looks like #3788 addressed a similar issue for another usage of System.getProperties. For uniformity's sake, what do you think about using Thanks for adding the regression test. The bug in ResetSystemProperties was a silly mistake on my part, which was introduced in https://issues.apache.org/jira/browse/SPARK-1010; thanks for fixing that. If / when you update this PR, mind adding a one- or two-sentence description that will become the commit message? Once we've done that and decided whether to use |
|
This sounds reasonable to me; intuitively, it seems like we'd like the map returned from This is pretty subtle, though, so it would be good to add a comment and a mention of the JIRA number so that future readers can understand these subtleties and not unknowingly break things. |
Agreed, I'll make the changes and add the clarifying comments. |
… system properties - explicit + defaults
QA tests have started for PR 4220 at commit
|
QA tests have finished for PR 4220 at commit
|
Test PASSed. |
@JoshRosen is it ok to go now? |
@jacek-lewandowski This looks good to me, so I'll merge it in a few minutes. It looks like your PRs against the other branches have also been updated, so I'll pull those in, too. Thanks! |
…parkConf SPARK-5425: Fixed usages of system properties This patch fixes few problems caused by the fact that the Scala wrapper over system properties is not thread-safe and is basically invalid because it doesn't take into account the default values which could have been set in the properties object. The problem is fixed by modifying `Utils.getSystemProperties` method so that it uses `stringPropertyNames` method of the `Properties` class, which is thread-safe (internally it creates a defensive copy in a synchronized method) and returns keys of the properties which were set explicitly and which are defined as defaults. The other related problem, which is fixed here. was in `ResetSystemProperties` mix-in. It created a copy of the system properties in the wrong way. This patch also introduces a test case for thread-safeness of SparkConf creation. Refer to the discussion in #4220 for more details. Author: Jacek Lewandowski <lewandowski.jacek@gmail.com> Closes #4220 from jacek-lewandowski/SPARK-5425-1.1 and squashes the following commits: 6c48a1f [Jacek Lewandowski] SPARK-5425: Modified Utils.getSystemProperties to return a map of all system properties - explicit + defaults 74b4489 [Jacek Lewandowski] SPARK-5425: Use SerializationUtils to save properties in ResetSystemProperties trait 685780e [Jacek Lewandowski] SPARK-5425: Use synchronised methods in system properties to create SparkConf
…parkConf SPARK-5425: Fixed usages of system properties This patch fixes few problems caused by the fact that the Scala wrapper over system properties is not thread-safe and is basically invalid because it doesn't take into account the default values which could have been set in the properties object. The problem is fixed by modifying `Utils.getSystemProperties` method so that it uses `stringPropertyNames` method of the `Properties` class, which is thread-safe (internally it creates a defensive copy in a synchronized method) and returns keys of the properties which were set explicitly and which are defined as defaults. The other related problem, which is fixed here. was in `ResetSystemProperties` mix-in. It created a copy of the system properties in the wrong way. This patch also introduces a test case for thread-safeness of SparkConf creation. Refer to the discussion in #4220 for more details. Author: Jacek Lewandowski <lewandowski.jacek@gmail.com> Closes #4222 from jacek-lewandowski/SPARK-5425-1.3 and squashes the following commits: 03da61b [Jacek Lewandowski] SPARK-5425: Modified Utils.getSystemProperties to return a map of all system properties - explicit + defaults 8faf2ea [Jacek Lewandowski] SPARK-5425: Use SerializationUtils to save properties in ResetSystemProperties trait 71aa572 [Jacek Lewandowski] SPARK-5425: Use synchronised methods in system properties to create SparkConf
I've merged this PR into |
…parkConf SPARK-5425: Fixed usages of system properties This patch fixes few problems caused by the fact that the Scala wrapper over system properties is not thread-safe and is basically invalid because it doesn't take into account the default values which could have been set in the properties object. The problem is fixed by modifying `Utils.getSystemProperties` method so that it uses `stringPropertyNames` method of the `Properties` class, which is thread-safe (internally it creates a defensive copy in a synchronized method) and returns keys of the properties which were set explicitly and which are defined as defaults. The other related problem, which is fixed here. was in `ResetSystemProperties` mix-in. It created a copy of the system properties in the wrong way. This patch also introduces a test case for thread-safeness of SparkConf creation. Refer to the discussion in #4220 for more details. Author: Jacek Lewandowski <lewandowski.jacek@gmail.com> Closes #4221 from jacek-lewandowski/SPARK-5425-1.2 and squashes the following commits: 87951a2 [Jacek Lewandowski] SPARK-5425: Modified Utils.getSystemProperties to return a map of all system properties - explicit + defaults 01dd5cb [Jacek Lewandowski] SPARK-5425: Use SerializationUtils to save properties in ResetSystemProperties trait 94aeacf [Jacek Lewandowski] SPARK-5425: Use synchronised methods in system properties to create SparkConf
…parkConf SPARK-5425: Fixed usages of system properties This patch fixes few problems caused by the fact that the Scala wrapper over system properties is not thread-safe and is basically invalid because it doesn't take into account the default values which could have been set in the properties object. The problem is fixed by modifying `Utils.getSystemProperties` method so that it uses `stringPropertyNames` method of the `Properties` class, which is thread-safe (internally it creates a defensive copy in a synchronized method) and returns keys of the properties which were set explicitly and which are defined as defaults. The other related problem, which is fixed here. was in `ResetSystemProperties` mix-in. It created a copy of the system properties in the wrong way. This patch also introduces a test case for thread-safeness of SparkConf creation. Refer to the discussion in apache/spark#4220 for more details. Author: Jacek Lewandowski <lewandowski.jacek@gmail.com> Closes #4222 from jacek-lewandowski/SPARK-5425-1.3 and squashes the following commits: 03da61b [Jacek Lewandowski] SPARK-5425: Modified Utils.getSystemProperties to return a map of all system properties - explicit + defaults 8faf2ea [Jacek Lewandowski] SPARK-5425: Use SerializationUtils to save properties in ResetSystemProperties trait 71aa572 [Jacek Lewandowski] SPARK-5425: Use synchronised methods in system properties to create SparkConf
SPARK-5425: Fixed usages of system properties
This patch fixes few problems caused by the fact that the Scala wrapper over system properties is not thread-safe and is basically invalid because it doesn't take into account the default values which could have been set in the properties object. The problem is fixed by modifying
Utils.getSystemProperties
method so that it usesstringPropertyNames
method of theProperties
class, which is thread-safe (internally it creates a defensive copy in a synchronized method) and returns keys of the properties which were set explicitly and which are defined as defaults.The other related problem, which is fixed here. was in
ResetSystemProperties
mix-in. It created a copy of the system properties in the wrong way.This patch also introduces a test case for thread-safeness of SparkConf creation.
Refer to the discussion in #4220 for more details.