-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
REL-505 merge Apache branch-1.1 bug fixes and add new ByteswapPartitioner #27
Commits on Nov 17, 2014
-
Revert "[SPARK-4075] [Deploy] Jar url validation is not enough for Ja…
…r file" This reverts commit 098f83c.
Andrew Or committedNov 17, 2014 Configuration menu - View commit details
-
Copy full SHA for b528367 - Browse repository at this point
Copy the full SHA b528367View commit details -
Revert "[maven-release-plugin] prepare for next development iteration"
This reverts commit 685bdd2.
Andrew Or committedNov 17, 2014 Configuration menu - View commit details
-
Copy full SHA for cf8d0ef - Browse repository at this point
Copy the full SHA cf8d0efView commit details -
Revert "[maven-release-plugin] prepare release v1.1.1-rc1"
This reverts commit 72a4fdb.
Andrew Or committedNov 17, 2014 Configuration menu - View commit details
-
Copy full SHA for e4f5695 - Browse repository at this point
Copy the full SHA e4f5695View commit details
Commits on Nov 18, 2014
-
[SPARK-4467] Partial fix for fetch failure in sort-based shuffle (1.1)
This is the 1.1 version of apache#3302. There has been some refactoring in master so we can't cherry-pick that PR. Author: Andrew Or <andrew@databricks.com> Closes apache#3330 from andrewor14/sort-fetch-fail and squashes the following commits: 486fc49 [Andrew Or] Reset `elementsRead`
Andrew Or committedNov 18, 2014 Configuration menu - View commit details
-
Copy full SHA for aa9ebda - Browse repository at this point
Copy the full SHA aa9ebdaView commit details -
[SPARK-4393] Fix memory leak in ConnectionManager ACK timeout TimerTa…
…sks; use HashedWheelTimer (For branch-1.1) This patch is intended to fix a subtle memory leak in ConnectionManager's ACK timeout TimerTasks: in the old code, each TimerTask held a reference to the message being sent and a cancelled TimerTask won't necessarily be garbage-collected until it's scheduled to run, so this caused huge buildups of messages that weren't garbage collected until their timeouts expired, leading to OOMs. This patch addresses this problem by capturing only the message ID in the TimerTask instead of the whole message, and by keeping a WeakReference to the promise in the TimerTask. I've also modified this code to use Netty's HashedWheelTimer, whose performance characteristics should be better for this use-case. Author: Kousuke Saruta <sarutak@oss.nttdata.co.jp> Closes apache#3321 from sarutak/connection-manager-timeout-bugfix and squashes the following commits: 786af91 [Kousuke Saruta] Fixed memory leak issue of ConnectionManager
Configuration menu - View commit details
-
Copy full SHA for 91b5fa8 - Browse repository at this point
Copy the full SHA 91b5fa8View commit details
Commits on Nov 19, 2014
-
[SPARK-4433] fix a racing condition in zipWithIndex
Spark hangs with the following code: ~~~ sc.parallelize(1 to 10).zipWithIndex.repartition(10).count() ~~~ This is because ZippedWithIndexRDD triggers a job in getPartitions and it causes a deadlock in DAGScheduler.getPreferredLocs (synced). The fix is to compute `startIndices` during construction. This should be applied to branch-1.0, branch-1.1, and branch-1.2. pwendell Author: Xiangrui Meng <meng@databricks.com> Closes apache#3291 from mengxr/SPARK-4433 and squashes the following commits: c284d9f [Xiangrui Meng] fix a racing condition in zipWithIndex (cherry picked from commit bb46046) Signed-off-by: Xiangrui Meng <meng@databricks.com>
Configuration menu - View commit details
-
Copy full SHA for ae9b1f6 - Browse repository at this point
Copy the full SHA ae9b1f6View commit details -
[SPARK-4468][SQL] Backports apache#3334 to branch-1.1
<!-- Reviewable:start --> [<img src="https://reviewable.io/review_button.png" height=40 alt="Review on Reviewable"/>](https://reviewable.io/reviews/apache/spark/3338) <!-- Reviewable:end --> Author: Cheng Lian <lian@databricks.com> Closes apache#3338 from liancheng/spark-3334-for-1.1 and squashes the following commits: bd17512 [Cheng Lian] Backports apache#3334 to branch-1.1
Configuration menu - View commit details
-
Copy full SHA for f9739b9 - Browse repository at this point
Copy the full SHA f9739b9View commit details -
[SPARK-4380] Log more precise number of bytes spilled (1.1)
This is the branch-1.1 version of apache#3243. Author: Andrew Or <andrew@databricks.com> Closes apache#3355 from andrewor14/spill-log-bytes-1.1 and squashes the following commits: 36ec152 [Andrew Or] Log more precise representation of bytes in spilling code
Andrew Or committedNov 19, 2014 Configuration menu - View commit details
-
Copy full SHA for e22a759 - Browse repository at this point
Copy the full SHA e22a759View commit details -
Merge branch 'branch-1.1' of github.com:apache/spark into csd-1.1
Conflicts: assembly/pom.xml bagel/pom.xml core/pom.xml examples/pom.xml external/flume-sink/pom.xml external/flume/pom.xml external/kafka/pom.xml external/mqtt/pom.xml external/twitter/pom.xml external/zeromq/pom.xml extras/kinesis-asl/pom.xml extras/spark-ganglia-lgpl/pom.xml graphx/pom.xml mllib/pom.xml pom.xml repl/pom.xml sql/catalyst/pom.xml sql/core/pom.xml sql/hive-thriftserver/pom.xml sql/hive/pom.xml streaming/pom.xml tools/pom.xml yarn/pom.xml yarn/stable/pom.xml
Configuration menu - View commit details
-
Copy full SHA for 1713c7e - Browse repository at this point
Copy the full SHA 1713c7eView commit details -
[SPARK-4480] Avoid many small spills in external data structures (1.1)
This is the branch-1.1 version of apache#3353. This requires a separate PR because the code in master has been refactored a little to eliminate duplicate code. I have tested this on a standalone cluster. The goal is to merge this into 1.1.1. Author: Andrew Or <andrew@databricks.com> Closes apache#3354 from andrewor14/avoid-small-spills-1.1 and squashes the following commits: f2e552c [Andrew Or] Fix tests 7012595 [Andrew Or] Avoid many small spills
Andrew Or committedNov 19, 2014 Configuration menu - View commit details
-
Copy full SHA for 16bf5f3 - Browse repository at this point
Copy the full SHA 16bf5f3View commit details -
Configuration menu - View commit details
-
Copy full SHA for aa3c794 - Browse repository at this point
Copy the full SHA aa3c794View commit details -
Configuration menu - View commit details
-
Copy full SHA for 3693ae5 - Browse repository at this point
Copy the full SHA 3693ae5View commit details -
Configuration menu - View commit details
-
Copy full SHA for 1df1c1d - Browse repository at this point
Copy the full SHA 1df1c1dView commit details
Commits on Nov 24, 2014
-
Merge tag 'v1.1.1-rc2' of github.com:apache/spark into csd-1.1
[maven-release-plugin] copy for tag v1.1.1-rc2 Conflicts: assembly/pom.xml bagel/pom.xml core/pom.xml examples/pom.xml external/flume-sink/pom.xml external/flume/pom.xml external/kafka/pom.xml external/mqtt/pom.xml external/twitter/pom.xml external/zeromq/pom.xml extras/kinesis-asl/pom.xml extras/spark-ganglia-lgpl/pom.xml graphx/pom.xml mllib/pom.xml pom.xml repl/pom.xml sql/catalyst/pom.xml sql/core/pom.xml sql/hive-thriftserver/pom.xml sql/hive/pom.xml streaming/pom.xml tools/pom.xml yarn/pom.xml yarn/stable/pom.xml
Configuration menu - View commit details
-
Copy full SHA for b838cef - Browse repository at this point
Copy the full SHA b838cefView commit details -
Configuration menu - View commit details
-
Copy full SHA for 1b2b7dd - Browse repository at this point
Copy the full SHA 1b2b7ddView commit details -
Update versions to 1.1.2-SNAPSHOT
Andrew Or committedNov 24, 2014 Configuration menu - View commit details
-
Copy full SHA for 6371737 - Browse repository at this point
Copy the full SHA 6371737View commit details
Commits on Nov 25, 2014
-
[SPARK-4196][SPARK-4602][Streaming] Fix serialization issue in PairDS…
…treamFunctions.saveAsNewAPIHadoopFiles Solves two JIRAs in one shot - Makes the ForechDStream created by saveAsNewAPIHadoopFiles serializable for checkpoints - Makes the default configuration object used saveAsNewAPIHadoopFiles be the Spark's hadoop configuration Author: Tathagata Das <tathagata.das1565@gmail.com> Closes apache#3457 from tdas/savefiles-fix and squashes the following commits: bb4729a [Tathagata Das] Same treatment for saveAsHadoopFiles b382ea9 [Tathagata Das] Fix serialization issue in PairDStreamFunctions.saveAsNewAPIHadoopFiles. (cherry picked from commit 8838ad7) Signed-off-by: Tathagata Das <tathagata.das1565@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for 7aa592c - Browse repository at this point
Copy the full SHA 7aa592cView commit details -
Configuration menu - View commit details
-
Copy full SHA for 1a7f414 - Browse repository at this point
Copy the full SHA 1a7f414View commit details
Commits on Nov 27, 2014
-
[Release] Automate generation of contributors list
This commit provides a script that computes the contributors list by linking the github commits with JIRA issues. Automatically translating github usernames remains a TODO at this point.
Andrew Or committedNov 27, 2014 Configuration menu - View commit details
-
Copy full SHA for a59c445 - Browse repository at this point
Copy the full SHA a59c445View commit details
Commits on Nov 28, 2014
-
[BRANCH-1.1][SPARK-4626] Kill a task only if the executorId is (still…
…) registered with the scheduler v1.1 backport for apache#3483 Author: roxchkplusony <roxchkplusony@gmail.com> Closes apache#3503 from roxchkplusony/bugfix/4626-1.1 and squashes the following commits: 234d350 [roxchkplusony] [SPARK-4626] Kill a task only if the executorId is (still) registered with the scheduler
Configuration menu - View commit details
-
Copy full SHA for f8a4fd3 - Browse repository at this point
Copy the full SHA f8a4fd3View commit details
Commits on Nov 29, 2014
-
[SPARK-4597] Use proper exception and reset variable in Utils.createT…
…empDir() `File.exists()` and `File.mkdirs()` only throw `SecurityException` instead of `IOException`. Then, when an exception is thrown, `dir` should be reset too. Author: Liang-Chi Hsieh <viirya@gmail.com> Closes apache#3449 from viirya/fix_createtempdir and squashes the following commits: 36cacbd [Liang-Chi Hsieh] Use proper exception and reset variable. (cherry picked from commit 49fe879) Signed-off-by: Josh Rosen <joshrosen@databricks.com>
Configuration menu - View commit details
-
Copy full SHA for 24b5c03 - Browse repository at this point
Copy the full SHA 24b5c03View commit details
Commits on Nov 30, 2014
-
SPARK-2143 [WEB UI] Add Spark version to UI footer
This PR adds the Spark version number to the UI footer; this is how it looks: ![screen shot 2014-11-21 at 22 58 40](https://cloud.githubusercontent.com/assets/822522/5157738/f4822094-7316-11e4-98f1-333a535fdcfa.png) Author: Sean Owen <sowen@cloudera.com> Closes apache#3410 from srowen/SPARK-2143 and squashes the following commits: e9b3a7a [Sean Owen] Add Spark version to footer
Configuration menu - View commit details
-
Copy full SHA for 1a2508b - Browse repository at this point
Copy the full SHA 1a2508bView commit details -
[HOTFIX] Fix build break in 1a2508b
org.apache.spark.SPARK_VERSION is new in 1.2; in earlier versions, we have to use SparkContext.SPARK_VERSION.
Configuration menu - View commit details
-
Copy full SHA for 90d90b2 - Browse repository at this point
Copy the full SHA 90d90b2View commit details
Commits on Dec 1, 2014
-
[DOC] Fixes formatting typo in SQL programming guide
<!-- Reviewable:start --> [<img src="https://reviewable.io/review_button.png" height=40 alt="Review on Reviewable"/>](https://reviewable.io/reviews/apache/spark/3498) <!-- Reviewable:end --> Author: Cheng Lian <lian@databricks.com> Closes apache#3498 from liancheng/fix-sql-doc-typo and squashes the following commits: 865ecd7 [Cheng Lian] Fixes formatting typo in SQL programming guide (cherry picked from commit 2a4d389) Signed-off-by: Josh Rosen <joshrosen@databricks.com>
Configuration menu - View commit details
-
Copy full SHA for 91eadd2 - Browse repository at this point
Copy the full SHA 91eadd2View commit details
Commits on Dec 2, 2014
-
[SPARK-4686] Link to allowed master URLs is broken
The link points to the old scala programming guide; it should point to the submitting applications page. This should be backported to 1.1.2 (it's been broken as of 1.0). Author: Kay Ousterhout <kayousterhout@gmail.com> Closes apache#3542 from kayousterhout/SPARK-4686 and squashes the following commits: a8fc43b [Kay Ousterhout] [SPARK-4686] Link to allowed master URLs is broken (cherry picked from commit d9a148b) Signed-off-by: Kay Ousterhout <kayousterhout@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for f333e4f - Browse repository at this point
Copy the full SHA f333e4fView commit details
Commits on Dec 3, 2014
-
[Release] Translate unknown author names automatically
Andrew Or committedDec 3, 2014 Configuration menu - View commit details
-
Copy full SHA for aec20af - Browse repository at this point
Copy the full SHA aec20afView commit details -
Modified typo. Author: Masayoshi TSUZUKI <tsudukim@oss.nttdata.co.jp> Closes apache#3560 from tsudukim/feature/SPARK-4701 and squashes the following commits: ed2a3f1 [Masayoshi TSUZUKI] Another whitespace position error. 1af3a35 [Masayoshi TSUZUKI] [SPARK-4701] Typo in sbt/sbt (cherry picked from commit 96786e3) Signed-off-by: Andrew Or <andrew@databricks.com>
Configuration menu - View commit details
-
Copy full SHA for e484b8a - Browse repository at this point
Copy the full SHA e484b8aView commit details -
[SPARK-4715][Core] Make sure tryToAcquire won't return a negative value
ShuffleMemoryManager.tryToAcquire may return a negative value. The unit test demonstrates this bug. It will output `0 did not equal -200 granted is negative`. Author: zsxwing <zsxwing@gmail.com> Closes apache#3575 from zsxwing/SPARK-4715 and squashes the following commits: a193ae6 [zsxwing] Make sure tryToAcquire won't return a negative value
Configuration menu - View commit details
-
Copy full SHA for af76954 - Browse repository at this point
Copy the full SHA af76954View commit details -
[SPARK-4642] Add description about spark.yarn.queue to running-on-YAR…
…N document. Added descriptions about these parameters. - spark.yarn.queue Modified description about the defalut value of this parameter. - spark.yarn.submit.file.replication Author: Masayoshi TSUZUKI <tsudukim@oss.nttdata.co.jp> Closes apache#3500 from tsudukim/feature/SPARK-4642 and squashes the following commits: ce99655 [Masayoshi TSUZUKI] better gramatically. 21cf624 [Masayoshi TSUZUKI] Removed intentionally undocumented properties. 88cac9b [Masayoshi TSUZUKI] [SPARK-4642] Documents about running-on-YARN needs update
Configuration menu - View commit details
-
Copy full SHA for 3e3cd5a - Browse repository at this point
Copy the full SHA 3e3cd5aView commit details -
[SPARK-4498][core] Don't transition ExecutorInfo to RUNNING until Dri…
…ver adds Executor The ExecutorInfo only reaches the RUNNING state if the Driver is alive to send the ExecutorStateChanged message to master. Else, appInfo.resetRetryCount() is never called and failing Executors will eventually exceed ApplicationState.MAX_NUM_RETRY, resulting in the application being removed from the master's accounting. Author: Mark Hamstra <markhamstra@gmail.com> Closes apache#3550 from markhamstra/SPARK-4498 and squashes the following commits: 8f543b1 [Mark Hamstra] Don't transition ExecutorInfo to RUNNING until Executor is added by Driver
Configuration menu - View commit details
-
Copy full SHA for 17dfd41 - Browse repository at this point
Copy the full SHA 17dfd41View commit details
Commits on Dec 4, 2014
-
[Release] Correctly translate contributors name in release notes
This commit involves three main changes: (1) It separates the translation of contributor names from the generation of the contributors list. This is largely motivated by the Github API limit; even if we exceed this limit, we should at least be able to proceed manually as before. This is why the translation logic is abstracted into its own script translate-contributors.py. (2) When we look for candidate replacements for invalid author names, we should look for the assignees of the associated JIRAs too. As a result, the intermediate file must keep track of these. (3) This provides an interactive mode with which the user can sit at the terminal and manually pick the candidate replacement that he/she thinks makes the most sense. As before, there is a non-interactive mode that picks the first candidate that the script considers "valid." TODO: We should have a known_contributors file that stores known mappings so we don't have to go through all of this translation every time. This is also valuable because some contributors simply cannot be automatically translated. Conflicts: .gitignore
Andrew Or committedDec 4, 2014 Configuration menu - View commit details
-
Copy full SHA for 6c53225 - Browse repository at this point
Copy the full SHA 6c53225View commit details -
[SPARK-4253] Ignore spark.driver.host in yarn-cluster and standalone-…
…cluster modes In yarn-cluster and standalone-cluster modes, we don't know where driver will run until it is launched. If the `spark.driver.host` property is set on the submitting machine and propagated to the driver through SparkConf then this will lead to errors when the driver launches. This patch fixes this issue by dropping the `spark.driver.host` property in SparkSubmit when running in a cluster deploy mode. Author: WangTaoTheTonic <barneystinson@aliyun.com> Author: WangTao <barneystinson@aliyun.com> Closes apache#3112 from WangTaoTheTonic/SPARK4253 and squashes the following commits: ed1a25c [WangTaoTheTonic] revert unrelated formatting issue 02c4e49 [WangTao] add comment 32a3f3f [WangTaoTheTonic] ingore it in SparkSubmit instead of SparkContext 667cf24 [WangTaoTheTonic] document fix ff8d5f7 [WangTaoTheTonic] also ignore it in standalone cluster mode 2286e6b [WangTao] ignore spark.driver.host in yarn-cluster mode (cherry picked from commit 8106b1e) Signed-off-by: Josh Rosen <joshrosen@databricks.com> Conflicts: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala
Configuration menu - View commit details
-
Copy full SHA for 5ac55c8 - Browse repository at this point
Copy the full SHA 5ac55c8View commit details -
[SPARK-4745] Fix get_existing_cluster() function with multiple securi…
…ty groups The current get_existing_cluster() function would only find an instance belonged to a cluster if the instance's security groups == cluster_name + "-master" (or "-slaves"). This fix allows for multiple security groups by checking if the cluster_name + "-master" security group is in the list of groups for a particular instance. Author: alexdebrie <alexdebrie1@gmail.com> Closes apache#3596 from alexdebrie/master and squashes the following commits: 9d51232 [alexdebrie] Fix get_existing_cluster() function with multiple security groups (cherry picked from commit 794f3ae) Signed-off-by: Josh Rosen <joshrosen@databricks.com>
Configuration menu - View commit details
-
Copy full SHA for d01fdd3 - Browse repository at this point
Copy the full SHA d01fdd3View commit details -
[SPARK-4459] Change groupBy type parameter from K to U
Please see https://issues.apache.org/jira/browse/SPARK-4459 Author: Saldanha <saldaal1@phusca-l24858.wlan.na.novartis.net> Closes apache#3327 from alokito/master and squashes the following commits: 54b1095 [Saldanha] [SPARK-4459] changed type parameter for keyBy from K to U d5f73c3 [Saldanha] [SPARK-4459] added keyBy test 316ad77 [Saldanha] SPARK-4459 changed type parameter for groupBy from K to U. 62ddd4b [Saldanha] SPARK-4459 added failing unit test (cherry picked from commit 743a889) Signed-off-by: Josh Rosen <joshrosen@databricks.com>
Configuration menu - View commit details
-
Copy full SHA for e98aa54 - Browse repository at this point
Copy the full SHA e98aa54View commit details -
[SPARK-4652][DOCS] Add docs about spark-git-repo option
There might be some cases when WIPS spark version need to be run on EC2 cluster. In order to setup this type of cluster more easily, add --spark-git-repo option description to ec2 documentation. Author: lewuathe <lewuathe@me.com> Author: Josh Rosen <joshrosen@databricks.com> Closes apache#3513 from Lewuathe/doc-for-development-spark-cluster and squashes the following commits: 6dae8ee [lewuathe] Wrap consistent with other descriptions cfaf9be [lewuathe] Add docs about spark-git-repo option (Editing / cleanup by Josh Rosen) (cherry picked from commit ab8177d) Signed-off-by: Josh Rosen <joshrosen@databricks.com>
Configuration menu - View commit details
-
Copy full SHA for bf637e0 - Browse repository at this point
Copy the full SHA bf637e0View commit details
Commits on Dec 5, 2014
-
[SPARK-4421] Wrong link in spark-standalone.html
Modified the link of building Spark. (backport version of apache#3279.) Author: Masayoshi TSUZUKI <tsudukim@oss.nttdata.co.jp> Closes apache#3280 from tsudukim/feature/SPARK-4421-2 and squashes the following commits: 3b4d38d [Masayoshi TSUZUKI] [SPARK-4421] Wrong link in spark-standalone.html
Configuration menu - View commit details
-
Copy full SHA for b09382a - Browse repository at this point
Copy the full SHA b09382aView commit details -
Author: Andy Konwinski <andykonwinski@gmail.com> Closes apache#3611 from andyk/patch-3 and squashes the following commits: 7bab333 [Andy Konwinski] Fix typo in Spark SQL docs. (cherry picked from commit 15cf3b0) Signed-off-by: Josh Rosen <joshrosen@databricks.com>
Configuration menu - View commit details
-
Copy full SHA for 8ee2d18 - Browse repository at this point
Copy the full SHA 8ee2d18View commit details -
Merge branch 'branch-1.1' of github.com:apache/spark into csd-1.1
Conflicts: assembly/pom.xml bagel/pom.xml core/pom.xml docs/_config.yml examples/pom.xml external/flume-sink/pom.xml external/flume/pom.xml external/kafka/pom.xml external/mqtt/pom.xml external/twitter/pom.xml external/zeromq/pom.xml extras/kinesis-asl/pom.xml extras/spark-ganglia-lgpl/pom.xml graphx/pom.xml mllib/pom.xml pom.xml repl/pom.xml sql/catalyst/pom.xml sql/core/pom.xml sql/hive-thriftserver/pom.xml sql/hive/pom.xml streaming/pom.xml tools/pom.xml yarn/alpha/pom.xml yarn/pom.xml yarn/stable/pom.xml
Configuration menu - View commit details
-
Copy full SHA for a290486 - Browse repository at this point
Copy the full SHA a290486View commit details
Commits on Dec 8, 2014
-
[SPARK-4764] Ensure that files are fetched atomically
tempFile is created in the same directory than targetFile, so that the move from tempFile to targetFile is always atomic Author: Christophe Préaud <christophe.preaud@kelkoo.com> Closes apache#2855 from preaudc/master and squashes the following commits: 9ba89ca [Christophe Préaud] Ensure that files are fetched atomically 54419ae [Christophe Préaud] Merge remote-tracking branch 'upstream/master' c6a5590 [Christophe Préaud] Revert commit 8ea871f 7456a33 [Christophe Préaud] Merge remote-tracking branch 'upstream/master' 8ea871f [Christophe Préaud] Ensure that files are fetched atomically (cherry picked from commit ab2abcb) Signed-off-by: Josh Rosen <rosenville@gmail.com> Conflicts: core/src/main/scala/org/apache/spark/util/Utils.scala
Configuration menu - View commit details
-
Copy full SHA for 16bc77b - Browse repository at this point
Copy the full SHA 16bc77bView commit details
Commits on Dec 9, 2014
-
SPARK-3926 [CORE] Reopened: result of JavaRDD collectAsMap() is not s…
…erializable My original 'fix' didn't fix at all. Now, there's a unit test to check whether it works. Of the two options to really fix it -- copy the `Map` to a `java.util.HashMap`, or copy and modify Scala's implementation in `Wrappers.MapWrapper`, I went with the latter. Author: Sean Owen <sowen@cloudera.com> Closes apache#3587 from srowen/SPARK-3926 and squashes the following commits: 8586bb9 [Sean Owen] Remove unneeded no-arg constructor, and add additional note about copied code in LICENSE 7bb0e66 [Sean Owen] Make SerializableMapWrapper actually serialize, and add unit test (cherry picked from commit e829bfa) Signed-off-by: Josh Rosen <joshrosen@databricks.com>
Configuration menu - View commit details
-
Copy full SHA for fe7d7a9 - Browse repository at this point
Copy the full SHA fe7d7a9View commit details -
Configuration menu - View commit details
-
Copy full SHA for 7bf3aa3 - Browse repository at this point
Copy the full SHA 7bf3aa3View commit details -
[SPARK-4714] BlockManager.dropFromMemory() should check whether block…
… has been removed after synchronizing on BlockInfo instance. After synchronizing on the `info` lock in the `removeBlock`/`dropOldBlocks`/`dropFromMemory` methods in BlockManager, the block that `info` represented may have already removed. The three methods have the same logic to get the `info` lock: ``` info = blockInfo.get(id) if (info != null) { info.synchronized { // do something } } ``` So, there is chance that when a thread enters the `info.synchronized` block, `info` has already been removed from the `blockInfo` map by some other thread who entered `info.synchronized` first. The `removeBlock` and `dropOldBlocks` methods are idempotent, so it's safe for them to run on blocks that have already been removed. But in `dropFromMemory` it may be problematic since it may drop block data which already removed into the diskstore, and this calls data store operations that are not designed to handle missing blocks. This patch fixes this issue by adding a check to `dropFromMemory` to test whether blocks have been removed by a racing thread. Author: hushan[胡珊] <hushan@xiaomi.com> Closes apache#3574 from suyanNone/refine-block-concurrency and squashes the following commits: edb989d [hushan[胡珊]] Refine code style and comments position 55fa4ba [hushan[胡珊]] refine code e57e270 [hushan[胡珊]] add check info is already remove or not while having gotten info.syn (cherry picked from commit 30dca92) Signed-off-by: Josh Rosen <joshrosen@databricks.com>
Configuration menu - View commit details
-
Copy full SHA for 9b99237 - Browse repository at this point
Copy the full SHA 9b99237View commit details
Commits on Dec 10, 2014
-
[SPARK-4772] Clear local copies of accumulators as soon as we're done…
… with them Accumulators keep thread-local copies of themselves. These copies were only cleared at the beginning of a task. This meant that (a) the memory they used was tied up until the next task ran on that thread, and (b) if a thread died, the memory it had used for accumulators was locked up forever on that worker. This PR clears the thread-local copies of accumulators at the end of each task, in the tasks finally block, to make sure they are cleaned up between tasks. It also stores them in a ThreadLocal object, so that if, for some reason, the thread dies, any memory they are using at the time should be freed up. Author: Nathan Kronenfeld <nkronenfeld@oculusinfo.com> Closes apache#3570 from nkronenfeld/Accumulator-Improvements and squashes the following commits: a581f3f [Nathan Kronenfeld] Change Accumulators to private[spark] instead of adding mima exclude to get around false positive in mima tests b6c2180 [Nathan Kronenfeld] Include MiMa exclude as per build error instructions - this version incompatibility should be irrelevent, as it will only surface if a master is talking to a worker running a different version of spark. 537baad [Nathan Kronenfeld] Fuller refactoring as intended, incorporating JR's suggestions for ThreadLocal localAccums, and keeping clear(), but also calling it in tasks' finally block, rather than just at the beginning of the task. 39a82f2 [Nathan Kronenfeld] Clear local copies of accumulators as soon as we're done with them (cherry picked from commit 94b377f) Signed-off-by: Josh Rosen <joshrosen@databricks.com> Conflicts: core/src/main/scala/org/apache/spark/Accumulators.scala core/src/main/scala/org/apache/spark/executor/Executor.scala
Configuration menu - View commit details
-
Copy full SHA for 6dcafa7 - Browse repository at this point
Copy the full SHA 6dcafa7View commit details -
[SPARK-4771][Docs] Document standalone cluster supervise mode
tdas looks like streaming already refers to the supervise mode. The link from there is broken though. Author: Andrew Or <andrew@databricks.com> Closes apache#3627 from andrewor14/document-supervise and squashes the following commits: 9ca0908 [Andrew Or] Wording changes 2b55ed2 [Andrew Or] Document standalone cluster supervise mode
Andrew Or committedDec 10, 2014 Configuration menu - View commit details
-
Copy full SHA for 273f2c8 - Browse repository at this point
Copy the full SHA 273f2c8View commit details -
[SPARK-4759] Fix driver hanging from coalescing partitions
The driver hangs sometimes when we coalesce RDD partitions. See JIRA for more details and reproduction. This is because our use of empty string as default preferred location in `CoalescedRDDPartition` causes the `TaskSetManager` to schedule the corresponding task on host `""` (empty string). The intended semantics here, however, is that the partition does not have a preferred location, and the TSM should schedule the corresponding task accordingly. Author: Andrew Or <andrew@databricks.com> Closes apache#3633 from andrewor14/coalesce-preferred-loc and squashes the following commits: e520d6b [Andrew Or] Oops 3ebf8bd [Andrew Or] A few comments f370a4e [Andrew Or] Fix tests 2f7dfb6 [Andrew Or] Avoid using empty string as default preferred location (cherry picked from commit 4f93d0c) Signed-off-by: Andrew Or <andrew@databricks.com>
Andrew Or committedDec 10, 2014 Configuration menu - View commit details
-
Copy full SHA for 396de67 - Browse repository at this point
Copy the full SHA 396de67View commit details
Commits on Dec 14, 2014
-
fixed spelling errors in documentation
changed "form" to "from" in 3 documentation entries for Kafka integration Author: Peter Klipfel <peter@klipfel.me> Closes apache#3691 from peterklipfel/master and squashes the following commits: 0fe7fc5 [Peter Klipfel] fixed spelling errors in documentation (cherry picked from commit 2a2983f) Signed-off-by: Josh Rosen <joshrosen@databricks.com>
Configuration menu - View commit details
-
Copy full SHA for 0faea17 - Browse repository at this point
Copy the full SHA 0faea17View commit details
Commits on Dec 16, 2014
-
SPARK-785 [CORE] ClosureCleaner not invoked on most PairRDDFunctions
This looked like perhaps a simple and important one. `combineByKey` looks like it should clean its arguments' closures, and that in turn covers apparently all remaining functions in `PairRDDFunctions` which delegate to it. Author: Sean Owen <sowen@cloudera.com> Closes apache#3690 from srowen/SPARK-785 and squashes the following commits: 8df68fe [Sean Owen] Clean context of most remaining functions in PairRDDFunctions, which ultimately call combineByKey (cherry picked from commit 2a28bc6) Signed-off-by: Josh Rosen <joshrosen@databricks.com>
Configuration menu - View commit details
-
Copy full SHA for fa3b3e3 - Browse repository at this point
Copy the full SHA fa3b3e3View commit details -
SPARK-4814 [CORE] Enable assertions in SBT, Maven tests / AssertionEr…
…ror from Hive's LazyBinaryInteger This enables assertions for the Maven and SBT build, but overrides the Hive module to not enable assertions. Author: Sean Owen <sowen@cloudera.com> Closes apache#3692 from srowen/SPARK-4814 and squashes the following commits: caca704 [Sean Owen] Disable assertions just for Hive f71e783 [Sean Owen] Enable assertions for SBT and Maven build (cherry picked from commit 81112e4) Signed-off-by: Josh Rosen <joshrosen@databricks.com> Conflicts: pom.xml
Configuration menu - View commit details
-
Copy full SHA for 892685b - Browse repository at this point
Copy the full SHA 892685bView commit details
Commits on Dec 17, 2014
-
[Release] Major improvements to generate contributors script
This commit introduces several major improvements to the script that generates the contributors list for release notes, notably: (1) Use release tags instead of a range of commits. Across branches, commits are not actually strictly two-dimensional, and so it is not sufficient to specify a start hash and an end hash. Otherwise, we end up counting commits that were already merged in an older branch. (2) Match PR numbers in addition to commit hashes. This is related to the first point in that if a PR is already merged in an older minor release tag, it should be filtered out here. This requires us to do some intelligent regex parsing on the commit description in addition to just relying on the GitHub API. (3) Relax author validity check. The old code fails on a name that has many middle names, for instance. The test was just too strict. (4) Use GitHub authentication. This allows us to make far more requests through the GitHub API than before (5000 as opposed to 60 per hour). (5) Translate from Github username, not commit author name. This is important because the commit author name is not always configured correctly by the user. For instance, the username "falaki" used to resolve to just "Hossein", which was treated as a github username and translated to something else that is completely arbitrary. (6) Add an option to use the untranslated name. If there is not a satisfactory candidate to replace the untranslated name with, at least allow the user to not translate it.
Andrew Or committedDec 17, 2014 Configuration menu - View commit details
-
Copy full SHA for 581f866 - Browse repository at this point
Copy the full SHA 581f866View commit details -
[Release] Cache known author translations locally
This bypasses unnecessary calls to the Github and JIRA API. Additionally, having a local cache allows us to remember names that we had to manually discover ourselves.
Andrew Or committedDec 17, 2014 Configuration menu - View commit details
-
Copy full SHA for 991748d - Browse repository at this point
Copy the full SHA 991748dView commit details -
[Release] Update contributors list format and sort it
Additionally, we now warn the user when a duplicate author name arises, in which case he/she needs to resolve it manually. Conflicts: .rat-excludes
Andrew Or committedDec 17, 2014 Configuration menu - View commit details
-
Copy full SHA for 0efd691 - Browse repository at this point
Copy the full SHA 0efd691View commit details -
[HOTFIX] Fix RAT exclusion for known_translations file
Author: Josh Rosen <joshrosen@databricks.com> Closes apache#3719 from JoshRosen/rat-fix and squashes the following commits: 1542886 [Josh Rosen] [HOTFIX] Fix RAT exclusion for known_translations file (cherry picked from commit 3d0c37b) Signed-off-by: Josh Rosen <joshrosen@databricks.com>
Configuration menu - View commit details
-
Copy full SHA for c15e7f2 - Browse repository at this point
Copy the full SHA c15e7f2View commit details
Commits on Dec 18, 2014
-
Configuration menu - View commit details
-
Copy full SHA for bed4807 - Browse repository at this point
Copy the full SHA bed4807View commit details
Commits on Dec 19, 2014
-
[SPARK-4884]: Improve Partition docs
Rewording was based on this discussion: http://apache-spark-developers-list.1001551.n3.nabble.com/RDD-data-flow-td9804.html This is the associated JIRA ticket: https://issues.apache.org/jira/browse/SPARK-4884 Author: Madhu Siddalingaiah <madhu@madhu.com> Closes apache#3722 from msiddalingaiah/master and squashes the following commits: 79e679f [Madhu Siddalingaiah] [DOC]: improve documentation 51d14b9 [Madhu Siddalingaiah] Merge remote-tracking branch 'upstream/master' 38faca4 [Madhu Siddalingaiah] Merge remote-tracking branch 'upstream/master' cbccbfe [Madhu Siddalingaiah] Documentation: replace <b> with <code> (again) 332f7a2 [Madhu Siddalingaiah] Documentation: replace <b> with <code> cd2b05a [Madhu Siddalingaiah] Merge remote-tracking branch 'upstream/master' 0fc12d7 [Madhu Siddalingaiah] Documentation: add description for repartitionAndSortWithinPartitions (cherry picked from commit d5a596d) Signed-off-by: Josh Rosen <joshrosen@databricks.com>
Configuration menu - View commit details
-
Copy full SHA for f4e6ffc - Browse repository at this point
Copy the full SHA f4e6ffcView commit details -
SPARK-3428. TaskMetrics for running tasks is missing GC time metrics
Author: Sandy Ryza <sandy@cloudera.com> Closes apache#3684 from sryza/sandy-spark-3428 and squashes the following commits: cb827fe [Sandy Ryza] SPARK-3428. TaskMetrics for running tasks is missing GC time metrics (cherry picked from commit 283263f) Signed-off-by: Josh Rosen <joshrosen@databricks.com>
Configuration menu - View commit details
-
Copy full SHA for 2d66463 - Browse repository at this point
Copy the full SHA 2d66463View commit details -
[SPARK-4896] don’t redundantly overwrite executor JAR deps
Author: Ryan Williams <ryan.blake.williams@gmail.com> Closes apache#2848 from ryan-williams/fetch-file and squashes the following commits: c14daff [Ryan Williams] Fix copy that was changed to a move inadvertently 8e39c16 [Ryan Williams] code review feedback 788ed41 [Ryan Williams] don’t redundantly overwrite executor JAR deps (cherry picked from commit 7981f96) Signed-off-by: Josh Rosen <joshrosen@databricks.com> Conflicts: core/src/main/scala/org/apache/spark/util/Utils.scala
Configuration menu - View commit details
-
Copy full SHA for 546a239 - Browse repository at this point
Copy the full SHA 546a239View commit details
Commits on Dec 20, 2014
-
SPARK-2641: Passing num executors to spark arguments from properties …
…file Since we can set spark executor memory and executor cores using property file, we must also be allowed to set the executor instances. Author: Kanwaljit Singh <kanwaljit.singh@guavus.com> Closes apache#1657 from kjsingh/branch-1.0 and squashes the following commits: d8a5a12 [Kanwaljit Singh] SPARK-2641: Fixing how spark arguments are loaded from properties file for num executors Conflicts: core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala
Kanwaljit Singh authored and Andrew Or committedDec 20, 2014 Configuration menu - View commit details
-
Copy full SHA for 3597c2e - Browse repository at this point
Copy the full SHA 3597c2eView commit details -
[Minor] Build Failed: value defaultProperties not found
Mvn Build Failed: value defaultProperties not found .Maybe related to this pr: apache@1d64812 andrewor14 can you look at this problem? Author: huangzhaowei <carlmartinmax@gmail.com> Closes apache#3749 from SaintBacchus/Mvn-Build-Fail and squashes the following commits: 8e2917c [huangzhaowei] Build Failed: value defaultProperties not found (cherry picked from commit a764960) Signed-off-by: Josh Rosen <joshrosen@databricks.com>
Configuration menu - View commit details
-
Copy full SHA for e5f2752 - Browse repository at this point
Copy the full SHA e5f2752View commit details
Commits on Dec 22, 2014
-
[SPARK-4818][Core] Add 'iterator' to reduce memory consumed by join
In Scala, `map` and `flatMap` of `Iterable` will copy the contents of `Iterable` to a new `Seq`. Such as, ```Scala val iterable = Seq(1, 2, 3).map(v => { println(v) v }) println("Iterable map done") val iterator = Seq(1, 2, 3).iterator.map(v => { println(v) v }) println("Iterator map done") ``` outputed ``` 1 2 3 Iterable map done Iterator map done ``` So we should use 'iterator' to reduce memory consumed by join. Found by Johannes Simon in http://mail-archives.apache.org/mod_mbox/spark-user/201412.mbox/%3C5BE70814-9D03-4F61-AE2C-0D63F2DE4446%40mail.de%3E Author: zsxwing <zsxwing@gmail.com> Closes apache#3671 from zsxwing/SPARK-4824 and squashes the following commits: 48ee7b9 [zsxwing] Remove the explicit types 95d59d6 [zsxwing] Add 'iterator' to reduce memory consumed by join (cherry picked from commit c233ab3) Signed-off-by: Josh Rosen <joshrosen@databricks.com> Conflicts: core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala
Configuration menu - View commit details
-
Copy full SHA for 3bce43f - Browse repository at this point
Copy the full SHA 3bce43fView commit details
Commits on Dec 23, 2014
-
[SPARK-4802] [streaming] Remove receiverInfo once receiver is de-regi…
…stered Once the streaming receiver is de-registered at executor, the `ReceiverTrackerActor` needs to remove the corresponding reveiverInfo from the `receiverInfo` map at `ReceiverTracker`. Author: Ilayaperumal Gopinathan <igopinathan@pivotal.io> Closes apache#3647 from ilayaperumalg/receiverInfo-RTracker and squashes the following commits: 6eb97d5 [Ilayaperumal Gopinathan] Polishing based on the review 3640c86 [Ilayaperumal Gopinathan] Remove receiverInfo once receiver is de-registered (cherry picked from commit 10d69e9) Signed-off-by: Tathagata Das <tathagata.das1565@gmail.com> Conflicts: streaming/src/main/scala/org/apache/spark/streaming/scheduler/ReceiverTracker.scala
Configuration menu - View commit details
-
Copy full SHA for b1de461 - Browse repository at this point
Copy the full SHA b1de461View commit details
Commits on Dec 24, 2014
-
[SPARK-4606] Send EOF to child JVM when there's no more data to read.
Author: Marcelo Vanzin <vanzin@cloudera.com> Closes apache#3460 from vanzin/SPARK-4606 and squashes the following commits: 031207d [Marcelo Vanzin] [SPARK-4606] Send EOF to child JVM when there's no more data to read. (cherry picked from commit 7e2deb7) Signed-off-by: Josh Rosen <joshrosen@databricks.com>
Configuration menu - View commit details
-
Copy full SHA for dd0287c - Browse repository at this point
Copy the full SHA dd0287cView commit details
Commits on Dec 26, 2014
-
[SPARK-4537][Streaming] Expand StreamingSource to add more metrics
Add `processingDelay`, `schedulingDelay` and `totalDelay` for the last completed batch. Add `lastReceivedBatchRecords` and `totalReceivedBatchRecords` to the received records counting. Author: jerryshao <saisai.shao@intel.com> Closes apache#3466 from jerryshao/SPARK-4537 and squashes the following commits: 00f5f7f [jerryshao] Change the code style and add totalProcessedRecords 44721a6 [jerryshao] Further address the comments c097ddc [jerryshao] Address the comments 02dd44f [jerryshao] Fix the addressed comments c7a9376 [jerryshao] Expand StreamingSource to add more metrics (cherry picked from commit f205fe4) Signed-off-by: Tathagata Das <tathagata.das1565@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for d21347d - Browse repository at this point
Copy the full SHA d21347dView commit details
Commits on Dec 27, 2014
-
[SPARK-4952][Core]Handle ConcurrentModificationExceptions in SparkEnv…
….environmentDetails Author: GuoQiang Li <witgo@qq.com> Closes apache#3788 from witgo/SPARK-4952 and squashes the following commits: d903529 [GuoQiang Li] Handle ConcurrentModificationExceptions in SparkEnv.environmentDetails (cherry picked from commit 080ceb7) Signed-off-by: Patrick Wendell <pwendell@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for 3442b7b - Browse repository at this point
Copy the full SHA 3442b7bView commit details
Commits on Dec 30, 2014
-
[HOTFIX] Add SPARK_VERSION to Spark package object.
This helps to avoid build breaks when backporting patches that use org.apache.spark.SPARK_VERSION.
Configuration menu - View commit details
-
Copy full SHA for d5e0a45 - Browse repository at this point
Copy the full SHA d5e0a45View commit details -
[SPARK-4882] Register PythonBroadcast with Kryo so that PySpark works…
… with KryoSerializer This PR fixes an issue where PySpark broadcast variables caused NullPointerExceptions if KryoSerializer was used. The fix is to register PythonBroadcast with Kryo so that it's deserialized with a KryoJavaSerializer. Author: Josh Rosen <joshrosen@databricks.com> Closes apache#3831 from JoshRosen/SPARK-4882 and squashes the following commits: 0466c7a [Josh Rosen] Register PythonBroadcast with Kryo. d5b409f [Josh Rosen] Enable registrationRequired, which would have caught this bug. 069d8a7 [Josh Rosen] Add failing test for SPARK-4882 (cherry picked from commit efa80a5) Signed-off-by: Josh Rosen <joshrosen@databricks.com>
Configuration menu - View commit details
-
Copy full SHA for 822a0b4 - Browse repository at this point
Copy the full SHA 822a0b4View commit details -
Revert "[SPARK-4882] Register PythonBroadcast with Kryo so that PySpa…
…rk works with KryoSerializer" This reverts commit 822a0b4. This fix does not apply to branch-1.1 or branch-1.0, since PythonBroadcast is new in 1.2.
Configuration menu - View commit details
-
Copy full SHA for d6b8d2c - Browse repository at this point
Copy the full SHA d6b8d2cView commit details -
[SPARK-4813][Streaming] Fix the issue that ContextWaiter didn't handl…
…e 'spurious wakeup' Used `Condition` to rewrite `ContextWaiter` because it provides a convenient API `awaitNanos` for timeout. Author: zsxwing <zsxwing@gmail.com> Closes apache#3661 from zsxwing/SPARK-4813 and squashes the following commits: 52247f5 [zsxwing] Add explicit unit type be42bcf [zsxwing] Update as per review suggestion e06bd4f [zsxwing] Fix the issue that ContextWaiter didn't handle 'spurious wakeup' (cherry picked from commit 6a89782) Signed-off-by: Tathagata Das <tathagata.das1565@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for eac740e - Browse repository at this point
Copy the full SHA eac740eView commit details
Commits on Dec 31, 2014
-
[SPARK-1010] Clean up uses of System.setProperty in unit tests
Several of our tests call System.setProperty (or test code which implicitly sets system properties) and don't always reset/clear the modified properties, which can create ordering dependencies between tests and cause hard-to-diagnose failures. This patch removes most uses of System.setProperty from our tests, since in most cases we can use SparkConf to set these configurations (there are a few exceptions, including the tests of SparkConf itself). For the cases where we continue to use System.setProperty, this patch introduces a `ResetSystemProperties` ScalaTest mixin class which snapshots the system properties before individual tests and to automatically restores them on test completion / failure. See the block comment at the top of the ResetSystemProperties class for more details. Author: Josh Rosen <joshrosen@databricks.com> Closes apache#3739 from JoshRosen/cleanup-system-properties-in-tests and squashes the following commits: 0236d66 [Josh Rosen] Replace setProperty uses in two example programs / tools 3888fe3 [Josh Rosen] Remove setProperty use in LocalJavaStreamingContext 4f4031d [Josh Rosen] Add note on why SparkSubmitSuite needs ResetSystemProperties 4742a5b [Josh Rosen] Clarify ResetSystemProperties trait inheritance ordering. 0eaf0b6 [Josh Rosen] Remove setProperty call in TaskResultGetterSuite. 7a3d224 [Josh Rosen] Fix trait ordering 3fdb554 [Josh Rosen] Remove setProperty call in TaskSchedulerImplSuite bee20df [Josh Rosen] Remove setProperty calls in SparkContextSchedulerCreationSuite 655587c [Josh Rosen] Remove setProperty calls in JobCancellationSuite 3f2f955 [Josh Rosen] Remove System.setProperty calls in DistributedSuite cfe9cce [Josh Rosen] Remove use of system properties in SparkContextSuite 8783ab0 [Josh Rosen] Remove TestUtils.setSystemProperty, since it is subsumed by the ResetSystemProperties trait. 633a84a [Josh Rosen] Remove use of system properties in FileServerSuite 25bfce2 [Josh Rosen] Use ResetSystemProperties in UtilsSuite 1d1aa5a [Josh Rosen] Use ResetSystemProperties in SizeEstimatorSuite dd9492b [Josh Rosen] Use ResetSystemProperties in AkkaUtilsSuite b0daff2 [Josh Rosen] Use ResetSystemProperties in BlockManagerSuite e9ded62 [Josh Rosen] Use ResetSystemProperties in TaskSchedulerImplSuite 5b3cb54 [Josh Rosen] Use ResetSystemProperties in SparkListenerSuite 0995c4b [Josh Rosen] Use ResetSystemProperties in SparkContextSchedulerCreationSuite c83ded8 [Josh Rosen] Use ResetSystemProperties in SparkConfSuite 51aa870 [Josh Rosen] Use withSystemProperty in ShuffleSuite 60a63a1 [Josh Rosen] Use ResetSystemProperties in JobCancellationSuite 14a92e4 [Josh Rosen] Use withSystemProperty in FileServerSuite 628f46c [Josh Rosen] Use ResetSystemProperties in DistributedSuite 9e3e0dd [Josh Rosen] Add ResetSystemProperties test fixture mixin; use it in SparkSubmitSuite. 4dcea38 [Josh Rosen] Move withSystemProperty to TestUtils class. (cherry picked from commit 352ed6b) Signed-off-by: Josh Rosen <joshrosen@databricks.com> Conflicts: core/src/test/scala/org/apache/spark/ShuffleSuite.scala core/src/test/scala/org/apache/spark/SparkConfSuite.scala core/src/test/scala/org/apache/spark/SparkContextSchedulerCreationSuite.scala core/src/test/scala/org/apache/spark/SparkContextSuite.scala core/src/test/scala/org/apache/spark/storage/BlockManagerSuite.scala core/src/test/scala/org/apache/spark/util/UtilsSuite.scala external/flume/src/test/java/org/apache/spark/streaming/LocalJavaStreamingContext.java external/mqtt/src/test/java/org/apache/spark/streaming/LocalJavaStreamingContext.java external/twitter/src/test/java/org/apache/spark/streaming/LocalJavaStreamingContext.java external/zeromq/src/test/java/org/apache/spark/streaming/LocalJavaStreamingContext.java tools/src/main/scala/org/apache/spark/tools/StoragePerfTester.scala
Configuration menu - View commit details
-
Copy full SHA for babcafa - Browse repository at this point
Copy the full SHA babcafaView commit details -
[SPARK-4298][Core] - The spark-submit cannot read Main-Class from Man…
…ifest. Resolves a bug where the `Main-Class` from a .jar file wasn't being read in properly. This was caused by the fact that the `primaryResource` object was a URI and needed to be normalized through a call to `.getPath` before it could be passed into the `JarFile` object. Author: Brennon York <brennon.york@capitalone.com> Closes apache#3561 from brennonyork/SPARK-4298 and squashes the following commits: 5e0fce1 [Brennon York] Use string interpolation for error messages, moved comment line from original code to above its necessary code segment 14daa20 [Brennon York] pushed mainClass assignment into match statement, removed spurious spaces, removed { } from case statements, removed return values c6dad68 [Brennon York] Set case statement to support multiple jar URI's and enabled the 'file' URI to load the main-class 8d20936 [Brennon York] updated to reset the error message back to the default a043039 [Brennon York] updated to split the uri and jar vals 8da7cbf [Brennon York] fixes SPARK-4298 (cherry picked from commit 8e14c5e) Signed-off-by: Josh Rosen <joshrosen@databricks.com> Conflicts: core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala
Configuration menu - View commit details
-
Copy full SHA for 08d4f70 - Browse repository at this point
Copy the full SHA 08d4f70View commit details -
[HOTFIX] Disable Spark UI in SparkSubmitSuite tests
This should fix a major cause of build breaks when running many parallel tests.
Configuration menu - View commit details
-
Copy full SHA for 1034707 - Browse repository at this point
Copy the full SHA 1034707View commit details
Commits on Jan 1, 2015
-
[SPARK-5035] [Streaming] ReceiverMessage trait should extend Serializ…
…able Spark Streaming's ReceiverMessage trait should extend Serializable in order to fix a subtle bug that only occurs when running on a real cluster: If you attempt to send a fire-and-forget message to a remote Akka actor and that message cannot be serialized, then this seems to lead to more-or-less silent failures. As an optimization, Akka skips message serialization for messages sent within the same JVM. As a result, Spark's unit tests will never fail due to non-serializable Akka messages, but these will cause mostly-silent failures when running on a real cluster. Before this patch, here was the code for ReceiverMessage: ``` /** Messages sent to the NetworkReceiver. */ private[streaming] sealed trait ReceiverMessage private[streaming] object StopReceiver extends ReceiverMessage ``` Since ReceiverMessage does not extend Serializable and StopReceiver is a regular `object`, not a `case object`, StopReceiver will throw serialization errors. As a result, graceful receiver shutdown is broken on real clusters (and local-cluster mode) but works in local modes. If you want to reproduce this, try running the word count example from the Streaming Programming Guide in the Spark shell: ``` import org.apache.spark._ import org.apache.spark.streaming._ import org.apache.spark.streaming.StreamingContext._ val ssc = new StreamingContext(sc, Seconds(10)) // Create a DStream that will connect to hostname:port, like localhost:9999 val lines = ssc.socketTextStream("localhost", 9999) // Split each line into words val words = lines.flatMap(_.split(" ")) import org.apache.spark.streaming.StreamingContext._ // Count each word in each batch val pairs = words.map(word => (word, 1)) val wordCounts = pairs.reduceByKey(_ + _) // Print the first ten elements of each RDD generated in this DStream to the console wordCounts.print() ssc.start() Thread.sleep(10000) ssc.stop(true, true) ``` Prior to this patch, this would work correctly in local mode but fail when running against a real cluster (it would report that some receivers were not shut down). Author: Josh Rosen <joshrosen@databricks.com> Closes apache#3857 from JoshRosen/SPARK-5035 and squashes the following commits: 71d0eae [Josh Rosen] [SPARK-5035] ReceiverMessage trait should extend Serializable. (cherry picked from commit fe6efac) Signed-off-by: Tathagata Das <tathagata.das1565@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for 61eb9be - Browse repository at this point
Copy the full SHA 61eb9beView commit details -
[HOTFIX] Bind web UI to ephemeral port in DriverSuite
The job launched by DriverSuite should bind the web UI to an ephemeral port, since it looks like port contention in this test has caused a large number of Jenkins failures when many builds are started simultaneously. Our tests already disable the web UI, but this doesn't affect subprocesses launched by our tests. In this case, I've opted to bind to an ephemeral port instead of disabling the UI because disabling features in this test may mask its ability to catch certain bugs. See also: e24d3a9 Author: Josh Rosen <joshrosen@databricks.com> Closes apache#3873 from JoshRosen/driversuite-webui-port and squashes the following commits: 48cd05c [Josh Rosen] [HOTFIX] Bind web UI to ephemeral port in DriverSuite. (cherry picked from commit 0128398) Signed-off-by: Josh Rosen <joshrosen@databricks.com>
Configuration menu - View commit details
-
Copy full SHA for c532acf - Browse repository at this point
Copy the full SHA c532acfView commit details
Commits on Jan 4, 2015
-
[SPARK-4787] Stop SparkContext if a DAGScheduler init error occurs
Author: Dale <tigerquoll@outlook.com> Closes apache#3809 from tigerquoll/SPARK-4787 and squashes the following commits: 5661e01 [Dale] [SPARK-4787] Ensure that call to stop() doesn't lose the exception by using a finally block. 2172578 [Dale] [SPARK-4787] Stop context properly if an exception occurs during DAGScheduler initialization. (cherry picked from commit 3fddc94) Signed-off-by: Josh Rosen <joshrosen@databricks.com>
Configuration menu - View commit details
-
Copy full SHA for 6d1ca23 - Browse repository at this point
Copy the full SHA 6d1ca23View commit details
Commits on Jan 7, 2015
-
[SPARK-5132][Core]Correct stage Attempt Id key in stageInfofromJson
SPARK-5132: stageInfoToJson: Stage Attempt Id stageInfoFromJson: Attempt Id Author: hushan[胡珊] <hushan@xiaomi.com> Closes apache#3932 from suyanNone/json-stage and squashes the following commits: 41419ab [hushan[胡珊]] Correct stage Attempt Id key in stageInfofromJson (cherry picked from commit d345ebe) Signed-off-by: Josh Rosen <joshrosen@databricks.com>
Configuration menu - View commit details
-
Copy full SHA for 55325af - Browse repository at this point
Copy the full SHA 55325afView commit details
Commits on Jan 9, 2015
-
Configuration menu - View commit details
-
Copy full SHA for c07a691 - Browse repository at this point
Copy the full SHA c07a691View commit details -
Configuration menu - View commit details
-
Copy full SHA for 677281e - Browse repository at this point
Copy the full SHA 677281eView commit details -
Configuration menu - View commit details
-
Copy full SHA for 470f026 - Browse repository at this point
Copy the full SHA 470f026View commit details -
Configuration menu - View commit details
-
Copy full SHA for 99910b1 - Browse repository at this point
Copy the full SHA 99910b1View commit details