Skip to content

Commit

Permalink
Document the two new settings
Browse files Browse the repository at this point in the history
  • Loading branch information
Evan Chan committed Apr 3, 2014
1 parent b92752b commit e3c408e
Show file tree
Hide file tree
Showing 2 changed files with 30 additions and 13 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,7 @@ private[spark] class Worker(
// How often worker will clean up old app folders
val CLEANUP_INTERVAL_MILLIS = conf.getLong("spark.worker.cleanup_interval", 60 * 30) * 1000
// TTL for app folders/data; after TTL expires it will be cleaned up
val APP_DATA_RETENTION_SECS = conf.getLong("spark.worker.app_data_ttl", 15 * 24 * 3600)
val APP_DATA_RETENTION_SECS = conf.getLong("spark.worker.app_data_ttl", 7 * 24 * 3600)

// Index into masterUrls that we're currently trying to register with.
var masterIndex = 0
Expand Down
41 changes: 29 additions & 12 deletions docs/configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -161,13 +161,13 @@ Apart from these, the following properties are also available, and may be useful
<td>spark.ui.acls.enable</td>
<td>false</td>
<td>
Whether spark web ui acls should are enabled. If enabled, this checks to see if the user has
Whether spark web ui acls should are enabled. If enabled, this checks to see if the user has
access permissions to view the web ui. See <code>spark.ui.view.acls</code> for more details.
Also note this requires the user to be known, if the user comes across as null no checks
are done. Filters can be used to authenticate and set the user.
</td>
</tr>
<tr>
<tr>
<td>spark.ui.view.acls</td>
<td>Empty</td>
<td>
Expand Down Expand Up @@ -276,10 +276,10 @@ Apart from these, the following properties are also available, and may be useful
<td>spark.serializer.objectStreamReset</td>
<td>10000</td>
<td>
When serializing using org.apache.spark.serializer.JavaSerializer, the serializer caches
objects to prevent writing redundant data, however that stops garbage collection of those
objects. By calling 'reset' you flush that info from the serializer, and allow old
objects to be collected. To turn off this periodic reset set it to a value of <= 0.
When serializing using org.apache.spark.serializer.JavaSerializer, the serializer caches
objects to prevent writing redundant data, however that stops garbage collection of those
objects. By calling 'reset' you flush that info from the serializer, and allow old
objects to be collected. To turn off this periodic reset set it to a value of <= 0.
By default it will reset the serializer every 10,000 objects.
</td>
</tr>
Expand Down Expand Up @@ -375,7 +375,7 @@ Apart from these, the following properties are also available, and may be useful
<td>spark.akka.heartbeat.interval</td>
<td>1000</td>
<td>
This is set to a larger value to disable failure detector that comes inbuilt akka. It can be enabled again, if you plan to use this feature (Not recommended). A larger interval value in seconds reduces network overhead and a smaller value ( ~ 1 s) might be more informative for akka's failure detector. Tune this in combination of `spark.akka.heartbeat.pauses` and `spark.akka.failure-detector.threshold` if you need to. Only positive use case for using failure detector can be, a sensistive failure detector can help evict rogue executors really quick. However this is usually not the case as gc pauses and network lags are expected in a real spark cluster. Apart from that enabling this leads to a lot of exchanges of heart beats between nodes leading to flooding the network with those.
This is set to a larger value to disable failure detector that comes inbuilt akka. It can be enabled again, if you plan to use this feature (Not recommended). A larger interval value in seconds reduces network overhead and a smaller value ( ~ 1 s) might be more informative for akka's failure detector. Tune this in combination of `spark.akka.heartbeat.pauses` and `spark.akka.failure-detector.threshold` if you need to. Only positive use case for using failure detector can be, a sensistive failure detector can help evict rogue executors really quick. However this is usually not the case as gc pauses and network lags are expected in a real spark cluster. Apart from that enabling this leads to a lot of exchanges of heart beats between nodes leading to flooding the network with those.
</td>
</tr>
<tr>
Expand Down Expand Up @@ -430,7 +430,7 @@ Apart from these, the following properties are also available, and may be useful
<td>spark.broadcast.blockSize</td>
<td>4096</td>
<td>
Size of each piece of a block in kilobytes for <code>TorrentBroadcastFactory</code>.
Size of each piece of a block in kilobytes for <code>TorrentBroadcastFactory</code>.
Too large a value decreases parallelism during broadcast (makes it slower); however, if it is too small, <code>BlockManager</code> might take a performance hit.
</td>
</tr>
Expand Down Expand Up @@ -555,28 +555,28 @@ Apart from these, the following properties are also available, and may be useful
the driver.
</td>
</tr>
<tr>
<tr>
<td>spark.authenticate</td>
<td>false</td>
<td>
Whether spark authenticates its internal connections. See <code>spark.authenticate.secret</code> if not
running on Yarn.
</td>
</tr>
<tr>
<tr>
<td>spark.authenticate.secret</td>
<td>None</td>
<td>
Set the secret key used for Spark to authenticate between components. This needs to be set if
not running on Yarn and authentication is enabled.
</td>
</tr>
<tr>
<tr>
<td>spark.core.connection.auth.wait.timeout</td>
<td>30</td>
<td>
Number of seconds for the connection to wait for authentication to occur before timing
out and giving up.
out and giving up.
</td>
</tr>
<tr>
Expand All @@ -586,6 +586,23 @@ Apart from these, the following properties are also available, and may be useful
Number of cores to allocate for each task.
</td>
</tr>
<tr>
<td>spark.worker.cleanup_interval</td>
<td>1800 (30 minutes)</td>
<td>
Controls the interval, in seconds, at which the worker cleans up old application work dirs
on the local machine.
</td>
</tr>
<tr>
<td>spark.worker.app_data_ttl</td>
<td>7 * 24 * 3600 (7 days)</td>
<td>
The number of seconds to retain application work directories on each worker. This is a Time To Live
and should depend on the amount of available disk space you have. Application logs and jars are
downloaded to each application work dir. Over time, the work dirs can quickly fill up disk space,
especially if you run jobs very frequently.
</tr>
</table>

## Viewing Spark Properties
Expand Down

0 comments on commit e3c408e

Please sign in to comment.