Document the two new settings

apache · Apr 3, 2014 · e3c408e · e3c408e
1 parent b92752b
commit e3c408e
Show file tree

Hide file tree

Showing 2 changed files with 30 additions and 13 deletions.
diff --git a/core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala b/core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala
@@ -67,7 +67,7 @@ private[spark] class Worker(
   // How often worker will clean up old app folders
   val CLEANUP_INTERVAL_MILLIS = conf.getLong("spark.worker.cleanup_interval", 60 * 30) * 1000
   // TTL for app folders/data;  after TTL expires it will be cleaned up
-  val APP_DATA_RETENTION_SECS = conf.getLong("spark.worker.app_data_ttl", 15 * 24 * 3600)
+  val APP_DATA_RETENTION_SECS = conf.getLong("spark.worker.app_data_ttl", 7 * 24 * 3600)
 
   // Index into masterUrls that we're currently trying to register with.
   var masterIndex = 0

diff --git a/docs/configuration.md b/docs/configuration.md
@@ -161,13 +161,13 @@ Apart from these, the following properties are also available, and may be useful
   <td>spark.ui.acls.enable</td>
   <td>false</td>
   <td>
-    Whether spark web ui acls should are enabled. If enabled, this checks to see if the user has 
+    Whether spark web ui acls should are enabled. If enabled, this checks to see if the user has
     access permissions to view the web ui. See <code>spark.ui.view.acls</code> for more details.
     Also note this requires the user to be known, if the user comes across as null no checks
     are done. Filters can be used to authenticate and set the user.
   </td>
 </tr>
-<tr>  
+<tr>
   <td>spark.ui.view.acls</td>
   <td>Empty</td>
   <td>
@@ -276,10 +276,10 @@ Apart from these, the following properties are also available, and may be useful
   <td>spark.serializer.objectStreamReset</td>
   <td>10000</td>
   <td>
-    When serializing using org.apache.spark.serializer.JavaSerializer, the serializer caches 
-    objects to prevent writing redundant data, however that stops garbage collection of those 
-    objects. By calling 'reset' you flush that info from the serializer, and allow old 
-    objects to be collected. To turn off this periodic reset set it to a value of <= 0. 
+    When serializing using org.apache.spark.serializer.JavaSerializer, the serializer caches
+    objects to prevent writing redundant data, however that stops garbage collection of those
+    objects. By calling 'reset' you flush that info from the serializer, and allow old
+    objects to be collected. To turn off this periodic reset set it to a value of <= 0.
     By default it will reset the serializer every 10,000 objects.
   </td>
 </tr>
@@ -375,7 +375,7 @@ Apart from these, the following properties are also available, and may be useful
   <td>spark.akka.heartbeat.interval</td>
   <td>1000</td>
   <td>
-    This is set to a larger value to disable failure detector that comes inbuilt akka. It can be enabled again, if you plan to use this feature (Not recommended). A larger interval value in seconds reduces network overhead and a smaller value ( ~ 1 s) might be more informative for akka's failure detector. Tune this in combination of `spark.akka.heartbeat.pauses` and `spark.akka.failure-detector.threshold` if you need to. Only positive use case for using failure detector can be, a sensistive failure detector can help evict rogue executors really quick. However this is usually not the case as gc pauses and network lags are expected in a real spark cluster. Apart from that enabling this leads to a lot of exchanges of heart beats between nodes leading to flooding the network with those. 
+    This is set to a larger value to disable failure detector that comes inbuilt akka. It can be enabled again, if you plan to use this feature (Not recommended). A larger interval value in seconds reduces network overhead and a smaller value ( ~ 1 s) might be more informative for akka's failure detector. Tune this in combination of `spark.akka.heartbeat.pauses` and `spark.akka.failure-detector.threshold` if you need to. Only positive use case for using failure detector can be, a sensistive failure detector can help evict rogue executors really quick. However this is usually not the case as gc pauses and network lags are expected in a real spark cluster. Apart from that enabling this leads to a lot of exchanges of heart beats between nodes leading to flooding the network with those.
   </td>
 </tr>
 <tr>
@@ -430,7 +430,7 @@ Apart from these, the following properties are also available, and may be useful
   <td>spark.broadcast.blockSize</td>
   <td>4096</td>
   <td>
-    Size of each piece of a block in kilobytes for <code>TorrentBroadcastFactory</code>. 
+    Size of each piece of a block in kilobytes for <code>TorrentBroadcastFactory</code>.
     Too large a value decreases parallelism during broadcast (makes it slower); however, if it is too small, <code>BlockManager</code> might take a performance hit.
   </td>
 </tr>
@@ -555,28 +555,28 @@ Apart from these, the following properties are also available, and may be useful
     the driver.
   </td>
 </tr>
-<tr>  
+<tr>
   <td>spark.authenticate</td>
   <td>false</td>
   <td>
     Whether spark authenticates its internal connections. See <code>spark.authenticate.secret</code> if not
     running on Yarn.
   </td>
 </tr>
-<tr>  
+<tr>
   <td>spark.authenticate.secret</td>
   <td>None</td>
   <td>
     Set the secret key used for Spark to authenticate between components. This needs to be set if
     not running on Yarn and authentication is enabled.
   </td>
 </tr>
-<tr>  
+<tr>
   <td>spark.core.connection.auth.wait.timeout</td>
   <td>30</td>
   <td>
     Number of seconds for the connection to wait for authentication to occur before timing
-    out and giving up. 
+    out and giving up.
   </td>
 </tr>
 <tr>
@@ -586,6 +586,23 @@ Apart from these, the following properties are also available, and may be useful
     Number of cores to allocate for each task.
   </td>
 </tr>
+<tr>
+  <td>spark.worker.cleanup_interval</td>
+  <td>1800 (30 minutes)</td>
+  <td>
+    Controls the interval, in seconds, at which the worker cleans up old application work dirs
+    on the local machine.
+  </td>
+</tr>
+<tr>
+  <td>spark.worker.app_data_ttl</td>
+  <td>7 * 24 * 3600 (7 days)</td>
+  <td>
+    The number of seconds to retain application work directories on each worker.  This is a Time To Live
+    and should depend on the amount of available disk space you have.  Application logs and jars are
+    downloaded to each application work dir.  Over time, the work dirs can quickly fill up disk space,
+    especially if you run jobs very frequently.
+</tr>
 </table>
 
 ## Viewing Spark Properties