aws · jamesls · Feb 25, 2015 · Feb 24, 2015 · Feb 25, 2015 · kyleknap
diff --git a/awscli/topics/s3-config.rst b/awscli/topics/s3-config.rst
@@ -0,0 +1,136 @@
+:title: AWS CLI S3 Configuration
+:description: Advanced configuration for AWS S3 Commands
+:category: S3
+:related command: s3 cp, s3 sync, s3 mv, s3 rm
+
+The ``aws s3`` transfer commands, which include the ``cp``, ``sync``, ``mv``,
+and ``rm`` commands, have additional configuration values you can use to
+control S3 transfers.  This topic guide discusses these parameters as well as
+best practices and guidelines for setting these values.
+
+Before discussing the specifics of these values, note that these values are
+entirely optional.  You should be able to use the ``aws s3`` transfer commands
+without having to configure any of these values.  These configuration values
+are provided in the case where you need to modify one of these values, either
+for performance reasons or to account for the specific environment where these
+``aws s3`` commands are being run.
+
+
+Configuration Values
+====================
+
+These are the configuration values you can set for S3:
+
+* ``max_concurrent_requests`` - The maximum number of concurrent requests.
+* ``max_queue_size`` - The maximum number of tasks in the task queue.
+* ``multipart_threshold`` - The size threshold the CLI uses for multipart
+  transfers of individual files.
+* ``multipart_chunksize`` - When using multipart transfers, this is the chunk
+  size that the CLI uses for multipart transfers of individual files.
+
+These values must be set under the top level ``s3`` key in the AWS Config File,
+which has a default location of ``~/.aws/config``.  Below is an example
+configuration::
+
+    [profile development]
+    aws_access_key_id=foo
+    aws_secret_access_key=bar
+    s3 =
+      max_concurrent_requests = 20
+      max_queue_size = 10000
+      multipart_threshold = 64MB
+      multipart_chunksize = 16MB
+
+Note that all the S3 configuration values are indented and nested under the top
+level ``s3`` key.
+
+You can also set these values programatically using the ``aws configure set``
+command.  For example, to set the above values for the default profile, you
+could instead run these commands::
+
+    $ aws configure set default.s3.max_concurrent_requests 20
+    $ aws configure set default.s3.max_queue_size 10000
+    $ aws configure set default.s3.multipart_threshold 64MB
+    $ aws configure set default.s3.multipart_chunksize 16MB
+
+
+max_concurrent_requests
+-----------------------
+
+**Default** - ``10``
+
+The ``aws s3`` transfer commands are multithreaded.  At any given time,
+multiple requests to Amazon S3 are in flight.  For example, if you are
+uploading a directory via ``aws s3 cp localdir s3://bucket/ --recursive``, the
+AWS CLI could be uploading the local files ``localdir/file1``,
+``localdir/file2``, and ``localdir/file3`` in parallel.  The
+``max_concurrent_requests`` specifies the maximum number of transfer commands
+that are allowed at any given time.
+
+You may need to change this value for a few reasons:
+
+* Decreasing this value - On some environments, the default of 10 concurrent
+  requests can overwhelm a system.  This may cause connection timeouts or
+  slow the responsiveness of the system.  Lowering this value will make the
+  S3 transfer commands less resource intensive.  The tradeoff is that
+  S3 transfers may take longer to complete.
+* Increasing this value - In some scenarios, you may want the S3 transfers
+  to complete as quickly as possible, using as much network bandwidth
+  as necessary.  In this scenario, the default number of concurrent requests
+  may not be sufficient to utilize all the network bandwidth available.
+  Increasing this value may improve the time it takes to complete an
+  S3 transfer.
+
+
+max_queue_size
+--------------
+
+**Default** - ``1000``
+
+The AWS CLI internally uses a producer consumer model, where we queue up S3
+tasks that are then executed by consumers, which in this case utilize a bound
+thread pool, controlled by ``max_concurrent_requests``.  A task generally maps
+to a single S3 operation.  For example, as task could be a ``PutObjectTask``,
+or a ``GetObjectTask``, or an ``UploadPartTask``.  The enqueuing rate can be
+much faster than the rate at which consumers are executing tasks.  To avoid
+unbounded growth, the task queue size is capped to a specific size.  This
+configuration value changes the value of that maximum number.
+
+You generally will not need to change this value.  This value also corresponds
+to the number of tasks we are aware of that need to be executed.  This means
+that by default we can only see 1000 tasks ahead.  Until the S3 command knows
+the total number of tasks executed, the progress line will show a total of
+``...``.  Increasing this value means that we will be able to more quickly know
+the total number of tasks needed, assuming that the enqueuing rate is quicker
+than the rate of task consumption.  The tradeoff is that a larger max queue
+size will require more memory.
+
+
+multipart_threshold
+-------------------
+
+**Default** - ``8MB``
+
+When uploading, downloading, or copying a file, the S3 commands
+will switch to multipart operations if the file reaches a given
+size threshold.  The ``multipart_threshold`` controls this value.
+You can specify this value in one of two ways:
+
+* The file size in bytes.  For example, ``1048576``.
+* The file size with a size suffix.  You can use ``KB``, ``MB``, ``GB``,
+  ``TB``.  For example: ``10MB``, ``1GB``.  Note that S3 imposes
+  constraints on valid values that can be used for multipart
+  operations.
+
+
+multipart_chunksize
+-------------------
+
+**Default** - ``8MB``
+
+Once the S3 commands have decided to use multipart operations, the
+file is divided into chunks.  This configuration option specifies what
+the chunk size (also referred to as the part size) should be.  This
+value can specified using the same semantics as ``multipart_threshold``,
+that is either as the number of bytes as an integer, or using a size
+suffix.