elastic · bmorelli25 · Nov 22, 2021 · Nov 5, 2021 · Nov 8, 2021 · Nov 8, 2021
diff --git a/docs/data-streams.asciidoc b/docs/data-streams.asciidoc
@@ -1,4 +1,85 @@
 [[apm-data-streams]]
-== Data streams
+=== Data streams
 
-// to do: fill with content. placeholder for external links for now
+****
+{agent} uses data streams to store append-only time series data across multiple indices.
+Data streams are well-suited for logs, metrics, traces, and other continuously generated data,
+and offer a host of benefits over other indexing strategies:
+
+* Reduced number of fields per index
+* More granular data control
+* Flexible naming scheme
+* Fewer ingest permissions required
+
+See the {fleet-guide}/data-streams.html[Fleet and Elastic Agent Guide] to learn more.
+****
+
+[discrete]
+[[apm-data-streams-naming-scheme]]
+=== Data stream naming scheme
+
+APM data follows the `<type>-<dataset>-<namespace>` naming scheme.
+The `type` and `dataset` are predefined by the APM integration,
+but the `namespace` is your opportunity to customize how different types of data are stored in {es}.
+There is no recommendation for what to use as your namespace--it is intentionally flexible.
+For example, you might create namespaces for each of your environments,
+like `dev`, `prod`, `production`, etc.
+Or, you might create namespaces that correspond to strategic business units within your organization.
+
+[discrete]
+[[apm-data-streams-list]]
+=== APM data streams
+
+By type, the APM data streams are:
+
+Traces::
+
+Traces are comprised of {apm-guide-ref}/data-model.html[spans and transactions].
+Traces are stored in the following data streams:
+
+- Application traces: `traces-apm-<namespace>`
+- RUM and iOS agent application traces: `traces-apm.rum-<namespace>`
+
+Metrics::
+
+Metrics include application-based metrics and basic system metrics.
+Metrics are stored in the following data streams:
+
+- APM internal metrics: `metrics-apm.internal-<namespace>`
+- APM profiling metrics: `metrics-apm.profiling-<namespace>`
+- Application metrics: `metrics-apm.app.<service.name>-<namespace>`
++
+Application metrics include the instrumented service's name--defined in each APM agent's
+configuration--in the data stream name.
+Service names therefore must follow certain index naming rules.
++
+[%collapsible]
+.Service name rules
+====
+* Service names are case-insensitive and must be unique.
+For example, you cannot have a service named `Foo` and another named `foo`.
+* Special characters will be removed from service names and replaced with underscores (`_`).
+Special characters include:
++
+[source,text]
+----
+'\\', '/', '*', '?', '"', '<', '>', '|', ' ', ',', '#', ':', '-'
+----
+====
+
+Logs::
+
+Logs include application error events and application logs.
+Logs are stored in the following data streams:
+
+- APM error/exception logging: `logs-apm.error-<namespace>`
+
+[discrete]
+[[apm-data-streams-next]]
+=== What's next?
+
+* Data streams define not only how data is stored in {es}, but also how data is retained over time.
+See <<ilm-how-to>> to learn how to create your own data retention policies.
+
+* See <<manage-storage>> for information on APM storage and processing costs,
+processing and performance, and other index management features.
diff --git a/docs/how-to.asciidoc b/docs/how-to.asciidoc
@@ -4,20 +4,11 @@
 Learn how to perform common APM configuration and management tasks.
 
 * <<source-map-how-to>>
-* <<ilm-how-to>>
 * <<jaeger-integration>>
 * <<ingest-pipelines>>
-* <<manage-storage>>
-* <<apm-tune-elasticsearch>>
 
 include::./source-map-how-to.asciidoc[]
 
-include::./ilm-how-to.asciidoc[]
-
 include::./jaeger-integration.asciidoc[]
 
 include::./ingest-pipelines.asciidoc[]
-
-include::./manage-storage.asciidoc[]
-
-include::./apm-tune-elasticsearch.asciidoc[]
diff --git a/docs/ilm-how-to.asciidoc b/docs/ilm-how-to.asciidoc
@@ -1,18 +1,164 @@
 [[ilm-how-to]]
-=== Index lifecycle management (ILM)
+=== Index lifecycle management
 
-// todo: add more context and an example
+Index lifecycle policies allow you to automate the
+lifecycle of your APM indices as they grow and age.
+A default policy is applied to each APM data stream,
+but can be customized depending on your business needs.
 
-++++
-<titleabbrev>Customize index lifecycle management</titleabbrev>
-++++
+See {ref}/index-lifecycle-management.html[ILM: Manage the index lifecycle] to learn more.
 
-The index lifecycle management (ILM) feature in {es} allows you to automate the
-lifecycle of your APM Server indices as they grow and age.
-ILM is enabled by default, and a default policy is applied to all APM indices.
+[discrete]
+[[index-lifecycle-policies-default]]
+=== Default policies
 
-To view and edit these index lifecycle policies in {kib},
-select *Stack Management* / *Index Lifecycle Management*.
-Search for `apm`.
+The table below describes the default index lifecycle policy applied to each APM data stream.
+Each policy includes a rollover and delete definition:
 
-See {ref}/getting-started-index-lifecycle-management.html[manage the index lifecycle] for more information.
+* **Rollover**: Using rollover indices prevents a single index from growing too large and optimizes indexing and search performance. Rollover, i.e. writing to a new index, occurs after either an age or size metric is met.
+* **Delete**: The delete phase permanently removes the index after a time threshold is met.
+
+[cols="1,1,1",options="header"]
+|===
+|Data stream
+|Rollover after
+|Delete after
+
+|`traces-apm`
+|30 days / 50 gb
+|10 days
+
+|`traces-apm.rum`
+|30 days / 50 gb
+|90 days
+
+|`metrics-apm.profiling`
+|30 days / 50 gb
+|10 days
+
+|`metrics-apm.internal`
+|30 days / 50 gb
+|90 days
+
+|`metrics-apm.app`
+|30 days / 50 gb
+|90 days
+
+|`logs-apm.error`
+|30 days / 50 gb
+|10 days
+
+|===
+
+The APM index lifecycle policies can be viewed in {kib}.
+Navigate to *Stack Management* / *Index Lifecycle Management*, and search for `apm`.
+
+[discrete]
+[[data-streams-custom-policy]]
+=== Configure a custom index lifecycle policy
+
+When the APM package is installed, Fleet creates a default `*@custom` component template for each data stream.
+The easiest way to configure a custom index lifecycle policy per data stream is to edit this template.
+
+This tutorial explains how to apply a custom index lifecycle policy to the `traces-apm` data stream.
+
+[discrete]
+[[data-streams-custom-one]]
+=== Step 1: View data streams
+
+The **Data Streams** view in {kib} shows you the data streams,
+index templates, and index lifecycle policies associated with a given integration.
+
+. Navigate to **Stack Management** > **Index Management** > **Data Streams**.
+. Search for `traces-apm` to see all data streams associated with APM trace data.
+. In this example, I only have one data stream because I'm only using the `default` namespace.
+You may have more if your setup includes multiple namespaces.
++
+[role="screenshot"]
+image::images/data-stream-overview.png[Data streams info]
+
+[discrete]
+[[data-streams-custom-two]]
+=== Step 2: Create an index lifecycle policy
+
+. Navigate to **Stack Management** > **Index Lifecycle Policies**.
+. Click **Create policy**.
+
+Name your new policy; For this tutorial, I've chosen `custom-traces-apm-policy`.
+Customize the policy to your liking, and when you're done, click **Save policy**.
+
+[discrete]
+[[data-streams-custom-three]]
+=== Step 3: Apply the index lifecycle policy
+
+To apply your new index lifecylce policy to the `traces-apm-*` data stream,
+edit the `<data-stream-name>@custom` component template.
+
+. Click on the **Component Template** tab and search for `traces-apm`.
+. Select the `traces-apm@custom` template and click **Manage** > **Edit**.
+. Under **Index settings**, set the ILM policy name created in the previous step:
++
+[source,json]
+----
+{
+  "lifecycle": {
+    "name": "custom-traces-apm-policy"
+  }
+}
+----
+. Continue to **Review** and ensure your request looks similar to the image below.
+If it does, click **Create component template**.
++
+[role="screenshot"]
+image::images/create-component-template.png[Create component template]
+
+[discrete]
+[[data-streams-custom-four]]
+=== Step 4: Roll over the data stream (optional)
+
+To confirm that the data stream is now using the new index template and ILM policy,
+you can either repeat <<data-streams-custom-one,step one>>, or navigate to **Dev Tools ** and run the following:
+
+[source,bash]
+----
+GET /_data_stream/traces-apm-default <1>
+----
+<1> The name of the data stream we've been hacking on appended with your <namespace>
+
+The result should include the following:
+
+[source,json]
+----
+{
+  "data_streams" : [
+    {
+      ...
+      "template" : "traces-apm-default", <1>
+      "ilm_policy" : "custom-traces-apm-policy", <2>
+      ...
+    }
+  ]
+}
+----
+<1> The name of the custom index template created in step three
+<2> The name of the ILM policy applied to the new component template in step two
+
+New ILM policies only take effect when new indices are created,
+so you either must wait for a rollover to occur (usually after 30 days or when the index size reaches 50GB),
+or force a rollover using the {ref}/indices-rollover-index.html[{es} rollover API]:
+
+[source,bash]
+----
+POST /traces-apm-default/_rollover/
+----
+
+[discrete]
+[[data-streams-custom-policy-namespace]]
+=== Namespace-level index lifecycle policies
+
+It is also possible to create more granular index lifecycle policies that apply to individual namespaces.
+This process is similar to the above tutorial, but includes cloning and modify the existing index template to use
+a new `*@custom` component template.
+
+For more information on this process, see
+{fleet-guide}/data-streams.html#data-streams-ilm-tutorial[Tutorial: Customize data retention for integrations]
diff --git a/docs/images/create-component-template.png b/docs/images/create-component-template.png
diff --git a/docs/images/data-stream-overview.png b/docs/images/data-stream-overview.png
diff --git a/docs/integrations-index.asciidoc b/docs/integrations-index.asciidoc
@@ -28,9 +28,9 @@ include::features.asciidoc[]
 
 include::how-to.asciidoc[]
 
-include::input-apm.asciidoc[]
+include::manage-storage.asciidoc[]
 
-include::data-streams.asciidoc[]
+include::input-apm.asciidoc[]
 
 include::secure-agent-communication.asciidoc[]
 

diff --git a/docs/manage-storage.asciidoc b/docs/manage-storage.asciidoc
@@ -1,13 +1,17 @@
 [[manage-storage]]
-=== Manage storage
+== Manage storage
 
-* <<storage-guide>>
-* <<processing-and-performance>>
-* <<reduce-apm-storage>>
-* <<manage-indices-in-kibana>>
-* <<update-data>>
+{agent} uses <<apm-data-streams,data streams>> to store time series data across multiple indices.
+Each data stream ships with a customizable <<ilm-how-to,index lifecycle policy>> that automates data retention as your indices grow and age.
+
+The <<storage-guide,storage and sizing guide>> attempts to define a "typical" storage reference for Elastic APM,
+and there are additional settings you can tweak to <<reduce-apm-storage,reduce storage>>,
+or to <<apm-tune-elasticsearch,tune data ingestion in Elasticsearch>>.
+
+include::./data-streams.asciidoc[]
+
+include::./ilm-how-to.asciidoc[]
 
-[float]
 [[storage-guide]]
 === Storage and sizing guide
 
@@ -71,53 +75,6 @@ APM data compresses quite well, so the storage cost in Elasticsearch will be con
 
 NOTE: These examples were indexing the same data over and over with minimal variation. Because of that, the compression ratios observed of 80-90% are somewhat optimistic.
 
-[float]
-[[processing-and-performance]]
-=== Processing and performance
-
-APM Server performance depends on a number of factors: memory and CPU available,
-network latency, transaction sizes, workload patterns,
-agent and server settings, versions, and protocol.
-
-Let's look at a simple example that makes the following assumptions:
-
-* The load is generated in the same region as where APM Server and Elasticsearch are deployed.
-* We're using the default settings in cloud.
-* A small number of agents are reporting.
-
-This leaves us with relevant variables like payload and instance sizes.
-See the table below for approximations.
-As a reminder, events are
-<<data-model-transactions,transactions>> and
-<<data-model-spans,spans>>.
-
-[options="header"]
-|=======================================================================
-|Transaction/Instance |512Mb Instance |2Gb Instance |8Gb Instance
-|Small transactions
-
-_5 spans with 5 stack frames each_ |600 events/second |1200 events/second |4800 events/second
-|Medium transactions
-
-_15 spans with 15 stack frames each_ |300 events/second |600 events/second |2400 events/second
-|Large transactions
-
-_30 spans with 30 stack frames each_ |150 events/second |300 events/second |1400 events/second
-|=======================================================================
-
-In other words, a 512 Mb instance can process \~3 Mbs per second,
-while an 8 Gb instance can process ~20 Mbs per second.
-
-APM Server is CPU bound, so it scales better from 2 Gb to 8 Gb than it does from 512 Mb to 2 Gb.
-This is because larger instance types in Elastic Cloud come with much more computing power.
-
-Don't forget that the APM Server is stateless.
-Several instances running do not need to know about each other.
-This means that with a properly sized Elasticsearch instance, APM Server scales out linearly.
-
-NOTE: RUM deserves special consideration. The RUM agent runs in browsers, and there can be many thousands reporting to an APM Server with very variable network latency.
-
-[float]
 [[reduce-apm-storage]]
 === Reduce storage
 
@@ -212,3 +169,5 @@ POST *-apm-*/_update_by_query?expand_wildcards=all
 // CONSOLE
 
 TIP: Remember to also change the service name in the {apm-agents-ref}/index.html[APM agent configuration].
+
+include::./apm-tune-elasticsearch.asciidoc[]