Merge pull request #3818 from cockroachdb/replicaton-zones-for-indexes

Document replication zones for secondary indexes
cockroachdb · Oct 4, 2018 · 97b8fa1 · 97b8fa1
2 parents 575f8d6 + 517aa44
commit 97b8fa1
Show file tree

Hide file tree

Showing 10 changed files with 224 additions and 61 deletions.
diff --git a/v2.0/architecture/distribution-layer.md b/v2.0/architecture/distribution-layer.md
@@ -39,7 +39,9 @@ Each node caches values of the `meta2` range it has accessed before, which optim
 
 After the node's meta ranges is the KV data your cluster stores.
 
-This data is broken up into 64MiB sections of contiguous key-space known as ranges. This size represents a sweet spot for us between a size that's small enough to move quickly between nodes, but large enough to store a meaningfully contiguous set of data whose keys are more likely to be accessed together. These ranges are then shuffled around your cluster to ensure survivability.
+Each table and its secondary indexes initially map to a single range, where each key-value pair in the range represents a single row in the table (also called the primary index because the table is sorted by the primary key) or a single row in a secondary index. As soon as a range reaches 64 MiB in size, it splits into two ranges. This process continues as a table and its indexes continue growing. Once a table is split across multiple ranges, it's likely that the table and secondary indexes will be stored in separate ranges. However, a range can still contain data for both the table and a secondary index.
+
+The default 64MiB range size represents a sweet spot for us between a size that's small enough to move quickly between nodes, but large enough to store a meaningfully contiguous set of data whose keys are more likely to be accessed together. These ranges are then shuffled around your cluster to ensure survivability.
 
 These ranges are replicated (in the aptly named Replication Layer), and have the addresses of each replica stored in the `meta2` range.
 

diff --git a/v2.0/configure-replication-zones.md b/v2.0/configure-replication-zones.md
diff --git a/v2.0/create-table.md b/v2.0/create-table.md
@@ -88,7 +88,7 @@ By default, tables are created in the default replication zone but can be placed
 
 ## Row-Level Replication <span class="version-tag">New in v2.0</span>
 
-CockroachDB allows [enterprise users](enterprise-licensing.html) to [define table partitions](partitioning.html), thus providing row-level control of how and where the data is stored. See [Create a Replication Zone for a Table Partition](configure-replication-zones.html#create-a-replication-zone-for-a-table-partition-new-in-v2-0) for more information.
+CockroachDB allows [enterprise users](enterprise-licensing.html) to [define table partitions](partitioning.html), thus providing row-level control of how and where the data is stored. See [Create a Replication Zone for a Table Partition](configure-replication-zones.html#create-a-replication-zone-for-a-table-or-secondary-index-partition-new-in-v2-0) for more information.
 
 {{site.data.alerts.callout_info}}The primary key required for partitioning is different from the conventional primary key. To define the primary key for partitioning, prefix the unique identifier(s) in the primary key with all columns you want to partition and subpartition the table on, in the order in which you want to nest your subpartitions. See <a href=partitioning.html#partition-using-primary-key>Partition using Primary Key</a> for more details.{{site.data.alerts.end}}
 

diff --git a/v2.0/performance-tuning.md b/v2.0/performance-tuning.md
@@ -2,7 +2,6 @@
 title: Performance Tuning
 summary: Essential techniques for getting fast reads and writes in a single- and multi-region CockroachDB deployment.
 toc: true
-
 ---
 
 This tutorial shows you essential techniques for getting fast reads and writes in CockroachDB, starting with a single-region deployment and expanding into multiple regions.
@@ -1777,7 +1776,7 @@ For this service, the most effective technique for improving read and write late
     The `rides` table contains 1 million rows, so dropping this index will take a few minutes.
     {{site.data.alerts.end}}
 
-7. Now [create replication zones](configure-replication-zones.html#create-a-replication-zone-for-a-table-partition-new-in-v2-0) to require city data to be stored on specific nodes based on node locality.
+7. Now [create replication zones](configure-replication-zones.html#create-a-replication-zone-for-a-table-or-secondary-index-partition-new-in-v2-0) to require city data to be stored on specific nodes based on node locality.
 
     City | Locality
     -----|---------

diff --git a/v2.0/recommended-production-settings.md b/v2.0/recommended-production-settings.md
@@ -56,7 +56,7 @@ Term | Definition
 
 - For best resilience:
     - Use many smaller nodes instead of fewer larger ones. Recovery from a failed node is faster when data is spread across more nodes.
-    - Use [zone configs](configure-replication-zones.html) to increase the replication factor from 3 (the default) to 5. This is especially recommended if you are using local disks rather than a cloud providers' network-attached disks that are often replicated underneath the covers, because local disks have a greater risk of failure. You can do this for the [entire cluster](configure-replication-zones.html#edit-the-default-replication-zone) or for specific [databases](configure-replication-zones.html#create-a-replication-zone-for-a-database), [tables](configure-replication-zones.html#create-a-replication-zone-for-a-table), or [rows](configure-replication-zones.html#create-a-replication-zone-for-a-table-partition-new-in-v2-0) (enterprise-only).
+    - Use [zone configs](configure-replication-zones.html) to increase the replication factor from 3 (the default) to 5. This is especially recommended if you are using local disks rather than a cloud providers' network-attached disks that are often replicated underneath the covers, because local disks have a greater risk of failure. You can do this for the [entire cluster](configure-replication-zones.html#edit-the-default-replication-zone) or for specific [databases](configure-replication-zones.html#create-a-replication-zone-for-a-database), [tables](configure-replication-zones.html#create-a-replication-zone-for-a-table), or [rows](configure-replication-zones.html#create-a-replication-zone-for-a-table-or-secondary-index-partition-new-in-v2-0) (enterprise-only).
         {{site.data.alerts.callout_danger}}
         {% include {{page.version.version}}/known-limitations/system-range-replication.md %}
         {{site.data.alerts.end}}

diff --git a/v2.1/architecture/distribution-layer.md b/v2.1/architecture/distribution-layer.md
@@ -10,10 +10,9 @@ The distribution layer of CockroachDB's architecture provides a unified view of
 If you haven't already, we recommend reading the [Architecture Overview](overview.html).
 {{site.data.alerts.end}}
 
-
 ## Overview
 
-To make all data in your cluster accessible from any node, CockroachDB stores data in a monolithic sorted map of key-value pairs. This keyspace describes all of the data in your cluster, as well as its location, and is divided into what we call "ranges", contiguous chunks of the keyspace, so that every key can always be found in a single range.
+To make all data in your cluster accessible from any node, CockroachDB stores data in a monolithic sorted map of key-value pairs. This key-space describes all of the data in your cluster, as well as its location, and is divided into what we call "ranges", contiguous chunks of the key-space, so that every key can always be found in a single range.
 
 CockroachDB implements a sorted map to enable:
 
@@ -41,9 +40,11 @@ Each node caches values of the `meta2` range it has accessed before, which optim
 
 After the node's meta ranges is the KV data your cluster stores.
 
-This data is broken up into 64MiB sections of contiguous key-space known as ranges. This size represents a sweet spot for us between a size that's small enough to move quickly between nodes, but large enough to store a meaningfully contiguous set of data whose keys are more likely to be accessed together. These ranges are then shuffled around your cluster to ensure survivability.
+Each table and its secondary indexes initially map to a single range, where each key-value pair in the range represents a single row in the table (also called the primary index because the table is sorted by the primary key) or a single row in a secondary index. As soon as a range reaches 64 MiB in size, it splits into two ranges. This process continues as a table and its indexes continue growing. Once a table is split across multiple ranges, it's likely that the table and secondary indexes will be stored in separate ranges. However, a range can still contain data for both the table and a secondary index. 
+
+The default 64MiB range size represents a sweet spot for us between a size that's small enough to move quickly between nodes, but large enough to store a meaningfully contiguous set of data whose keys are more likely to be accessed together. These ranges are then shuffled around your cluster to ensure survivability.
 
-These ranges are replicated (in the aptly named replication layer), and have the addresses of each replica stored in the `meta2` range.
+These table ranges are replicated (in the aptly named replication layer), and have the addresses of each replica stored in the `meta2` range.
 
 ### Using the monolithic sorted map