Skip to content

Commit

Permalink
Merge branch 'master' into linasn/bootstrap-profiler
Browse files Browse the repository at this point in the history
* master:
  [DOCS] Update to cluster docs (#3084)
  [dbnode][coordinator] Ensure docs limit is propagated for search and aggregate RPCs (#3108)
  [query] Take bounds into account for list endpoints (#3110)
  Add warning to changing blocksize (#3096)
  Add support for dynamic query limit overriding (#3090)
  [tests] test setups exported to allow us to use it from other packages (#3042)
  [query] Implemented Graphite's pow function (#3048)
  [dbnode] Direct conversion of encoded tags to doc.Metadata (#3087)
  [tests] Skip flaky TestWatchNoLeader (#3106)
  [dbnode] Faster search of tag bytes in convert.FromSeriesIDAndTags (#3075)
  Replace bytes.Compare() == 0 with bytes.Equal() (#3101)
  Capture seekerMgr instead Rlock (#3104)
  [m3db] Check bloom filter before stream request allocation (#3103)
  • Loading branch information
soundvibe committed Jan 22, 2021
2 parents f21f851 + 1125eb4 commit f700c04
Show file tree
Hide file tree
Showing 67 changed files with 3,857 additions and 551 deletions.
1 change: 1 addition & 0 deletions go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -110,6 +110,7 @@ require (
golang.org/x/sys v0.0.0-20201009025420-dfb3f7c4e634
golang.org/x/tools v0.0.0-20201013201025-64a9e34f3752
google.golang.org/grpc v1.29.1
google.golang.org/protobuf v1.23.0
gopkg.in/go-ini/ini.v1 v1.57.0 // indirect
gopkg.in/go-playground/assert.v1 v1.2.1 // indirect
gopkg.in/go-playground/validator.v9 v9.7.0
Expand Down
56 changes: 1 addition & 55 deletions site/content/cluster/binaries_cluster.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,30 +10,7 @@ This guide shows you the steps involved in creating an M3 cluster using M3 binar
This guide assumes you have read the [quickstart](/docs/quickstart/binaries), and builds upon the concepts in that guide.
{{% /notice %}}

## M3 Architecture

Here's a typical M3 deployment:

<!-- TODO: Update image -->

![Typical Deployment](/cluster_architecture.png)

An M3 deployment typically has two main node types:

- **Coordinator node**: `m3coordinator` nodes coordinate reads and writes across all nodes in the cluster. It's a lightweight process, and does not store any data. This role typically runs alongside a Prometheus instance, or is part of a collector agent such as statsD.
- **Storage node**: The `m3dbnode` processes are the workhorses of M3, they store data and serve reads and writes.

A `m3coordinator` node exposes two ports:

- `7201` to manage the cluster topology, you make most API calls to this endpoint
- `7203` for Prometheus to scrape the metrics produced by M3DB and M3Coordinator

## Prerequisites

M3 uses [etcd](https://etcd.io/) as a distributed key-value storage for the following functions:

- Update cluster configuration in realtime
- Manage placements for distributed and sharded clusters
{{< fileinclude file="cluster-architecture.md" >}}

## Download and Install a Binary

Expand All @@ -52,8 +29,6 @@ You can download the latest release as [pre-compiled binaries from the M3 GitHub

## Provision a Host

Enough background, let's create a real cluster!

M3 in production can run on local or cloud-based VMs, or bare-metal servers. M3 supports all popular Linux distributions (Ubuntu, RHEL, CentOS), and [let us know](https://github.com/m3db/m3/issues/new/choose) if you have any issues with your preferred distribution.

### Network
Expand Down Expand Up @@ -236,35 +211,6 @@ curl -X POST {{% apiendpoint %}}database/create -d '{

If you need to setup multiple namespaces, you can run the command above multiple times with different namespace configurations.

### Ready a Namespace
<!-- TODO: Why?> -->
Once a namespace has finished bootstrapping, you must mark it as ready before receiving traffic by using the _{{% apiendpoint %}}namespace/ready_.

{{< tabs name="ready_namespaces" >}}
{{% tab name="Command" %}}

{{< codeinclude file="docs/includes/quickstart/ready-namespace.sh" language="shell" >}}

{{% /tab %}}
{{% tab name="Output" %}}

```json
{
"ready": true
}
```

{{% /tab %}}
{{< /tabs >}}

### Replication factor

We recommend a replication factor of **3**, with each replica spread across failure domains such as a physical server rack, data center or availability zone. Read our [replication factor recommendations](/docs/operational_guide/replication_and_deployment_in_zones) for more details.

### Shards

Read the [placement configuration guide](/docs/operational_guide/placement_configuration) to determine the appropriate number of shards to specify.

{{< fileinclude file="cluster-common-steps.md" >}}

<!-- ## Next Steps
Expand Down
288 changes: 279 additions & 9 deletions site/content/cluster/kubernetes_cluster.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,28 +14,30 @@ This guide assumes you have read the [quickstart](/docs/quickstart/docker), and
We recommend you use [our Kubernetes operator](/docs/operator/operator) to deploy M3 to a cluster. It is a more streamlined setup that uses [custom resource definitions](https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/) to automatically handle operations such as managing cluster placements.
{{% /notice %}}

{{< fileinclude file="cluster-architecture.md" >}}

## Prerequisites

- A running Kubernetes cluster.
- For local testing, you can use [minikube](https://kubernetes.io/docs/tasks/tools/install-minikube/), [Docker desktop](https://www.docker.com/products/docker-desktop), or [we have a script](https://raw.githubusercontent.com/m3db/m3db-operator/master/scripts/kind-create-cluster.sh) you can use to start a 3 node cluster with [Kind](https://kind.sigs.k8s.io/docs/user/quick-start/).

## Create An etcd Cluster
{{% notice note %}}
The rest of this guide uses minikube, you may need to change some of the steps to suit your local cluster.
{{% /notice %}}

## Create an etcd Cluster

M3 stores its cluster placements and runtime metadata in [etcd](https://etcd.io) and needs a running cluster to communicate with.

We have example services and stateful sets you can use, but feel free to use your own configuration and change any later instructions accordingly.

```shell
kubectl apply -f https://raw.githubusercontent.com/m3db/m3db-operator/master/example/etcd/etcd-minikube.yaml
kubectl apply -f https://raw.githubusercontent.com/m3db/m3db-operator/master/example/etcd/etcd-basic.yaml
```

If the etcd cluster is running on your local machine, update your _/etc/hosts_ file to match the domains specified in the `etcd` `--initial-cluster` argument. For example to match the `StatefulSet` declaration in the _etcd-minikube.yaml_ above, that is:

```text
$(minikube ip) etcd-0.etcd
$(minikube ip) etcd-1.etcd
$(minikube ip) etcd-2.etcd
```
{{% notice tip %}}
Depending on what you use to run a cluster on your local machine, you may need to update your _/etc/hosts_ file to match the domains specified in the `etcd` `--initial-cluster` argument. For example to match the `StatefulSet` declaration in the _etcd-minikube.yaml_ above, these are `etcd-0.etcd`, `etcd-1.etcd`, and `etcd-2.etcd`.
{{% /notice %}}

Verify that the cluster is running with something like the Kubernetes dashboard, or the command below:

Expand Down Expand Up @@ -83,4 +85,272 @@ kubectl delete m3dbcluster simple-cluster

By default, the operator uses [finalizers](https://kubernetes.io/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions/#finalizers) to delete the placement and namespaces associated with a cluster before the custom resources. If you do not want this behavior, set `keepEtcdDataOnDelete` to `true` in the cluster configuration.

<!-- TODO: Placement, same as Binaries? -->


## Organizing Data with Placements and Namespaces

A time series database (TSDBs) typically consist of one node (or instance) to store metrics data. This setup is simple to use but has issues with scalability over time as the quantity of metrics data written and read increases.

As a distributed TSDB, M3 helps solve this problem by spreading metrics data, and demand for that data, across multiple nodes in a cluster. M3 does this by splitting data into segments that match certain criteria (such as above a certain value) across nodes into shards.

<!-- TODO: Find an image -->

If you've worked with a distributed database before, then these concepts are probably familiar to you, but M3 uses different terminology to represent some concepts.

- Every cluster has **one** placement that maps shards to nodes in the cluster.
- A cluster can have **0 or more** namespaces that are similar conceptually to tables in other databases, and each node serves every namespace for the shards it owns.

<!-- TODO: Image -->

For example, if the cluster placement states that node A owns shards 1, 2, and 3, then node A owns shards 1, 2, 3 for all configured namespaces in the cluster. Each namespace has its own configuration options, including a name and retention time for the data.

## Create a Placement and Namespace

This quickstart uses the _{{% apiendpoint %}}database/create_ endpoint that creates a namespace, and the placement if it doesn't already exist based on the `type` argument.

You can create [placements](/docs/operational_guide/placement_configuration/) and [namespaces](/docs/operational_guide/namespace_configuration/#advanced-hard-way) separately if you need more control over their settings.

In another terminal, use the following command.

{{< tabs name="create_placement_namespace" >}}
{{< tab name="Command" >}}

{{< codeinclude file="docs/includes/create-database.sh" language="shell" >}}

{{< /tab >}}
{{% tab name="Output" %}}

```json
{
"namespace": {
"registry": {
"namespaces": {
"default": {
"bootstrapEnabled": true,
"flushEnabled": true,
"writesToCommitLog": true,
"cleanupEnabled": true,
"repairEnabled": false,
"retentionOptions": {
"retentionPeriodNanos": "43200000000000",
"blockSizeNanos": "1800000000000",
"bufferFutureNanos": "120000000000",
"bufferPastNanos": "600000000000",
"blockDataExpiry": true,
"blockDataExpiryAfterNotAccessPeriodNanos": "300000000000",
"futureRetentionPeriodNanos": "0"
},
"snapshotEnabled": true,
"indexOptions": {
"enabled": true,
"blockSizeNanos": "1800000000000"
},
"schemaOptions": null,
"coldWritesEnabled": false,
"runtimeOptions": null
}
}
}
},
"placement": {
"placement": {
"instances": {
"m3db_local": {
"id": "m3db_local",
"isolationGroup": "local",
"zone": "embedded",
"weight": 1,
"endpoint": "127.0.0.1:9000",
"shards": [
{
"id": 0,
"state": "INITIALIZING",
"sourceId": "",
"cutoverNanos": "0",
"cutoffNanos": "0"
},
{
"id": 63,
"state": "INITIALIZING",
"sourceId": "",
"cutoverNanos": "0",
"cutoffNanos": "0"
}
],
"shardSetId": 0,
"hostname": "localhost",
"port": 9000,
"metadata": {
"debugPort": 0
}
}
},
"replicaFactor": 1,
"numShards": 64,
"isSharded": true,
"cutoverTime": "0",
"isMirrored": false,
"maxShardSetId": 0
},
"version": 0
}
}
```

{{% /tab %}}
{{< /tabs >}}

Placement initialization can take a minute or two. Once all the shards have the `AVAILABLE` state, the node has finished bootstrapping, and you should see the following messages in the node console output.

<!-- TODO: Fix these timestamps -->

```shell
{"level":"info","ts":1598367624.0117292,"msg":"bootstrap marking all shards as bootstrapped","namespace":"default","namespace":"default","numShards":64}
{"level":"info","ts":1598367624.0301404,"msg":"bootstrap index with bootstrapped index segments","namespace":"default","numIndexBlocks":0}
{"level":"info","ts":1598367624.0301914,"msg":"bootstrap success","numShards":64,"bootstrapDuration":0.049208827}
{"level":"info","ts":1598367624.03023,"msg":"bootstrapped"}
```

You can check on the status by calling the _{{% apiendpoint %}}services/m3db/placement_ endpoint:

{{< tabs name="check_placement" >}}
{{% tab name="Command" %}}

```shell
curl {{% apiendpoint %}}services/m3db/placement | jq .
```

{{% /tab %}}
{{% tab name="Output" %}}

```json
{
"placement": {
"instances": {
"m3db_local": {
"id": "m3db_local",
"isolationGroup": "local",
"zone": "embedded",
"weight": 1,
"endpoint": "127.0.0.1:9000",
"shards": [
{
"id": 0,
"state": "AVAILABLE",
"sourceId": "",
"cutoverNanos": "0",
"cutoffNanos": "0"
},
{
"id": 63,
"state": "AVAILABLE",
"sourceId": "",
"cutoverNanos": "0",
"cutoffNanos": "0"
}
],
"shardSetId": 0,
"hostname": "localhost",
"port": 9000,
"metadata": {
"debugPort": 0
}
}
},
"replicaFactor": 1,
"numShards": 64,
"isSharded": true,
"cutoverTime": "0",
"isMirrored": false,
"maxShardSetId": 0
},
"version": 2
}
```

{{% /tab %}}
{{< /tabs >}}

{{% notice tip %}}
[Read more about the bootstrapping process](/docs/operational_guide/bootstrapping_crash_recovery/).
{{% /notice %}}

### Ready a Namespace
<!-- TODO: Why?> -->
Once a namespace has finished bootstrapping, you must mark it as ready before receiving traffic by using the _{{% apiendpoint %}}services/m3db/namespace/ready_.

{{< tabs name="ready_namespaces" >}}
{{% tab name="Command" %}}

{{% codeinclude file="docs/includes/quickstart/ready-namespace.sh" language="shell" %}}

{{% /tab %}}
{{% tab name="Output" %}}

```json
{
"ready": true
}
```

{{% /tab %}}
{{< /tabs >}}

### View Details of a Namespace

You can also view the attributes of all namespaces by calling the _{{% apiendpoint %}}services/m3db/namespace_ endpoint

{{< tabs name="check_namespaces" >}}
{{% tab name="Command" %}}

```shell
curl {{% apiendpoint %}}services/m3db/namespace | jq .
```

{{% notice tip %}}
Add `?debug=1` to the request to convert nano units in the output into standard units.
{{% /notice %}}

{{% /tab %}}
{{% tab name="Output" %}}

```json
{
"registry": {
"namespaces": {
"default": {
"bootstrapEnabled": true,
"flushEnabled": true,
"writesToCommitLog": true,
"cleanupEnabled": true,
"repairEnabled": false,
"retentionOptions": {
"retentionPeriodNanos": "43200000000000",
"blockSizeNanos": "1800000000000",
"bufferFutureNanos": "120000000000",
"bufferPastNanos": "600000000000",
"blockDataExpiry": true,
"blockDataExpiryAfterNotAccessPeriodNanos": "300000000000",
"futureRetentionPeriodNanos": "0"
},
"snapshotEnabled": true,
"indexOptions": {
"enabled": true,
"blockSizeNanos": "1800000000000"
},
"schemaOptions": null,
"coldWritesEnabled": false,
"runtimeOptions": null
}
}
}
}
```

{{% /tab %}}
{{< /tabs >}}

{{< fileinclude file="cluster-common-steps.md" >}}
Loading

0 comments on commit f700c04

Please sign in to comment.