diff --git a/deploy/aws/.gitignore b/deploy/aws/.gitignore index 2fa90f5444..e460c42302 100644 --- a/deploy/aws/.gitignore +++ b/deploy/aws/.gitignore @@ -3,3 +3,4 @@ credentials/ terraform.tfstate terraform.tfstate.backup .terraform.tfstate.lock.info +kubeconfig_*.yaml \ No newline at end of file diff --git a/deploy/aws/README.md b/deploy/aws/README.md index 65aaecb9a2..5068e0ec94 100644 --- a/deploy/aws/README.md +++ b/deploy/aws/README.md @@ -41,10 +41,10 @@ Before deploying a TiDB cluster on AWS EKS, make sure the following requirements The default setup will create a new VPC and a t2.micro instance as bastion machine, and an EKS cluster with the following ec2 instances as worker nodes: -* 3 m5d.xlarge instances for PD -* 3 i3.2xlarge instances for TiKV -* 2 c4.4xlarge instances for TiDB -* 1 c5.xlarge instance for monitor +* 3 m5.large instances for PD +* 3 c5d.4xlarge instances for TiKV +* 2 c5.4xlarge instances for TiDB +* 1 c5.2xlarge instance for monitor Use the following commands to set up the cluster: @@ -76,7 +76,7 @@ monitor_endpoint = http://abd299cc47af411e98aae02938da0762-1989524000.us-east-2. region = us-east-2 tidb_dns = abd2e3f7c7af411e98aae02938da0762-17499b76b312be02.elb.us-east-2.amazonaws.com tidb_port = 4000 -tidb_version = v3.0.0-rc.1 +tidb_version = v3.0.0 ``` > **Note:** You can use the `terraform output` command to get the output again. @@ -86,7 +86,7 @@ tidb_version = v3.0.0-rc.1 To access the deployed TiDB cluster, use the following commands to first `ssh` into the bastion machine, and then connect it via MySQL client (replace the `<>` parts with values from the output): ``` shell -ssh -i credentials/k8s-prod-.pem ec2-user@ +ssh -i credentials/.pem ec2-user@ mysql -h -P -u root ``` @@ -118,12 +118,12 @@ The initial Grafana login credentials are: To upgrade the TiDB cluster, edit the `variables.tf` file with your preferred text editor and modify the `tidb_version` variable to a higher version, and then run `terraform apply`. -For example, to upgrade the cluster to version 3.0.0-rc.1, modify the `tidb_version` to `v3.0.0-rc.2`: +For example, to upgrade the cluster to version 3.0.0-rc.1, modify the `tidb_version` to `v3.0.0`: ``` variable "tidb_version" { description = "tidb cluster version" - default = "v3.0.0-rc.2" + default = "v3.0.0" } ``` @@ -131,12 +131,12 @@ For example, to upgrade the cluster to version 3.0.0-rc.1, modify the `tidb_vers ## Scale -To scale the TiDB cluster, edit the `variables.tf` file with your preferred text editor and modify the `tikv_count` or `tidb_count` variable to your desired count, and then run `terraform apply`. +To scale the TiDB cluster, edit the `variables.tf` file with your preferred text editor and modify the `default_cluster_tikv_count` or `default_cluster_tidb_count` variable to your desired count, and then run `terraform apply`. For example, to scale out the cluster, you can modify the number of TiDB instances from 2 to 3: ``` - variable "tidb_count" { + variable "default_cluster_tidb_count" { default = 4 } ``` @@ -145,7 +145,7 @@ For example, to scale out the cluster, you can modify the number of TiDB instanc ## Customize -You can change default values in `variables.tf` (such as the cluster name and image versions) as needed. +You can change default values in `variables.tf` (such as the default cluster name and image versions) as needed. ### Customize AWS related resources @@ -159,12 +159,91 @@ The TiDB version and component count are also configurable in variables.tf, you Currently, the instance type of TiDB cluster component is not configurable because PD and TiKV relies on [NVMe SSD instance store](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ssd-instance-store.html), different instance types have different disks. -### Customize TiDB parameters +### Customize TiDB Cluster -Currently, there are not many customizable TiDB parameters. And there are two ways to customize the parameters: +The values file ([`./tidb-cluster/values/default.yaml`](./tidb-cluster/values/default.yaml)) provide proper default for TiDB cluster in EKS. You can specify an overriding values file in [`clusters.tf`](./clusters.tf) for each TiDB cluster. Values of this file will override the default ones. -* Before deploying the cluster, you can directly modify the `templates/tidb-cluster-values.yaml.tpl` file and then deploy the cluster with customized configs. -* After the cluster is running, you must run `terraform apply` again every time you make changes to the `templates/tidb-cluster-values.yaml.tpl` file, or the cluster will still be using old configs. +For example, the default cluster specify using `./default-cluster.yaml` as the overriding values file, and enable the ConfigMap rollout feature in this file. + +In EKS, some values are not customizable as usual, including the cluster version, replicas, node selectors and taints. These variables are controlled by the terraform instead in favor of consistency. To customize these variables, you can edit the [`clusters.tf`](./clusters.tf) and change the variables of each `./tidb-cluster` module directly. + +### Customized TiDB Operator + +You can customize the TiDB operator by specifying a helm values file through the `operator_values` variable. For example: + +```hcl +variable "operator_values" { + description = "The helm values of TiDB Operator" + default = file("operator_values.yaml") +} +``` + +## Multiple TiDB Cluster Management + +An instance of `./tidb-cluster` module corresponds to a TiDB cluster in the EKS cluster. If you want to add a new TiDB cluster, you can edit `./cluster.tf` and add a new instance of `./tidb-cluster` module: + +```hcl +module example-cluster { + source = "./tidb-cluster" + + # The target EKS, required + eks_info = local.default_eks + # The subnets of node pools of this TiDB cluster, required + subnets = local.default_subnets + # TiDB cluster name, required + cluster_name = "example-cluster" + + # Helm values file + override_values = file("example-cluster.yaml") + # TiDB cluster version + cluster_version = "v3.0.0" + # SSH key of cluster nodes + ssh_key_name = module.key-pair.key_name + # PD replica number + pd_count = 3 + # TiKV instance type + pd_instance_type = "t2.xlarge" + # TiKV replica number + tikv_count = 3 + # TiKV instance type + tikv_instance_type = "t2.xlarge" + # The storage class used by TiKV, if the TiKV instance type do not have local SSD, you should change it to storage class + # TiDB replica number + tidb_count = 2 + # TiDB instance type + tidb_instance_type = "t2.xlarge" + # Monitor instance type + monitor_instance_type = "t2.xlarge" + # The version of tidb-cluster helm chart + tidb_cluster_chart_version = "v1.0.0-beta.3" +} + +module other-cluster { + source = "./tidb-cluster" + + cluster_name = "other-cluster" + override_values = file("other-cluster.yaml") + #...... +} +``` + +> **Note:** +> +> The `cluster_name` of each cluster must be unique. + +You can refer to [./tidb-cluster/variables.tf](./tidb-cluster/variables.tf) for the complete configuration reference of `./tidb-cluster` module. + +You can get the DNS name of TiDB service and grafana service via kubectl. If you want terraform to print these information like the `default-cluster`, you can add `output` sections in `outputs.tf`: + +```hcl +output "example-cluster_tidb-dns" { + value = module.example-cluster.tidb_dns +} + +output "example-cluster_monitor-dns" { + value = module.example-cluster.monitor_dns +} +``` ## Destroy @@ -174,4 +253,35 @@ It may take some while to finish destroying the cluster. $ terraform destroy ``` -> **Note:** You have to manually delete the EBS volumes in AWS console after running `terraform destroy` if you do not need the data on the volumes anymore. +> **Note:** +> +> This will destroy your EKS cluster along with all the TiDB clusters you deployed on it. + +> **Note:** +> +> You have to manually delete the EBS volumes in AWS console after running terraform destroy if you do not need the data on the volumes anymore. + +## Advanced Guide: Use the tidb-cluster and tidb-operator Modules + +Under the hood, this terraform module composes two sub-modules: + +- [tidb-operator](./tidb-operator/README.md), which provisions the Kubernetes control plane for TiDB cluster +- [tidb-cluster](./tidb-cluster/README.md), which provisions a TiDB cluster in the target Kubernetes cluster + +You can use these modules separately in your own terraform scripts, by either referencing these modules locally or publish these modules to your terraform module registry. + +For example, let's say you create a terraform module in `/deploy/aws/staging`, you can reference the tidb-operator and tidb-cluster modules as following: + +```hcl +module "setup-control-plane" { + source = "../tidb-operator" +} + +module "tidb-cluster-a" { + source = "../tidb-cluster" +} + +module "tidb-cluster-b" { + source = "../tidb-cluster" +} +``` diff --git a/deploy/aws/aws-key-pair/main.tf b/deploy/aws/aws-key-pair/main.tf new file mode 100644 index 0000000000..875f0a7d22 --- /dev/null +++ b/deploy/aws/aws-key-pair/main.tf @@ -0,0 +1,43 @@ +locals { + public_key_filename = "${var.path}/${var.name}.pub" + private_key_filename = "${var.path}/${var.name}.pem" +} + +resource "tls_private_key" "generated" { + algorithm = "RSA" +} + +resource "aws_key_pair" "generated" { + key_name = var.name + public_key = tls_private_key.generated.public_key_openssh + + lifecycle { + ignore_changes = [key_name] + } +} + +resource "local_file" "public_key_openssh" { + count = var.path != "" ? 1 : 0 + content = tls_private_key.generated.public_key_openssh + filename = local.public_key_filename +} + +resource "local_file" "private_key_pem" { + count = var.path != "" ? 1 : 0 + content = tls_private_key.generated.private_key_pem + filename = local.private_key_filename +} + +resource "null_resource" "chmod" { + count = var.path != "" ? 1 : 0 + depends_on = [local_file.private_key_pem] + + triggers = { + key = tls_private_key.generated.private_key_pem + } + + provisioner "local-exec" { + command = "chmod 600 ${local.private_key_filename}" + } +} + diff --git a/deploy/aws/aws-key-pair/outputs.tf b/deploy/aws/aws-key-pair/outputs.tf new file mode 100644 index 0000000000..da32317b7e --- /dev/null +++ b/deploy/aws/aws-key-pair/outputs.tf @@ -0,0 +1,20 @@ +output "key_name" { + value = aws_key_pair.generated.key_name +} + +output "public_key_openssh" { + value = tls_private_key.generated.public_key_openssh +} + +output "private_key_pem" { + value = tls_private_key.generated.private_key_pem +} + +output "public_key_filepath" { + value = local.public_key_filename +} + +output "private_key_filepath" { + value = local.private_key_filename +} + diff --git a/deploy/aws/aws-key-pair/variables.tf b/deploy/aws/aws-key-pair/variables.tf new file mode 100644 index 0000000000..6f392c3440 --- /dev/null +++ b/deploy/aws/aws-key-pair/variables.tf @@ -0,0 +1,8 @@ +variable "name" { + description = "Unique name for the key, should also be a valid filename. This will prefix the public/private key." +} + +variable "path" { + description = "Path to a directory where the public and private key will be stored." + default = "" +} diff --git a/deploy/aws/aws-key-pair/versions.tf b/deploy/aws/aws-key-pair/versions.tf new file mode 100644 index 0000000000..ac97c6ac8e --- /dev/null +++ b/deploy/aws/aws-key-pair/versions.tf @@ -0,0 +1,4 @@ + +terraform { + required_version = ">= 0.12" +} diff --git a/deploy/aws/aws-tutorial.tfvars b/deploy/aws/aws-tutorial.tfvars index beba5bdc9c..6f71319587 100644 --- a/deploy/aws/aws-tutorial.tfvars +++ b/deploy/aws/aws-tutorial.tfvars @@ -1,11 +1,10 @@ -pd_instance_type = "c5d.large" -tikv_instance_type = "c5d.large" -tidb_instance_type = "c4.large" -monitor_instance_type = "c5.large" +default_cluster_pd_instance_type = "c5d.large" +default_cluster_pd_tikv_instance_type = "c5d.large" +default_cluster_tidb_instance_type = "c4.large" +default_cluster_monitor_instance_type = "c5.large" -pd_count = 1 -tikv_count = 1 -tidb_count = 1 +default_cluster_pd_count = 1 +default_cluster_tikv_count = 1 +default_cluster_tidb_count = 1 -cluster_name = "aws_tutorial" -tikv_root_volume_size = "50" \ No newline at end of file +default_cluster_cluster_name = "aws-tutorial" diff --git a/deploy/aws/bastion.tf b/deploy/aws/bastion.tf new file mode 100644 index 0000000000..23ad7c1734 --- /dev/null +++ b/deploy/aws/bastion.tf @@ -0,0 +1,33 @@ +resource "aws_security_group" "ssh" { + name = "${var.eks_name}-bastion" + description = "Allow SSH access for bastion instance" + vpc_id = var.create_vpc ? module.vpc.vpc_id : var.vpc_id + ingress { + from_port = 22 + to_port = 22 + protocol = "tcp" + cidr_blocks = var.bastion_ingress_cidr + } + egress { + from_port = 0 + to_port = 0 + protocol = "-1" + cidr_blocks = ["0.0.0.0/0"] + } +} + +module "ec2" { + source = "terraform-aws-modules/ec2-instance/aws" + + version = "2.3.0" + name = "${var.eks_name}-bastion" + instance_count = var.create_bastion ? 1 : 0 + ami = data.aws_ami.amazon-linux-2.id + instance_type = var.bastion_instance_type + key_name = module.key-pair.key_name + associate_public_ip_address = true + monitoring = false + user_data = file("bastion-userdata") + vpc_security_group_ids = [aws_security_group.ssh.id] + subnet_ids = local.default_subnets +} \ No newline at end of file diff --git a/deploy/aws/charts/tidb-cluster b/deploy/aws/charts/tidb-cluster deleted file mode 120000 index 326d382104..0000000000 --- a/deploy/aws/charts/tidb-cluster +++ /dev/null @@ -1 +0,0 @@ -../../../charts/tidb-cluster \ No newline at end of file diff --git a/deploy/aws/charts/tidb-operator b/deploy/aws/charts/tidb-operator deleted file mode 120000 index a45f172da2..0000000000 --- a/deploy/aws/charts/tidb-operator +++ /dev/null @@ -1 +0,0 @@ -../../../charts/tidb-operator \ No newline at end of file diff --git a/deploy/aws/clusters.tf b/deploy/aws/clusters.tf new file mode 100644 index 0000000000..124a09edac --- /dev/null +++ b/deploy/aws/clusters.tf @@ -0,0 +1,59 @@ +resource "local_file" "kubeconfig" { + depends_on = [module.tidb-operator.eks] + sensitive_content = module.tidb-operator.eks.kubeconfig + filename = module.tidb-operator.eks.kubeconfig_filename +} + +# The helm provider for TiDB clusters must be configured in the top level, otherwise removing clusters will failed due to +# the helm provider configuration is removed too. +provider "helm" { + alias = "eks" + insecure = true + # service_account = "tiller" + install_tiller = false # currently this doesn't work, so we install tiller in the local-exec provisioner. See https://github.com/terraform-providers/terraform-provider-helm/issues/148 + kubernetes { + config_path = local_file.kubeconfig.filename + } +} + +# TiDB cluster declaration example +#module "example-cluster" { +# source = "./tidb-cluster" +# eks_info = local.default_eks +# subnets = local.default_subnets +# +# # NOTE: cluster_name cannot be changed after creation +# cluster_name = "demo-cluster" +# cluster_version = "v3.0.0" +# ssh_key_name = module.key-pair.key_name +# pd_count = 1 +# pd_instance_type = "t2.xlarge" +# tikv_count = 1 +# tikv_instance_type = "t2.xlarge" +# tidb_count = 1 +# tidb_instance_type = "t2.xlarge" +# monitor_instance_type = "t2.xlarge" +# # yaml file that passed to helm to customize the release +# override_values = file("values/example.yaml") +#} + +module "default-cluster" { + providers = { + helm = "helm.eks" + } + source = "./tidb-cluster" + eks = local.default_eks + subnets = local.default_subnets + + cluster_name = var.default_cluster_name + cluster_version = var.default_cluster_version + ssh_key_name = module.key-pair.key_name + pd_count = var.default_cluster_pd_count + pd_instance_type = var.default_cluster_pd_instance_type + tikv_count = var.default_cluster_tikv_count + tikv_instance_type = var.default_cluster_tikv_instance_type + tidb_count = var.default_cluster_tidb_count + tidb_instance_type = var.default_cluster_tidb_instance_type + monitor_instance_type = var.default_cluster_monitor_instance_type + override_values = file("default-cluster.yaml") +} diff --git a/deploy/aws/data.tf b/deploy/aws/data.tf index ea7fbfc547..054c461d3a 100644 --- a/deploy/aws/data.tf +++ b/deploy/aws/data.tf @@ -1,53 +1,13 @@ -data "aws_availability_zones" "available" {} +data "aws_availability_zones" "available" { +} data "aws_ami" "amazon-linux-2" { - most_recent = true - - owners = ["amazon"] + most_recent = true - filter { - name = "name" - values = ["amzn2-ami-hvm-*-x86_64-gp2"] - } -} + owners = ["amazon"] -data "template_file" "tidb_cluster_values" { - template = "${file("${path.module}/templates/tidb-cluster-values.yaml.tpl")}" - vars { - cluster_version = "${var.tidb_version}" - pd_replicas = "${var.pd_count}" - tikv_replicas = "${var.tikv_count}" - tidb_replicas = "${var.tidb_count}" - monitor_enable_anonymous_user = "${var.monitor_enable_anonymous_user}" + filter { + name = "name" + values = ["amzn2-ami-hvm-*-x86_64-gp2"] } } - -# kubernetes provider can't use computed config_path right now, see issue: -# https://github.com/terraform-providers/terraform-provider-kubernetes/issues/142 -# so we don't use kubernetes provider to retrieve tidb and monitor connection info, -# instead we use external data source. -# data "kubernetes_service" "tidb" { -# depends_on = ["helm_release.tidb-cluster"] -# metadata { -# name = "tidb-cluster-${var.cluster_name}-tidb" -# namespace = "tidb" -# } -# } - -# data "kubernetes_service" "monitor" { -# depends_on = ["helm_release.tidb-cluster"] -# metadata { -# name = "tidb-cluster-${var.cluster_name}-grafana" -# namespace = "tidb" -# } -# } - -data "external" "tidb_service" { - depends_on = ["null_resource.wait-tidb-ready"] - program = ["bash", "-c", "kubectl --kubeconfig credentials/kubeconfig_${var.cluster_name} get svc -n tidb tidb-cluster-${var.cluster_name}-tidb -ojson | jq '.status.loadBalancer.ingress[0]'"] -} - -data "external" "monitor_service" { - depends_on = ["null_resource.wait-tidb-ready"] - program = ["bash", "-c", "kubectl --kubeconfig credentials/kubeconfig_${var.cluster_name} get svc -n tidb tidb-cluster-${var.cluster_name}-grafana -ojson | jq '.status.loadBalancer.ingress[0]'"] -} diff --git a/deploy/aws/default-cluster.yaml b/deploy/aws/default-cluster.yaml new file mode 100644 index 0000000000..00f5302c63 --- /dev/null +++ b/deploy/aws/default-cluster.yaml @@ -0,0 +1 @@ +enableConfigMapRollout: true \ No newline at end of file diff --git a/deploy/aws/main.tf b/deploy/aws/main.tf index 634c55a37c..014633de51 100644 --- a/deploy/aws/main.tf +++ b/deploy/aws/main.tf @@ -1,242 +1,51 @@ provider "aws" { - region = "${var.region}" + region = var.region } -module "key-pair" { - source = "cloudposse/key-pair/aws" - version = "0.3.2" - - name = "${var.cluster_name}" - namespace = "k8s" - stage = "prod" - ssh_public_key_path = "${path.module}/credentials/" - generate_ssh_key = "true" - private_key_extension = ".pem" - chmod_command = "chmod 600 %v" +locals { + default_subnets = split(",", var.create_vpc ? join(",", module.vpc.private_subnets) : join(",", var.subnets)) + default_eks = module.tidb-operator.eks } -resource "aws_security_group" "ssh" { - name = "${var.cluster_name}" - description = "Allow SSH access for bastion instance" - vpc_id = "${var.create_vpc ? module.vpc.vpc_id : var.vpc_id}" - ingress { - from_port = 22 - to_port = 22 - protocol = "tcp" - cidr_blocks = "${var.ingress_cidr}" - } - egress { - from_port = 0 - to_port = 0 - protocol = "-1" - cidr_blocks = ["0.0.0.0/0"] - } +module "key-pair" { + source = "./aws-key-pair" + name = var.eks_name + path = "${path.module}/credentials/" } module "vpc" { source = "terraform-aws-modules/vpc/aws" - version = "1.60.0" - name = "${var.cluster_name}" - cidr = "${var.vpc_cidr}" - create_vpc = "${var.create_vpc}" - azs = ["${data.aws_availability_zones.available.names[0]}", "${data.aws_availability_zones.available.names[1]}", "${data.aws_availability_zones.available.names[2]}"] - private_subnets = "${var.private_subnets}" - public_subnets = "${var.public_subnets}" + + version = "2.6.0" + name = var.eks_name + cidr = var.vpc_cidr + create_vpc = var.create_vpc + azs = data.aws_availability_zones.available.names + private_subnets = var.private_subnets + public_subnets = var.public_subnets enable_nat_gateway = true single_nat_gateway = true # The following tags are required for ELB private_subnet_tags = { - "kubernetes.io/cluster/${var.cluster_name}" = "shared" + "kubernetes.io/cluster/${var.eks_name}" = "shared" } public_subnet_tags = { - "kubernetes.io/cluster/${var.cluster_name}" = "shared" + "kubernetes.io/cluster/${var.eks_name}" = "shared" } vpc_tags = { - "kubernetes.io/cluster/${var.cluster_name}" = "shared" + "kubernetes.io/cluster/${var.eks_name}" = "shared" } } -module "ec2" { - source = "terraform-aws-modules/ec2-instance/aws" - version = "1.21.0" - name = "${var.cluster_name}-bastion" - instance_count = "${var.create_bastion ? 1:0}" - ami = "${data.aws_ami.amazon-linux-2.id}" - instance_type = "${var.bastion_instance_type}" - key_name = "${module.key-pair.key_name}" - associate_public_ip_address = true - monitoring = false - user_data = "${file("bastion-userdata")}" - vpc_security_group_ids = ["${aws_security_group.ssh.id}"] - subnet_ids = "${split(",", var.create_vpc ? join(",", module.vpc.public_subnets) : join(",", var.public_subnet_ids))}" - - tags = { - app = "tidb" - } -} +module "tidb-operator" { + source = "./tidb-operator" -module "eks" { - # source = "terraform-aws-modules/eks/aws" - # version = "2.3.1" - # We can not use cluster autoscaler for pod with local PV due to the limitations listed here: - # https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/FAQ.md#i-have-a-couple-of-pending-pods-but-there-was-no-scale-up - # so we scale out by updating auto-scaling-group desired_capacity directly via the patched version of aws eks module - source = "github.com/tennix/terraform-aws-eks?ref=v2.3.1-patch" - cluster_name = "${var.cluster_name}" - cluster_version = "${var.k8s_version}" + eks_name = var.eks_name + eks_version = var.eks_version + operator_version = var.operator_version config_output_path = "credentials/" - subnets = "${split(",", var.create_vpc ? join(",", module.vpc.private_subnets) : join(",", var.private_subnet_ids))}" - vpc_id = "${var.create_vpc ? module.vpc.vpc_id : var.vpc_id}" - - # instance types: https://aws.amazon.com/ec2/instance-types/ - # instance prices: https://aws.amazon.com/ec2/pricing/on-demand/ - - worker_groups = [ - { - # pd - name = "pd_worker_group" - key_name = "${module.key-pair.key_name}" - # WARNING: if you change instance type, you must also modify the corresponding disk mounting in pd-userdata.sh script - # instance_type = "c5d.xlarge" # 4c, 8G, 100G NVMe SSD - instance_type = "${var.pd_instance_type}" # m5d.xlarge 4c, 16G, 150G NVMe SSD - root_volume_size = "50" # rest NVMe disk for PD data - public_ip = false - kubelet_extra_args = "--register-with-taints=dedicated=pd:NoSchedule --node-labels=dedicated=pd" - asg_desired_capacity = "${var.pd_count}" - asg_max_size = "${var.pd_count + 2}" - additional_userdata = "${file("userdata.sh")}" - }, - { # tikv - name = "tikv_worker_group" - key_name = "${module.key-pair.key_name}" - # WARNING: if you change instance type, you must also modify the corresponding disk mounting in tikv-userdata.sh script - instance_type = "${var.tikv_instance_type}" # i3.2xlarge 8c, 61G, 1.9T NVMe SSD - root_volume_type = "gp2" - root_volume_size = "100" - public_ip = false - kubelet_extra_args = "--register-with-taints=dedicated=tikv:NoSchedule --node-labels=dedicated=tikv" - asg_desired_capacity = "${var.tikv_count}" - asg_max_size = "${var.tikv_count + 2}" - additional_userdata = "${file("userdata.sh")}" - }, - { # tidb - name = "tidb_worker_group" - key_name = "${module.key-pair.key_name}" - instance_type = "${var.tidb_instance_type}" # c4.4xlarge 16c, 30G - root_volume_type = "gp2" - root_volume_size = "100" - public_ip = false - kubelet_extra_args = "--register-with-taints=dedicated=tidb:NoSchedule --node-labels=dedicated=tidb" - asg_desired_capacity = "${var.tidb_count}" - asg_max_size = "${var.tidb_count + 2}" - }, - { # monitor - name = "monitor_worker_group" - key_name = "${module.key-pair.key_name}" - instance_type = "${var.monitor_instance_type}" # c5.xlarge 4c, 8G - root_volume_type = "gp2" - root_volume_size = "100" - public_ip = false - asg_desired_capacity = 1 - asg_max_size = 3 - } - ] - - worker_group_count = "4" - - tags = { - app = "tidb" - } -} - -# kubernetes and helm providers rely on EKS, but terraform provider doesn't support depends_on -# follow this link https://github.com/hashicorp/terraform/issues/2430#issuecomment-370685911 -# we have the following hack -resource "local_file" "kubeconfig" { - # HACK: depends_on for the helm and kubernetes provider - # Passing provider configuration value via a local_file - depends_on = ["module.eks"] - sensitive_content = "${module.eks.kubeconfig}" - filename = "${path.module}/credentials/kubeconfig_${var.cluster_name}" -} - -# kubernetes provider can't use computed config_path right now, see issue: -# https://github.com/terraform-providers/terraform-provider-kubernetes/issues/142 -# so we don't use kubernetes provider to retrieve tidb and monitor connection info, -# instead we use external data source. -# provider "kubernetes" { -# config_path = "${local_file.kubeconfig.filename}" -# } - -provider "helm" { - insecure = true - # service_account = "tiller" - # install_tiller = true # currently this doesn't work, so we install tiller in the local-exec provisioner. See https://github.com/terraform-providers/terraform-provider-helm/issues/148 - kubernetes { - config_path = "${local_file.kubeconfig.filename}" - } -} - -resource "null_resource" "setup-env" { - depends_on = ["module.eks"] - - provisioner "local-exec" { - working_dir = "${path.module}" - command = < 8, default thread pool size for coprocessors - # will be set to tikv.resources.limits.cpu * 0.8. - # readpoolCoprocessorConcurrency: 8 - - # scheduler's worker pool size, should increase it in heavy write cases, - # also should less than total cpu cores. - # storageSchedulerWorkerPoolSize: 4 - -tidb: - replicas: ${tidb_replicas} - # The secret name of root password, you can create secret with following command: - # kubectl create secret generic tidb-secret --from-literal=root= --namespace= - # If unset, the root password will be empty and you can set it after connecting - # passwordSecretName: tidb-secret - # initSql is the SQL statements executed after the TiDB cluster is bootstrapped. - # initSql: |- - # create database app; - image: "pingcap/tidb:${cluster_version}" - # Image pull policy. - imagePullPolicy: IfNotPresent - logLevel: info - preparedPlanCacheEnabled: false - preparedPlanCacheCapacity: 100 - # Enable local latches for transactions. Enable it when - # there are lots of conflicts between transactions. - txnLocalLatchesEnabled: false - txnLocalLatchesCapacity: "10240000" - # The limit of concurrent executed sessions. - tokenLimit: "1000" - # Set the memory quota for a query in bytes. Default: 32GB - memQuotaQuery: "34359738368" - # The limitation of the number for the entries in one transaction. - # If using TiKV as the storage, the entry represents a key/value pair. - # WARNING: Do not set the value too large, otherwise it will make a very large impact on the TiKV cluster. - # Please adjust this configuration carefully. - txnEntryCountLimit: "300000" - # The limitation of the size in byte for the entries in one transaction. - # If using TiKV as the storage, the entry represents a key/value pair. - # WARNING: Do not set the value too large, otherwise it will make a very large impact on the TiKV cluster. - # Please adjust this configuration carefully. - txnTotalSizeLimit: "104857600" - # enableBatchDml enables batch commit for the DMLs - enableBatchDml: false - # check mb4 value in utf8 is used to control whether to check the mb4 characters when the charset is utf8. - checkMb4ValueInUtf8: true - # treat-old-version-utf8-as-utf8mb4 use for upgrade compatibility. Set to true will treat old version table/column UTF8 charset as UTF8MB4. - treatOldVersionUtf8AsUtf8mb4: true - # lease is schema lease duration, very dangerous to change only if you know what you do. - lease: 45s - # Max CPUs to use, 0 use number of CPUs in the machine. - maxProcs: 0 - resources: - limits: {} - # cpu: 16000m - # memory: 16Gi - requests: {} - # cpu: 12000m - # memory: 12Gi - nodeSelector: - dedicated: tidb - # kind: tidb - # zone: cn-bj1-01,cn-bj1-02 - # region: cn-bj1 - tolerations: - - key: dedicated - operator: Equal - value: tidb - effect: "NoSchedule" - maxFailoverCount: 3 - service: - type: LoadBalancer - exposeStatus: true - annotations: - service.beta.kubernetes.io/aws-load-balancer-internal: 0.0.0.0/0 - service.beta.kubernetes.io/aws-load-balancer-type: nlb - # separateSlowLog: true - slowLogTailer: - image: busybox:1.26.2 - resources: - limits: - cpu: 100m - memory: 50Mi - requests: - cpu: 20m - memory: 5Mi - - # tidb plugin configuration - plugin: - # enable plugin or not - enable: false - # the start argument to specify the folder containing - directory: /plugins - # the start argument to specify the plugin id (name "-" version) that needs to be loaded, e.g. 'conn_limit-1'. - list: ["whitelist-1"] - -# mysqlClient is used to set password for TiDB -# it must has Python MySQL client installed -mysqlClient: - image: tnir/mysqlclient - imagePullPolicy: IfNotPresent - -monitor: - create: true - # Also see rbac.create - # If you set rbac.create to false, you need to provide a value here. - # If you set rbac.create to true, you should leave this empty. - # serviceAccount: - persistent: true - storageClassName: ebs-gp2 - storage: 500Gi - grafana: - create: true - image: grafana/grafana:6.0.1 - imagePullPolicy: IfNotPresent - logLevel: info - resources: - limits: {} - # cpu: 8000m - # memory: 8Gi - requests: {} - # cpu: 4000m - # memory: 4Gi - username: admin - password: admin - config: - # Configure Grafana using environment variables except GF_PATHS_DATA, GF_SECURITY_ADMIN_USER and GF_SECURITY_ADMIN_PASSWORD - # Ref https://grafana.com/docs/installation/configuration/#using-environment-variables - GF_AUTH_ANONYMOUS_ENABLED: %{ if monitor_enable_anonymous_user }"true"%{ else }"false"%{ endif } - GF_AUTH_ANONYMOUS_ORG_NAME: "Main Org." - GF_AUTH_ANONYMOUS_ORG_ROLE: "Viewer" - # if grafana is running behind a reverse proxy with subpath http://foo.bar/grafana - # GF_SERVER_DOMAIN: foo.bar - # GF_SERVER_ROOT_URL: "%(protocol)s://%(domain)s/grafana/" - service: - type: LoadBalancer - prometheus: - image: prom/prometheus:v2.2.1 - imagePullPolicy: IfNotPresent - logLevel: info - resources: - limits: {} - # cpu: 8000m - # memory: 8Gi - requests: {} - # cpu: 4000m - # memory: 4Gi - service: - type: NodePort - reserveDays: 12 - # alertmanagerURL: "" - nodeSelector: {} - # kind: monitor - # zone: cn-bj1-01,cn-bj1-02 - # region: cn-bj1 - tolerations: [] - # - key: node-role - # operator: Equal - # value: tidb - # effect: "NoSchedule" - -binlog: - pump: - create: false - replicas: 1 - image: "pingcap/tidb-binlog:${cluster_version}" - imagePullPolicy: IfNotPresent - logLevel: info - # storageClassName is a StorageClass provides a way for administrators to describe the "classes" of storage they offer. - # different classes might map to quality-of-service levels, or to backup policies, - # or to arbitrary policies determined by the cluster administrators. - # refer to https://kubernetes.io/docs/concepts/storage/storage-classes - storageClassName: local-storage - storage: 10Gi - syncLog: true - # a integer value to control expiry date of the binlog data, indicates for how long (in days) the binlog data would be stored. - # must bigger than 0 - gc: 7 - # number of seconds between heartbeat ticks (in 2 seconds) - heartbeatInterval: 2 - - drainer: - create: false - image: "pingcap/tidb-binlog:${cluster_version}" - imagePullPolicy: IfNotPresent - logLevel: info - # storageClassName is a StorageClass provides a way for administrators to describe the "classes" of storage they offer. - # different classes might map to quality-of-service levels, or to backup policies, - # or to arbitrary policies determined by the cluster administrators. - # refer to https://kubernetes.io/docs/concepts/storage/storage-classes - storageClassName: local-storage - storage: 10Gi - # parallel worker count (default 16) - workerCount: 16 - # the interval time (in seconds) of detect pumps' status (default 10) - detectInterval: 10 - # disbale detect causality - disableDetect: false - # disable dispatching sqls that in one same binlog; if set true, work-count and txn-batch would be useless - disableDispatch: false - # # disable sync these schema - ignoreSchemas: "INFORMATION_SCHEMA,PERFORMANCE_SCHEMA,mysql,test" - # if drainer donesn't have checkpoint, use initial commitTS to initial checkpoint - initialCommitTs: 0 - # enable safe mode to make syncer reentrant - safeMode: false - # number of binlog events in a transaction batch (default 20) - txnBatch: 20 - # downstream storage, equal to --dest-db-type - # valid values are "mysql", "pb", "kafka" - destDBType: pb - mysql: {} - # host: "127.0.0.1" - # user: "root" - # password: "" - # port: 3306 - # # Time and size limits for flash batch write - # timeLimit: "30s" - # sizeLimit: "100000" - kafka: {} - # only need config one of zookeeper-addrs and kafka-addrs, will get kafka address if zookeeper-addrs is configed. - # zookeeperAddrs: "127.0.0.1:2181" - # kafkaAddrs: "127.0.0.1:9092" - # kafkaVersion: "0.8.2.0" - -scheduledBackup: - create: false - binlogImage: "pingcap/tidb-binlog:${cluster_version}" - binlogImagePullPolicy: IfNotPresent - # https://github.com/tennix/tidb-cloud-backup - mydumperImage: pingcap/tidb-cloud-backup:20190610 - mydumperImagePullPolicy: IfNotPresent - # storageClassName is a StorageClass provides a way for administrators to describe the "classes" of storage they offer. - # different classes might map to quality-of-service levels, or to backup policies, - # or to arbitrary policies determined by the cluster administrators. - # refer to https://kubernetes.io/docs/concepts/storage/storage-classes - storageClassName: local-storage - storage: 100Gi - # https://kubernetes.io/docs/tasks/job/automated-tasks-with-cron-jobs/#schedule - schedule: "0 0 * * *" - # https://kubernetes.io/docs/tasks/job/automated-tasks-with-cron-jobs/#suspend - suspend: false - # https://kubernetes.io/docs/tasks/job/automated-tasks-with-cron-jobs/#jobs-history-limits - successfulJobsHistoryLimit: 3 - failedJobsHistoryLimit: 1 - # https://kubernetes.io/docs/tasks/job/automated-tasks-with-cron-jobs/#starting-deadline - startingDeadlineSeconds: 3600 - # https://github.com/maxbube/mydumper/blob/master/docs/mydumper_usage.rst#options - options: "--chunk-filesize=100" - # secretName is the name of the secret which stores user and password used for backup - # Note: you must give the user enough privilege to do the backup - # you can create the secret by: - # kubectl create secret generic backup-secret --from-literal=user=root --from-literal=password= - secretName: backup-secret - # backup to gcp - gcp: {} - # bucket: "" - # secretName is the name of the secret which stores the gcp service account credentials json file - # The service account must have read/write permission to the above bucket. - # Read the following document to create the service account and download the credentials file as credentials.json: - # https://cloud.google.com/docs/authentication/production#obtaining_and_providing_service_account_credentials_manually - # And then create the secret by: kubectl create secret generic gcp-backup-secret --from-file=./credentials.json - # secretName: gcp-backup-secret - - # backup to ceph object storage - ceph: {} - # endpoint: "" - # bucket: "" - # secretName is the name of the secret which stores ceph object store access key and secret key - # You can create the secret by: - # kubectl create secret generic ceph-backup-secret --from-literal=access_key= --from-literal=secret_key= - # secretName: ceph-backup-secret - - # backup to s3 - s3: {} - # region: "" - # bucket: "" - # secretName is the name of the secret which stores s3 object store access key and secret key - # You can create the secret by: - # kubectl create secret generic s3-backup-secret --from-literal=access_key= --from-literal=secret_key= - # secretName: s3-backup-secret - -metaInstance: "{{ $labels.instance }}" -metaType: "{{ $labels.type }}" -metaValue: "{{ $value }}" diff --git a/deploy/aws/tidb-cluster/README.md b/deploy/aws/tidb-cluster/README.md new file mode 100644 index 0000000000..4166ae6c78 --- /dev/null +++ b/deploy/aws/tidb-cluster/README.md @@ -0,0 +1,7 @@ +The `tidb-cluster` module for AWS spins up a TiDB cluster in the specified `EKS` cluster. The following resources will be provisioned: + +- An auto scaling group for PD +- An auto scaling group for TiKV +- An auto scaling group for TiDB +- An auto scaling group for Monitoring +- A `TidbCluster` custom resource diff --git a/deploy/aws/tidb-cluster/cluster.tf b/deploy/aws/tidb-cluster/cluster.tf new file mode 100644 index 0000000000..40d3980375 --- /dev/null +++ b/deploy/aws/tidb-cluster/cluster.tf @@ -0,0 +1,139 @@ +resource "null_resource" "wait-tiller-ready" { + depends_on = [var.eks] + + provisioner "local-exec" { + working_dir = path.cwd + command = < /dev/null 2>&1; then + echo "disk /dev/nvme${i}n1 already parted, skipping" + else + echo "disk /dev/nvme${i}n1 is not parted" + if ! blkid /dev/nvme${i}n1 > /dev/null; then + echo "/dev/nvme${i}n1 not formatted" + mkfs -t ext4 /dev/nvme${i}n1 + DISK_UUID=$(blkid -s UUID -o value /dev/nvme${i}n1) + mkdir -p /mnt/local-ssd/$DISK_UUID + echo UUID=`blkid -s UUID -o value /dev/nvme${i}n1` /mnt/local-ssd/$DISK_UUID ext4 defaults 0 2 | tee -a /etc/fstab + fi + fi + fi + fi +done + +# mount local ssd disks +mount -a + +# ZONE=$(curl -s http://169.254.169.254/latest/meta-data/placement/availability-zone) +# AWS_DEFAULT_REGION=$(echo $ZONE | awk '{print substr($0, 1, length($0)-1)}') \ No newline at end of file diff --git a/deploy/aws/pd-userdata.sh b/deploy/aws/tidb-cluster/templates/userdata.sh.tpl similarity index 61% rename from deploy/aws/pd-userdata.sh rename to deploy/aws/tidb-cluster/templates/userdata.sh.tpl index d45802cdd8..db015227aa 100644 --- a/deploy/aws/pd-userdata.sh +++ b/deploy/aws/tidb-cluster/templates/userdata.sh.tpl @@ -1,3 +1,8 @@ +#!/bin/bash -xe + +# Allow user supplied pre userdata code +${pre_userdata} + # set ulimits cat < /etc/security/limits.d/99-tidb.conf root soft nofile 1000000 @@ -12,14 +17,8 @@ sed -i 's/LimitNPROC=infinity/LimitNPROC=1048576/' /etc/systemd/system/docker.se systemctl daemon-reload systemctl restart docker -# format and mount nvme disk -if grep nvme1n1 /etc/fstab; then - echo "disk already mounted" -else - mkfs -t ext4 /dev/nvme1n1 - mkdir -p /mnt/disks/ssd1 - cat <> /etc/fstab -/dev/nvme1n1 /mnt/disks/ssd1 ext4 defaults,nofail,noatime,nodelalloc 0 2 -EOF - mount -a -fi +# Bootstrap and join the cluster +/etc/eks/bootstrap.sh --b64-cluster-ca '${cluster_auth_base64}' --apiserver-endpoint '${endpoint}' ${bootstrap_extra_args} --kubelet-extra-args '${kubelet_extra_args}' '${cluster_name}' + +# Allow user supplied userdata code +${additional_userdata} diff --git a/deploy/aws/tidb-cluster/values/default.yaml b/deploy/aws/tidb-cluster/values/default.yaml new file mode 100644 index 0000000000..e38634ffa0 --- /dev/null +++ b/deploy/aws/tidb-cluster/values/default.yaml @@ -0,0 +1,27 @@ +# Basic customization for tidb-cluster chart that suits AWS environment +timezone: UTC + +pd: + logLevel: info + storageClassName: ebs-gp2 +tikv: + logLevel: info + stroageClassName: local-storage + syncLog: true +tidb: + logLevel: info + service: + type: LoadBalancer + annotations: + service.beta.kubernetes.io/aws-load-balancer-internal: '0.0.0.0/0' + service.beta.kubernetes.io/aws-load-balancer-type: nlb + +monitor: + storage: 100Gi + storageClassName: ebs-gp2 + persistent: true + grafana: + config: + GF_AUTH_ANONYMOUS_ENABLED: "true" + service: + type: LoadBalancer \ No newline at end of file diff --git a/deploy/aws/tidb-cluster/variables.tf b/deploy/aws/tidb-cluster/variables.tf new file mode 100644 index 0000000000..5c6e315232 --- /dev/null +++ b/deploy/aws/tidb-cluster/variables.tf @@ -0,0 +1,162 @@ +variable "subnets" { + description = "A list of subnets to place the EKS cluster and workers within." + type = list(string) +} + +variable "tags" { + description = "A map of tags to add to all resources." + type = map(string) + default = {} +} + +variable "worker_groups" { + description = "A list of maps defining worker group configurations to be defined using AWS Launch Configurations. See workers_group_defaults for valid keys." + type = list(map(string)) + + default = [ + { + name = "default" + }, + ] +} + +variable "worker_group_count" { + description = "The number of maps contained within the worker_groups list." + type = string + default = "1" +} + +variable "workers_group_defaults" { + description = "Override default values for target groups. See workers_group_defaults_defaults in local.tf for valid keys." + type = map(string) + default = {} +} + +variable "worker_group_tags" { + description = "A map defining extra tags to be applied to the worker group ASG." + type = map(list(string)) + + default = { + default = [] + } +} + +variable "worker_groups_launch_template" { + description = "A list of maps defining worker group configurations to be defined using AWS Launch Templates. See workers_group_defaults for valid keys." + type = list(map(string)) + + default = [ + { + name = "default" + }, + ] +} + +variable "worker_group_launch_template_count" { + description = "The number of maps contained within the worker_groups_launch_template list." + type = string + default = "0" +} + +variable "workers_group_launch_template_defaults" { + description = "Override default values for target groups. See workers_group_defaults_defaults in local.tf for valid keys." + type = map(string) + default = {} +} + +variable "worker_group_launch_template_tags" { + description = "A map defining extra tags to be applied to the worker group template ASG." + type = map(list(string)) + + default = { + default = [] + } +} + +variable "worker_ami_name_filter" { + description = "Additional name filter for AWS EKS worker AMI. Default behaviour will get latest for the cluster_version but could be set to a release from amazon-eks-ami, e.g. \"v20190220\"" + default = "v*" +} + +variable "worker_additional_security_group_ids" { + description = "A list of additional security group ids to attach to worker instances" + type = list(string) + default = [] +} + +variable "local_exec_interpreter" { + description = "Command to run for local-exec resources. Must be a shell-style interpreter. If you are on Windows Git Bash is a good choice." + type = list(string) + default = ["/bin/sh", "-c"] +} + +variable "iam_path" { + description = "If provided, all IAM roles will be created on this path." + default = "/" +} + + + + +variable "tidb_cluster_chart_version" { + description = "tidb-cluster chart version" + default = "v1.0.0-beta.3" +} + +variable "cluster_name" { + type = string + description = "tidb cluster name" +} + +variable "cluster_version" { + type = string + default = "v3.0.0-rc.2" +} + +variable "ssh_key_name" { + type = string +} + +variable "pd_count" { + type = number + default = 1 +} + +variable "tikv_count" { + type = number + default = 1 +} + +variable "tidb_count" { + type = number + default = 1 +} + +variable "pd_instance_type" { + type = string + default = "c5d.large" +} + +variable "tikv_instance_type" { + type = string + default = "c5d.large" +} + +variable "tidb_instance_type" { + type = string + default = "c5d.large" +} + +variable "monitor_instance_type" { + type = string + default = "c5d.large" +} + +variable "override_values" { + type = string + default = "" +} + +variable "eks" { + description = "eks info" +} diff --git a/deploy/aws/tidb-cluster/workers.tf b/deploy/aws/tidb-cluster/workers.tf new file mode 100644 index 0000000000..1fab065160 --- /dev/null +++ b/deploy/aws/tidb-cluster/workers.tf @@ -0,0 +1,174 @@ +# Worker Groups using Launch Configurations + +resource "aws_autoscaling_group" "workers" { + name_prefix = "${var.eks.cluster_id}-${lookup(local.tidb_cluster_worker_groups[count.index], "name", count.index)}" + desired_capacity = lookup( + local.tidb_cluster_worker_groups[count.index], + "asg_desired_capacity", + local.workers_group_defaults["asg_desired_capacity"], + ) + max_size = lookup( + local.tidb_cluster_worker_groups[count.index], + "asg_max_size", + local.workers_group_defaults["asg_max_size"], + ) + min_size = lookup( + local.tidb_cluster_worker_groups[count.index], + "asg_min_size", + local.workers_group_defaults["asg_min_size"], + ) + force_delete = false + launch_configuration = element(aws_launch_configuration.workers.*.id, count.index) + vpc_zone_identifier = split( + ",", + coalesce( + lookup(local.tidb_cluster_worker_groups[count.index], "subnets", ""), + local.workers_group_defaults["subnets"], + ), + ) + protect_from_scale_in = false + count = local.worker_group_count + placement_group = "" # The name of the placement group into which to launch the instances, if any. + + tags = concat( + [ + { + key = "Name" + value = "${var.eks.cluster_id}-${lookup(local.tidb_cluster_worker_groups[count.index], "name", count.index)}-eks_asg" + propagate_at_launch = true + }, + { + key = "kubernetes.io/cluster/${var.eks.cluster_id}" + value = "owned" + propagate_at_launch = true + }, + { + key = "k8s.io/cluster-autoscaler/${lookup( + local.tidb_cluster_worker_groups[count.index], + "autoscaling_enabled", + local.workers_group_defaults["autoscaling_enabled"], + ) == 1 ? "enabled" : "disabled"}" + value = "true" + propagate_at_launch = false + }, + { + key = "k8s.io/cluster-autoscaler/node-template/resources/ephemeral-storage" + value = "${lookup( + local.tidb_cluster_worker_groups[count.index], + "root_volume_size", + local.workers_group_defaults["root_volume_size"], + )}Gi" + propagate_at_launch = false + }, + ], + local.asg_tags, + var.worker_group_tags[contains( + keys(var.worker_group_tags), + lookup(local.tidb_cluster_worker_groups[count.index], "name", count.index), + ) ? lookup(local.tidb_cluster_worker_groups[count.index], "name", count.index) : "default"], + ) + + + lifecycle { + create_before_destroy = true + # ignore_changes = ["desired_capacity"] + } +} + +resource "aws_launch_configuration" "workers" { + name_prefix = "${var.eks.cluster_id}-${lookup(local.tidb_cluster_worker_groups[count.index], "name", count.index)}" + associate_public_ip_address = lookup( + local.tidb_cluster_worker_groups[count.index], + "public_ip", + local.workers_group_defaults["public_ip"], + ) + security_groups = concat([var.eks.worker_security_group_id], var.worker_additional_security_group_ids, compact( + split( + ",", + lookup( + local.tidb_cluster_worker_groups[count.index], + "additional_security_group_ids", + local.workers_group_defaults["additional_security_group_ids"], + ), + ), + )) + iam_instance_profile = element(var.eks.worker_iam_instance_profile_names, count.index) + image_id = lookup( + local.tidb_cluster_worker_groups[count.index], + "ami_id", + local.workers_group_defaults["ami_id"], + ) + instance_type = lookup( + local.tidb_cluster_worker_groups[count.index], + "instance_type", + local.workers_group_defaults["instance_type"], + ) + key_name = lookup( + local.tidb_cluster_worker_groups[count.index], + "key_name", + local.workers_group_defaults["key_name"], + ) + user_data_base64 = base64encode(element(data.template_file.userdata.*.rendered, count.index)) + ebs_optimized = lookup( + local.tidb_cluster_worker_groups[count.index], + "ebs_optimized", + lookup( + local.ebs_optimized, + lookup( + local.tidb_cluster_worker_groups[count.index], + "instance_type", + local.workers_group_defaults["instance_type"], + ), + false, + ), + ) + enable_monitoring = lookup( + local.tidb_cluster_worker_groups[count.index], + "enable_monitoring", + local.workers_group_defaults["enable_monitoring"], + ) + spot_price = lookup( + local.tidb_cluster_worker_groups[count.index], + "spot_price", + local.workers_group_defaults["spot_price"], + ) + placement_tenancy = lookup( + local.tidb_cluster_worker_groups[count.index], + "placement_tenancy", + local.workers_group_defaults["placement_tenancy"], + ) + count = local.worker_group_count + + lifecycle { + create_before_destroy = true + } + + root_block_device { + volume_size = lookup( + local.tidb_cluster_worker_groups[count.index], + "root_volume_size", + local.workers_group_defaults["root_volume_size"], + ) + volume_type = lookup( + local.tidb_cluster_worker_groups[count.index], + "root_volume_type", + local.workers_group_defaults["root_volume_type"], + ) + iops = lookup( + local.tidb_cluster_worker_groups[count.index], + "root_iops", + local.workers_group_defaults["root_iops"], + ) + delete_on_termination = true + } +} + +resource "null_resource" "tags_as_list_of_maps" { + count = length(keys(var.tags)) + + triggers = { + key = element(keys(var.tags), count.index) + value = element(values(var.tags), count.index) + propagate_at_launch = "true" + } +} diff --git a/deploy/aws/tidb-cluster/workers_launch_template.tf b/deploy/aws/tidb-cluster/workers_launch_template.tf new file mode 100644 index 0000000000..b767d7dce4 --- /dev/null +++ b/deploy/aws/tidb-cluster/workers_launch_template.tf @@ -0,0 +1,299 @@ +# Worker Groups using Launch Templates + +resource "aws_autoscaling_group" "workers_launch_template" { + name_prefix = "${var.eks.cluster_id}-${lookup( + var.worker_groups_launch_template[count.index], + "name", + count.index, + )}" + desired_capacity = lookup( + var.worker_groups_launch_template[count.index], + "asg_desired_capacity", + local.workers_group_launch_template_defaults["asg_desired_capacity"], + ) + max_size = lookup( + var.worker_groups_launch_template[count.index], + "asg_max_size", + local.workers_group_launch_template_defaults["asg_max_size"], + ) + min_size = lookup( + var.worker_groups_launch_template[count.index], + "asg_min_size", + local.workers_group_launch_template_defaults["asg_min_size"], + ) + force_delete = lookup( + var.worker_groups_launch_template[count.index], + "asg_force_delete", + local.workers_group_launch_template_defaults["asg_force_delete"], + ) + + mixed_instances_policy { + instances_distribution { + on_demand_allocation_strategy = lookup( + var.worker_groups_launch_template[count.index], + "on_demand_allocation_strategy", + local.workers_group_launch_template_defaults["on_demand_allocation_strategy"], + ) + on_demand_base_capacity = lookup( + var.worker_groups_launch_template[count.index], + "on_demand_base_capacity", + local.workers_group_launch_template_defaults["on_demand_base_capacity"], + ) + on_demand_percentage_above_base_capacity = lookup( + var.worker_groups_launch_template[count.index], + "on_demand_percentage_above_base_capacity", + local.workers_group_launch_template_defaults["on_demand_percentage_above_base_capacity"], + ) + spot_allocation_strategy = lookup( + var.worker_groups_launch_template[count.index], + "spot_allocation_strategy", + local.workers_group_launch_template_defaults["spot_allocation_strategy"], + ) + spot_instance_pools = lookup( + var.worker_groups_launch_template[count.index], + "spot_instance_pools", + local.workers_group_launch_template_defaults["spot_instance_pools"], + ) + spot_max_price = lookup( + var.worker_groups_launch_template[count.index], + "spot_max_price", + local.workers_group_launch_template_defaults["spot_max_price"], + ) + } + + launch_template { + launch_template_specification { + launch_template_id = element( + aws_launch_template.workers_launch_template.*.id, + count.index, + ) + version = "$Latest" + } + + override { + instance_type = lookup( + var.worker_groups_launch_template[count.index], + "instance_type", + local.workers_group_launch_template_defaults["instance_type"], + ) + } + + override { + instance_type = lookup( + var.worker_groups_launch_template[count.index], + "override_instance_type", + local.workers_group_launch_template_defaults["override_instance_type"], + ) + } + } + } + + vpc_zone_identifier = split( + ",", + coalesce( + lookup( + var.worker_groups_launch_template[count.index], + "subnets", + "", + ), + local.workers_group_launch_template_defaults["subnets"], + ), + ) + protect_from_scale_in = lookup( + var.worker_groups_launch_template[count.index], + "protect_from_scale_in", + local.workers_group_launch_template_defaults["protect_from_scale_in"], + ) + + count = var.worker_group_launch_template_count + + tags = concat( + [ + { + key = "Name" + value = "${var.eks.cluster_id}-${lookup( + var.worker_groups_launch_template[count.index], + "name", + count.index, + )}-eks_asg" + propagate_at_launch = true + }, + { + key = "kubernetes.io/cluster/${var.eks.cluster_id}" + value = "owned" + propagate_at_launch = true + }, + { + key = "k8s.io/cluster-autoscaler/${lookup( + var.worker_groups_launch_template[count.index], + "autoscaling_enabled", + local.workers_group_launch_template_defaults["autoscaling_enabled"], + ) == 1 ? "enabled" : "disabled"}" + value = "true" + propagate_at_launch = false + }, + { + key = "k8s.io/cluster-autoscaler/node-template/resources/ephemeral-storage" + value = "${lookup( + var.worker_groups_launch_template[count.index], + "root_volume_size", + local.workers_group_launch_template_defaults["root_volume_size"], + )}Gi" + propagate_at_launch = false + }, + ], + local.asg_tags, + var.worker_group_launch_template_tags[contains( + keys(var.worker_group_launch_template_tags), + lookup( + var.worker_groups_launch_template[count.index], + "name", + count.index, + ), + ) ? lookup( + var.worker_groups_launch_template[count.index], + "name", + count.index, + ) : "default"], + ) + + lifecycle { + create_before_destroy = true + + ignore_changes = [desired_capacity] + } +} + +resource "aws_launch_template" "workers_launch_template" { + name_prefix = "${var.eks.cluster_id}-${lookup( + var.worker_groups_launch_template[count.index], + "name", + count.index, + )}" + + network_interfaces { + associate_public_ip_address = lookup( + var.worker_groups_launch_template[count.index], + "public_ip", + local.workers_group_launch_template_defaults["public_ip"], + ) + security_groups = concat([var.eks.worker_security_group_id], var.worker_additional_security_group_ids, compact( + split( + ",", + lookup( + var.worker_groups_launch_template[count.index], + "additional_security_group_ids", + local.workers_group_launch_template_defaults["additional_security_group_ids"], + ), + ), + )) + } + + iam_instance_profile { + name = element( + aws_iam_instance_profile.workers_launch_template.*.name, + count.index, + ) + } + + image_id = lookup( + var.worker_groups_launch_template[count.index], + "ami_id", + local.workers_group_launch_template_defaults["ami_id"], + ) + instance_type = lookup( + var.worker_groups_launch_template[count.index], + "instance_type", + local.workers_group_launch_template_defaults["instance_type"], + ) + key_name = lookup( + var.worker_groups_launch_template[count.index], + "key_name", + local.workers_group_launch_template_defaults["key_name"], + ) + user_data = base64encode( + element( + data.template_file.launch_template_userdata.*.rendered, + count.index, + ), + ) + ebs_optimized = lookup( + var.worker_groups_launch_template[count.index], + "ebs_optimized", + lookup( + local.ebs_optimized, + lookup( + var.worker_groups_launch_template[count.index], + "instance_type", + local.workers_group_launch_template_defaults["instance_type"], + ), + false, + ), + ) + + monitoring { + enabled = lookup( + var.worker_groups_launch_template[count.index], + "enable_monitoring", + local.workers_group_launch_template_defaults["enable_monitoring"], + ) + } + + placement { + tenancy = lookup( + var.worker_groups_launch_template[count.index], + "placement_tenancy", + local.workers_group_launch_template_defaults["placement_tenancy"], + ) + } + + count = var.worker_group_launch_template_count + + lifecycle { + create_before_destroy = true + } + + block_device_mappings { + device_name = data.aws_ami.eks_worker.root_device_name + + ebs { + volume_size = lookup( + var.worker_groups_launch_template[count.index], + "root_volume_size", + local.workers_group_launch_template_defaults["root_volume_size"], + ) + volume_type = lookup( + var.worker_groups_launch_template[count.index], + "root_volume_type", + local.workers_group_launch_template_defaults["root_volume_type"], + ) + iops = lookup( + var.worker_groups_launch_template[count.index], + "root_iops", + local.workers_group_launch_template_defaults["root_iops"], + ) + encrypted = lookup( + var.worker_groups_launch_template[count.index], + "root_encrypted", + local.workers_group_launch_template_defaults["root_encrypted"], + ) + kms_key_id = lookup( + var.worker_groups_launch_template[count.index], + "kms_key_id", + local.workers_group_launch_template_defaults["kms_key_id"], + ) + delete_on_termination = true + } + } +} + +resource "aws_iam_instance_profile" "workers_launch_template" { + name_prefix = var.eks.cluster_id + role = lookup( + var.worker_groups_launch_template[count.index], + "iam_role_id", + local.workers_group_launch_template_defaults["iam_role_id"], + ) + count = var.worker_group_launch_template_count + path = var.iam_path +} diff --git a/deploy/aws/tidb-operator/README.md b/deploy/aws/tidb-operator/README.md new file mode 100644 index 0000000000..6b565f945d --- /dev/null +++ b/deploy/aws/tidb-operator/README.md @@ -0,0 +1,7 @@ +The `tidb-operator` module for AWS spins up a control plane for TiDB in Kubernetes. The following resources will be provisioned: + +- An EKS cluster +- A auto scaling group to run the control pods listed below +- TiDB operator, including `tidb-controller-manager` and `tidb-scheduler` +- local-volume-provisioner +- Tiller for Helm \ No newline at end of file diff --git a/deploy/aws/tidb-operator/main.tf b/deploy/aws/tidb-operator/main.tf new file mode 100644 index 0000000000..e44e72d1e7 --- /dev/null +++ b/deploy/aws/tidb-operator/main.tf @@ -0,0 +1,89 @@ +module "eks" { + source = "terraform-aws-modules/eks/aws" + + cluster_name = var.eks_name + cluster_version = var.eks_version + vpc_id = var.vpc_id + config_output_path = var.config_output_path + subnets = var.subnets + + tags = { + app = "tidb" + } + + worker_groups = [ + { + name = "${var.eks_name}-control" + key_name = var.ssh_key_name + instance_type = var.default_worker_group_instance_type + public_ip = false + asg_desired_capacity = var.default_worker_group_instance_count + asg_max_size = var.default_worker_group_instance_count + 2 + }, + ] +} + +# kubernetes and helm providers rely on EKS, but terraform provider doesn't support depends_on +# follow this link https://github.com/hashicorp/terraform/issues/2430#issuecomment-370685911 +# we have the following hack +resource "local_file" "kubeconfig" { + depends_on = [module.eks] + sensitive_content = module.eks.kubeconfig + filename = module.eks.kubeconfig_filename +} + +provider "helm" { + alias = "initial" + insecure = true + # service_account = "tiller" + install_tiller = false # currently this doesn't work, so we install tiller in the local-exec provisioner. See https://github.com/terraform-providers/terraform-provider-helm/issues/148 + kubernetes { + config_path = local_file.kubeconfig.filename + } +} + +resource "null_resource" "setup-env" { + depends_on = [local_file.kubeconfig] + + provisioner "local-exec" { + working_dir = path.module + command = < kube_config.yaml +kubectl apply -f https://raw.githubusercontent.com/pingcap/tidb-operator/v1.0.0-beta.3/manifests/crd.yaml +kubectl apply -f https://raw.githubusercontent.com/pingcap/tidb-operator/v1.0.0-beta.3/manifests/tiller-rbac.yaml +kubectl apply -f manifests/local-volume-provisioner.yaml +kubectl apply -f manifests/gp2-storageclass.yaml +helm init --service-account tiller --upgrade --wait +until helm ls; do + echo "Wait tiller ready" + sleep 5 +done +rm kube_config.yaml +EOS + environment = { + KUBECONFIG = "kube_config.yaml" + } + } +} + +data "helm_repository" "pingcap" { + provider = "helm.initial" + depends_on = ["null_resource.setup-env"] + name = "pingcap" + url = "http://charts.pingcap.org/" +} + +resource "helm_release" "tidb-operator" { + provider = "helm.initial" + depends_on = ["null_resource.setup-env"] + + repository = data.helm_repository.pingcap.name + chart = "tidb-operator" + version = var.operator_version + namespace = "tidb-admin" + name = "tidb-operator" + values = [var.operator_helm_values] +} + + + diff --git a/deploy/aws/manifests/gp2-storageclass.yaml b/deploy/aws/tidb-operator/manifests/gp2-storageclass.yaml similarity index 100% rename from deploy/aws/manifests/gp2-storageclass.yaml rename to deploy/aws/tidb-operator/manifests/gp2-storageclass.yaml diff --git a/deploy/aws/manifests/local-volume-provisioner.yaml b/deploy/aws/tidb-operator/manifests/local-volume-provisioner.yaml similarity index 85% rename from deploy/aws/manifests/local-volume-provisioner.yaml rename to deploy/aws/tidb-operator/manifests/local-volume-provisioner.yaml index a20799869b..b8bc32f713 100644 --- a/deploy/aws/manifests/local-volume-provisioner.yaml +++ b/deploy/aws/tidb-operator/manifests/local-volume-provisioner.yaml @@ -14,8 +14,8 @@ metadata: data: storageClassMap: | local-storage: - hostDir: /mnt/disks - mountDir: /mnt/disks + hostDir: /mnt/local-ssd + mountDir: /mnt/local-ssd --- apiVersion: extensions/v1beta1 @@ -36,13 +36,10 @@ spec: spec: tolerations: - key: dedicated - operator: Equal - value: pd - effect: "NoSchedule" - - key: dedicated - operator: Equal - value: tikv + operator: Exists effect: "NoSchedule" + nodeSelector: + pingcap.com/aws-local-ssd: "true" serviceAccountName: local-storage-admin containers: - image: "quay.io/external_storage/local-volume-provisioner:v2.2.0" @@ -71,22 +68,16 @@ spec: - mountPath: /etc/provisioner/config name: provisioner-config readOnly: true - # mounting /dev in DinD environment would fail - # - mountPath: /dev - # name: provisioner-dev - - mountPath: /mnt/disks + - mountPath: /mnt/local-ssd name: local-disks mountPropagation: "HostToContainer" volumes: - name: provisioner-config configMap: name: local-provisioner-config - # - name: provisioner-dev - # hostPath: - # path: /dev - name: local-disks hostPath: - path: /mnt/disks + path: /mnt/local-ssd --- apiVersion: v1 diff --git a/deploy/aws/tidb-operator/outputs.tf b/deploy/aws/tidb-operator/outputs.tf new file mode 100644 index 0000000000..47785e4544 --- /dev/null +++ b/deploy/aws/tidb-operator/outputs.tf @@ -0,0 +1,3 @@ +output "eks" { + value = module.eks +} \ No newline at end of file diff --git a/deploy/aws/tidb-operator/variables.tf b/deploy/aws/tidb-operator/variables.tf new file mode 100644 index 0000000000..1d2aebceb3 --- /dev/null +++ b/deploy/aws/tidb-operator/variables.tf @@ -0,0 +1,53 @@ +variable "eks_name" { + description = "Name of the EKS cluster. Also used as a prefix in names of related resources." + type = string +} + +variable "eks_version" { + description = "Kubernetes version to use for the EKS cluster." + type = string + default = "1.12" +} + +variable "operator_version" { + description = "TiDB Operator version" + type = string + default = "v1.0.0-beta.3" +} + +variable "operator_helm_values" { + description = "Operator helm values" + type = string + default = "" +} + +variable "config_output_path" { + description = "Where to save the Kubectl config file (if `write_kubeconfig = true`). Should end in a forward slash `/` ." + type = string + default = "./" +} + +variable "subnets" { + description = "A list of subnets to place the EKS cluster and workers within." + type = list(string) +} + +variable "vpc_id" { + description = "VPC where the cluster and workers will be deployed." + type = string +} + +variable "default_worker_group_instance_type" { + description = "The instance type of default worker groups, this group will be used to run tidb-operator" + default = "m4.large" +} + +variable "default_worker_group_instance_count" { + description = "The instance count of default worker groups, this group will be used to run tidb-operator" + default = 1 +} + +variable "ssh_key_name" { + type = string +} + diff --git a/deploy/aws/tikv-userdata.sh b/deploy/aws/tikv-userdata.sh deleted file mode 100644 index d71c5b9512..0000000000 --- a/deploy/aws/tikv-userdata.sh +++ /dev/null @@ -1,25 +0,0 @@ -# set system ulimits -cat < /etc/security/limits.d/99-tidb.conf -root soft nofile 1000000 -root hard nofile 1000000 -root soft core unlimited -root soft stack 10240 -EOF -# config docker ulimits -cp /usr/lib/systemd/system/docker.service /etc/systemd/system/docker.service -sed -i 's/LimitNOFILE=infinity/LimitNOFILE=1048576/' /etc/systemd/system/docker.service -sed -i 's/LimitNPROC=infinity/LimitNPROC=1048576/' /etc/systemd/system/docker.service -systemctl daemon-reload -systemctl restart docker - -# format and mount nvme disk -if grep nvme0n1 /etc/fstab; then - echo "disk already mounted" -else - mkfs -t ext4 /dev/nvme0n1 - mkdir -p /mnt/disks/ssd1 - cat <> /etc/fstab -/dev/nvme0n1 /mnt/disks/ssd1 ext4 defaults,nofail,noatime,nodelalloc 0 2 -EOF - mount -a -fi diff --git a/deploy/aws/userdata.sh b/deploy/aws/userdata.sh deleted file mode 100644 index 123ba40add..0000000000 --- a/deploy/aws/userdata.sh +++ /dev/null @@ -1,35 +0,0 @@ -# set ulimits -cat < /etc/security/limits.d/99-tidb.conf -root soft nofile 1000000 -root hard nofile 1000000 -root soft core unlimited -root soft stack 10240 -EOF -# config docker ulimit -cp /usr/lib/systemd/system/docker.service /etc/systemd/system/docker.service -sed -i 's/LimitNOFILE=infinity/LimitNOFILE=1048576/' /etc/systemd/system/docker.service -sed -i 's/LimitNPROC=infinity/LimitNPROC=1048576/' /etc/systemd/system/docker.service -systemctl daemon-reload -systemctl restart docker - -# format and mount nvme disk -if grep nvme0n1 /etc/fstab || grep nvme1n1 /etc/fstab; then - echo "disk already mounted" -else - if mkfs -t ext4 /dev/nvme1n1 ; then - - mkdir -p /mnt/disks/ssd1 - cat <> /etc/fstab -/dev/nvme1n1 /mnt/disks/ssd1 ext4 defaults,nofail,noatime,nodelalloc 0 2 -EOF - mount -a - else - mkfs -t ext4 /dev/nvme0n1 - mkdir -p /mnt/disks/ssd1 - cat <> /etc/fstab -/dev/nvme0n1 /mnt/disks/ssd1 ext4 defaults,nofail,noatime,nodelalloc 0 2 -EOF - mount -a - fi -fi - diff --git a/deploy/aws/variables.tf b/deploy/aws/variables.tf index 8465920c06..4364c2d2e3 100644 --- a/deploy/aws/variables.tf +++ b/deploy/aws/variables.tf @@ -1,122 +1,117 @@ variable "region" { - description = "aws region" - default = "us-east-2" + description = "AWS region" + # supported regions: + # US: us-east-1, us-east-2, us-west-2 + # Asia Pacific: ap-south-1, ap-northeast-2, ap-southeast-1, ap-southeast-2, ap-northeast-1 + # Europe: eu-central-1, eu-west-1, eu-west-2, eu-west-3, eu-north-1 + default = "us-west-2" } -variable "ingress_cidr" { - description = "IP CIDR that allowed to access bastion ec2 instance" - default = ["0.0.0.0/0"] # Note: Please restrict your ingress to only necessary IPs. Opening to 0.0.0.0/0 can lead to security vulnerabilities. +variable "eks_name" { + description = "Name of the EKS cluster. Also used as a prefix in names of related resources." + default = "my-cluster" +} + +variable "eks_version" { + description = "Kubernetes version to use for the EKS cluster." + default = "1.12" +} + +variable "operator_version" { + description = "TiDB operator version" + default = "v1.0.0-beta.3" +} + +variable "operator_values" { + description = "The helm values of TiDB Operator" + default = "" } # Please note that this is only for manually created VPCs, deploying multiple EKS # clusters in one VPC is NOT supported now. variable "create_vpc" { - description = "Create a new VPC or not. If there is an existing VPC that you'd like to use, set this value to `false` and adjust `vpc_id`, `private_subnet_ids` and `public_subnet_ids` to the existing ones." - default = true + description = "Create a new VPC or not, if true the vpc_id/subnet_ids must be set correctly, otherwise the vpc_cidr/private_subnets/public_subnets must be set correctly" + default = true } variable "vpc_cidr" { - description = "The network to use within the VPC. This value is ignored if `create_vpc=false`." - default = "10.0.0.0/16" + description = "VPC cidr, must be set correctly if create_vpc is true" + default = "10.0.0.0/16" } variable "private_subnets" { - description = "The networks to use for private subnets. This value is ignored if `create_vpc=false`." - type = "list" - default = ["10.0.1.0/24", "10.0.2.0/24", "10.0.3.0/24"] + description = "VPC private subnets, must be set correctly if create_vpc is true" + type = list(string) + default = ["10.0.16.0/20", "10.0.32.0/20", "10.0.48.0/20"] } variable "public_subnets" { - description = "The networks to use for public subnets. This value is ignored if `create_vpc=false`." - type = "list" - default = ["10.0.4.0/24", "10.0.5.0/24", "10.0.6.0/24"] + description = "VPC public subnets, must be set correctly if create_vpc is true" + type = list(string) + default = ["10.0.64.0/20", "10.0.80.0/20", "10.0.96.0/20"] } variable "vpc_id" { - description = "ID of the existing VPC. This value is ignored if `create_vpc=true`." - type = "string" - default = "vpc-c679deae" + description = "VPC id, must be set correctly if create_vpc is false" + type = string + default = "" } -# To use the same subnets for both private and public usage, -# just set their values identical. -variable "private_subnet_ids" { - description = "The subnet ID(s) of the existing private networks. This value is ignored if `create_vpc=true`." - type = "list" - default = ["subnet-899e79f3", "subnet-a72d80cf", "subnet-a76d34ea"] +variable "subnets" { + description = "subnet id list, must be set correctly if create_vpc is false" + type = list(string) + default = [] } - -variable "public_subnet_ids" { - description = "The subnet ID(s) of the existing public networks. This value is ignored if `create_vpc=true`." - type = "list" - default = ["subnet-899e79f3", "subnet-a72d80cf", "subnet-a76d34ea"] +variable "bastion_ingress_cidr" { + description = "IP cidr that allowed to access bastion ec2 instance" + default = ["0.0.0.0/0"] # Note: Please restrict your ingress to only necessary IPs. Opening to 0.0.0.0/0 can lead to security vulnerabilities. } variable "create_bastion" { description = "Create bastion ec2 instance to access TiDB cluster" - default = true -} - -variable "bastion_ami" { - description = "bastion ami id" - default = "ami-0cd3dfa4e37921605" + default = true } variable "bastion_instance_type" { description = "bastion ec2 instance type" - default = "t2.micro" -} - -variable "cluster_name" { - description = "eks cluster name" - default = "my-cluster" + default = "t2.micro" } -variable "k8s_version" { - description = "eks cluster version" - default = "1.12" +# For aws tutorials compatiablity +variable "default_cluster_version" { + default = "v3.0.0" } -variable "tidb_version" { - description = "tidb cluster version" - default = "v3.0.0-rc.1" -} - -variable "pd_count" { +variable "default_cluster_pd_count" { default = 3 } -variable "tikv_count" { +variable "default_cluster_tikv_count" { default = 3 } -variable "tidb_count" { +variable "default_cluster_tidb_count" { default = 2 } -// Be careful about changing the instance types, it may break the user data and local volume setup -variable "pd_instance_type" { - default = "m5d.xlarge" +variable "default_cluster_pd_instance_type" { + default = "m5.xlarge" } -variable "tikv_instance_type" { - default = "i3.2xlarge" +variable "default_cluster_tikv_instance_type" { + default = "c5d.4xlarge" } -variable "tidb_instance_type" { - default = "c4.4xlarge" +variable "default_cluster_tidb_instance_type" { + default = "c5.4xlarge" } -variable "monitor_instance_type" { - default = "c5.xlarge" +variable "default_cluster_monitor_instance_type" { + default = "c5.2xlarge" } -variable "tikv_root_volume_size" { - default = "100" +variable "default_cluster_name" { + default = "my-cluster" } -variable "monitor_enable_anonymous_user" { - description = "Whether enabling anonymous user visiting for monitoring" - default = false -}