Refine aliyun and aws cloud tidb configurations #492

aylei · 2019-05-14T11:12:50Z

Signed-off-by: Aylei rayingecho@gmail.com

What problem does this PR solve?

Improve the user experience of terraform deployment:

Use tidb 2.1.8 by default for Aliyun
Better defaults for Aliyun kubernetes cluster
Disable Grafana anonymous user by default (both AWS and Aliyun expose Grafana to internet by default)
Add an option to enable Grafana anonymous user

What is changed and how it works?

Terraform scripts changed, more options are exposed.

Check List

Tests

Manual test (add detailed scripts or steps below)

$ terraform apply

@AstroProfundis @jlerche @tennix PTAL

Signed-off-by: Aylei <rayingecho@gmail.com>

gregwebs · 2019-05-14T22:46:43Z

deploy/aws/variables.tf

@@ -105,3 +105,7 @@ variable "tikv_root_volume_size" {
  default = "100"
 }

+variable "monitor_enable_anonymous_user" {


Why create this variable in terrraform? Can't the user change it directly in the helm values? If we start doing this we eventually have to create a terraform value for every helm configuration.

Yes, neither do I like this direction but I cannot figure out a better way.

The problem is if we guide the user to customize the values.yaml, we have to also document that their modification will be overridden once they run terraform apply again, which render a new values.yaml using tf vars. This can be troublesome and error-prone for the users.

I do like the direction that the terraform scripts should only bootstrap the cluster and users should eventually maintain the tidb cluster by the application level method like values.yaml, sadly we haven't yet solved the cluster scaling problem outside terraform as mentioned in #436

How do you think?

Can they change tidb-cluster-values.yaml.tpl directly and use that as their values file?

For the long-term, the current flow of information is backwards. Terraform should just provision the K8s clusters and node pools.

terraform install the K8s cluster (actually we should support attaching our node pools to an existing cluster also)

helm install

refresh terraform with helm values (number of pd, tikv, tidb)

Then maitentance is step 2 & 3

helm upgrade

refresh-script

terraform plan

terraform apply

Step 3 shouldn't be needed actually, but instead just an API call or button click to scale out the node group is needed. Its a problem that terraform wants to be in charge of setting the number of nodes in an autoscaling group, and I know on AWS this is supposed to get fixed in a newer version of terraform.

I agree. As a experienced Kubernetes user, I would prefer to manage my tidb cluster this way, which is more flexible and predictable.

In the meantime, encapsulating all the things in a terraform script to provide "one-click deploy" is valuable for users who are not familiar with Kubernetes and Helm, and is a good fit for presentation especially.

So I suggest to add an option to skip the helm install tidb-cluster in terraform to achieve this terraform + helm workflow, while keeping the "one-click deploy" ability as default.

Yes, of course. I'd consider this as a workaround, because this expose the implementation details to user. Despite we may or may not document this method, we should provide a more formal way to leverage the full customization of helm values, for example, separate the terraform script into two modules: infrastructure & helm as you mentioned in #494

I don't think we should have a goal of hiding helm values as an implementation detail: that makes it impossible to customize the existing deployment. What is the downside to having users modify the helm values config file?

The helm values is not the implementation detail, the template file (tidb-cluster-values.yaml.tpl) is. If the user edit the template file, they are maintain the terraform script on their own, and they have to track and merge with upstream change manually, and there's always an indirection even if the related change don't require an infrastructure change (e.g. change the config).
Have saying that, it is probably not a serious problem in real world usage. So the dominate reason I think we should finally avoid documenting "modify the .tpl file on your own" is #494 will expose the values.yaml to user in a more formal and flexible way (e.g. if future deployment requires multiple tidb-cluster to one kubernetes cluster).

How does the end user make changes that they need to the helm values in the existing setup? We do need a way to handle this workflow now.

I think we should limit this setup to deploying and managing only one tidb cluster. For managing multiple tidb clusters in a single k8s cluster, we may provide another terraform script to only set up the k8s cluster and tidb-operator itself.

In this case, it makes more sense to configure the tidb clusters using variables.tf.

jlerche

LGTM

tennix

LGTM

aylei · 2019-05-16T04:42:51Z

/run-e2e-tests

aylei · 2019-05-16T12:21:48Z

/run-e2e-tests

aylei added 2 commits May 14, 2019 15:35

Refine aliyun cloud tidb configurations

3ac0716

Signed-off-by: Aylei <rayingecho@gmail.com>

Expose grafana anonymous user option in aws deployment

244a661

Signed-off-by: Aylei <rayingecho@gmail.com>

aylei changed the title ~~Refine aliyun cloud tidb configurations~~ Refine aliyun and aws cloud tidb configurations May 14, 2019

aylei requested review from tennix, jlerche and AstroProfundis May 14, 2019 11:18

Update manual for aliyun deployment

1c2702d

Signed-off-by: Aylei <rayingecho@gmail.com>

gregwebs reviewed May 14, 2019

View reviewed changes

jlerche approved these changes May 14, 2019

View reviewed changes

tennix approved these changes May 15, 2019

View reviewed changes

aylei mentioned this pull request May 16, 2019

Separate aws/gcp/alibaba terraform scripts into infrastructure module and tidb-cluster module #494

Closed

Merge branch 'master' into refine-aliyun

363cbdf

Merge branch 'master' into refine-aliyun

624b0ab

Merge branch 'master' into refine-aliyun

832ecea

tennix merged commit 8df3140 into pingcap:master May 21, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refine aliyun and aws cloud tidb configurations #492

Refine aliyun and aws cloud tidb configurations #492

aylei commented May 14, 2019 •

edited

Loading

gregwebs May 14, 2019

aylei May 15, 2019

gregwebs May 15, 2019

gregwebs May 15, 2019

aylei May 16, 2019 •

edited

Loading

aylei May 17, 2019

gregwebs May 17, 2019

aylei May 20, 2019

gregwebs May 20, 2019

tennix May 21, 2019

jlerche left a comment

tennix left a comment

aylei commented May 16, 2019

aylei commented May 16, 2019

Refine aliyun and aws cloud tidb configurations #492

Refine aliyun and aws cloud tidb configurations #492

Conversation

aylei commented May 14, 2019 • edited Loading

What problem does this PR solve?

What is changed and how it works?

Check List

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aylei May 16, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jlerche left a comment

Choose a reason for hiding this comment

tennix left a comment

Choose a reason for hiding this comment

aylei commented May 16, 2019

aylei commented May 16, 2019

aylei commented May 14, 2019 •

edited

Loading

aylei May 16, 2019 •

edited

Loading