Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor kubernetes client config initialization and fix in-cluster config #690

Merged
merged 9 commits into from
Feb 3, 2020

Conversation

pdecat
Copy link
Contributor

@pdecat pdecat commented Nov 14, 2019

This PR is an alternative to #686 that intends to:

if v, ok := d.GetOk("host"); ok {
cfg.Host = v.(string)
overrides.ClusterInfo.Server = v.(string)
Copy link
Contributor Author

@pdecat pdecat Nov 14, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change causes a regression when passing a bare IP address instead of a URL, e.g. when using this:

provider "kubernetes" {
  version = "1.10.1-dev3"

  load_config_file = false
  host             = data.google_container_cluster.cluster0.endpoint
  token            = data.google_client_config.default.access_token
  cluster_ca_certificate = base64decode(
    data.google_container_cluster.cluster0.master_auth[0].cluster_ca_certificate,
  )
}

I believe that's because overrides.ClusterInfo.Server always expects a URL: https://github.com/kubernetes/client-go/blob/v12.0.0/tools/clientcmd/api/types.go#L68-L69

While cfg.Host is more flexible: https://github.com/kubernetes/client-go/blob/v12.0.0/rest/config.go#L53-L57

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A work-around is to use this:

  host             = "https://${data.google_container_cluster.cluster0.endpoint}:443"

But I'm experimenting how to avoid this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolved.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not the proper way to do it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pdecat
Copy link
Contributor Author

pdecat commented Nov 16, 2019

So far, I've tested this successfully in the following situations:

A subset of acceptance tests

# make testacc TEST=./kubernetes TESTARGS='-run=TestAccKubernetesConfigMap -count=1'
==> Checking that code complies with gofmt requirements...
TF_ACC=1 go test ./kubernetes -v -run=TestAccKubernetesConfigMap -count=1 -timeout 120m
go: finding k8s.io/client-go v12.0.0+incompatible
go: finding k8s.io/client-go v12.0.0+incompatible
go: finding github.com/terraform-providers/terraform-provider-google v2.17.0+incompatible
go: finding github.com/terraform-providers/terraform-provider-aws v2.32.0+incompatible
go: finding github.com/terraform-providers/terraform-provider-google v2.17.0+incompatible
go: finding github.com/terraform-providers/terraform-provider-aws v2.32.0+incompatible
=== RUN   TestAccKubernetesConfigMap_basic
--- PASS: TestAccKubernetesConfigMap_basic (1.38s)
=== RUN   TestAccKubernetesConfigMap_importBasic
--- PASS: TestAccKubernetesConfigMap_importBasic (0.57s)
=== RUN   TestAccKubernetesConfigMap_binaryData
--- PASS: TestAccKubernetesConfigMap_binaryData (0.79s)
=== RUN   TestAccKubernetesConfigMap_generatedName
--- PASS: TestAccKubernetesConfigMap_generatedName (0.49s)
=== RUN   TestAccKubernetesConfigMap_importGeneratedName
--- PASS: TestAccKubernetesConfigMap_importGeneratedName (0.57s)
PASS
ok      github.com/terraform-providers/terraform-provider-kubernetes/kubernetes 3.832s

Using kubeconfig against GKE 1.13.11-gke.5

provider "kubernetes" {
  version = "1.10.1-dev6"

  config_path = var.gke_kubeconfig_gitlab-ci
}

Using access token against GKE 1.13.11-gke.14

provider "kubernetes" {
  version = "1.10.1-dev6"

  load_config_file = false
  host             = data.google_container_cluster.cluster0.endpoint
  token            = data.google_client_config.default.access_token
  cluster_ca_certificate = base64decode(
    data.google_container_cluster.cluster0.master_auth[0].cluster_ca_certificate,
  )
}

In-cluster with minikube and kubernetes 1.16

https://github.com/terraform-providers/terraform-provider-kubernetes/pull/690/files#diff-52766c12b9ab0b44de46fcc3af48a543R10-R17

Cross-cluster with minikube and kubernetes 1.16 targeting minikube and kubernetes 1.15

https://github.com/terraform-providers/terraform-provider-kubernetes/pull/690/files#diff-2841352a525f3b955007a50733253d5bR11-R18

Configurations not tested yet

  • username / password
  • client_key
  • exec
  • insecure
  • mixes of all of the above and kubeconfig files

@alexsomesan
Copy link
Member

I take this one for a ride in CI, three major cloud platforms.

@houqp
Copy link
Contributor

houqp commented Dec 2, 2019

@pdecat @alexsomesan what's the state of this PR? Anything I could help to move it forward?

@pdecat
Copy link
Contributor Author

pdecat commented Dec 2, 2019

At this stage, I believe it mostly needs more testing and a complete review.

@pdecat pdecat changed the title WIP: refactor kubernetes client config initialization and fix in-cluster config Refactor kubernetes client config initialization and fix in-cluster config Dec 2, 2019
@houqp
Copy link
Contributor

houqp commented Jan 16, 2020

@alexsomesan anything we could do to help move this forward?

@alexsomesan
Copy link
Member

I've tested this and it works as specified. I'm a bit on the fence whether we should fail when running in-cluster unless the load_config_file is set, but the alternative of automatically picking in-cluster config would surely confuse a lot of inexperienced users. Overall I agree this is the right compromise.

On the other hand, I have one suggestion about the example based on a job. I think it should be more explicit about the fact that it's not illustrating the default in-cluster scenario and explain that external credentials are being injected as well as the external API being used. I have a feeling that in-cluster API is little known enough that this clarification will save confusion going forward.

@alexsomesan
Copy link
Member

@houqp aiming to add this to the upcoming release. Likely on monday.

@pdecat
Copy link
Contributor Author

pdecat commented Jan 31, 2020

I've tested this and it works as specified.

Good news 🎉

I'm a bit on the fence whether we should fail when running in-cluster unless the load_config_file is set, but the alternative of automatically picking in-cluster config would surely confuse a lot of inexperienced users. Overall I agree this is the right compromise.

What about adding explicit messages with INFO or WARN log level during provider initialization to give users hints about where the configuration came from and why in-cluster config was not tried?

On the other hand, I have one suggestion about the example based on a job. I think it should be more explicit about the fact that it's not illustrating the default in-cluster scenario and explain that external credentials are being injected as well as the external API being used. I have a feeling that in-cluster API is little known enough that this clarification will save confusion going forward.

We can indeed expand a bit on the description of the job example.

Maybe referencing some other existing doc on the matter could help, e.g.:

Feel free to add what you think is best, I'm not sure I'll have time to address this before Monday.

@alexsomesan
Copy link
Member

Sounds good, @pdecat. I can take care of the changes.
I like the idea about exposing the config source with log messages.

@alexsomesan
Copy link
Member

I've given it more thought and I think the examples included here all have their merits. I'd like to keep them as they are. I'll follow up with a simpler one for in-cluster in addition to this.

Copy link
Member

@alexsomesan alexsomesan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested manually and in CI. Works as expected.

@pdecat pdecat deleted the fix-in-cluster branch February 3, 2020 13:52
@pdecat pdecat restored the fix-in-cluster branch February 18, 2020 13:09
@ghost ghost locked and limited conversation to collaborators Apr 21, 2020
@pdecat pdecat deleted the fix-in-cluster branch July 16, 2021 12:29
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unauthorized after update to v1.10 using token auth
3 participants