Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error downloading the cluster config - config.zip: no such file or directory #2806

Closed
ocofaigh opened this issue Jul 1, 2021 · 12 comments
Closed
Labels
GoldenEye service/Kubernetes Service Issues related to Kubernetes Service Issues

Comments

@ocofaigh
Copy link
Contributor

ocofaigh commented Jul 1, 2021

We hit a scenario where when two data sources (both running ibm_container_cluster_config) run at the same time, they seem to clash with each other and give the below error:

╷
│ Error: Error downloading the cluster config [c3en2rss08p3fuce3dk0]: open /Users/conall/3b823a3d3a844d109ac91ffba785f0b3eef39cb03dd513926c4e396a81f755e7_c3en2rss08p3fuce3dk0_k8sconfig/config.zip: no such file or directory
│ 
│   with data.ibm_container_cluster_config.cluster_config,
│   on main.tf line 96, in data "ibm_container_cluster_config" "cluster_config":
│   96: data "ibm_container_cluster_config" "cluster_config" {
│ 
╵

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Terraform CLI and Terraform IBM Provider Version

$ terraform -v
Terraform v0.15.3
on darwin_amd64
+ provider registry.terraform.io/hashicorp/helm v2.2.0
+ provider registry.terraform.io/hashicorp/kubernetes v2.3.2
+ provider registry.terraform.io/hashicorp/local v2.1.0
+ provider registry.terraform.io/hashicorp/null v3.1.0
+ provider registry.terraform.io/hashicorp/time v0.7.1
+ provider registry.terraform.io/ibm-cloud/ibm v1.27.1

Affected Resource(s)

  • ibm_container_cluster_config

Terraform Configuration Files

Please include all Terraform configurations required to reproduce the bug. Bug reports without a functional reproduction may be closed without investigation.

terraform {
  required_version = ">= 0.15"
  required_providers {
    ibm = {
      source  = "ibm-cloud/ibm"
      version = ">= 1.27.0"
    }
  }
}

provider "ibm" {
  ibmcloud_api_key = var.ibmcloud_api_key
}

variable "ibmcloud_api_key" {
  description = "APIkey that's associated with the account to use, set via environment variable TF_VAR_ibmcloud_api_key"
  type        = string
  sensitive   = true
}

variable "cluster_id" {
  type        = string
  description = "OpenShift cluster ID"
}

data "ibm_container_cluster_config" "cluster" {
  cluster_name_id   = var.cluster_id
}

data "ibm_container_cluster_config" "cluster_config" {
  cluster_name_id   = var.cluster_id
}

Debug Output

Panic Output

Expected Behavior

The provider should be able to handle such a scenario and not produce an error

Actual Behavior

The following error:

╷
│ Error: Error downloading the cluster config [c3en2rss08p3fuce3dk0]: open /Users/conall/3b823a3d3a844d109ac91ffba785f0b3eef39cb03dd513926c4e396a81f755e7_c3en2rss08p3fuce3dk0_k8sconfig/config.zip: no such file or directory
│ 
│   with data.ibm_container_cluster_config.cluster_config,
│   on main.tf line 96, in data "ibm_container_cluster_config" "cluster_config":
│   96: data "ibm_container_cluster_config" "cluster_config" {
│ 
╵

Steps to Reproduce

  1. terraform apply

Important Factoids

References

  • #0000
@kavya498
Copy link
Collaborator

kavya498 commented Jul 1, 2021

@ocofaigh ,

data "ibm_container_cluster_config" "cluster" {
  cluster_name_id   = var.cluster_id
}

data "ibm_container_cluster_config" "cluster_config" {
  cluster_name_id   = var.cluster_id
}

Looks like you are running two configs on same cluster..
Can you please add different config_dir for each datasource

data "ibm_container_cluster_config" "cluster" {
  cluster_name_id   = var.cluster_id
  config_dir="/Users/conall/cluster"
}

data "ibm_container_cluster_config" "cluster_config" {
  cluster_name_id   = var.cluster_id
  config_dir="/Users/conall/cluster_config"
}

@kavya498 kavya498 added the service/Kubernetes Service Issues related to Kubernetes Service Issues label Jul 1, 2021
@vburckhardt
Copy link

vburckhardt commented Jul 1, 2021

@kavya498 - In the scenario where 3rd party modules are used, it is not possible for someone assembling multiple modules together to change the "ibm_container_cluster_config" "cluster" block. I think the IBM Cloud Provider should handle transparently this without requiring specific config_dir to be set.

@ocofaigh
Copy link
Contributor Author

ocofaigh commented Jul 1, 2021

Using config_dir does prevent the issue, but as @vburckhardt mentioned, what happens if there are 3rd party modules using ibm_container_cluster_config without config_dir. We would have no control over them. Perhaps there is some way to have the provider handle this scenario?

@kavya498
Copy link
Collaborator

kavya498 commented Jul 2, 2021

Since it is a parallel process, this is happening..We ll look into issue
As a workaround.. please add depends_on between two datasources or modules OR provide different config_dir in the datasource..

@ocofaigh
Copy link
Contributor Author

ocofaigh commented Oct 4, 2021

This may have been a side effect of the race conditions that were fixed in IBM-Cloud/bluemix-go#318
Going to test it out.

@ocofaigh
Copy link
Contributor Author

ocofaigh commented Oct 5, 2021

No joy, when we remove the workaround, we reproduce the issue:

TestTerraformDeployProxyExample 2021-10-05T08:55:34Z logger.go:66: │ Error: Error downloading the cluster config [c5e0l86r0c1fl881fmpg]: open /root/bc8ada65bbaa80d441045db3daf5db99a1bcb168a3fcac6aa31a5451c03eceb1_c5e0l86r0c1fl881fmpg_k8sconfig/config.zip: no such file or directory
TestTerraformDeployProxyExample 2021-10-05T08:55:34Z logger.go:66: │ 
TestTerraformDeployProxyExample 2021-10-05T08:55:34Z logger.go:66: │   with module.deploy_proxy.module.kubeconfig.data.ibm_container_cluster_config.cluster_config,
TestTerraformDeployProxyExample 2021-10-05T08:55:34Z logger.go:66: │   on .terraform/modules/deploy_proxy.kubeconfig/main.tf line 18, in data "ibm_container_cluster_config" "cluster_config":
TestTerraformDeployProxyExample 2021-10-05T08:55:34Z logger.go:66: │   18: data "ibm_container_cluster_config" "cluster_config" {
TestTerraformDeployProxyExample 2021-10-05T08:55:34Z logger.go:66: │ 
TestTerraformDeployProxyExample 2021-10-05T08:55:34Z logger.go:66: ╵
TestTerraformDeployProxyExample 2021-10-05T08:55:34Z retry.go:99: Returning due to fatal error: FatalError{Underlying: error while running command: exit status 1; ╷
│ Error: Error downloading the cluster config [c5e0l86r0c1fl881fmpg]: open /root/bc8ada65bbaa80d441045db3daf5db99a1bcb168a3fcac6aa31a5451c03eceb1_c5e0l86r0c1fl881fmpg_k8sconfig/config.zip: no such file or directory
│ 
│   with module.deploy_proxy.module.kubeconfig.data.ibm_container_cluster_config.cluster_config,
│   on .terraform/modules/deploy_proxy.kubeconfig/main.tf line 18, in data "ibm_container_cluster_config" "cluster_config":
│   18: data "ibm_container_cluster_config" "cluster_config" {
│ 
╵}
    destroy.go:11: 
        	Error Trace:	destroy.go:11
        	            				terraform_deploy_proxy_example_test.go:50
        	Error:      	Received unexpected error:
        	            	FatalError{Underlying: error while running command: exit status 1; ╷
        	            	│ Error: Error downloading the cluster config [c5e0l86r0c1fl881fmpg]: open /root/bc8ada65bbaa80d441045db3daf5db99a1bcb168a3fcac6aa31a5451c03eceb1_c5e0l86r0c1fl881fmpg_k8sconfig/config.zip: no such file or directory
        	            	│ 
        	            	│   with module.deploy_proxy.module.kubeconfig.data.ibm_container_cluster_config.cluster_config,
        	            	│   on .terraform/modules/deploy_proxy.kubeconfig/main.tf line 18, in data "ibm_container_cluster_config" "cluster_config":
        	            	│   18: data "ibm_container_cluster_config" "cluster_config" {
        	            	│ 
        	            	╵}
        	Test:       	TestTerraformDeployProxyExample

For now, we will continue to use the workaround (aka provide different config_dir in the datasource)

@vburckhardt
Copy link

Opened a PR at IBM-Cloud/bluemix-go#328 that appears to fix the issue.

@ocofaigh
Copy link
Contributor Author

@kavya498 We have found an issue with the workaround...

If you specifically specify a config_dir value, and you delete that directory after running a successful apply, any subsequent plan / apply / destroy attempt will fail with:

Error: Kubernetes cluster unreachable: invalid configuration: no configuration has been provided, try setting KUBERNETES_MASTER environment variable

However, if you do not use config_dir, and allow ibm_container_cluster_config data lookup to use the default directory, the kubeconfig file and directory get recreated if it no longer exist for every plan / apply / destroy.

Can you help push to get IBM-Cloud/bluemix-go#328 merged so we can stop using the broken workaround? Thanks

@hkantare
Copy link
Collaborator

@ocofaigh we are reviewing PR and will be available part of next week release

@ocofaigh
Copy link
Contributor Author

ocofaigh commented Oct 22, 2021

So the reason we were seeing the issue described in above comment where kubeconfig was not getting re-generated when using config_dir was because we were creating the directory using terraform, so if that directory gets deleted, a subsequent terraform plan is not going to regenerate the directory, and so the data lookup fails since it cant find the directory to add the kubeconfig file to.

So what we need to do to ensure we don't hit either of the issues is, for the value of config_dir:

@ocofaigh
Copy link
Contributor Author

Latest updates on the permanent fix -> IBM-Cloud/bluemix-go#328 (comment)

@ocofaigh
Copy link
Contributor Author

ocofaigh commented Nov 1, 2021

Fix looks good in v1.35.0 (#3264). Closing

@ocofaigh ocofaigh closed this as completed Nov 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
GoldenEye service/Kubernetes Service Issues related to Kubernetes Service Issues
Projects
None yet
Development

No branches or pull requests

4 participants