Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] terraform state gets corrupted without ability to restore #1082

Open
im-dim opened this issue Aug 30, 2023 · 4 comments
Open

[BUG] terraform state gets corrupted without ability to restore #1082

im-dim opened this issue Aug 30, 2023 · 4 comments
Labels

Comments

@im-dim
Copy link

im-dim commented Aug 30, 2023

Contact us

For any immediate issues or help , reach out to us at NetScaler-AutomationToolkit@cloud.com !

Bug Report

I get the below error if a configuration object was built by terraform but then was deleted from VPX (for any reason).

Error: [ERROR] FindResourceArrayWithParams: non zero errorcode 461

After that you can't apply and can't destroy anything. And it looks like the only way to continue is to delete terraformstate, rebuild VPX, and then re-apply citrix config again.

To Reproduce
Steps to reproduce the behavior:

  1. I.e. bind ecc curve to vpnvservers
    resource "citrixadc_sslvserver_ecccurve_binding" "P_256" {
    for_each = {
    for vpnvserver in local.vservers : "${vpnvserver.product_name}-${vpnvserver.product_env}" => vpnvserver
    }
    ecccurvename = "P_256"
    vservername = citrixadc_vpnvserver.vpnvserver["${each.value.product_name}-${each.value.product_env}"].name
    }

  2. Remove ecc curve from config
    unbind ssl vserver XXX -eccCurveName P_256
    unnbind ssl vserver YYY -eccCurveName P_256

  3. run terraform apply

  4. Error I am facing on the console


**Expected behaviour**
Config object should be re-created if it's not found on the device.

**Environment (please fill the following information):**
 - OS: Mac
 - Terraform v1.4.0
 - terraform-provider-citrixadc version 1.36.0
 - go1.19 darwin/arm64

@ravager-dk
Copy link
Contributor

This is actually a misconception about how Terraform works. Yes, some providers can handle this situation, but for complex resource types this becomes problematic. The correct flow is to remove the deleted resources from the state using "Terraform state rm" https://developer.hashicorp.com/terraform/cli/commands/state/rm

An alternative solution is to recreate the resource in the NetScaler directly and then import it into the Terraform state.

Using Terraform to continuously configure your infrastructure requires you to only make changes to the resources through Terraform.

@im-dim
Copy link
Author

im-dim commented Sep 11, 2023

This is actually a misconception about how Terraform works. Yes, some providers can handle this situation, but for complex resource types this becomes problematic. The correct flow is to remove the deleted resources from the state using "Terraform state rm" https://developer.hashicorp.com/terraform/cli/commands/state/rm

An alternative solution is to recreate the resource in the NetScaler directly and then import it into the Terraform state.

Using Terraform to continuously configure your infrastructure requires you to only make changes to the resources through Terraform.

The reported issue is just an example but have had many corrupted tfstate files when we run VPX failover tests between AZs and the only way to get out of there was a) delete tfstate on S3, b) delete lock from dynamo, c) rebuild both VPXes.

In order to have more stable environment, we switched to template files but problem there is that you can't delete some resources when parameters change (i.e. removing VIP).

@im-dim
Copy link
Author

im-dim commented Sep 14, 2023

You should be able to reproduce this issue by, let's say, "corrupting" both VPXes (terminate instances) and then trying to rebuild them by re-running terraform...

Above should give you a bunch of the below errors, TF will fail, and you'll get CORRUPTED tfstate.

 Error: [ERROR] FindResourceArrayWithParams: non zero errorcode 344
 Error: [ERROR] FindResourceArrayWithParams: non zero errorcode 461

And the only way to restore environment is to delete tfstate on S3, delete lock from dynamoDB, and re-init environment

Is that expected?

Shouldn't absence of a resource be detected and a new one created as it's done for any other AWS resources?

@kaiAsmOne
Copy link

i do belive there is some gold for you to discover in @ravager-dk ´s comment. I do LARGE deploys in Azure and i do not have this issue but i have to do terraform state work as ravenger-dk suggests.

My roadmap is to go canary or blue/green deploys only in the future.. Whenever i do a change to a netscaler, deploy a new fresh netscaler from code. When deployed, change the Azure LB in front of the Netscaler. blue/green or canary will make issues like this non exsistent

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants