Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using element(blah.*.param, count.index) sometimes returns the wrong element #795

Closed
knuckolls opened this issue Jan 14, 2015 · 2 comments · Fixed by #1016
Closed

Using element(blah.*.param, count.index) sometimes returns the wrong element #795

knuckolls opened this issue Jan 14, 2015 · 2 comments · Fixed by #1016
Assignees

Comments

@knuckolls
Copy link
Contributor

I have been having a problem wherein some runs fail intermittently due to a tainted resource. Maybe 30% of the runs. Took me awhile to get a test case w/ the correct logging inserted to prove what was wrong. The bug exists in both master and the f-formalize-interps branch. I have a branch w/ my fix that I'll PR to master. It may be a quick fix to the f-formalize-interps branch as well if this is the sort of solution we want to merge.

Anyway, here's the problem. We use vpc so we have to connect to the private_ip of the instance for provisioning. Unfortunately, this bug may crop up in other places so it needs a bigger fix than just adding a private_ip toggle for aws.

With this configuration:

resource "aws_instance" "zookeeper_aws_instance" {
    count = "${var.zookeeper_aws_count}"

    ami             =  "${var.ami}"
    instance_type   =  "${var.zookeeper_aws_instance_type}"
    key_name        =  "${var.key_name}"
    security_groups = ["${var.aws_security_group}"]
    subnet_id       =  "${var.aws_subnet_id}"

    connection {
        host     = "${element(aws_instance.zookeeper_aws_instance.*.private_ip, count.index)}"
        key_file = "${var.key_file}"
        user     = "ubuntu"
        timeout  = "30s"
    }

    provisioner "file" {
        source = "${path.module}/scripts/foo.sh"
        destination = "/tmp/foo.sh"
    }
    provisioner "remote-exec" {
        inline = [
            "chmod +x /tmp/foo.sh",
            "sudo /tmp/foo.sh"
        ]
    }
}

This will sometimes copy the file to the wrong host, due to the wrong value being returned from the element interpolation in the connection object. Then when it tries to remote-exec that script on the correct host, it fails because the script is not found.

Here's a case that shows the problem that I just pulled from the f-formalize-interps branch w/ added logging. You can see what I added here:

Banno@bb9feea

2015/01/14 11:39:40 ==> interpolationFuncElement: args => []interface {}{"10.10.1.75", "1"}
module.env.aws_instance.zookeeper_aws_instance.1: Provisioning with 'file'...
module.env.aws_instance.zookeeper_aws_instance.1 (file): private_ip attribute: "10.10.1.75"
module.env.aws_instance.zookeeper_aws_instance.1 (file): connection:host attribute: "10.10.1.75"
<snip>

^^ instance.1 provisions to the correct node.

2015/01/14 11:39:45 ==> interpolationFuncElement: args => []interface {}{"10.10.1.75B780FFEC-B661-4EB8-9236-A01737AD98B610.10.1.49", "2"}
module.env.aws_instance.zookeeper_aws_instance.2: Provisioning with 'file'...
module.env.aws_instance.zookeeper_aws_instance.2 (file): private_ip attribute: "10.10.1.49"
module.env.aws_instance.zookeeper_aws_instance.2 (file): connection:host attribute: "10.10.1.75"
<snip>

^^ instance.2 is where we see the bug. it provisions to the wrong node.

below is the rest of the run.

2015/01/14 11:39:49 ==> interpolationFuncElement: args => []interface {}{"10.10.1.75B780FFEC-B661-4EB8-9236-A01737AD98B610.10.1.49", "1"}
module.env.aws_instance.zookeeper_aws_instance.1: Provisioning with 'remote-exec'...
2015/01/14 11:39:49 terraform-provisioner-remote-exec: 2015/01/14 11:39:49 reconnecting to TCP connection for SSH
module.env.aws_instance.zookeeper_aws_instance.1 (remote-exec): Connecting to remote host via SSH...
module.env.aws_instance.zookeeper_aws_instance.1 (remote-exec):   Host: 10.10.1.49
module.env.aws_instance.zookeeper_aws_instance.1 (remote-exec):   User: ubuntu
module.env.aws_instance.zookeeper_aws_instance.1 (remote-exec):   Password: false
module.env.aws_instance.zookeeper_aws_instance.1 (remote-exec):   Private key: true
2015/01/14 11:39:49 terraform-provisioner-file: 2015/01/14 11:39:49 opening new ssh session
<snip>
2015/01/14 11:39:50 ==> interpolationFuncElement: args => []interface {}{"10.10.1.75B780FFEC-B661-4EB8-9236-A01737AD98B610.10.1.49", "2"}
module.env.aws_instance.zookeeper_aws_instance.2: Provisioning with 'remote-exec'...
module.env.aws_instance.zookeeper_aws_instance.2 (remote-exec): Connecting to remote host via SSH...
module.env.aws_instance.zookeeper_aws_instance.2 (remote-exec):   Host: 10.10.1.75
module.env.aws_instance.zookeeper_aws_instance.2 (remote-exec):   User: ubuntu
module.env.aws_instance.zookeeper_aws_instance.2 (remote-exec):   Password: false
module.env.aws_instance.zookeeper_aws_instance.2 (remote-exec):   Private key: true
<snip>
module.env.aws_instance.zookeeper_aws_instance.1 (remote-exec): Connected! Executing scripts...
<snip>
module.env.aws_instance.zookeeper_aws_instance.2 (remote-exec): Connected! Executing scripts...
<snip>
2015/01/14 11:39:51 ==> interpolationFuncElement: args => []interface {}{"10.10.1.102B780FFEC-B661-4EB8-9236-A01737AD98B610.10.1.75B780FFEC-B661-4EB8-9236-A01737AD98B610.10.1.49", "0"}
module.env.aws_instance.zookeeper_aws_instance.0: Provisioning with 'file'...
module.env.aws_instance.zookeeper_aws_instance.0 (file): private_ip attribute: "10.10.1.102"
module.env.aws_instance.zookeeper_aws_instance.0 (file): connection:host attribute: "10.10.1.102"
2015/01/14 11:39:51 terraform-provisioner-remote-exec: 2015/01/14 11:39:51 SCP session complete, closing stdin pipe.
2015/01/14 11:39:51 terraform-provisioner-remote-exec: 2015/01/14 11:39:51 Waiting for SSH session to complete.
module.env.aws_instance.zookeeper_aws_instance.1 (remote-exec): sudo: unable to resolve host ip-10-10-1-49
module.env.aws_instance.zookeeper_aws_instance.1 (remote-exec): sudo: /tmp/foo.sh: command not found
2015/01/14 11:39:51 terraform-provisioner-remote-exec: 2015/01/14 11:39:51 remote command exited with '1': /tmp/script.sh
2015/01/14 11:39:51 [ERROR] Error walking 'aws_instance.zookeeper_aws_instance.1': 1 error(s) occurred:

* Script exited with non-zero exit status: 1
module.env.aws_instance.zookeeper_aws_instance.1: Creation complete
<snip>
module.env.aws_instance.zookeeper_aws_instance.2: Creation complete
<snip>
2015/01/14 11:39:56 ==> interpolationFuncElement: args => []interface {}{"10.10.1.102B780FFEC-B661-4EB8-9236-A01737AD98B610.10.1.49", "0"}
module.env.aws_instance.zookeeper_aws_instance.0: Provisioning with 'remote-exec'...
module.env.aws_instance.zookeeper_aws_instance.0 (remote-exec): Connecting to remote host via SSH...
<snip>
module.env.aws_instance.zookeeper_aws_instance.0 (remote-exec):   Host: 10.10.1.102
module.env.aws_instance.zookeeper_aws_instance.0 (remote-exec):   User: ubuntu
module.env.aws_instance.zookeeper_aws_instance.0 (remote-exec):   Password: false
module.env.aws_instance.zookeeper_aws_instance.0 (remote-exec):   Private key: true
<snip>
2015/01/14 11:39:58 [INFO] Apply walk complete
2015/01/14 11:39:58 [INFO] Writing backup state to: terraform.tfstate.backup
<snip>
module.env.aws_instance.zookeeper_aws_instance.0: Creation complete
2015/01/14 11:39:58 waiting for all plugin processes to complete...
Error applying plan:

1 error(s) occurred:

* Script exited with non-zero exit status: 1

Terraform does not automatically rollback in the face of errors.
Instead, your Terraform state file has been partially updated with
any resources that successfully completed. Please address the error
above and apply again to incrementally change your infrastructure.
<snip>

As you can see, if the resources are being applied in parallel and they use element() interpolation there is a chance that not all of the resources have been completely initialized so the "aws_instance.zookeeper_aws_instance.*.private_ip" functionality doesn't return all of the instances like you'd expect.

You can see the fix I wrote on master in #794 and I think it solves it for all cases but there may be an edge case. All this fix does is preserve the correct length of the array by filling in the uninitialized instances with a placeholder.

@mitchellh
Copy link
Contributor

This is fixed in #1016, pending review.

@ghost
Copy link

ghost commented May 4, 2020

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@ghost ghost locked and limited conversation to collaborators May 4, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants