EBS Volume Attachement Destroy/Recreate state issue in 0.7.4 #9000

sepulworld · 2016-09-22T19:46:06Z

Terraform Version

0.7.4

Affected Resource(s)

aws_volume_attachement

Terraform Configuration Files

resource "aws_instance" "kafka" {
  count             = "${var.kafka_node_count}"
  ami               = "${lookup(var.kafka_amis, count.index)}"
  instance_type     = "${var.kafka_instance_type}"
  iam_instance_profile = "${aws_iam_instance_profile.core_instance_profile.name}"
  vpc_security_group_ids = ["${aws_security_group.sg-bastion.id}"]
  source_dest_check = false
  subnet_id         = "${element(split(",", module.vpc.public_subnets), count.index)}"
  private_ip = "${lookup(var.kafka_private_ips, count.index)}"
  key_name          = "${var.ssh_pem}"
  depends_on        = ["aws_instance.zookeeper", "aws_instance.bastion", "aws_ebs_volume.kafka_ebs"]
  tags {
    Name = "kafka${count.index}-${var.service}-${var.environment}"
    Environment = "${var.environment}"
  }
  user_data         = "{\"consul_master\":\"consul-${var.environment}.${var.domain_name}\", \"role\": \"kafka\", \"domain\": \"${var.service}-${var.environment}\", \"cluster_name\": \"${var.service}-${var.environment}\"}"
  lifecycle {
    ignore_changes = ["user_data"]
  }
}

resource "aws_ebs_volume" "kafka_ebs" {
  count = "${var.kafka_node_count}"
  availability_zone = "${lookup(var.kafka_ebs_volume_zones, count.index)}"
  size = "${var.kafka_ebs_volume_size}"
  type = "${var.kafka_ebs_volume_type}"
}

# Can't count deploy aws_volume_attachement via count because of bug: # https://github.com/hashicorp/terraform/issues/3449

resource "aws_volume_attachment" "kafka_ebs_att_0" {
  device_name = "/dev/xvdz"
  volume_id = "${aws_ebs_volume.kafka_ebs.0.id}"
  instance_id = "${aws_instance.kafka.0.id}"
  provisioner "file" {
    source = "remote_scripts/setup_ebs.sh"
    destination = "/tmp/setup_ebs.sh"
    connection {
      host = "${aws_instance.kafka.0.public_ip}"
      user = "ubuntu"
      private_key = "${file("${var.ssh_pem_location}")}"
    }
  }
}

resource "aws_volume_attachment" "kafka_ebs_att_1" {
  device_name = "/dev/xvdz"
  volume_id = "${aws_ebs_volume.kafka_ebs.1.id}"
  instance_id = "${aws_instance.kafka.1.id}"
  provisioner "file" {
    source = "remote_scripts/setup_ebs.sh"
    destination = "/tmp/setup_ebs.sh"
    connection {
      host = "${aws_instance.kafka.1.public_ip}"
      user = "ubuntu"
      private_key = "${file("${var.ssh_pem_location}")}"
    }
  }
}

resource "aws_volume_attachment" "kafka_ebs_att_2" {
  device_name = "/dev/xvdz"
  volume_id = "${aws_ebs_volume.kafka_ebs.2.id}"
  instance_id = "${aws_instance.kafka.2.id}"
  provisioner "file" {
    source = "remote_scripts/setup_ebs.sh"
    destination = "/tmp/setup_ebs.sh"
    connection {
      host = "${aws_instance.kafka.2.public_ip}"
      user = "ubuntu"
      private_key = "${file("${var.ssh_pem_location}")}"
    }
  }
}

Expected Behavior

If the aws_instance.kafka.0 is destroyed through the AWS console, but the attached EBS volume aws_ebs_volume.kafka_ebs.0 is left intact. You would expect the next Terraform 'apply' to recreate the destroyed ec2 instances (aws_instance.kafka.0) and reattach the (aws_ebs_volume.kafka_ebs.0) by recreating aws_volume_attachment.kafka_ebs.0 resource.

Actual Behavior

If the aws_instance.kafka.0 is destroyed through the AWS console, but the attached EBS volume aws_ebs_volume.kafka_ebs.0 is left intact.
Subsequent Terraform apply will attempt to destroy and recreate the aws_volume_attachment.kafka_ebs_att_0 and error with:

aws_volume_attachment.kafka_ebs_att_0: Failed to detach Volume (vol-fdda0075) from Instance (i-04027ff7bb5614c41): IncorrectState: Volume 'vol-fdda0075'is in the 'available' state.

This is new behavior after upgrading from 0.7.2 to 0.7.4

hgontijo · 2016-09-26T22:56:44Z

@sepulworld Since you manually removed an aws_instance via AWS console, have you tried reconciling the terraform state, i.e. terraform state rm aws_volume_attachment.<id> before running terraform apply?

kishorenc · 2016-10-07T08:28:49Z

This issue is unfortunate, especially given #2957. We re-assign EBS volumes when we rotate instances. That's quite a common pattern. Because of this issue we've to do a terraform state rm to flush Terraform's "memory" before I can re-attach a volume to a new instance.

stack72 · 2016-11-02T12:45:03Z

Hi folks

I believe that PR #9792 takes care of this issue - it will allow us to skip the detachment of an EBS volume and let the instance take care of it

Please let me know your thoughts on this

Paul

clstokes · 2017-01-22T01:52:40Z

@stack72 I think skip_destroy would work, but when doing an apply when an instance has been destroyed or tainted, Terraform errors out seemingly before the skip_destroy logic is reached.

To reproduce:

terraform apply the config linked below
terraform taint aws_instance.main.1
terraform apply

Error:

...
aws_volume_attachment.data_att.1: Still creating... (10s elapsed)
aws_volume_attachment.data_att.1: Creation complete
Error applying plan:

2 error(s) occurred:

* aws_volume_attachment.data_att.0: [WARN] Error attaching volume (vol-05be5996fc82db4e3) to instance (i-06fa8407fee424c26), message: "vol-05be5996fc82db4e3 is already attached to an instance", code: "VolumeInUse"
* aws_volume_attachment.data_att.2: [WARN] Error attaching volume (vol-0ac2fcd156504448c) to instance (i-0a7641c1c933cf907), message: "vol-0ac2fcd156504448c is already attached to an instance", code: "VolumeInUse"
...

Full config and log is at https://gist.github.com/clstokes/06487cb02dea5e46b538bbcfc4007dea.

This is with Terraform v0.8.4.

Updated to add Terraform version.

stack72 · 2017-02-01T15:42:54Z

Hi all

this has been fixed by #11060

I added all the debug info there to show this was the case. I also talked through with @clstokes what was happening

This will be part of terraform 0.8.6

Paul

cl0udgeek · 2017-06-26T20:16:50Z

Not sure if this is another bug or I'm just doing it wrong...but for some reason, Terraform wants to destroy all of my ebs volumes even if jus taint one node....

Config:

resource "aws_instance" "influxdata" {
  count         = "${var.ec2-count-influx-data}"
  ami           = "${module.amis.rhel73_id}"
  instance_type = "${var.ec2-type-influx-data}"

  vpc_security_group_ids = ["${var.sg-ids}"]
  subnet_id              = "${element(module.infra.subnet,count.index)}"
  key_name               = "${var.KeyName}"

  tags {
    Name               = "influx-data-node-0${count.index}"
    ASV                = "${module.infra.ASV}"
    CMDBEnvironment    = "${module.infra.CMDBEnvironment}"
    OwnerContact       = "${module.infra.OwnerContact}"
    custodian_downtime = "off"
    OwnerEid           = "${var.OwnerEid}"
  }

  connection {
    private_key = "${file("/Users/influx_east.pem")}" #qa env east
    user        = "ec2-user"
  }

  provisioner "remote-exec" {
    inline = ["echo just checking for ssh. ttyl. bye."]
  }

  provisioner "remote-exec" {
    when   = "destroy"
    inline = [
      "sudo service influx-data stop",
      "sudo unmount /dev/xvdg"
    ]
    connection {
      user = "ec2-user"
      host = "${self.private_ip}"
      private_key = "${file("/Users/influx_east.pem")}"
    }
  }
}

resource "aws_ebs_volume" "influxdata_ebs" {
  count             = "${var.ec2-count-influx-meta}"
  availability_zone = "${element(var.cds-qa-east,count.index)}"
  size = "1024"
  type = "io1"
  iops = 3000
  encrypted = true
  tags {
    Name               = "influxdata-ebs-0${count.index}"
    Sys                = "${module.infra.ASV}"
    OwnerContact       = "${module.infra.OwnerContact}"
    Owner           = "${var.OwnerEid}"
  }
}

resource "aws_volume_attachment" "influx_ebs_att" {
  count = 3
  device_name = "/dev/xvdg"
  volume_id = "${element(aws_ebs_volume.influxdata_ebs.*.id, count.index)}"
  instance_id = "${element(aws_instance.influxdata.*.id, count.index)}"
}

I run a terraform taint aws_instance.influxdata.0 to just rebuild one instance.

Expected Behavior:

Terraform should stop the service, unmount volume, detach volume, recreate instance, reattach volume

Actual Behavior:

Terraform trys to destroy all EBS volumes and get this error:


3 error(s) occurred:

* aws_volume_attachment.influx_ebs_att[1] (destroy): 1 error(s) occurred:

* aws_volume_attachment.influx_ebs_att.1: Error waiting for Volume (vol-04c306280e9b6c953) to detach from Instance: i-003a4db9ccfb4af68
* aws_volume_attachment.influx_ebs_att[0] (destroy): 1 error(s) occurred:

* aws_volume_attachment.influx_ebs_att.0: Error waiting for Volume (vol-054725609c55a35d6) to detach from Instance: i-078c714d85eb77afe
* aws_volume_attachment.influx_ebs_att[2] (destroy): 1 error(s) occurred:

* aws_volume_attachment.influx_ebs_att.2: Error waiting for Volume (vol-0ccce3d93122eb233) to detach from Instance: i-0c380a9cae915d8a3

Terraform does not automatically rollback in the face of errors.
Instead, your Terraform state file has been partially updated with
any resources that successfully completed. Please address the error
above and apply again to incrementally change your infrastructure.

I'm using version 0.9.8 as well...

here is the plan too...

"apply" is called, Terraform can't guarantee this is what will execute.

~ aws_elb.influxdata-elb
    instances.#: "" => "<computed>"

-/+ aws_instance.influxdata.0 (tainted)
    ami:                               "ami-916c4387" => "ami-916c4387"
    associate_public_ip_address:       "false" => "<computed>"
    availability_zone:                 "us-east-1a" => "<computed>"
    ebs_block_device.#:                "1" => "<computed>"
    ephemeral_block_device.#:          "0" => "<computed>"
    instance_state:                    "running" => "<computed>"
    instance_type:                     "r4.2xlarge" => "r4.2xlarge"
    ipv6_address_count:                "" => "<computed>"
    ipv6_addresses.#:                  "0" => "<computed>"
    key_name:                          "influx_east" => "influx_east"
    network_interface.#:               "0" => "<computed>"
    network_interface_id:              "eni-dae72b76" => "<computed>"
    placement_group:                   "" => "<computed>"
    primary_network_interface_id:      "eni-dae72b76" => "<computed>"
    private_dns:                       "ip-10-47-0-18.da.co.com" => "<computed>"
    private_ip:                        "10.47.0.18" => "<computed>"
    public_dns:                        "" => "<computed>"
    public_ip:                         "" => "<computed>"
    root_block_device.#:               "1" => "<computed>"
    security_groups.#:                 "0" => "<computed>"
    source_dest_check:                 "true" => "true"
    subnet_id:                         "subnet-875fadab" => "subnet-875fadab"
    tags.%:                            "6" => "6"
    tags.CMDBEnvironment:              "STREAMDATAPLATFORMAWS" => "ENVNPSTREAMDATAPLATFORMAWS"
    tags.Name:                         "influx-data-node-00" => "influx-data-node-00"
    tags.OwnerEid:                     "943" => "943"
    tenancy:                           "default" => "<computed>"
    volume_tags.%:                     "7" => "<computed>"
    vpc_security_group_ids.#:          "3" => "3"
    vpc_security_group_ids.1889494443: "sg-86a071f7" => "sg-86a071f7"
    vpc_security_group_ids.528573618:  "sg-07a57476" => "sg-07a57476"
    vpc_security_group_ids.787016340:  "sg-b5fd02c4" => "sg-b5fd02c4"

-/+ aws_volume_attachment.influx_ebs_att.0
    device_name:  "/dev/xvdg" => "/dev/xvdg"
    force_detach: "" => "<computed>"
    instance_id:  "i-078c714d85eb77afe" => "${element(aws_instance.influxdata.*.id, count.index)}" (forces new resource)
    skip_destroy: "" => "<computed>"
    volume_id:    "vol-054725609c55a35d6" => "vol-054725609c55a35d6"

-/+ aws_volume_attachment.influx_ebs_att.1
    device_name:  "/dev/xvdg" => "/dev/xvdg"
    force_detach: "" => "<computed>"
    instance_id:  "i-003a4db9ccfb4af68" => "${element(aws_instance.influxdata.*.id, count.index)}" (forces new resource)
    skip_destroy: "" => "<computed>"
    volume_id:    "vol-04c306280e9b6c953" => "vol-04c306280e9b6c953"

-/+ aws_volume_attachment.influx_ebs_att.2
    device_name:  "/dev/xvdg" => "/dev/xvdg"
    force_detach: "" => "<computed>"
    instance_id:  "i-0c380a9cae915d8a3" => "${element(aws_instance.influxdata.*.id, count.index)}" (forces new resource)
    skip_destroy: "" => "<computed>"
    volume_id:    "vol-0ccce3d93122eb233" => "vol-0ccce3d93122eb233"

+ local_file.inventory-meta
    content:  "[meta]\n${join(\"\\n\",aws_instance.influxmeta.*.private_ip)}\n\n[data]\n${join(\"\\n\",aws_instance.influxdata.*.private_ip)}\n"
    filename: "inventory"

ghost · 2020-04-08T02:11:05Z

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

DSh3p4rd mentioned this issue Sep 23, 2016

provider/aws: Propagate errors from DetachVolume #8479

Merged

stack72 added bug provider/aws labels Sep 23, 2016

hgontijo mentioned this issue Sep 26, 2016

Terraform is not removing tainted aws_resource and it's failing to remove aws_volume_attachment #9065

Closed

stack72 closed this as completed Feb 1, 2017

evgeniagusakova mentioned this issue Sep 24, 2017

Failed to destroy AWS node with volume: * aws_volume_attachment.jenkins_disk_attachment: Error waiting for Volume (vol-XXXX) to detach from Instance: i-XXXXX #16167

Closed

hashibot mentioned this issue Oct 20, 2017

Failed to destroy AWS node with volume: * aws_volume_attachment.jenkins_disk_attachment: Error waiting for Volume (vol-XXXX) to detach from Instance: i-XXXXX hashicorp/terraform-provider-aws#1991

Closed

ghost locked and limited conversation to collaborators Apr 8, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

EBS Volume Attachement Destroy/Recreate state issue in 0.7.4 #9000

EBS Volume Attachement Destroy/Recreate state issue in 0.7.4 #9000

sepulworld commented Sep 22, 2016 •

edited

Loading

hgontijo commented Sep 26, 2016 •

edited

Loading

kishorenc commented Oct 7, 2016

stack72 commented Nov 2, 2016

clstokes commented Jan 22, 2017 •

edited

Loading

stack72 commented Feb 1, 2017 •

edited

Loading

cl0udgeek commented Jun 26, 2017 •

edited

Loading

ghost commented Apr 8, 2020

EBS Volume Attachement Destroy/Recreate state issue in 0.7.4 #9000

EBS Volume Attachement Destroy/Recreate state issue in 0.7.4 #9000

Comments

sepulworld commented Sep 22, 2016 • edited Loading

Terraform Version

Affected Resource(s)

Terraform Configuration Files

Expected Behavior

Actual Behavior

hgontijo commented Sep 26, 2016 • edited Loading

kishorenc commented Oct 7, 2016

stack72 commented Nov 2, 2016

clstokes commented Jan 22, 2017 • edited Loading

stack72 commented Feb 1, 2017 • edited Loading

cl0udgeek commented Jun 26, 2017 • edited Loading

ghost commented Apr 8, 2020

sepulworld commented Sep 22, 2016 •

edited

Loading

hgontijo commented Sep 26, 2016 •

edited

Loading

clstokes commented Jan 22, 2017 •

edited

Loading

stack72 commented Feb 1, 2017 •

edited

Loading

cl0udgeek commented Jun 26, 2017 •

edited

Loading