Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Terraform crashes when using data source aws_ami #8786

Closed
moritzheiber opened this issue Sep 12, 2016 · 15 comments
Closed

Terraform crashes when using data source aws_ami #8786

moritzheiber opened this issue Sep 12, 2016 · 15 comments

Comments

@moritzheiber
Copy link

moritzheiber commented Sep 12, 2016

I'm getting consistent crashes when trying to use the aws_ami data source. Configuration is linked below.

Terraform Version

Terraform v0.7.3 (UPDATED: Also happening on 0.7.4)

Affected Resource(s)

  • aws.data.aws_ami

Terraform Configuration Files

Encrypted with the public key corresponding to 51852D87348FFC4C:
https://gist.github.com/moritzheiber/51bec104f5fcad898deda6e1197415bb

Debug Output/Panic Output

Encrypted with the public key corresponding to 51852D87348FFC4C:
https://gist.github.com/moritzheiber/ac2862637b306655b06eb1426f13fbfa

Expected Behavior

The aws_ami data source is providing me with the latest AMI ID corresponding to the regex 'consul-base-ami\\ *'.

Actual Behavior

The terraform process is crashing.

Steps to Reproduce

  1. terraform plan

Important Factoids

The AMI IDs are formed after the following Packer pattern: default-consul-ami {{isotime \"2006-01-02T15_04_05\"}}.

References

Might be related to #7910

@kwilczynski
Copy link
Contributor

@moritzheiber hi there! Sorry about this!

Would you kindly provide decrypted logs? Not every person who works on the Terraform is HashiCorp employee, thus has no access to the private key, which hiders any attempt on troubleshooting the issue. Thank you in advance.

@moritzheiber
Copy link
Author

I'm happy to send the encrypted logs to anyone who's willing to tender their public keys to me. Unfortunately, contractual obligations prevent me from posting the respective information in the open, sorry :(

@moritzheiber
Copy link
Author

If this presents a challenge in resolving this bug I'll try to reproduce it with a stack unassociated with the one from the initial report. However, that might not be solving the use case I reported this with initially.

@kwilczynski
Copy link
Contributor

@moritzheiber hello there! Have you been able to make any progress with allowing us to see the logs? Even stack trace from the original logs would be a starting point (and it does not include any sensitive information, etc.).

@apparentlymart
Copy link
Contributor

@moritzheiber if you are able, I think the most helpful part to share would be the main headline from the Go error message and ideally at least a couple lines out of the stack trace that follows it. That part should include only information that is part of the public Terraform binary and not anything sensitive about your configuration, and would give us a pointer to what section of code is crashing in order to theorize about what might be causing it.

If you're not sure what part of the log output I'm talking about, take a look at what I fished out of the debug log in #6441; it'll likely be a line starting with panic: and then a stack trace that begins under a heading like goroutine nnn [running]:; note that the arguments shown there are from the memory of the running process, so feel free to redact the stuff between the parentheses on the lines that are shaped like this if you are concerned that they may disclose something you don't want to share:

github.com/hashicorp/terraform/builtin/providers/aws.forwardedValuesHash(0xb72000, 0x82268b890, 0x800000001)

@moritzheiber
Copy link
Author

I've "renegotiated" (believe me, if it were for me I would've posted the plaintext output log ages ago) and was able to convince the stakeholders to paste the panic output:

panic: runtime error: invalid memory address or nil pointer dereference
2016/09/23 16:32:42 [DEBUG] plugin: terraform: [signal SIGSEGV: segmentation violation code=0x1 addr=0x8 pc=0x6b8497]
2016/09/23 16:32:42 [DEBUG] plugin: terraform: 
2016/09/23 16:32:42 [DEBUG] plugin: terraform: goroutine 88 [running]:
2016/09/23 16:32:42 [DEBUG] plugin: terraform: panic(0x28a0d20, 0xc420012110)
2016/09/23 16:32:42 [DEBUG] plugin: terraform:  /opt/go/src/runtime/panic.go:500 +0x1a1
2016/09/23 16:32:42 [DEBUG] plugin: terraform: github.com/hashicorp/terraform/builtin/providers/aws.dataSourceAwsAmiRead(0xc420466c00, 0x25feb00, 0xc420464180, 0xad00000000000028, 0x0)
2016/09/23 16:32:42 [DEBUG] plugin: terraform:  /opt/gopath/src/github.com/hashicorp/terraform/builtin/providers/aws/data_source_aws_ami.go:242 +0x387
2016/09/23 16:32:42 [DEBUG] plugin: terraform: github.com/hashicorp/terraform/helper/schema.(*Resource).ReadDataApply(0xc4204e1c20, 0xc42049bce0, 0x25feb00, 0xc420464180, 0xc4203bb338, 0x1, 0x0)
2016/09/23 16:32:42 [DEBUG] plugin: terraform:  /opt/gopath/src/github.com/hashicorp/terraform/helper/schema/resource.go:207 +0xda
2016/09/23 16:32:42 [DEBUG] plugin: terraform: github.com/hashicorp/terraform/helper/schema.(*Provider).ReadDataApply(0xc4203b4ba0, 0xc42027ffc0, 0xc42049bce0, 0x0, 0x18, 0x18)
2016/09/23 16:32:42 [DEBUG] plugin: terraform:  /opt/gopath/src/github.com/hashicorp/terraform/helper/schema/provider.go:317 +0x91
2016/09/23 16:32:42 [DEBUG] plugin: terraform: github.com/hashicorp/terraform/plugin.(*ResourceProviderServer).ReadDataApply(0xc4202044a0, 0xc420596660, 0xc420596c30, 0x0, 0x0)
2016/09/23 16:32:42 [DEBUG] plugin: terraform:  /opt/gopath/src/github.com/hashicorp/terraform/plugin/resource_provider.go:537 +0x4e
2016/09/23 16:32:42 [DEBUG] plugin: terraform: reflect.Value.call(0xc420466420, 0xc4204121e8, 0x13, 0x2e6520f, 0x4, 0xc4206d1ed0, 0x3, 0x3, 0x26d90c0, 0xc4204169c0, ...)
2016/09/23 16:32:42 [DEBUG] plugin: terraform:  /opt/go/src/reflect/value.go:434 +0x5c8
2016/09/23 16:32:42 [DEBUG] plugin: terraform: reflect.Value.Call(0xc420466420, 0xc4204121e8, 0x13, 0xc4206d1ed0, 0x3, 0x3, 0xc420584394, 0x100000000, 0xc42046d701)
2016/09/23 16:32:42 [DEBUG] plugin: terraform:  /opt/go/src/reflect/value.go:302 +0xa4
2016/09/23 16:32:42 [DEBUG] plugin: terraform: net/rpc.(*service).call(0xc42039bd40, 0xc42039bd00, 0xc4204f44a0, 0xc42001e500, 0xc420314c60, 0x2600a80, 0xc420596660, 0x16, 0x2600ac0, 0xc420596c30, ...)
2016/09/23 16:32:42 [DEBUG] plugin: terraform:  /opt/go/src/net/rpc/server.go:383 +0x148
2016/09/23 16:32:42 [DEBUG] plugin: terraform: created by net/rpc.(*Server).ServeCodec
2016/09/23 16:32:42 [DEBUG] plugin: terraform:  /opt/go/src/net/rpc/server.go:477 +0x421

Also, I'll try to get a redacted version of the debug/crash log posted here.

The trace output above was generated using Terraform 0.7.4.

@moritzheiber
Copy link
Author

moritzheiber commented Sep 23, 2016

This is 100% reproducible by taking the AMI naming convention posted at the beginning of this ticket and applying a simple regex. The part of the configuration for the data source is:

data "aws_ami" "our_ami" {
  most_recent = true
  name_regex  = "^our-ami"
}

The corresponding packer pattern would be:

{
  "builders": [
    {
      "type": "amazon-ebs",
      "region": "eu-central-1",
      "source_ami": "ami-f955a096",
      "instance_type": "t2.micro",
      "ssh_username": "ec2-user",
      "ami_name": "our-ami {{isotime \"2006-01-02T15_04_05\"}}",
      "iam_instance_profile": "our_service"
    }
  ]
}

The resulting AMI image's name would be (for today): our-ami 2016-09-23T16_43_20

@kwilczynski
Copy link
Contributor

@moritzheiber hi there! I REALLY appreciate this. I also understand your pain, as I work for an Enterprise myself, thus the pain you had to go through to get this is really appreciated.

@kwilczynski
Copy link
Contributor

kwilczynski commented Sep 23, 2016

@moritzheiber thanks to the panic and other data you have provided, I was able to narrow the issue to the following Amazon Machine Image (AMI) which seem not to have a name set. I am not entirely sure if this is even possible (probably a bug?) to achieve in EC2. The following is the image metadata which causes Terraform to panic with nil pointer dereference:

$ aws ec2 describe-images --filters Name=image-id,Values=ami-1111ec7e --region eu-central-1
{
    "Images": [
        {
            "VirtualizationType": "hvm",
            "Hypervisor": "xen",
            "ImageId": "ami-1111ec7e",
            "State": "available",
            "BlockDeviceMappings": [
                {
                    "DeviceName": "sdb",
                    "VirtualName": "ephemeral0"
                },
                {
                    "DeviceName": "sdc",
                    "VirtualName": "ephemeral1"
                }
            ],
            "Architecture": "x86_64",
            "ImageLocation": "cloudtest-images-eu-central-1/maestro-or-resultsservice/58.3/image.manifest.xml",
            "RootDeviceType": "instance-store",
            "OwnerId": "851601128636",
            "RootDeviceName": "/dev/sda1",
            "CreationDate": "2016-09-21T15:22:54.000Z",
            "Public": true,
            "ImageType": "machine"
        }
    ]
}

Basically, since you are using a regular expression to narrow the results, the following loop gets executed to walk all the images available and match against the expression:

data_source_aws_ami.go#L240-L245

        r := regexp.MustCompile(nameRegex.(string))
        for _, image := range resp.Images {
            if r.MatchString(*image.Name) == true {
                filteredImages = append(filteredImages, image)
            }
        }

When the image with ID "ami-1111ec7e is selected, it does not have the name, thus this line fails to dereference the name field from the response:

            if r.MatchString(*image.Name) == true {

I will provide a fix shortly.

@moritzheiber
Copy link
Author

Oh, so the function iterates over all available AMIs, not just the one created by/belonging to our account?

That explains it.

I reckon a way of limiting the pool of AMI ids to walk through would be convenient, if it doesn't exist already (i.e. has_to_belong_to = <account_id> or something).

Good find.

@kwilczynski
Copy link
Contributor

kwilczynski commented Sep 23, 2016

I have an addendum:

I've run query for the account ID 851601128636 and 003046273657 which owns these images, and I found that none of the following have names set:

  • ami-1111ec7e
    cloudtest-images-eu-central-1/maestro-or-resultsservice/58.3/image.manifest.xml
  • ami-1b08fc74
    cloudtest-images-eu-central-1/maestro-or-resultsservice/54.7/image.manifest.xml
  • ami-2602343b
    trustance-eu-central-1/0.9.1/ami.img.manifest.xml
  • ami-360df059
    cloudtest-images-eu-central-1/maestro-or-resultsservice/58.3/image.manifest.xml
  • ami-54cdfd49
    cloudtest-images-eu-central-1/maestro-or-resultsservice/54.5/image.manifest.xml
  • ami-6d938001
    cloudtest-images-eu-central-1/maestro-or-resultsservice/56.1/image.manifest.xml
  • ami-70cfdc1c
    cloudtest-images-eu-central-1/maestro-or-resultsservice/56.0/image.manifest.xml
  • ami-b8cdfda5
    cloudtest-images-eu-central-1/maestro-or-resultsservice/53.1/image.manifest.xml
  • ami-bd7f82d2
    cloudtest-images-eu-central-1/maestro-or-resultsservice/58.3/image.manifest.xml
  • ami-d02212cd
    cloudtest-images-eu-central-1/maestro-or-resultsservice/54.4/image.manifest.xml
  • ami-ea2212f7
    cloudtest-images-eu-central-1/maestro-or-resultsservice/53.0/image.manifest.xml

@kwilczynski
Copy link
Contributor

@moritzheiber hi there!

You could try to use your Amazon account ID with the owners attribute. Perhaps setting it to self could also do the trick.

@kwilczynski
Copy link
Contributor

@moritzheiber hi here! I was wondering whether you were able to apply the owners attribute to speed things up and narrow the search scope?

@apparentlymart
Copy link
Contributor

I merged @kwilczynski's patch in #9033, so this issue should be resolved in the next release of Terraform.

As the documentation page says...

This filtering is done locally on what AWS returns, and could have a performance impact if the result is large. It is recommended to combine this with other options to narrow down the list AWS returns.

...it's suggested to use name_regex only in conjunction with one of the other filters, such as "owner", or else it will scan every public AMI available in the region to find matches. name_regex is just provided as a convenience because the filtering built in to the AWS API is quite rudimentary and not suitable for all cases.

With the simple pattern that was posted earlier, I believe a filter block would actually achieve the same result with the filtering done on the server side:

filter {
  name  = "name"
  values = ["our-ami*"]
}

...though best to also constrain with the owners attribute unless you're intentionally trying to match community AMIs, since the filtering can be slow even when it's applied on the AWS side. (I suspect that on their end they can implement certain constraints via proper indices but then implement the ad-hoc filter blocks by scanning the results just like Terraform is doing for name_regex.)

With all of that said, I'm going to close this now since I think #9033 took care of this. If there's more here then do feel free to reopen. Thanks again for the bug report, and thanks to @kwilczynski for the investigation and the fix!

@ghost
Copy link

ghost commented Apr 22, 2020

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@ghost ghost locked and limited conversation to collaborators Apr 22, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants