Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: AWS Batch 'panic: interface conversion: interface {} is nil, not map[string]interface {}' #38710

Closed
justingodden opened this issue Aug 6, 2024 · 5 comments · Fixed by #38716
Labels
bug Addresses a defect in current functionality. crash Results from or addresses a Terraform crash or kernel panic. service/vpc Issues and PRs that pertain to the vpc service.
Milestone

Comments

@justingodden
Copy link

Terraform Core Version

1.9.3

AWS Provider Version

5.61.0

Affected Resource(s)

aws_batch_job_definition

Expected Behavior

I am using the AWS Batch TF module: https://registry.terraform.io/modules/terraform-aws-modules/batch/aws/latest

But I believe the problem is with the underlying aws_batch_job_definition resource.

I expect to be able to create the resources with the batch module.

Actual Behavior

Creating the resources initially with terraform apply works just fine. But even if the code remains completely unchanged, when running terraform apply again, the provider crashes.

It looks like it's coming from the /internal/service/batch.needsJobDefUpdate function.

Not the same issue, but looks similar: #22660, #17284

Relevant Error/Panic Output Snippet

Stack trace from the terraform-provider-aws_v5.61.0_x5 plugin:

panic: interface conversion: interface {} is nil, not map[string]interface {}

goroutine 3020 [running]:
github.com/hashicorp/terraform-provider-aws/internal/service/batch.needsJobDefUpdate(0xc00324e000)
        github.com/hashicorp/terraform-provider-aws/internal/service/batch/job_definition.go:569 +0x1074
github.com/hashicorp/terraform-provider-aws/internal/service/batch.jobDefinitionCustomizeDiff({0x13f75fa0?, 0x208b4340?}, 0xc00324e000, {0xe?, 0xc0026548f0?})
        github.com/hashicorp/terraform-provider-aws/internal/service/batch/job_definition.go:464 +0x3a
github.com/hashicorp/terraform-provider-aws/internal/service/batch.ResourceJobDefinition.Sequence.func23({0x170b2cc8, 0xc003980b10}, 0xc00324e000, {0x14e571c0, 0xc0026548f0})
        github.com/hashicorp/terraform-plugin-sdk/v2@v2.34.0/helper/customdiff/compose.go:69 +0x84
github.com/hashicorp/terraform-provider-aws/internal/provider.New.(*wrappedResource).CustomizeDiff.func5({0x170b2cc8?, 0xc00386a7e0?}, 0xc00324e000, {0x14e571c0, 0xc0026548f0})
        github.com/hashicorp/terraform-provider-aws/internal/provider/intercept.go:186 +0x63
github.com/hashicorp/terraform-plugin-sdk/v2/helper/schema.schemaMap.Diff(0xc000f34060, {0x170b2cc8, 0xc00386a7e0}, 0xc003902c30, 0xc004722280, 0xc00022b9f8, {0x14e571c0, 0xc0026548f0}, 0x0)
        github.com/hashicorp/terraform-plugin-sdk/v2@v2.34.0/helper/schema/schema.go:698 +0x4b4
github.com/hashicorp/terraform-plugin-sdk/v2/helper/schema.(*Resource).SimpleDiff(0x170b30f8?, {0x170b2cc8?, 0xc00386a7e0?}, 0xc003902c30, 0xc00386a810?, {0x14e571c0?, 0xc0026548f0?})
        github.com/hashicorp/terraform-plugin-sdk/v2@v2.34.0/helper/schema/resource.go:990 +0xdb
github.com/hashicorp/terraform-plugin-sdk/v2/helper/schema.(*GRPCProviderServer).PlanResourceChange(0xc00426f068, {0x170b2cc8?, 0xc00386a6f0?}, 0xc003ca5540)
        github.com/hashicorp/terraform-plugin-sdk/v2@v2.34.0/helper/schema/grpc_provider.go:858 +0xbe8
github.com/hashicorp/terraform-plugin-mux/tf5muxserver.(*muxServer).PlanResourceChange(0xc0013f8f50, {0x170b2cc8?, 0xc00386a420?}, 0xc003ca5540)
        github.com/hashicorp/terraform-plugin-mux@v0.16.0/tf5muxserver/mux_server_PlanResourceChange.go:73 +0x2ad
github.com/hashicorp/terraform-plugin-go/tfprotov5/tf5server.(*server).PlanResourceChange(0xc000619c20, {0x170b2cc8?, 0xc00384fb60?}, 0xc004562600)
        github.com/hashicorp/terraform-plugin-go@v0.23.0/tfprotov5/tf5server/server.go:825 +0x3f0
github.com/hashicorp/terraform-plugin-go/tfprotov5/internal/tfplugin5._Provider_PlanResourceChange_Handler({0x14c3fb20, 0xc000619c20}, {0x170b2cc8, 0xc00384fb60}, 0xc004562580, 0x0)
        github.com/hashicorp/terraform-plugin-go@v0.23.0/tfprotov5/internal/tfplugin5/tfplugin5_grpc.pb.go:500 +0x1a6
google.golang.org/grpc.(*Server).processUnaryRPC(0xc001296000, {0x170b2cc8, 0xc00384fad0}, {0x17116620, 0xc0002ba300}, 0xc00385eb40, 0xc002700120, 0x208231e0, 0x0)
        google.golang.org/grpc@v1.63.2/server.go:1369 +0xdf8
google.golang.org/grpc.(*Server).handleStream(0xc001296000, {0x17116620, 0xc0002ba300}, 0xc00385eb40)
        google.golang.org/grpc@v1.63.2/server.go:1780 +0xe8b
google.golang.org/grpc.(*Server).serveStreams.func2.1()
        google.golang.org/grpc@v1.63.2/server.go:1019 +0x8b
created by google.golang.org/grpc.(*Server).serveStreams.func2 in goroutine 65
        google.golang.org/grpc@v1.63.2/server.go:1030 +0x125

Error: The terraform-provider-aws_v5.61.0_x5 plugin crashed!

This is always indicative of a bug within the plugin. It would be immensely
helpful if you could report the crash with the plugin's maintainers so that it
can be fixed. The output above should help diagnose the issue.

Terraform Configuration Files

resource "aws_security_group" "this" {
  name   = "aws_batch_compute_environment_security_group"
  vpc_id = var.vpc_id
  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }
}

module "batch" {
  source = "terraform-aws-modules/batch/aws"

  create_instance_iam_role      = true
  instance_iam_role_name        = "batch-role"
  instance_iam_role_path        = "/batch/"
  instance_iam_role_description = "IAM instance role/profile for AWS Batch ECS instance(s)"
  instance_iam_role_additional_policies = [
    "arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore"
  ]

  create_service_iam_role      = true
  service_iam_role_name        = "batch-service-role"
  service_iam_role_path        = "/batch/"
  service_iam_role_description = "IAM service role for AWS Batch"

  compute_environments = {
    ec2_gpu = {
      name_prefix = "ec2_gpu"

      compute_resources = {
        type           = "EC2"
        min_vcpus      = 4
        max_vcpus      = 4
        desired_vcpus  = 4
        instance_types = ["g4dn.xlarge"]

        ec2_configuration = {
          image_type = "ECS_AL2_NVIDIA"
        }

        security_group_ids = [aws_security_group.this.id]
        subnets            = var.private_subnets
      }
    }
  }

  job_queues = {
    batch_queue = {
      name                     = "BatchQueue"
      state                    = "ENABLED"
      priority                 = 1
      create_scheduling_policy = false
    }
  }

  job_definitions = {
    nginx = {
      name = "nginx"
      type = "container"

      container_properties = jsonencode({
        image = "nginx"

        resourceRequirements = [
          { type = "VCPU", value = "4" },
          { type = "MEMORY", value = "15000" },
          { type = "GPU", value = "1" }
        ]
      })
    }
  }
}

Steps to Reproduce

terraform apply
yes

terraform apply

Debug Output

No response

Panic Output

No response

Important Factoids

No response

References

No response

Would you like to implement a fix?

None

@justingodden justingodden added the bug Addresses a defect in current functionality. label Aug 6, 2024
Copy link

github-actions bot commented Aug 6, 2024

Community Note

Voting for Prioritization

  • Please vote on this issue by adding a 👍 reaction to the original post to help the community and maintainers prioritize this request.
  • Please see our prioritization guide for information on how we prioritize.
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request.

Volunteering to Work on This Issue

  • If you are interested in working on this issue, please leave a comment.
  • If this would be your first contribution, please review the contribution guide.

@github-actions github-actions bot added crash Results from or addresses a Terraform crash or kernel panic. service/vpc Issues and PRs that pertain to the vpc service. labels Aug 6, 2024
@terraform-aws-provider terraform-aws-provider bot added the needs-triage Waiting for first response or review from a maintainer. label Aug 6, 2024
@justingodden
Copy link
Author

Update: Adding EITHER a retry strategy OR a timeout with the following code solved it.

attempt_duration_seconds = 60
retry_strategy = {
        attempts = 3
        evaluate_on_exit = {
          retry_error = {
            action       = "RETRY"
            on_exit_code = 1
          }
          exit_success = {
            action       = "EXIT"
            on_exit_code = 0
          }
        }
      }

I'm no Golang expert but I think the problem is doing type coercion on a nil on this line.

@jar-b jar-b removed the needs-triage Waiting for first response or review from a maintainer. label Aug 7, 2024
Copy link

github-actions bot commented Aug 7, 2024

Warning

This issue has been closed, meaning that any additional comments are hard for our team to see. Please assume that the maintainers will not see them.

Ongoing conversations amongst community members are welcome, however, the issue will be locked after 30 days. Moving conversations to another venue, such as the AWS Provider forum, is recommended. If you have additional concerns, please open a new issue, referencing this one where needed.

@github-actions github-actions bot added this to the v5.62.0 milestone Aug 7, 2024
Copy link

github-actions bot commented Aug 9, 2024

This functionality has been released in v5.62.0 of the Terraform AWS Provider. Please see the Terraform documentation on provider versioning or reach out if you need any assistance upgrading.

For further feature requests or bug reports with this functionality, please create a new GitHub issue following the template. Thank you!

Copy link

github-actions bot commented Sep 9, 2024

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 9, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Addresses a defect in current functionality. crash Results from or addresses a Terraform crash or kernel panic. service/vpc Issues and PRs that pertain to the vpc service.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants