Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Windows worker fleets failing to start #295

Closed
horsmand opened this issue Jan 22, 2021 · 0 comments · Fixed by #296
Closed

Windows worker fleets failing to start #295

horsmand opened this issue Jan 22, 2021 · 0 comments · Fixed by #296
Assignees
Labels
bug This issue is a bug.

Comments

@horsmand
Copy link
Contributor

Trying to deploy a worker fleet (or new instance into an existing worker fleet) that uses any Windows AMI is failing due to a bug in a script that we execute on the host as part of its initialization process (in the UserData). The nature of the bug means that it affects all versions of RFDK in production, and all worker fleets that try to deploy using an AMI with a Windows operating system.

The script that is failing attempts to install and configure the CloudWatch agent onto the instance. Because this script fails, the script that is supposed to configure the Deadline worker to connect to the render queue and start it never gets executed and the health check fails, causing the CDK deployment to fail and roll back.

Since the failure prevents CloudWatch from setting up properly, no logs are uploaded to CloudWatch and viewable from the AWS Console, and the host gets terminated.

Log statement seen that signals we're falling back to the latest version of the CloudWatch agent rather than the version we try to pin to: https://github.com/aws/aws-rfdk/blob/mainline/packages/aws-rfdk/lib/core/scripts/powershell/configureCloudWatchAgent.ps1#L26
Error message that was observed: https://github.com/aws/aws-rfdk/blob/mainline/packages/aws-rfdk/lib/core/scripts/powershell/configureCloudWatchAgent.ps1#L52

Environment

  • CDK CLI Version: all
  • CDK Framework Version: all
  • RFDK Version: all
  • Deadline Version: all

This is 🐛 Bug Report

@horsmand horsmand added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. and removed needs-triage This issue or PR still needs to be triaged. labels Jan 22, 2021
horsmand referenced this issue in horsmand/aws-rfdk Jan 22, 2021
Fixes #295

This fixes a bug that is preventing the deployment of any worker
instance that is using an AMI with a Windows OS.
@horsmand horsmand self-assigned this Jan 22, 2021
horsmand referenced this issue in horsmand/aws-rfdk Jan 22, 2021
Fixes #295

This fixes a bug that is preventing the deployment of any worker
instance that is using an AMI with a Windows OS.
horsmand referenced this issue in horsmand/aws-rfdk Jan 22, 2021
Fixes #295

This fixes a bug that is preventing the deployment of any worker
instance that is using an AMI with a Windows OS.
horsmand referenced this issue in horsmand/aws-rfdk Jan 22, 2021
Fixes #295

This fixes a bug that is preventing the deployment of any worker
instance that is using an AMI with a Windows OS.
horsmand referenced this issue in horsmand/aws-rfdk Jan 22, 2021
Fixes #295

This fixes a bug that is preventing the deployment of any worker
instance that is using an AMI with a Windows OS.
ddneilson pushed a commit that referenced this issue Jan 22, 2021
Fixes #295

This fixes a bug that is preventing the deployment of any worker
instance that is using an AMI with a Windows OS.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug This issue is a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant