Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(deadline): allow traffic from RenderQueue to UsageBasedLicensing #617

Merged
merged 1 commit into from
Oct 21, 2021

Conversation

jusiskin
Copy link
Contributor

Problem

When rendering jobs with Deadline Third-Party Usage Based Licensing (UBL) with a fleet of 50 workers, the API endpoints served by the RenderQueue construct become locked-up and eventually return 503 Service Unavailable responses. The health checks on the Application Load Balancer detect this problem and signal to ECS to replace the task - causing a render farm disruption.

Solution

It was determined that the RCS request handlers were blocked attempting to communicate with the Deadline License Forwarder deployed by the UsageBasedLicensing construct. It was discovered that RFDK does not create a security group rule to allow this traffic from the RenderQueueUsageBasedLicensing. The Deadline Remote Connection Server uses a long connection timeout and with a sufficiently large number of workers rendering jobs, the request handling threads become blocked waiting for these connections to time out.

The solution is to add a security group rule when creating a UsageBasedLicensing construct instance to allow this traffic and avoid connection timeouts.

Testing

  • Synthesis tests were added to cover this behavior
  • The AWS-All-In-Infrastructure-Basic example app was configured to use UBL. It deployed successfully and all RenderQueue performance degradations and disruptions disappeared

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license

@jusiskin jusiskin added bug This issue is a bug. contribution/core This is a PR that came from AWS. labels Oct 21, 2021
@ddneilson ddneilson self-requested a review October 21, 2021 17:59
Copy link
Contributor

@ddneilson ddneilson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work tracking this one down, Josh!

LGTM

@horsmand horsmand merged commit dfbf88f into aws:mainline Oct 21, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug This issue is a bug. contribution/core This is a PR that came from AWS.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants