-
Notifications
You must be signed in to change notification settings - Fork 199
Abstract out node disposal #1686
Abstract out node disposal #1686
Conversation
# When there's more than 1 message in the pool queue | ||
operator=ComparisonOperationType.GREATER_THAN, | ||
operator=ComparisonOperationType.GREATER_THAN_OR_EQUAL, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was a bug, would not spin up new nodes if there is only 1 message in the queue.
@@ -123,23 +123,47 @@ def create_auto_scale_profile(min: int, max: int, queue_uri: str) -> AutoscalePr | |||
metric_trigger=MetricTrigger( | |||
metric_name="ApproximateMessageCount", | |||
metric_resource_uri=queue_uri, | |||
# Check every minute | |||
time_grain=timedelta(minutes=1), | |||
# Check every 15 minutes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some of these timing numbers were tuned after reading this guidance on autoscaling: https://docs.microsoft.com/en-us/azure/architecture/best-practices/auto-scaling
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add this link as comment to the code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are just default values, right? We're (eventually) providing a way for the admin to customize these values? Or do we expect them to use the portal for tweeks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the values that we should include in the cli are:
max
/default
- required to create the scale set in the first placemin
- optional but defaults to 1scale-out-amount
/scale-out-cooldown
/scale-in-amount
/scale-in-cooldown
- will vary by scale set and use case so it's convenient to have it configurable. Ex: busier systems will want biggerscale-{in|out}-amount
values, if nodes take long to set up they'll want longer cooldowns.
We can keep the current values as defaults since I think they're appropriate for a less busy deployment.
logging.info( | ||
SCALESET_LOG_PREFIX + "unexpected scaleset size, resizing. " | ||
"scaleset_id:%s expected:%d actual:%d", | ||
self.scaleset_id, | ||
self.size, | ||
size, | ||
) | ||
self.set_state(ScalesetState.resize) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Synchronizing the state of the number of instances in a scale set with azure doesn't require resizing.
Summary of the Pull Request
Introduces a new way for nodes to be reaped from a scale set. This allows azure auto scale to scale in nodes when appropriate.
PR Checklist
Info on Pull Request
This PR includes:
resize
state that is no longer necessaryValidation Steps Performed