Skip to content

Commit

Permalink
feat(batch): set default spot allocation strategy to `SPOT_PRICE_CAPA…
Browse files Browse the repository at this point in the history
…CITY_OPTIMIZED` (#26731)

https://aws.amazon.com/about-aws/whats-new/2023/08/aws-batch-price-capacity-optimized-allocation-strategy-spot-instances/ and https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-fleet-allocation-strategy.html

`SPOT_PRICE_CAPACITY_OPTIMIZED` is now recommended over `SPOT_CAPACITY_OPTIMIZED`; make it the new default, while the construct is still in alpha.

BREAKING CHANGE: if using spot instances on your Compute Environments, they will default to `SPOT_PRICE_CAPACITY_OPTIMIZED` instead of `SPOT_CAPACITY_OPTIMIZED` now.

----

*By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
  • Loading branch information
comcalvi authored Aug 12, 2023
1 parent ce2f844 commit e0ca252
Show file tree
Hide file tree
Showing 11 changed files with 487 additions and 145 deletions.
21 changes: 13 additions & 8 deletions packages/@aws-cdk/aws-batch-alpha/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -128,19 +128,23 @@ computeEnv.addInstanceClass(ec2.InstanceClass.R4);

#### Allocation Strategies

| Allocation Strategy | Optimized for | Downsides |
| ----------------------- | ------------- | ----------------------------- |
| BEST_FIT | Cost | May limit throughput |
| BEST_FIT_PROGRESSIVE | Throughput | May increase cost |
| SPOT_CAPACITY_OPTIMIZED | Least interruption | Only useful on Spot instances |
| Allocation Strategy | Optimized for | Downsides |
| ----------------------- | ------------- | ----------------------------- |
| BEST_FIT | Cost | May limit throughput |
| BEST_FIT_PROGRESSIVE | Throughput | May increase cost |
| SPOT_CAPACITY_OPTIMIZED | Least interruption | Only useful on Spot instances |
| SPOT_PRICE_CAPACITY_OPTIMIZED | Least interruption + Price | Only useful on Spot instances |

Batch provides different Allocation Strategies to help it choose which instances to provision.
If your workflow tolerates interruptions, you should enable `spot` on your `ComputeEnvironment`
and use `SPOT_CAPACITY_OPTIMIZED` (this is the default if `spot` is enabled).
and use `SPOT_PRICE_CAPACITY_OPTIMIZED` (this is the default if `spot` is enabled).
This will tell Batch to choose the instance types from the ones you’ve specified that have
the most spot capacity available to minimize the chance of interruption.
the most spot capacity available to minimize the chance of interruption and have the lowest price.
To get the most benefit from your spot instances,
you should allow Batch to choose from as many different instance types as possible.
If you only care about minimal interruptions and not want Batch to optimize for cost, use
`SPOT_CAPACITY_OPTIMIZED`. `SPOT_PRICE_CAPACITY_OPTIMIZED` is recommended over `SPOT_CAPACITY_OPTIMIZED`
for most use cases.

If your workflow does not tolerate interruptions and you want to minimize your costs at the expense
of potentially longer waiting times, use `AllocationStrategy.BEST_FIT`.
Expand Down Expand Up @@ -189,7 +193,8 @@ const computeEnv = new batch.ManagedEc2EcsComputeEnvironment(this, 'myEc2Compute
You can specify the maximum and minimum vCPUs a managed `ComputeEnvironment` can have at any given time.
Batch will *always* maintain `minvCpus` worth of instances in your ComputeEnvironment, even if it is not executing any jobs,
and even if it is disabled. Batch will scale the instances up to `maxvCpus` worth of instances as
jobs exit the JobQueue and enter the ComputeEnvironment. If you use `AllocationStrategy.BEST_FIT_PROGRESSIVE` or `AllocationStrategy.SPOT_CAPACITY_OPTIMIZED`,
jobs exit the JobQueue and enter the ComputeEnvironment. If you use `AllocationStrategy.BEST_FIT_PROGRESSIVE`,
`AllocationStrategy.SPOT_PRICE_CAPACITY_OPTIMIZED`, or `AllocationStrategy.SPOT_CAPACITY_OPTIMIZED`,
batch may exceed `maxvCpus`; it will never exceed `maxvCpus` by more than a single instance type. This example configures a
`minvCpus` of 10 and a `maxvCpus` of 100:

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -452,6 +452,15 @@ export enum AllocationStrategy {
* you should allow Batch to choose from as many different instance types as possible.
*/
SPOT_CAPACITY_OPTIMIZED = 'SPOT_CAPACITY_OPTIMIZED',

/**
* The price and capacity optimized allocation strategy looks at both price and capacity
* to select the Spot Instance pools that are the least likely to be interrupted
* and have the lowest possible price.
*
* The Batch team recommends this over `SPOT_CAPACITY_OPTIMIZED` in most instances.
*/
SPOT_PRICE_CAPACITY_OPTIMIZED = 'SPOT_PRICE_CAPACITY_OPTIMIZED',
}

/**
Expand Down Expand Up @@ -1145,7 +1154,9 @@ function createSpotFleetRole(scope: Construct): IRole {
function determineAllocationStrategy(id: string, allocationStrategy?: AllocationStrategy, spot?: boolean): AllocationStrategy | undefined {
let result = allocationStrategy;
if (!allocationStrategy) {
result = spot ? AllocationStrategy.SPOT_CAPACITY_OPTIMIZED : AllocationStrategy.BEST_FIT_PROGRESSIVE;
result = spot ? AllocationStrategy.SPOT_PRICE_CAPACITY_OPTIMIZED : AllocationStrategy.BEST_FIT_PROGRESSIVE;
} else if (allocationStrategy === AllocationStrategy.SPOT_PRICE_CAPACITY_OPTIMIZED && !spot) {
throw new Error(`Managed ComputeEnvironment '${id}' specifies 'AllocationStrategy.SPOT_PRICE_CAPACITY_OPTIMIZED' without using spot instances`);
} else if (allocationStrategy === AllocationStrategy.SPOT_CAPACITY_OPTIMIZED && !spot) {
throw new Error(`Managed ComputeEnvironment '${id}' specifies 'AllocationStrategy.SPOT_CAPACITY_OPTIMIZED' without using spot instances`);
}
Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
{
"version": "32.0.0",
"version": "33.0.0",
"files": {
"21fbb51d7b23f6a6c262b46a9caee79d744a3ac019fd45422d988b96d44b2a22": {
"source": {
Expand Down
Original file line number Diff line number Diff line change
@@ -1,15 +1,15 @@
{
"version": "32.0.0",
"version": "33.0.0",
"files": {
"81f3134124cef368d56ccabda586dbcbef39a78089edd14c9d641cbcb4e0bad2": {
"c107f22b1a273d6b3e98ae47d04dfc2c17295a01e96b0b2a69ceaaad3ec33905": {
"source": {
"path": "batch-stack.template.json",
"packaging": "file"
},
"destinations": {
"current_account-current_region": {
"bucketName": "cdk-hnb659fds-assets-${AWS::AccountId}-${AWS::Region}",
"objectKey": "81f3134124cef368d56ccabda586dbcbef39a78089edd14c9d641cbcb4e0bad2.json",
"objectKey": "c107f22b1a273d6b3e98ae47d04dfc2c17295a01e96b0b2a69ceaaad3ec33905.json",
"assumeRoleArn": "arn:${AWS::Partition}:iam::${AWS::AccountId}:role/cdk-hnb659fds-file-publishing-role-${AWS::AccountId}-${AWS::Region}"
}
}
Expand Down
Loading

0 comments on commit e0ca252

Please sign in to comment.