Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(ebs): set default volumeType to gp3(under feature flag) #29934

Merged
merged 15 commits into from
Apr 30, 2024

Conversation

pahud
Copy link
Contributor

@pahud pahud commented Apr 23, 2024

Issue # (if applicable)

As the EBS console is now having gp3 as the default volumeType, this PR set the default volumeType to gp3 if undefined under feature flag.

Closes #29931

Reason for this change

Description of changes

Description of how you validated changes

I have deployed the sample below and verified the volume type is gp3 from console.

import { Stack, App, Size, aws_ec2 as ec2 } from 'aws-cdk-lib';
import * as cxapi from 'aws-cdk-lib/cx-api';

const app = new App();

const stack = new Stack(app, 'demo-stack');

stack.node.setContext(cxapi.EBS_DEFAULT_GP3, true);

// should create a gp3 volume
new ec2.Volume(stack, 'Volume', {
  availabilityZone: 'us-east-1a',
  size: Size.gibibytes(500),
});

Checklist


By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license

@github-actions github-actions bot added effort/medium Medium work item – several days of effort feature-request A feature should be added or improved. p2 labels Apr 23, 2024
@aws-cdk-automation aws-cdk-automation requested a review from a team April 23, 2024 16:20
@mergify mergify bot added the contribution/core This is a PR that came from AWS. label Apr 23, 2024
@@ -128,7 +127,8 @@ The following json shows the current recommended set of flags, as `cdk init` wou
"@aws-cdk/aws-codepipeline:crossAccountKeysDefaultValueToFalse": true,
"@aws-cdk/aws-codepipeline:defaultPipelineTypeToV2": true,
"@aws-cdk/aws-kms:reduceCrossAccountRegionPolicyScope": true,
"@aws-cdk/aws-eks:nodegroupNameAttribute": true
"@aws-cdk/aws-eks:nodegroupNameAttribute": true,
"@aws-cdk/aws-ec2:ebsDefaultGp3Volume": true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason not to have something like "@aws-cdk/aws-ec2:ebsDefaultVolume": "gp3"? The current setup makes future changes in the default volume type a little more awkward, we'll have to have multiple aws-cdk/aws-ec2:ebsDefaultXXXVolume flags, make sure there are no conflicts, etc.

Copy link
Contributor Author

@pahud pahud Apr 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's say if we introduce "@aws-cdk/aws-ec2:ebsDefaultVolume": "gp3" in this PR and after a while AWS announces gp4 as a new default and we modify this feature flag to "@aws-cdk/aws-ec2:ebsDefaultVolume": "gp4", this would cause breaking changes for those opting in gp3 and we still need a check to see if they have explicitly enable this flag as gp3. I am not sure if that would be easier to maintain. Another consideration is conventionally our feature flags are boolean except for target partitions and we generally don't update value of existing feature flags. Will check this with the maintainers.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we modify this feature flag to "@aws-cdk/aws-ec2:ebsDefaultVolume": "gp4", this would cause breaking changes for those opting in gp3 and we still need a check to see if they have explicitly enable this flag as gp3

Wouldn't we only change the recommended value to gp4? Users that opted in to gp3 would be in the same situation as those who enabled ebsDefaultGp3Volume if we introduce ebsDefaultGp4Volume. Arguably even worse off, since they would both need to disable ebsDefaultGp3Volume and enable ebsDefaultGp4Volume, vs just updating ebsDefaultVolume to gp4.

I'm leaving it up to you guys, I might not be seeing the whole picture, just throwing it out there 👍

@pahud pahud marked this pull request as ready for review April 23, 2024 19:54
@TheRealAmazonKendra
Copy link
Contributor

If we make this switch without a feature flag, what is the behavior of this change? Obviously there will be a replacement of instances but will this cause downtime? If we can just update it, no reason not to.

@pahud
Copy link
Contributor Author

pahud commented Apr 23, 2024

@TheRealAmazonKendra Let me check and update here.

@pahud
Copy link
Contributor Author

pahud commented Apr 23, 2024

@TheRealAmazonKendra

CloudFormation would trigger in-place volume update but this is what Amazon Q said:

It is generally not recommended to update a gp2 volume to gp3 while it is mounted and in use. Here are a few things to keep in mind:

Migrating the volume will temporarily impact availability and performance while the update is in progress. This could cause downtime or errors for applications using the volume.

The migration process modifies properties like IOPS and throughput provisioned on the volume. This could lead to a performance mismatch if the volume is actively being used during migration.

It is always better to take necessary precautions like making snapshots, detaching volumes, updating, and then reattaching to avoid any data corruption or loss during migration.

Some best practices for migrating from gp2 to gp3 include:

Take a snapshot of the gp2 volume to have a backup.

Detach the gp2 volume from the instance to avoid I/O activity during migration.

Modify the volume type to gp3 using the AWS CLI or console and select appropriate IOPS and throughput.

Reattach the updated gp3 volume to the instance once migration is complete.

So I guess we probably still need a feature flag for that.

@aws-cdk-automation aws-cdk-automation added the pr/needs-maintainer-review This PR needs a review from a Core Team Member label Apr 24, 2024
@TheRealAmazonKendra
Copy link
Contributor

I was looking around to see what the EOL date is for these instance types (there is none) and found this https://aws.amazon.com/blogs/storage/migrate-your-amazon-ebs-volumes-from-gp2-to-gp3-and-save-up-to-20-on-costs/. It seems that there isn't actually downtime when a replacement happens but that there may be other considerations we need to take into account if someone is migrating.

@TheRealAmazonKendra
Copy link
Contributor

@TheRealAmazonKendra
Copy link
Contributor

All that to say, I totally agree that we need to add a feature flag here but we also probably need to add a bit more than just that. Our documentation should make these limitations clear. I'm not sure if there's much else we can do to assist in a smooth upgrade.

From the fact that gp3 was released in 2020 and we haven't had any new ones since then, I'm a bit torn about how much we need to future proof this.

If we want to really future proof it, we could update the default here to be LATEST and say in the documentation that we will use the latest volume type available at the time of implementation. To ensure we're not updating that version without the user intending to, we could save that version in context and require the user to clear that out in order to upgrade. I'm modeling this idea off of what we do with Amazon Linux AMIs.

This approach would still require a feature flag but it would ensure that we aren't having to update that feature flag and/or overwrite it in the future.

Thoughts on this approach? If it feels like overkill, I'm not opposed to how you've added the feature flag as is. Perhaps we should check with the EBS team to see if there's anything on the horizon for gp4 or similar and to see if there's anything thorny we need to know about updating via CloudFormation so we can warn users.

@pahud
Copy link
Contributor Author

pahud commented Apr 24, 2024

To ensure we're not updating that version without the user intending to, we could save that version in context and require the user to clear that out in order to upgrade. I'm modeling this idea off of what we do with Amazon Linux AMIs.

I'm interested to know more about this. Are you referring latestamazonlinux2() which essentially lookup the value from ssm parameter through a context provider and cache in the context? Is there any existing context provider we can reuse for this use case? Adding a new context provider just for this seems a little bit overkill to me.

I am not a big fan of LATEST and I prefer explicit, clear and self-explanatory construct props without users having to deal with the cache invalidation. For example, if I use LATEST for the default ebs volume type, I won't know what exactly volume type would be picked by CDK from my code and I'll need to check if there's something cached in my local context and double check my synthesized template. But if I explicitly set GP3 as the default with feature flag I have the confidence it would always be GP3 unless I explicitly set something else.

RE gp4, by the time gp4 is announced as the default volume type and we are still at CDK v2, we could introduce ebsDefaultGp4Volume which has high priority over ebsDefaultGp3Volume if both exists and we could have a helper function to determine the default type if multiple feature flags are enabled. The advantage is that the user code would be very clear and self-explanatory without ambiguity. I am also fine with ebsDefaultVolume: gp4 as mentioned above by @nmussy but that seems more like a context variable to switch the defaults rather than a feature flag to me and I can't find any existing similar use cases from our code base.

Copy link
Contributor

@GavinZZ GavinZZ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pahud
Copy link
Contributor Author

pahud commented Apr 24, 2024

@GavinZZ DONE

@GavinZZ
Copy link
Contributor

GavinZZ commented Apr 25, 2024

Read through all the comments and I actually like the idea proposed by @nmussy. Haven't thought about the implementation difficulty but my first instinct is that we don't need anything else to support string type value in feature flag. When gp4 is out, we can switch the recommended value to gp4 and stacks created before shouldn't get impacted while new stacks get default support using gp4.

Note that I also prefer specifying explicit value instead of LATEST, users would have to look into the code for an additional step to figure out the exact version LATEST refers to.

This was referenced Apr 26, 2024
Copy link
Contributor

@GavinZZ GavinZZ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, discussed with the team and decided to continue using boolean value for feature flags.

Copy link
Contributor

mergify bot commented Apr 29, 2024

Thank you for contributing! Your pull request will be updated from main and then merged automatically (do not update manually, and be sure to allow changes to be pushed to your fork).

@aws-cdk-automation aws-cdk-automation removed the pr/needs-maintainer-review This PR needs a review from a Core Team Member label Apr 29, 2024
Copy link
Contributor

mergify bot commented Apr 29, 2024

Thank you for contributing! Your pull request will be updated from main and then merged automatically (do not update manually, and be sure to allow changes to be pushed to your fork).

Copy link
Contributor

mergify bot commented Apr 30, 2024

Thank you for contributing! Your pull request will be updated from main and then merged automatically (do not update manually, and be sure to allow changes to be pushed to your fork).

@aws-cdk-automation
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: AutoBuildv2Project1C6BFA3F-wQm2hXv2jqQv
  • Commit ID: 2082d37
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@mergify mergify bot merged commit bbde879 into aws:main Apr 30, 2024
9 checks passed
Copy link
Contributor

mergify bot commented Apr 30, 2024

Thank you for contributing! Your pull request will be updated from main and then merged automatically (do not update manually, and be sure to allow changes to be pushed to your fork).

@aws-cdk-automation
Copy link
Collaborator

Comments on closed issues and PRs are hard for our team to see. If you need help, please open a new issue that references this one.

@aws aws locked as resolved and limited conversation to collaborators Jul 25, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
contribution/core This is a PR that came from AWS. effort/medium Medium work item – several days of effort feature-request A feature should be added or improved. p2
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ec2: change default Volume.volumeType and/or change generic EbsDeviceVolumeType
5 participants