-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Backport of [QT-525] and [QT-530] into release/1.11.x #20160
Merged
ryancragun
merged 2 commits into
release/1.11.x
from
backport/qt-525/highly-settling-pangolin
Apr 23, 2023
Merged
Backport of [QT-525] and [QT-530] into release/1.11.x #20160
ryancragun
merged 2 commits into
release/1.11.x
from
backport/qt-525/highly-settling-pangolin
Apr 23, 2023
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Collaborator
hc-github-team-secure-vault-core
force-pushed
the
backport/qt-525/highly-settling-pangolin
branch
2 times, most recently
from
April 13, 2023 20:53
e186ac8
to
b668681
Compare
ryancragun
force-pushed
the
backport/qt-525/highly-settling-pangolin
branch
from
April 23, 2023 22:37
1a1ed35
to
12f7264
Compare
ryancragun
changed the title
Backport of [QT-525] enos: use spot instances for Vault targets into release/1.11.x
Backport of [QT-525] and [QT-530] into release/1.11.x
Apr 23, 2023
The previous strategy for provisioning infrastructure targets was to use the cheapest instances that could reliably perform as Vault cluster nodes. With this change we introduce a new model for target node infrastructure. We've replaced on-demand instances for a spot fleet. While the spot price fluctuates based on dynamic pricing, capacity, region, instance type, and platform, cost savings for our most common combinations range between 20-70%. This change only includes spot fleet targets for Vault clusters. We'll be updating our Consul backend bidding in another PR. * Create a new `vault_cluster` module that handles installation, configuration, initializing, and unsealing Vault clusters. * Create a `target_ec2_instances` module that can provision a group of instances on-demand. * Create a `target_ec2_spot_fleet` module that can bid on a fleet of spot instances. * Extend every Enos scenario to utilize the spot fleet target acquisition strategy and the `vault_cluster` module. * Update our Enos CI modules to handle both the `aws-nuke` permissions and also the privileges to provision spot fleets. * Only use us-east-1 and us-west-2 in our scenario matrices as costs are lower than us-west-1. Signed-off-by: Ryan Cragun <me@ryan.ec>
The security groups that allow access to remote machines in Enos scenarios have been configured to only allow port 22 (SSH) from the public IP address of machine executing the Enos scenario. To achieve this we previously utilized the `enos_environment.public_ip_address` attribute. Sometime in mid March we started seeing sporadic SSH i/o timeout errors when attempting to execute Enos resources against SSH transport targets. We've only ever seen this when communicating from Azure hosted runners to AWS hosted machines. While testing we were able to confirm that in some cases the public IP address resolved using DNS over UDP4 to Google and OpenDNS name servers did not match what was resolved when using the HTTPS/TCP IP address service hosted by AWS. The Enos data source was implemented in a way that we'd attempt resolution of a single name server and only attempt resolving from the next if previous name server could not get a result. We'd then allow-list that single IP address. That's a problem if we can resolve two different public IP addresses depending our endpoint address. This change utlizes the new `enos_environment.public_ip_addresses` attribute and subsequent behavior change. Now the data source will attempt to resolve our public IP address via name servers hosted by Google, OpenDNS, Cloudflare, and AWS. We then return a unique set of these IP addresses and allow-list all of them in our security group. It is our hope that this resolves these i/o timeout errors that seem like they're caused by the security group black-holing our attempted access because the IP we resolved does not match what we're actually exiting with. Signed-off-by: Ryan Cragun <me@ryan.ec>
ryancragun
force-pushed
the
backport/qt-525/highly-settling-pangolin
branch
from
April 23, 2023 22:40
12f7264
to
2a6437e
Compare
ryancragun
approved these changes
Apr 23, 2023
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The previous strategy for provisioning infrastructure targets was to use
the cheapest instances that could reliably perform as Vault cluster
nodes. With this change we introduce a new model for target node
infrastructure. We've replaced on-demand instances for a spot
fleet. While the spot price fluctuates based on dynamic pricing,
capacity, region, instance type, and platform, cost savings for our
most common combinations range between 20-70%.
This change only includes spot fleet targets for Vault clusters.
We'll be updating our Consul backend bidding in another PR.
vault_cluster
module that handles installation,configuration, initializing, and unsealing Vault clusters.
target_ec2_instances
module that can provision a group ofinstances on-demand.
target_ec2_spot_fleet
module that can bid on a fleet ofspot instances.
strategy and the
vault_cluster
module.aws-nuke
permissionsand also the privileges to provision spot fleets.
lower than us-west-1.
Signed-off-by: Ryan Cragun me@ryan.ec