-
Notifications
You must be signed in to change notification settings - Fork 160
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add machine type availability checks to slurm-gcp-v6-nodeset #2962
Add machine type availability checks to slurm-gcp-v6-nodeset #2962
Conversation
The boolean logic in the check block does not enforce the correct behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To make the terraform_data resource and precondition useful, it should be inserted into the terraform resource graph.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please also change versions.tf to require terraform 1.4
Done |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add one final helpful hint to the user. Then squash your commits and this should be ready!
Done. Verified the error as well
|
6c9bcfc
to
dcfd1ce
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a big improvement in user experience. Thank you for the submission!
715f9f2
into
GoogleCloudPlatform:develop
Issue
Blueprint yaml files allow users to specify machine and zone combinations that do not exist. Infra is provisioned by Terraform during
./ghpc deploy
, but autoscaling may fail later if capacity is not found by bulk insert APIs in zones specifiedApproach
Created a Terraform precondition to verify that the machine type is available in at least one zone. In case they're not, there is no feasible way for bulk insert to allocate machines, and terraform will exit during plan/apply
Testing
Ran
./ghpc create
,./ghpc deploy
on this configuration. The specified machine type does not exist in both zonesus-central1-b
andus-central1-c
. Verified that Terraform exits and error message is as expectedconfig:
output:
Added 1 valid zone (
us-central1-a
) and verified that infra is correctly provisionedSubmission Checklist
Please take the following actions before submitting this pull request.