Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow setting GPUs limits in cluster autoscaling #807

Closed
parthmishra opened this issue Feb 1, 2021 · 3 comments
Closed

Allow setting GPUs limits in cluster autoscaling #807

parthmishra opened this issue Feb 1, 2021 · 3 comments
Labels
enhancement New feature or request P2 high priority issues triaged Scoped and ready for work

Comments

@parthmishra
Copy link

When configuring node auto-provisioning, it would be helpful to also be able to include GPU-specific resource limits. Currently, this module only allows setting CPU and Memory limits while the API allows for GPU-specific strings.

I imagine supporting this option may require a slight change to the cluster_autoscaling block since the type is any valid GPU string identifier. Example usage might look something like:

cluster_autoscaling = {
    enabled       = true
    min_cpu_cores = 0
    max_cpu_cores = 128
    min_memory_gb = 0
    max_memory_gb = 2000
    min_gpu       = 0
    max_gpu       = 16
    gpu_type      = "nvidia-tesla-k80"
  }

Is something like this currently possible? Apologies if this has been asked before. Thanks!

@parthmishra parthmishra changed the title Allow setting GPUs limits in cluster autoscaling configuring Allow setting GPUs limits in cluster autoscaling Feb 1, 2021
@morgante
Copy link
Contributor

morgante commented Feb 2, 2021

No, it's not currently possible, but definitely a good idea.

I don't currently have time to implement this, but we would be happy to review a PR adding it.

@morgante morgante added enhancement New feature or request P2 high priority issues triaged Scoped and ready for work labels Feb 2, 2021
JamesTimms pushed a commit to WayhomeUK/terraform-google-kubernetes-engine that referenced this issue Jun 30, 2021
* add gpu node autoscaling support for top level module

* add gpu node autoscaling support for beta-private-cluster module
JamesTimms pushed a commit to WayhomeUK/terraform-google-kubernetes-engine that referenced this issue Jun 30, 2021
JamesTimms pushed a commit to WayhomeUK/terraform-google-kubernetes-engine that referenced this issue Jun 30, 2021
…google-modules#807)

* updater example/node_pool cluster_autoscaling var to work with
  gpu_resources
JamesTimms pushed a commit to WayhomeUK/terraform-google-kubernetes-engine that referenced this issue Jun 30, 2021
JamesTimms pushed a commit to WayhomeUK/terraform-google-kubernetes-engine that referenced this issue Jul 4, 2021
…google-modules#807)

* Format gpu_resource to meet linter requirements
* Update examples/node_pool/
JamesTimms pushed a commit to WayhomeUK/terraform-google-kubernetes-engine that referenced this issue Jul 4, 2021
…google-modules#807)

* Add v16.0 upgrade guide
* Update node_pool test to specify `gpu_resources`
JamesTimms added a commit to WayhomeUK/terraform-google-kubernetes-engine that referenced this issue Jul 8, 2021
…google-modules#807)


* updates upgrade guide

Co-authored-by: Bharath KKB <bharathkrishnakb@gmail.com>
bharathkkb added a commit that referenced this issue Jul 9, 2021
* feature: add gpu node autoscaling support (#807)

* add gpu node autoscaling support for top level module

* add gpu node autoscaling support for beta-private-cluster module

* feature: add gpu node autoscaling support for all modules (#807)

* add gpu node autoscaling support for all modules

* feature: add gpu node autoscaling support for all modules (#807)

* updater example/node_pool cluster_autoscaling var to work with
  gpu_resources

* feature: add gpu node autoscaling support for all modules (#807)

* fix example/node_pool formatting error

* feature: add gpu node autoscaling support for all modules (#807)

* Format gpu_resource to meet linter requirements
* Update examples/node_pool/

* feature: add gpu node autoscaling support for all modules (#807)

* Add v16.0 upgrade guide
* Update node_pool test to specify `gpu_resources`

* feature: add gpu node autoscaling support for all modules (#807)

* updates upgrade guide

Co-authored-by: Bharath KKB <bharathkrishnakb@gmail.com>

Co-authored-by: Bharath KKB <bharathkrishnakb@gmail.com>
@max-sixty
Copy link

Is this closed by #944?

@morgante
Copy link
Contributor

Yes, I think so.

CPL-markus pushed a commit to WALTER-GROUP/terraform-google-kubernetes-engine that referenced this issue Jul 15, 2024
terraform-google-modules#944)

* feature: add gpu node autoscaling support (terraform-google-modules#807)

* add gpu node autoscaling support for top level module

* add gpu node autoscaling support for beta-private-cluster module

* feature: add gpu node autoscaling support for all modules (terraform-google-modules#807)

* add gpu node autoscaling support for all modules

* feature: add gpu node autoscaling support for all modules (terraform-google-modules#807)

* updater example/node_pool cluster_autoscaling var to work with
  gpu_resources

* feature: add gpu node autoscaling support for all modules (terraform-google-modules#807)

* fix example/node_pool formatting error

* feature: add gpu node autoscaling support for all modules (terraform-google-modules#807)

* Format gpu_resource to meet linter requirements
* Update examples/node_pool/

* feature: add gpu node autoscaling support for all modules (terraform-google-modules#807)

* Add v16.0 upgrade guide
* Update node_pool test to specify `gpu_resources`

* feature: add gpu node autoscaling support for all modules (terraform-google-modules#807)

* updates upgrade guide

Co-authored-by: Bharath KKB <bharathkrishnakb@gmail.com>

Co-authored-by: Bharath KKB <bharathkrishnakb@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request P2 high priority issues triaged Scoped and ready for work
Projects
None yet
Development

No branches or pull requests

3 participants