-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GKE autopilot is always created with default service account II #9505
GKE autopilot is always created with default service account II #9505
Comments
@slevenick is there any update on this? |
I'm not sure how to proceed with this. This bug is due to a weird interaction between autopilot & the default service account field. Basically, the API is not respecting the request that is sent with the service account. I'm not sure how gcloud is setting up the autopilot cluster with a non-default service account successfully. Can you capture the HTTP requests to see if that is happening in a single request, or if there is a later update to apply the service account? |
Hi, i run into the same problem. @slevenick Is there any update on this subject ? Best regards. |
Sorry for late answer @slevenick - I was on vacation... I executed this:
This is the request:
|
hi, I ran into the same issue, not being able to assign a custom service account to an autopilot gke cluster with terraform v1.0.1. @slevenick Is there any update on this subject? Regards, |
Hi, |
@nilsoulinou I created GKE cluster via @venkykuberan @slevenick is this still considered active? |
@tSte are you saying that you can't create GKE in autopilot mode with a non default service account directly with google provider and you have to create it with gcloud command then import it with terraform ? If yes, i think this issue is still active because i expect it to be performed with terraform and not having to do manual steps. |
@lrk you're right - all of our clusters are currently created via |
Are there any updates to this thread, on the ability to use non default SA to provision a Autopilot GKE? |
The issue occurs because Terraform is using a deprecated field to set up the service account while the API no longer respects this field when the cluster type is Autopilot. The following payload to the API will create the cluster succesfully:
However, Terraform generates the following payload:
The difference between these two is, the former is using the Perhaps Terraform provider should get away from the deprecated property to avoid not only this one but also other any future issues @slevenick. There is already TODO item here for that :) |
Thinking about this a little bit more, I believe the API should not simply ignore the field although it is deprecated. I have also created an issue https://issuetracker.google.com/issues/219237911. Impacted people may consider starring the issue. |
@slevenick: Updating assignment because I think this has gone inactive, please correct this if you're still working on it!
The TODO in that file was for another tool that the MM generator used to be used for- Terraform's implementation is handwritten. #7185 and #4963 (roughly) track potential removal of the field. We haven't gone forward with it because of the projected impact- requiring users to rewrite configs, and recreating their clusters if they get it wrong- and the lack of signal from the API that they'll actually remove the field. The API respecting the service account in one case and not the other is confusing and frustrating as both those messages should have created the same cluster- thanks for filing upstream. I think there's a workaround in the provider today, luckily, as you should be able to create clusters with |
Hi all, the underlying API issue seems to be resolved according to here: https://issuetracker.google.com/issues/219237911#comment3 If someone can confirm that on Terraform side this also fixed the issue, then this one can be closed. |
Hi all, Regards Nils |
Hi, Terraform v1.1.7
with the following terraform block:
Should I need additional informations? Nils |
If you feel the issue was not fixed, please drop a comment to https://issuetracker.google.com/issues/219237911#comment3 |
I've recently run into this issue myself. Below are my findings Terraform v1.1.5
Like in #9505 (comment) I noticed the payload that was being generated for a new autopilot cluster was the following:
Looking at the documentation to create a cluster at [1] it lists the command to be used as
So that means that the Terraform provider using the default cluster creation API [2] that doesn't list any flags to specify autopilot when it should be using [3] instead. I've verified that using the following command will create an Autopilot cluster with a correct service account.
While I see that there is discussion of a deprecation at [4] it seems like a quicker solution may to use the API specified in [3] which currently works. [1] https://cloud.google.com/kubernetes-engine/docs/how-to/creating-an-autopilot-cluster#gcloud |
) fixes hashicorp/terraform-provider-google#9505 Signed-off-by: Modular Magician <magic-modules@google.com> Signed-off-by: Modular Magician <magic-modules@google.com>
I don't think this is fixed. I've built the provider with #13024 and am trying to provision an Autopilot cluster. We'd previously deleted the default GCE SA from the project entirely, and get
even when specifying a custom SA. |
Hey @mgoodness! Would you mind providing the terraform config so that we can use to reproduce the error and also the debug log if possible? |
Sure thing! Let me know if there's anything else I can try/share. |
Thanks @mgoodness - looking into this, I don't think it's related to the TF implementation. There may be a dependency issue at initialization. If the default SA exists, you should still see your workloads scheduled on nodes using your provided SA but we may still be looking for the default anyway. |
@JeremyOT So more of a GCP API issue? Worth opening a ticket with them...somewhere? I should note that we are able to provision non-Autopilot clusters using (essentially) the same TF config, even with the missing default SA. Seems to only be an issue with AP. |
FYI: This fix was just released as part of https://github.com/hashicorp/terraform-provider-google/releases/tag/v4.44.0 🎉 Sadly now the following tf config is accepted, but returns an API error: resource "google_container_cluster" "my_autopilot_cluster" {
## other config
enable_autopilot = true
cluster_autoscaling {
auto_provisioning_defaults {
service_account = google_service_account.my_account.email
}
}
}
Is the config wrong? |
@johanneswuerbach I'm getting the same error you're, with a similar config. Did you manage to find a fix? |
Sadly not, I think this needs to be reopened @shuyama1 |
This appears to be a server-side issue. The TF config passes the correct parameters to the backend, but the initial nodes created at bootstrapping still use the default SA. New nodes created as workloads are added do use the supplied SA, but this is preventing proper startup when the default SA is deleted. A fix is in progress. |
+1 here. Using beta-autopilot-private-cluster, I added the block: cluster_autoscaling {
auto_provisioning_defaults {
service_account = google_service_account.my_account.email
}
} I re-deployed the cluster and the new nodes are still being created with the default service account. |
@diegosucaria (nice pilot pic :) ) this is what I was referring to in my comment above. If you deploy workloads and additional nodes are created, they should use your supplied SA - I just verified with both google and google-beta at 4.44.1. |
@diegosucaria I think that is a problem in the module. I filed terraform-google-modules/terraform-google-kubernetes-engine#1488 It seems like their module config (cluster.tf) doesn't even set the service account anywhere. |
Yes, that is correct. I had to do a local copy and added the I still cannot get the new nodes to use the non-default service account. (new nodes for my workloads) It is not a critical problem, but it goes against the best practices we recommend |
@diegosucaria If you have bandwidth, happy to review a PR fixing this in the module too. Some context in terraform-google-modules/terraform-google-kubernetes-engine#1488 |
Looks like it's a module issue now and a ticket is filed against https://github.com/terraform-google-modules/terraform-google-kubernetes-engine. Therefore close this issue (after reopen it). Sorry for the confusion. Please let me know if the issue still occurs in the provider and need to reopen this issue. |
I don't understand how this resolves #9505 (comment). Does it mean that you can't change the service account of an existing cluster? Shouldn't this parameter be set to require a recreation in this case? |
Hey all! This is a little messy, so I checked with @JeremyOT to summarise what's up with this issue. tl;dr: Specifying a service account through The API currently supports specifying service accounts through a few places, and one of those ( The method that was unblocked in GoogleCloudPlatform/magic-modules#6733 (& released in
As raised in #9505 (comment) it was discovered there was an issue with the server-side implementation- the default node pool continues to be created with the default SA. I can doubly confirm a fix is in progress, but can't speak to an exact timeline in this thread (sorry!). The method that Between a server-side fix on the way and a workaround requiring a code change + release + provider upgrade (a best case of a week, but more likely two due to release timing), the best path forward appears to be to wait for the server-side fix to roll out. |
Just wanted to comment here since I made a previous comment claiming that it wasn't working (that I since deleted). I'm using the following in my terraform config and I found that it is working as expected as far as I can tell. I'm not sure how to 100% confirm which service account is actually being used on the autopilot nodes (they seem hidden from the GCP console UI), but I was dealing with some permission issues which didn't resolve until I added The only downside, is I think you still have to keep the default compute service account active since autopilot requires it in other ways. As JeremyOT mentioned, I think you need to make sure the nodes scale to 0, or rebuild your cluster since only new nodes will be created with the correct service account. enable_autopilot = true
cluster_autoscaling {
auto_provisioning_defaults {
service_account = "my-service-account@my-project.iam.gserviceaccount.com"
}
} |
Ah, I lost a qualifier at some point when writing my message. I'll edit it back in. My understanding is that the default node pool doesn't respect the setting, but future ones will at the moment. I believe that's the case preventing you from using autopilot w/ the account removed. Once the server-side fix rolls out, the default node pool should respect the setting as well. |
I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. |
Note: This should have rolled out fully by now, and a configuration like the following will apply to all nodes, including the default node pool:
|
Community Note
modular-magician
user, it is either in the process of being autogenerated, or is planned to be autogenerated soon. If an issue is assigned to a user, that user is claiming responsibility for the issue. If an issue is assigned tohashibot
, a community member has claimed the issue already.This is duplicate of #8918 see #8918 (comment) - sorry for creating this, but I don't seem to have rights to re-open the original issue (?) and it doesn't seem to be any activity there.
The text was updated successfully, but these errors were encountered: