Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add autoscaler local exec option #895

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

cwiggs
Copy link
Contributor

@cwiggs cwiggs commented Feb 8, 2024

This commit adds the ability to deploy the k8s cluster autoscaler using local-exec rather than remote-exec. Using local-exec is helpful when you don't use the operator/bastion features of this module. Now if you set cluster_autoscaler_remote_exec variable to false terraform will run a kubectl apply command on the same machine where you are running Terraform.

More info in this issue: #894

Signed-off-by: Chris Wiggins(5607419+cwiggs@users.noreply.github.com)

This commit adds the ability to deploy the k8s cluster autoscaler using
local-exec rather than remote-exec.  Using local-exec is helpful when
you don't use the operator/bastion features of this module.  Now if you
set cluster_autoscaler_remote_exec variable to false terraform will run
a `kubectl apply` command on the same machine where you are running
Terraform.

More info in this issue: oracle-terraform-modules#894

Signed-off-by: Chris Wiggins(5607419+cwiggs@users.noreply.github.com)
@oracle-contributor-agreement oracle-contributor-agreement bot added the OCA Verified All contributors have signed the Oracle Contributor Agreement. label Feb 8, 2024
}

provisioner "local-exec" {
inline = ["cat ${local.cluster_autoscaler_manifest} | kubectl apply --dry-run='client' -f -"]

This comment was marked as outdated.

@@ -118,7 +127,7 @@ data "helm_template" "cluster_autoscaler" {
}

resource "null_resource" "cluster_autoscaler" {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's cleaner to change this to remote_cluster_autoscaler so it matches the new local_cluster_autoscaler, however I didn't want to break anything upstream. Options:

  • Change the name and add a moved {} block.
  • Leave the name as it is.

Any thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, then all extensions will need this kind of feature. I like your idea of using a variable to control remote or local exec. Perhaps just use the same null resource but different conditional blocks based on the variable?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ended up going with the helm_release resource, I think that is the better approach. I attempted to use a separate null_resource with a local-exec provisioner, but I would also have to add a local_file resource to create the manifest and then apply it with kubectl. Using kubectl makes sense when you can't use the helm_release resource remotely, but you can use it locally.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see the helm provider being initialized.
I recommend using a kube-config datasource and creating a local kube-config file instead of relying on the existing one and the currently configured context.

Note that this approach assumes that OCI CLI is already installed and configured locally.

@cwiggs

This comment was marked as outdated.

}

provisioner "local-exec" {
command = "cat ${local.cluster_autoscaler_manifest} | kubectl apply --dry-run='client' -f -"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cat-ing that local variable into kubectl apply doesn't seem to work. I get an error like the following:

│       serviceAccountName: cluster-autoscaler-oci-oke-cluster-autoscaler                                      
│       tolerations:                                                                                                           
│         []                                                                                                                   
│  | kubectl apply --dry-run='client' -f -': exit status 2. Output: : -: not found                             
│ /bin/sh: 81: -: not found                                    
│ /bin/sh: 82: -: not found                                                                                                    
│ /bin/sh: 83: -: not found                                    
│ /bin/sh: 84: -: not found 

Maybe there are newlines in the variable that is messing it up?

Options:

  • Use the helm provider that is already in versions.tf to just apply the helm chart.
  • Use the local_file resource to actually save the manifest yaml, which would have a similar effect to what the remote-exec is doing.

cwiggs and others added 2 commits February 8, 2024 15:16
This commit updates the autoscaler so when you want it to run locally it
will use the helm_release resource instead of a null_resource with a
local command provisioner.
@cwiggs cwiggs marked this pull request as ready for review February 8, 2024 22:57
@cwiggs
Copy link
Contributor Author

cwiggs commented Feb 8, 2024

I was able to get this working with the helm_release resource. You can read more about why I used that resource on this comment

I did run into an issue when trying to apply the helm package locally after I deployed it via the remote method. It looks like when deploying it via the manifest that is generated from the helm_template resource and then applying that via kubectl there are annotations that are missing. Since the annotations are missing, when you try to then apply via the helm_release resource, terraform throws the error below. I was able to work around it by adding the annotations manually, some of them can be added via the helm chart values, but not all of them.

module.oke.module.extensions[0].helm_release.local_cluster_autoscaler[0]: Creating...                                          
╷                                                                                                                              
│ Error: rendered manifests contain a resource that already exists. Unable to continue with install: Deployment "cluster-autosc
aler-oci-oke-cluster-autoscaler" in namespace "kube-system" exists and cannot be imported into the current release: invalid own
ership metadata; annotation validation error: missing key "meta.helm.sh/release-name": must be set to "cluster-autoscaler"; ann
otation validation error: missing key "meta.helm.sh/release-namespace": must be set to "kube-system"                           
│                                                                                                                              
│   with module.oke.module.extensions[0].helm_release.local_cluster_autoscaler[0],                                             
│   on .terraform/modules/oke/modules/extensions/autoscaler.tf line 161, in resource "helm_release" "local_cluster_autoscaler":
│  161: resource "helm_release" "local_cluster_autoscaler" {

TLDR: if you are switching from remote to local install of the cluster autoscaler helm chart you are going to have some manual work to do.

Right now I only need this feature to work with the cluster autoscaler but I could see adding it to the other extentions.

Hopefully this work for y'all, let me know if I need to make any changes.

@cwiggs cwiggs requested a review from hyder February 8, 2024 23:12
@cwiggs
Copy link
Contributor Author

cwiggs commented Mar 8, 2024

Can I get a maintainer to take a look at this PR? I've been using it in production for a few weeks.

@mailbox171
Copy link

mailbox171 commented Mar 9, 2024 via email

@cwiggs
Copy link
Contributor Author

cwiggs commented Mar 19, 2024

Sorry. Problem solved. I found out I had google auth app set up. And it worked. Thanks. You can close the ticket F.Costa +393351246931 Il Ven 8 Mar 2024, 23:17 Chris Wiggins @.***> ha scritto:

Can I get a maintainer to take a look at this PR? I've been using it in production for a few weeks. — Reply to this email directly, view it on GitHub <#895 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHSK64KWLNPXF2C7MXFR3OLYXI2F3AVCNFSM6AAAAABC66NRY6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSOBWGQ4TOMRXGE . You are receiving this because you are subscribed to this thread.Message ID: </pull/895/c1986497271@ github.com>

wrong PR?

@cwiggs cwiggs requested a review from robo-cap March 19, 2024 23:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
OCA Verified All contributors have signed the Oracle Contributor Agreement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants