-
Notifications
You must be signed in to change notification settings - Fork 115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Installing cert-manager with CRDs using helm hangs pulumi #1222
Comments
#1130 seems related - sometimes it would hang on planning, sometimes on applying, depended on messing with pulumi refresh and some related stuff |
In #1219 there's a hint that downgrading cert-manager to 0.15.2 may help - I will try that, but something is clearly wrong in pulumi-kubernetes process and verbose logging doesn't help much to know. |
hmm ok I think I found the issue upstream here: https://cert-manager.io/docs/installation/upgrading/upgrading-0.15-0.16/ Kubernetes bug: kubernetes/kubernetes#91615 |
it's not only using helm, I got hang by doing this: import * as k8s from "@pulumi/kubernetes";
export const certManager = new k8s.yaml.ConfigFile("cert-manager", {
file:
"https://github.com/jetstack/cert-manager/releases/download/v0.16.0/cert-manager.yaml",
}); |
I'm encountering the same issue. Here is my code
|
After further investigation, this issue appears to be triggered by the Once the fix has merged, I'll update the Pulumi k8s provider's dependency. For now, I'd suggest sticking with a previous version of Edit: In the interest of fixing this more quickly, I forked the upstream repo and applied the fix in the fork. I'll cut a release with the fix on Monday. |
cert-manager v1.0.0 is still having various issues with pulumi. |
Same here with |
I tested this again this morning with the latest k8s provider release (v2.6.1) and did not encounter the reported hangs. I expected this to be fixed by the changes in #1223, so can you verify that you're using a recent version of the provider? Here's the code that deployed successfully for me: import * as k8s from "@pulumi/kubernetes";
export const certManager = new k8s.yaml.ConfigFile("cert-manager", {
file: "https://github.com/jetstack/cert-manager/releases/download/v1.0.1/cert-manager.yaml",
}); It also worked with a Helm deployment: import * as k8s from "@pulumi/kubernetes";
const certManagerNamespace = new k8s.core.v1.Namespace("certmanager", { metadata: { name: "cert-manager" } } );
const certmanager = new k8s.helm.v3.Chart("certmanager", {
chart: "cert-manager",
namespace: "cert-manager",
version: "1.0.1",
fetchOpts: {
repo: "https://charts.jetstack.io"
},
values: {
installCRDs: true,
}
}); |
@lblackstone Tried multiple things, using latest versions of all tools (kubectl and pulumi). My cluster is in Azure. Changing the cluster size had some effect on installing cert-manager with Pulumi, but I still got it hanging forever (waited ~30 minutes, had just another service deployment in addition to the cert-manager). Didn't want to give it more tries, since stopping the hanged running Pulumi process caused the state to be corrupted, and I needed to remove all cluster resources and then recreate all of them :( |
We actually just abandoned cert manager and went for Azure Front Door for certificate generation/ingress. Cert manager seems to be more trouble than it's worth, and tearing down clusters always causes pulumi stack issues when cert manager resources are present. Here's an example custom resource I'm using for Front Door export interface frontDoorOpts extends pulumi.CustomResourceOptions {
name: string,
port: number,
https?: boolean,
additionalRoutingRules?:pulumi.Input<azure.types.input.frontdoor.FrontdoorRoutingRule>[]
}
export class FrontDoor extends pulumi.ComponentResource {
public readonly publicIp:azure.network.PublicIp;
public readonly frontDoor:azure.frontdoor.Frontdoor;
public readonly frontDoorHttps:azure.frontdoor.CustomHttpsConfiguration;
public readonly frontDoorCustomHttps:azure.frontdoor.CustomHttpsConfiguration;
public readonly dns:azure.dns.CNameRecord;
constructor(name:string, opts:frontDoorOpts) {
super("pkg:index:fd", name, {}, opts);
this.publicIp = new azure.network.PublicIp(`${name}-public-ip`, {
name: `${name}-ip`,
resourceGroupName: resourceGroup.name,
location: config.get('location'),
allocationMethod: 'Static',
sku: 'Standard',
tags: globalTags
}, { parent: this })
this.dns = new azure.dns.CNameRecord(`${name}-dns`, {
name: opts.name,
zoneName: zone.name,
resourceGroupName: zone.resourceGroupName,
ttl: 300,
record: `${name}-ingress.azurefd.net`
}, {parent: this});
this.frontDoor = new azure.frontdoor.Frontdoor(`${name}-frontdoor`, {
name: `${name}-ingress`,
resourceGroupName: resourceGroup.name,
backendPools: [{
name,
loadBalancingName: name,
healthProbeName: name,
backends: [{
address: this.publicIp.ipAddress,
httpPort: opts.port,
httpsPort: opts.port,
hostHeader: `${opts.name}.${subdomain}`,
}]
}],
frontendEndpoints: [{
name,
hostName: `${name}-ingress.azurefd.net`
}, {
name: `${name}custom`,
hostName: `${opts.name}.${subdomain}`
}],
backendPoolHealthProbes: [{
name,
protocol: opts.https ? 'Https' : 'Http'
}],
backendPoolLoadBalancings: [{
name,
}],
enforceBackendPoolsCertificateNameCheck: false,
routingRules: [{
name,
acceptedProtocols: ['Https'],
frontendEndpoints: [name, `${name}custom`],
patternsToMatches: ['/*'],
forwardingConfiguration: {
forwardingProtocol: opts.https ? 'HttpsOnly' : 'HttpOnly',
backendPoolName: name
}
}, {
name: 'redirect',
acceptedProtocols: ['Http'],
frontendEndpoints: [name, `${name}custom`],
patternsToMatches: ['/*'],
redirectConfiguration: {
redirectProtocol: 'HttpsOnly',
redirectType: 'Moved'
}
},
...opts.additionalRoutingRules ? opts.additionalRoutingRules : []
],
}, {parent: this, dependsOn: [this.dns, kubernetesCluster]});
this.frontDoorHttps = new azure.frontdoor.CustomHttpsConfiguration(`${name}-https`, {
frontendEndpointId: this.frontDoor.frontendEndpoints.apply(frontendEndpoints => frontendEndpoints[0].id || '/subscriptions/random-id/resourceGroups/fake-rg/providers/Microsoft.FrontDoor/frontendEndpoints/shi'),
customHttpsProvisioningEnabled: true,
resourceGroupName: resourceGroup.name,
customHttpsConfiguration: {
certificateSource: 'FrontDoor'
}
}, {parent: this, dependsOn: this.frontDoor})
this.frontDoorCustomHttps = new azure.frontdoor.CustomHttpsConfiguration(`${name}-custom-https`, {
frontendEndpointId: this.frontDoor.frontendEndpoints.apply(frontendEndpoints => frontendEndpoints[1].id || '/subscriptions/random-id/resourceGroups/fake-rg/providers/Microsoft.FrontDoor/frontendEndpoints/shi'),
customHttpsProvisioningEnabled: true,
resourceGroupName: resourceGroup.name,
customHttpsConfiguration: {
certificateSource: 'FrontDoor'
}
}, {parent: this, dependsOn: this.frontDoor});
this.registerOutputs({
dns: this.dns,
frontDoor: this.frontDoor,
frontDoorHttps: this.frontDoorHttps,
frontDoorCustomHttps: this.frontDoorCustomHttps,
publicIp: this.publicIp
})
}
} Then I just pass the public IP created in this front door resource into my services const frontDoor = new FrontDoor(`${name}-frontdoor`, {
name: ingressName,
port
});
const service = new k8s.core.v1.Service(`${name}-app-service`, {
metadata: {
namespace,
name,
labels: {
app: selector
},
annotations: serviceAnnotations
},
spec: {
externalTrafficPolicy: ingressName ? 'Cluster' : 'Local',
loadBalancerIP: frontDoor.publicIp.ipAddress,
ports: [{
port,
protocol: 'TCP'
}],
selector: {
app: selector
},
sessionAffinity: 'None',
type: 'LoadBalancer'
}
}, { provider: kubernetesProvider, deleteBeforeReplace: true, dependsOn: serviceDependencies, customTimeouts: {
create: '1h'
} }); |
Problem description
I used this setup for a while and it was just fine.
At some point problems started where it would just hang my
pulumi up
forever working on cert-manager.(maybe it was "caused" by new helm release of cert-manager, since I didn't specify version).
I tried deleting CRDs and the namespace, then doing pulumi refresh, removing the above code from index.ts - and then my deploy worked just fine.
As I would try to apply that cert-manager code cleanly on the same cluster, it would keep hanging again.
I noticed that while it hangs it just keeps forever using 100% cpu on pulumi-kubernetes process...
doing strace -ff -p $PID on that processed showed just a spam of timer related syscalls, I saw no network or I/O activity...
I have some feeling that it's related to the CRDs and installing them seperately would solve the problem, but I haven't checked yet.
It might be related to finalizers as well, as patching or removing CRDs may cause finalizers to hang forever.
I tried deleting the CRDs once and it would hang unless I removed the finalizers first.
On a second try on clean setup, it would delete them just fine.
But I am not convinced it's even trying to patch these CRDs, but I am 100% sure it hangs while "working" on one of them.
Also I noticed that during deploy DigitalOcean's kubernetes API server got into super aggresive throttling mode, where it would start dropping connections before handshake even.
I am not sure yet what's the deal with throttling, waiting on ticket response.
It may be that pulumi was spamming the server all the time because of a bug, or it may be that their limits are just incorrectly set.
If you think throttling can cause pulumi-kubernetes to go into 100% cpu loop, maybe that's the bug here?
I tried running with maximum verbosity but nothing interesting found, except maybe serialization debugs containing the CRDs.
Errors & Logs
Affected product version(s)
Latest pulumi and kubernetes plugin.
DigitalOcean Kubernetes.
Latest helm3
Reproducing the issue
Suggestions for a fix
The text was updated successfully, but these errors were encountered: