-
Notifications
You must be signed in to change notification settings - Fork 315
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Network inaccessible when pod starting, when using Azure CNI + Calico #2750
Comments
Hi alpeb, AKS bot here 👋 I might be just a bot, but I'm told my suggestions are normally quite good, as such:
|
Triage required from @Azure/aks-pm |
@aanandr, @phealy would you be able to assist? Issue DetailsWhat happened: Linkerd's control-plane pods have a sidecar proxy that starts before the main container. The proxy manifest has a postStart hook that blocks the creation of the main container until the proxy is ready. Only when using the combo Azure CNI + Calico, the proxy container appears to have no network, which avoids it to become ready, forbidding the pod startup to complete (it remains in status ContainerCreating). What you expected to happen: The network should be available as soon as the pod starts. How to reproduce it:
apiVersion: v1
kind: Pod
metadata:
name: curl
spec:
containers:
- image: curlimages/curl
name: curl
command: [ "sh", "-c", "--" ]
args: [ "while true; do curl -k https://10.0.0.1; done;" ]
lifecycle:
postStart:
exec:
command: [ "sh", "-c", "--", "while true; do sleep 30; done;" ] The pod remains in the ContainerCreating status as expected, but the curl command times out. This status forbids us from checking the pod's logs through
If we remove the lifecycle snippet, then curl works as expected:
Anything else we need to know?: If we try with Kubenet+Calico, or Azure CNI without Calico, there is no issue. Environment:
|
We have the same problem as well in our new AKS setup. Very interested in a solution to this asap. We have disabled CNI in the meantime and worked around it. |
Are your node pools running Windows or Linux? Docker or containerd? |
@rnemeth90 we are running AKS with containerd and ubuntu |
Action required from @Azure/aks-pm |
Issue needing attention of @Azure/aks-leads |
2 similar comments
Issue needing attention of @Azure/aks-leads |
Issue needing attention of @Azure/aks-leads |
Issue needing attention of @Azure/aks-leads |
Issue needing attention of @Azure/aks-leads |
Issue needing attention of @Azure/aks-leads |
I also have a similar issue but in a productive cluster, would be great to get a fix or even a good workaround that we can apply also in a productive cluster. I can't recreate the cluster switching from CNI to Kubenet. |
Issue needing attention of @Azure/aks-leads |
4 similar comments
Issue needing attention of @Azure/aks-leads |
Issue needing attention of @Azure/aks-leads |
Issue needing attention of @Azure/aks-leads |
Issue needing attention of @Azure/aks-leads |
Issue needing attention of @Azure/aks-leads |
Same issue |
Issue needing attention of @Azure/aks-leads |
13 similar comments
Issue needing attention of @Azure/aks-leads |
Issue needing attention of @Azure/aks-leads |
Issue needing attention of @Azure/aks-leads |
Issue needing attention of @Azure/aks-leads |
Issue needing attention of @Azure/aks-leads |
Issue needing attention of @Azure/aks-leads |
Issue needing attention of @Azure/aks-leads |
Issue needing attention of @Azure/aks-leads |
Issue needing attention of @Azure/aks-leads |
Issue needing attention of @Azure/aks-leads |
Issue needing attention of @Azure/aks-leads |
Issue needing attention of @Azure/aks-leads |
Issue needing attention of @Azure/aks-leads |
Is there still no fix for this? |
This issue has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs within 15 days of this comment. |
Issue needing attention of @Azure/aks-leads |
Just spent 4 weeks with my team to figure out we have this.. Disappointed Also lack of confirmation is not giving a great feeling about this |
This issue has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs within 15 days of this comment. |
This issue will now be closed because it hasn't had any activity for 7 days after stale. alpeb feel free to comment again on the next 7 days to reopen or open a new issue after that time if you still have a question/issue or suggestion. |
What happened:
Linkerd's control-plane pods have a sidecar proxy that starts before the main container. The proxy manifest has a postStart hook that blocks the creation of the main container until the proxy is ready. Only when using the combo Azure CNI + Calico, the proxy container appears to have no network, which avoids it to become ready, forbidding the pod startup to complete (it remains in status ContainerCreating).
What you expected to happen:
The network should be available as soon as the pod starts.
How to reproduce it:
The pod remains in the ContainerCreating status as expected, but the curl command times out. This status forbids us from checking the pod's logs through
kubectl logs
, but we can get them by getting into the node throughkubectl debug
:If we remove the lifecycle snippet, then curl works as expected:
Anything else we need to know?:
If we try with Kubenet+Calico, or Azure CNI without Calico, there is no issue.
Environment:
The text was updated successfully, but these errors were encountered: