Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Windows app deployment failing with Flannel #6171

Closed
mdrahman-suse opened this issue Jun 10, 2024 · 2 comments
Closed

Windows app deployment failing with Flannel #6171

mdrahman-suse opened this issue Jun 10, 2024 · 2 comments
Assignees
Labels
kind/bug Something isn't working

Comments

@mdrahman-suse
Copy link
Contributor

Environmental Info:
RKE2 Version:

Failing on latest commits on likely all versions
rke2 version v1.29.5+dev.05931752 (0593175)
go version go1.21.9 X:boringcrypto

Node(s) CPU architecture, OS, and Version:

# linux server and agent
cat /etc/os-release
NAME="Oracle Linux Server"
VERSION="9.3"

# Windows agent
PS C:\Users\Administrator> Get-ComputerInfo

WindowsEditionId                                        : ServerDatacenter
WindowsInstallationType                                 : Server
WindowsProductName                                      : Windows Server 2019 Datacenter
WindowsRegisteredOrganization                           : Amazon.com
WindowsRegisteredOwner                                  : EC2
WindowsVersion                                          : 1809

Cluster Configuration:

1 Linux server, 1 linux agent, 1 Windows 2019 agent

Describe the bug:

When rke2 cluster is created with Windows node and cni: flannel, upon performing a windows app deplyment, the pod fails to start and throws error

Steps To Reproduce:

  • Installed RKE2 on server and agent nodes, cni: flannel with commit 0593175
  • Ensure cluster is up and running
  • Deploy an windows-app

Expected behavior:

  • Ensure the pod for windows-app gets created successfully

Actual behavior:

  • The windows pod fails to starts with error and status is ContainerCreating

Additional context / logs:

  • Pod log
NAMESPACE     NAME                                                                  READY   STATUS              RESTARTS   AGE
default       client-deployment-5846fc994f-jwwqh                                    1/1     Running             0          22m
default       client-deployment-5846fc994f-n64m8                                    1/1     Running             0          22m
default       windows-app-deployment-6964ff4fb8-677rv                               0/1     ContainerCreating   0          21m
default       windows-app-deployment-6964ff4fb8-kb8gd                               0/1     ContainerCreating   0          21m

$ kubeclt describe -n default pod/windows-app-deployment-6964ff4fb8-677rv
...
Warning  FailedCreatePodSandBox  19m                   kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "aebfc2f57944a4bd65be153e750058003654f4b8e4e4534a1b389898d18a8408": plugin type="flannel" failed (add): loadFlannelSubnetEnv failed: open /run/flannel/subnet.env: The system cannot find the path specified.
  • Windows log
...
6/10/2024 2:55:46 PM Running RKE2 kube-proxy [--bind-address=xxx.xx.x.89 --enable-dsr=true --feature-gates=WinDSR=true --network-name=flannel.4096 --source-vip=10.42.2.2 --cluster-cidr=10.42.0.0/16 --healthz-bind-address=127.0.0.1
                     --hostname-override=ip-ac1f0259 --kubeconfig=C:\var\lib\rancher\rke2\agent\kubeproxy.kubeconfig --proxy-mode=kernelspace]
6/10/2024 2:56:01 PM Flanneld has an error: exit status 1. Check c:\var\lib\rancher\rke2\agent\logs\flanneld.log for extra information
6/10/2024 2:56:01 PM Flanneld exited
  • flannel.d logs
...
E0610 14:56:01.286034    4232 main.go:343] Error registering network: failed to create VXLAN network: Interface bound to flannel.4096 took too long to get ready. Please check your network host configuration: timeout, failed to get net interface for HostCompu
teNetwork flannel.4096 (172.31.2.89): context deadline exceeded
I0610 14:56:01.286034    4232 main.go:432] Stopping shutdownHandler...
@mdrahman-suse mdrahman-suse added the kind/bug Something isn't working label Jun 10, 2024
@manuelbuil
Copy link
Contributor

flannel-io/flannel#1996

@mdrahman-suse
Copy link
Contributor Author

Validated on master branch with commit 3aaa16c

Testing

Validation

$ rke2 -v
rke2 version v1.30.1+dev.3aaa16c9 (3aaa16c9b17da45e9f3475ba5011ed90a49a2e42)
go version go1.22.2 X:boringcrypto
  • Cluster is up and deployment is successful
$ kgn
NAME                                          STATUS   ROLES                       AGE   VERSION
ip-172-31-10-243.us-east-2.compute.internal   Ready    control-plane,etcd,master   16m   v1.30.1+rke2r1
ip-172-31-13-193.us-east-2.compute.internal   Ready    <none>                      14m   v1.30.1+rke2r1
ip-ac1f0f2b                                   Ready    <none>                      9m   v1.30.1

$ kgp
NAMESPACE     NAME                                                                   READY   STATUS      RESTARTS   AGE
default       windows-app-deployment-6dcc4cb997-7xx5f                                1/1     Running     0          30s
default       windows-app-deployment-6dcc4cb997-t2wft                                1/1     Running     0          30s
...
kube-system   kube-flannel-ds-g2f46                                                  1/1     Running     0          64m
kube-system   kube-flannel-ds-nbpt6                                                  1/1     Running     0          66m

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants