Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: NodeGetVolumeStats crash #576

Merged

Conversation

andyzhangx
Copy link
Member

What type of PR is this?
/kind bug

What this PR does / why we need it:
fix: NodeGetVolumeStats crash

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

fix: NodeGetVolumeStats crash

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/bug Categorizes issue or PR as related to a bug. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Dec 27, 2023
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: andyzhangx

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Dec 27, 2023
@k8s-ci-robot k8s-ci-robot added the size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. label Dec 27, 2023
@andyzhangx andyzhangx force-pushed the fix-NodeGetVolumeStats-crash branch from b9383e3 to a64df6f Compare December 27, 2023 13:56
@andyzhangx andyzhangx merged commit 562c128 into kubernetes-csi:master Dec 27, 2023
18 of 19 checks passed
@GImmekerAFP
Copy link

i still get an error:

│ nfs Driver Name: nfs.csi.k8s.io │
│ nfs Driver Version: v4.6.0 │
│ nfs Git Commit: "" │
│ nfs Go Version: go1.21.5 │
│ node-driver-registrar I1228 09:41:49.049245 1 node_register.go:88] Skipping HTTP server because endpoint is set to: "" │
│ nfs Platform: linux/amd64 │
│ nfs │
│ nfs Streaming logs below: │
│ node-driver-registrar I1228 09:41:49.921746 1 main.go:90] Received GetInfo call: &InfoRequest{} │
│ nfs I1228 09:51:02.352588 1 mount_linux.go:274] Cannot create temp dir to detect safe 'not mounted' behavior: mkdir /tmp/kubelet-detect-safe-umount2491460017: read-only file system │
│ node-driver-registrar I1228 09:41:49.957760 1 main.go:101] Received NotifyRegistrationStatus call: &RegistrationStatus{PluginRegistered:true,Error:,} │
│ nfs I1228 09:51:02.359290 1 server.go:117] Listening for connections on address: &net.UnixAddr{Name:"//csi/csi.sock", Net:"unix"} │
│ nfs I1228 09:52:32.489562 1 utils.go:109] GRPC call: /csi.v1.Node/NodeUnpublishVolume │
│ nfs I1228 09:52:32.489966 1 utils.go:109] GRPC call: /csi.v1.Node/NodeUnpublishVolume │
│ nfs I1228 09:52:32.489975 1 utils.go:109] GRPC call: /csi.v1.Node/NodeUnpublishVolume │
│ nfs I1228 09:52:32.490053 1 utils.go:109] GRPC call: /csi.v1.Node/NodeUnpublishVolume │
│ nfs I1228 09:52:32.489556 1 utils.go:109] GRPC call: /csi.v1.Node/NodeUnpublishVolume │
│ nfs I1228 09:52:32.489606 1 utils.go:109] GRPC call: /csi.v1.Node/NodeUnpublishVolume │
│ nfs I1228 09:52:32.490072 1 utils.go:110] GRPC request: {"target_path":"/var/lib/kubelet/pods/96694ff6-6a64-4776-8881-eef719b26666/volumes/kubernetes.iocsi/nfs-pv/mount","volume_id":"nfs-server-nfs-server-provisioner.nfs-server.svc.cluster.l │
│ nfs I1228 09:52:32.495933 1 nodeserver.go:172] NodeUnpublishVolume: unmounting volume nfs-server-nfs-server-provisioner.nfs-server.svc.cluster.local/export/pvc-084cfb0d-cb0a-4089-a6ce-5b29049f4347 on /var/lib/kubelet/pods/96694ff6-6a64-4776-8 │
│ nfs I1228 09:52:32.495954 1 nodeserver.go:177] force unmount nfs-server-nfs-server-provisioner.nfs-server.svc.cluster.local/export/pvc-084cfb0d-cb0a-4089-a6ce-5b29049f4347 on /var/lib/kubelet/pods/96694ff6-6a64-4776-8881-eef719b26666/volumes/ │
│ nfs I1228 09:52:32.490116 1 utils.go:110] GRPC request: {"target_path":"/var/lib/kubelet/pods/a074f476-ff01-45a5-ae06-baa5aa88eb55/volumes/kubernetes.io
csi/nfs-pv/mount","volume_id":"nfs-server-nfs-server-provisioner.nfs-server.svc.cluster.l │
│ nfs I1228 09:52:32.496126 1 nodeserver.go:172] NodeUnpublishVolume: unmounting volume nfs-server-nfs-server-provisioner.nfs-server.svc.cluster.local/export/pvc-084cfb0d-cb0a-4089-a6ce-5b29049f4347 on /var/lib/kubelet/pods/a074f476-ff01-45a5-a │
│ nfs I1228 09:52:32.496167 1 nodeserver.go:177] force unmount nfs-server-nfs-server-provisioner.nfs-server.svc.cluster.local/export/pvc-084cfb0d-cb0a-4089-a6ce-5b29049f4347 on /var/lib/kubelet/pods/a074f476-ff01-45a5-ae06-baa5aa88eb55/volumes/ │
│ nfs I1228 09:52:32.489994 1 utils.go:110] GRPC request: {"target_path":"/var/lib/kubelet/pods/8a7cf686-3350-4800-a51f-f3700c398097/volumes/kubernetes.iocsi/nfs-pv/mount","volume_id":"nfs-server-nfs-server-provisioner.nfs-server.svc.cluster.l │
│ nfs I1228 09:52:32.496515 1 nodeserver.go:172] NodeUnpublishVolume: unmounting volume nfs-server-nfs-server-provisioner.nfs-server.svc.cluster.local/export/pvc-084cfb0d-cb0a-4089-a6ce-5b29049f4347 on /var/lib/kubelet/pods/8a7cf686-3350-4800-a │
│ nfs I1228 09:52:32.496574 1 nodeserver.go:177] force unmount nfs-server-nfs-server-provisioner.nfs-server.svc.cluster.local/export/pvc-084cfb0d-cb0a-4089-a6ce-5b29049f4347 on /var/lib/kubelet/pods/8a7cf686-3350-4800-a51f-f3700c398097/volumes/ │
│ nfs I1228 09:52:32.489798 1 utils.go:110] GRPC request: {"target_path":"/var/lib/kubelet/pods/ecb9844b-c5f3-47c0-a274-fd427e4a29d3/volumes/kubernetes.io
csi/nfs-pv/mount","volume_id":"nfs-server-nfs-server-provisioner.nfs-server.svc.cluster.l │
│ nfs I1228 09:52:32.496666 1 nodeserver.go:172] NodeUnpublishVolume: unmounting volume nfs-server-nfs-server-provisioner.nfs-server.svc.cluster.local/export/pvc-084cfb0d-cb0a-4089-a6ce-5b29049f4347 on /var/lib/kubelet/pods/ecb9844b-c5f3-47c0-a │
│ nfs I1228 09:52:32.496682 1 nodeserver.go:177] force unmount nfs-server-nfs-server-provisioner.nfs-server.svc.cluster.local/export/pvc-084cfb0d-cb0a-4089-a6ce-5b29049f4347 on /var/lib/kubelet/pods/ecb9844b-c5f3-47c0-a274-fd427e4a29d3/volumes/ │
│ nfs I1228 09:52:32.489986 1 utils.go:110] GRPC request: {"target_path":"/var/lib/kubelet/pods/f1acd50f-c497-4851-a9b2-7b5e31f67bd5/volumes/kubernetes.iocsi/nfs-pv/mount","volume_id":"nfs-server-nfs-server-provisioner.nfs-server.svc.cluster.l │
│ nfs I1228 09:52:32.496750 1 nodeserver.go:172] NodeUnpublishVolume: unmounting volume nfs-server-nfs-server-provisioner.nfs-server.svc.cluster.local/export/pvc-084cfb0d-cb0a-4089-a6ce-5b29049f4347 on /var/lib/kubelet/pods/f1acd50f-c497-4851-a │
│ nfs I1228 09:52:32.496775 1 nodeserver.go:177] force unmount nfs-server-nfs-server-provisioner.nfs-server.svc.cluster.local/export/pvc-084cfb0d-cb0a-4089-a6ce-5b29049f4347 on /var/lib/kubelet/pods/f1acd50f-c497-4851-a9b2-7b5e31f67bd5/volumes/ │
│ nfs I1228 09:52:32.493730 1 utils.go:110] GRPC request: {"target_path":"/var/lib/kubelet/pods/c6bf0deb-aedd-4e30-b175-90578ccfeca3/volumes/kubernetes.io
csi/nfs-pv/mount","volume_id":"nfs-server-nfs-server-provisioner.nfs-server.svc.cluster.l │
│ nfs I1228 09:52:32.496933 1 nodeserver.go:172] NodeUnpublishVolume: unmounting volume nfs-server-nfs-server-provisioner.nfs-server.svc.cluster.local/export/pvc-084cfb0d-cb0a-4089-a6ce-5b29049f4347 on /var/lib/kubelet/pods/c6bf0deb-aedd-4e30-b │
│ nfs I1228 09:52:32.497026 1 nodeserver.go:177] force unmount nfs-server-nfs-server-provisioner.nfs-server.svc.cluster.local/export/pvc-084cfb0d-cb0a-4089-a6ce-5b29049f4347 on /var/lib/kubelet/pods/c6bf0deb-aedd-4e30-b175-90578ccfeca3/volumes/ │
│ nfs panic: interface conversion: interface {} is *csi.NodeGetVolumeStatsResponse, not csi.NodeGetVolumeStatsResponse │
│ nfs │
│ nfs goroutine 104 [running]: │
│ nfs github.com/kubernetes-csi/csi-driver-nfs/pkg/nfs.(*NodeServer).NodeGetVolumeStats(0xc0004de000, {0xc000457890?, 0x4105a5?}, 0xc00028b4a0) │
│ nfs /workspace/pkg/nfs/nodeserver.go:219 +0xb74 │
│ nfs github.com/container-storage-interface/spec/lib/go/csi._Node_NodeGetVolumeStats_Handler.func1({0x18c6a18, 0xc000874450}, {0x15a16e0?, 0xc00028b4a0}) │
│ nfs /workspace/vendor/github.com/container-storage-interface/spec/lib/go/csi/csi.pb.go:7106 +0x72 │
│ nfs github.com/kubernetes-csi/csi-driver-nfs/pkg/nfs.logGRPC({0x18c6a18, 0xc000874450}, {0x15a16e0?, 0xc00028b4a0?}, 0xc0002234c0, 0xc000308618) │
│ nfs /workspace/pkg/nfs/utils.go:112 +0x3a9 │
│ nfs github.com/container-storage-interface/spec/lib/go/csi._Node_NodeGetVolumeStats_Handler({0x1563e40?, 0xc0004de000}, {0x18c6a18, 0xc000874450}, 0xc000843200, 0x176c5b0) │
│ nfs /workspace/vendor/github.com/container-storage-interface/spec/lib/go/csi/csi.pb.go:7108 +0x135 │
│ nfs google.golang.org/grpc.(*Server).processUnaryRPC(0xc000502000, {0x18c6a18, 0xc0008743c0}, {0x18cc120, 0xc000750000}, 0xc00038ea20, 0xc00049c2d0, 0x23db280, 0x0) │
│ nfs /workspace/vendor/google.golang.org/grpc/server.go:1372 +0xe03 │
│ nfs google.golang.org/grpc.(*Server).handleStream(0xc000502000, {0x18cc120, 0xc000750000}, 0xc00038ea20) │
│ nfs /workspace/vendor/google.golang.org/grpc/server.go:1783 +0xfec │
│ nfs google.golang.org/grpc.(*Server).serveStreams.func2.1() │
│ nfs /workspace/vendor/google.golang.org/grpc/server.go:1016 +0x59 │
│ nfs created by google.golang.org/grpc.(*Server).serveStreams.func2 in goroutine 130 │
│ nfs /workspace/vendor/google.golang.org/grpc/server.go:1027 +0x115

@andyzhangx
Copy link
Member Author

@GImmekerAFP I think you are still using the old canary image, can you set as imagePullPolicy: Always and delete the driver daemonset pod to make sure it would pull the lastest image?

imagePullPolicy: "IfNotPresent"

@GImmekerAFP
Copy link

Hello,
First off all, happy new year !

Had some time to test this today, and the fix seems to work as expected.
Sorry that it took a little bit longer but i couldn't get the imagepull policy correct due to the way we deploy stuff here.
I don't see the system load going up anymore so problem fixed.

(any idea when the current fix get released ? )

With kind regards,
Gerben Immeker.

@andyzhangx
Copy link
Member Author

Hello, First off all, happy new year !

Had some time to test this today, and the fix seems to work as expected. Sorry that it took a little bit longer but i couldn't get the imagepull policy correct due to the way we deploy stuff here. I don't see the system load going up anymore so problem fixed.

(any idea when the current fix get released ? )

With kind regards, Gerben Immeker.

@GImmekerAFP I could cut a new release in the middle of this month.

@GImmekerAFP
Copy link

Ok thank you for your awnser,

That would be great, i will keep an eye out for the release.

With kind regards,
Gerben Immeker.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants