Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CSI: handlePluginEvent nil pointer dereference #15478

Closed
the-nando opened this issue Dec 6, 2022 · 2 comments · Fixed by #15518
Closed

CSI: handlePluginEvent nil pointer dereference #15478

the-nando opened this issue Dec 6, 2022 · 2 comments · Fixed by #15518
Assignees
Milestone

Comments

@the-nando
Copy link
Contributor

the-nando commented Dec 6, 2022

Nomad version

Nomad v1.4.1+ent

Operating system and Environment details

Issue

I'm running the AWS EFS CSI driver on the Nomad clients via Docker image amazon/aws-efs-csi-driver:v1.3.8 deployed as a system job.
During some testing the Docker container started crashing which caused in turn both the Nomad Server, and all the Nomad Clients to panic:

Nomad Server (leader at the time - Issue fixed in #15095)

2022-12-05T15:39:08.698Z [ERROR] nomad.volumes_watcher: error releasing volume claims: namespace=ns volume_id=datastore-uploads
error=
| 9 errors occurred:
| \t* could not detach from node: node detach volume: EOF
| \t* could not detach from node: node detach volume: EOF
| \t* could not detach from node: node detach volume: EOF
| \t* could not detach from node: rpc error: node detach volume: CSI.NodeDetachVolume: plugin aws-efs for type csi-node not found
| \t* could not detach from node: rpc error: node detach volume: EOF
| \t* could not detach from node: No path to node
| \t* could not detach from node: No path to node
| \t* could not detach from node: No path to node
| \t* could not detach from node: No path to node
|

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x110 pc=0x1d647f3]

goroutine 132161661 [running]:
github.com/hashicorp/nomad/nomad/volumewatcher.(*volumeWatcher).volumeReapImpl(0x0?, 0x0)
github.com/hashicorp/nomad/nomad/volumewatcher/volume_watcher.go:208 +0x33
github.com/hashicorp/nomad/nomad/volumewatcher.(*volumeWatcher).volumeReap(0xc00488f2c0, 0xc002088ea0?)
github.com/hashicorp/nomad/nomad/volumewatcher/volume_watcher.go:193 +0x58
github.com/hashicorp/nomad/nomad/volumewatcher.(*volumeWatcher).watch(0xc00488f2c0)
github.com/hashicorp/nomad/nomad/volumewatcher/volume_watcher.go:121 +0x5f
created by github.com/hashicorp/nomad/nomad/volumewatcher.(*volumeWatcher).Start
github.com/hashicorp/nomad/nomad/volumewatcher/volume_watcher.go:89 +0x125

Nomad Client (all of them)

2022-12-05T15:39:03.399Z [INFO]  client.driver_mgr.docker: stopped container: container_id=f18a02d8dd6801407bf5f70aa578a9391d1ab9f7c0bb9e249f57efe418cfc465 driver=docker
    2022-12-05T15:39:08.480Z [INFO]  client.driver_mgr.docker: stopped container: container_id=f9eabc3ddaa317e45e610733b131d40a1fc6f06ccf3edb31cd5106694451799b driver=docker
    2022-12-05T15:39:08.495Z [INFO]  agent: (runner) stopping
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x1230631]

goroutine 156 [running]:
github.com/hashicorp/nomad/client/pluginmanager/csimanager.(*csiManager).handlePluginEvent(0xc000b86e00, 0xc0015dde18)
	github.com/hashicorp/nomad/client/pluginmanager/csimanager/manager.go:157 +0xf1
github.com/hashicorp/nomad/client/pluginmanager/csimanager.(*csiManager).runLoop(0xc000b86e00)
	github.com/hashicorp/nomad/client/pluginmanager/csimanager/manager.go:112 +0x1f7
created by github.com/hashicorp/nomad/client/pluginmanager/csimanager.(*csiManager).Run
	github.com/hashicorp/nomad/client/pluginmanager/csimanager/manager.go:96 +0x56

Reproduction steps

The trigger to the issue seems to be the sudden unavailability of the aws-efs plugin on all clients

Expected Result

No panic.

@the-nando the-nando changed the title CSI: volumewatcher nil pointer dereference CSI: handlePluginEvent nil pointer dereference Dec 6, 2022
@tgross
Copy link
Member

tgross commented Dec 6, 2022

Thanks for opening this issue @nandolone!

@github-actions
Copy link

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Apr 12, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants