Skip to content

Commit

Permalink
csi: fix handling of garbage collected node in node unpublish
Browse files Browse the repository at this point in the history
When a node is garbage collected, we assume that the volume is no
longer attached to it and ignore the `ErrUnknownNode` error. But we
used `errors.Is` to check for a wrapped error, and RPC flattens the
errors during serialization. This results in an error check that works
in automated testing but not in real clusters. Use a string contains
check instead.
  • Loading branch information
tgross committed Mar 22, 2022
1 parent 879e137 commit 8084e33
Show file tree
Hide file tree
Showing 2 changed files with 7 additions and 2 deletions.
3 changes: 3 additions & 0 deletions .changelog/12350.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
```release-note:bug
csi: Fixed a bug where garbage collected nodes would block releasing a volume
```
6 changes: 4 additions & 2 deletions nomad/csi_endpoint.go
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
package nomad

import (
"errors"
"fmt"
"net/http"
"strings"
"time"

metrics "github.com/armon/go-metrics"
Expand Down Expand Up @@ -741,7 +741,9 @@ func (v *CSIVolume) nodeUnpublishVolumeImpl(vol *structs.CSIVolume, claim *struc
// we should only get this error if the Nomad node disconnects and
// is garbage-collected, so at this point we don't have any reason
// to operate as though the volume is attached to it.
if !errors.Is(err, structs.ErrUnknownNode) {
// note: errors.Is cannot be used because the RPC call breaks
// error wrapping.
if !strings.Contains(err.Error(), structs.ErrUnknownNode.Error()) {
return fmt.Errorf("could not detach from node: %w", err)
}
}
Expand Down

0 comments on commit 8084e33

Please sign in to comment.