Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]: Couldn't fence volumes #199

Closed
eanjtab opened this issue Feb 21, 2022 · 3 comments
Closed

[BUG]: Couldn't fence volumes #199

eanjtab opened this issue Feb 21, 2022 · 3 comments
Assignees
Labels
area/csi-powerflex Issue pertains to the CSI Driver for Dell EMC PowerFlex area/csm-resiliency Issue pertains to the CSM Resiliency module type/bug Something isn't working. This is the default label associated with a bug issue.
Milestone

Comments

@eanjtab
Copy link

eanjtab commented Feb 21, 2022

Bug Description

"Couldn't fence volumes" error message was observed for a pod when a WN was shutdown.

Logs

Logs were shared on email.

Screenshots

No response

Additional Environment Information

No response

Steps to Reproduce

These steps will not guarantee to reproduce this issue, but it is worth to attempt it.

  1. Spin up a database postgres pod.
  2. Shutdown the WN (Do not reboot)
  3. Observe and check if the pods throw the couldn't fence volumes error.

Expected Behavior

The pods should be successfully evacuated (deleted and recreated).

CSM Driver(s)

  1. PowerFlex CSI driver image version  csi-vxflexos:v1.5.0-1-a838fe22
  2. Podmon image version  podmon:v0.1.0-2-a838fe22

Installation Type

No response

Container Storage Modules Enabled

  1. PowerFlex CSI driver image version  csi-vxflexos:v1.5.0-1-a838fe22
  2. Podmon image version  podmon:v0.1.0-2-a838fe22

Container Orchestrator

OS/Version: SUSE, 15-SP2 Kubernetes Version: v1.21.1

Operating System

OS/Version: SUSE, 15-SP2 Kubernetes Version: v1.21.1

@eanjtab eanjtab added needs-triage Issue requires triage. type/bug Something isn't working. This is the default label associated with a bug issue. labels Feb 21, 2022
@prablr79 prablr79 added the area/csi-powerflex Issue pertains to the CSI Driver for Dell EMC PowerFlex label Feb 22, 2022
@alikdell alikdell self-assigned this Feb 22, 2022
@alikdell alikdell removed the needs-triage Issue requires triage. label Feb 22, 2022
@alikdell
Copy link
Contributor

Engineering looking into

@hoppea2 hoppea2 added the area/csm-resiliency Issue pertains to the CSM Resiliency module label Feb 23, 2022
@hoppea2 hoppea2 added this to the v1.2.0 milestone Feb 23, 2022
@rbo54
Copy link

rbo54 commented Feb 23, 2022

The logs for this are driver.logs.20220218_1049.tgz and have been triaged. We have developed a fix that we thing would resolve the problem. However so far we've been unable to reproduce the problem in house. The problem was caused by a node that was missing it's driver annotation that is used to determine the CSI Node ID for the node. The fix involves using a previously saved copy of the annotation.

@alikdell
Copy link
Contributor

alikdell commented Mar 2, 2022

Fixed will be available in next CSM for Resiliency release

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/csi-powerflex Issue pertains to the CSI Driver for Dell EMC PowerFlex area/csm-resiliency Issue pertains to the CSM Resiliency module type/bug Something isn't working. This is the default label associated with a bug issue.
Projects
None yet
Development

No branches or pull requests

6 participants