Blank disk with block volume mode stuck in ImportScheduled for WFFC storage class #2915

lzang · 2023-10-04T01:07:27Z

What happened:
Created a blank DV with block volume mode for WFFC storage class, and found that it is stuck in ImportScheduled phase

What you expected to happen:
Expect the disk to be in Succeeded phase.

How to reproduce it (as minimally and precisely as possible):
Steps to reproduce the behavior.

Have a storage class that supports block mode and WFFC
Create a blank DV as below:

apiVersion: cdi.kubevirt.io/v1beta1
kind: DataVolume
metadata:
  name: empty-disk
spec:
  source:
    blank: {}
  storage:
    accessModes:
    - ReadWriteMany
    resources:
      requests:
        storage: 10Gi
    storageClassName: block-sc
    volumeMode: Block

Observe that the dv is stuck in ImportScheduled phase.

Additional context:

Adding the annotation cdi.kubevirt.io/storage.usePopulator: "false" to the dv allows the import to fall back to the old way and bypass the issue.

Environment:

CDI version (use kubectl get deployments cdi-deployment -o yaml): v1.57.0

The text was updated successfully, but these errors were encountered:

alromeros · 2023-10-04T09:10:35Z

Hey @lzang, thanks for the issue! This might be a bug in our populators flow, I'll take a look and let you know when I find something.

akalenyu · 2023-10-04T11:53:29Z

I have opened a PR to hopefully address this; though I had no luck reproducing this without
the bind.immediate annotation

Would appreciate if you could attach the CDI CR and confirm the logs in the PR description match the ones in your cdi-deployment pod

lzang · 2023-10-04T16:13:09Z

@akalenyu Thank you for looking at this! Sorry I forgot to mention that in CDI, we didn't enable HonorWaitForFirstConsumer feature gate. We basically have an empty CDI config. Is that the reason why we hit this issue?

I have opened a PR to hopefully address this; though I had no luck reproducing this without the bind.immediate annotation

Would appreciate if you could attach the CDI CR and confirm the logs in the PR description match the ones in your cdi-deployment pod

awels · 2023-10-04T16:20:14Z

The standard cdi should look something like this

apiVersion: cdi.kubevirt.io/v1beta1
kind: CDI
metadata:
  name: cdi
spec:
  config:
    featureGates:
    - HonorWaitForFirstConsumer
  imagePullPolicy: IfNotPresent
  infra:
    nodeSelector:
      kubernetes.io/os: linux
    tolerations:
    - key: CriticalAddonsOnly
      operator: Exists
  workload:
    nodeSelector:
      kubernetes.io/os: linux

awels · 2023-10-04T16:22:27Z

The only real difference is that any datavolume associated with it, goes into WaitForFirstConsumer phase, instead of pending. This tells KubeVirt it is safe to start the VM, and the population will take place at that point.

lzang · 2023-10-04T16:22:37Z

The standard cdi should look something like this

apiVersion: cdi.kubevirt.io/v1beta1
kind: CDI
metadata:
  name: cdi
spec:
  config:
    featureGates:
    - HonorWaitForFirstConsumer
  imagePullPolicy: IfNotPresent
  infra:
    nodeSelector:
      kubernetes.io/os: linux
    tolerations:
    - key: CriticalAddonsOnly
      operator: Exists
  workload:
    nodeSelector:
      kubernetes.io/os: linux

Yes, that is what we have except the feature gate configuration. I do expect it works even without the feature gate enabled.

lzang · 2023-10-04T16:24:37Z

The only real difference is that any datavolume associated with it, goes into WaitForFirstConsumer phase, instead of pending. This tells KubeVirt it is safe to start the VM, and the population will take place at that point.

But for image download case, I think it makes sense to populate the disk even if no VM uses it.

awels · 2023-10-04T16:26:46Z

The only real difference is that any datavolume associated with it, goes into WaitForFirstConsumer phase, instead of pending. This tells KubeVirt it is safe to start the VM, and the population will take place at that point.

But for image download case, I think it makes sense to populate the disk even if no VM uses it.

Not with WaitForFirstConsumer, we want the disk image to end up in the right location for the scheduler to have maximum flexibility when scheduling the VM. If we populate immediately, then (lets say the WFFC is because of local storage) then the VM will have to end up on whatever node the disk was downloaded to. If we wait with populating then the scheduler can schedule on any node, at which point we download to that node.

awels · 2023-10-04T16:29:05Z

It gets even more interesting with multiple disks on a VM, if we import immediately (or create blank disks immediately on random nodes) then it is entirely possible the VM cannot run because the disks are scattered across nodes. The feature gate enables CDI to pass the WFFC status to KubeVirt, so that CDI won't populate the disks until the VM is scheduled to run on a particular node. And since its the workload determining where the disks end up, they all end up on the same node.

awels · 2023-10-04T16:34:46Z

Note that if for whatever reason, you do want to import the image without a VM using it (because you are going to clone it for an actual VM). You can pass an annotation to the DataVolume, that forces CDI to import it to a random node.

akalenyu · 2023-10-04T16:37:17Z

I do expect it works even without the feature gate enabled.

Yes, we do have a loose end in that regard. You are right.
With the feature gate disabled (or with the bind.immediate annotation, those two are equivalent) we basically
don't provide the wffc override functionality. This differs from other CDI flows where we do.

I would just make sure that this wffc override functionality is something you desire. It is usually not the best idea
like @awels mentions. It makes sense for several use cases like setting up a golden image where no VM will be started from it

awels · 2023-10-04T16:39:41Z

We should probably make the feature gate HonorWaitForFirstConsumer the default at this point, since all our tests assume this. And it should be a much better experience for the user with it enabled.

lzang · 2023-10-06T00:15:57Z

It gets even more interesting with multiple disks on a VM, if we import immediately (or create blank disks immediately on random nodes) then it is entirely possible the VM cannot run because the disks are scattered across nodes. The feature gate enables CDI to pass the WFFC status to KubeVirt, so that CDI won't populate the disks until the VM is scheduled to run on a particular node. And since its the workload determining where the disks end up, they all end up on the same node.

I am aware of this issue. But in general we do prefer to use non-local storage for serious use case so that VM can be moved around freely.

lzang · 2023-10-06T00:18:16Z

I do expect it works even without the feature gate enabled.

Yes, we do have a loose end in that regard. You are right. With the feature gate disabled (or with the bind.immediate annotation, those two are equivalent) we basically don't provide the wffc override functionality. This differs from other CDI flows where we do.

I would just make sure that this wffc override functionality is something you desire. It is usually not the best idea like @awels mentions. It makes sense for several use cases like setting up a golden image where no VM will be started from it

Yes, this is one of the case we want the disk to be prepared instead of waiting to be bound. Also conceptually it makes sense to prepare a disk when we create it instead of deferring its creation as disk and VMs are separate entities.

akalenyu · 2023-10-08T13:06:56Z

Also conceptually it makes sense to prepare a disk when we create it instead of deferring its creation as disk and VMs are separate entities

Not sure this is correct with lazy (WFFC) binding (pods & PVCs will behave the same)

this is one of the case we want the disk to be prepared instead of waiting to be bound

Hmm, I would get that for stray golden disks. But we're talking about a blank disk that is only of use in a VM

lzang · 2023-10-09T04:20:39Z

Also conceptually it makes sense to prepare a disk when we create it instead of deferring its creation as disk and VMs are separate entities

Not sure this is correct with lazy (WFFC) binding (pods & PVCs will behave the same)

this is one of the case we want the disk to be prepared instead of waiting to be bound

Hmm, I would get that for stray golden disks. But we're talking about a blank disk that is only of use in a VM

I would suggest not to debate which behavior is better here, but instead fix the issue, i.e blank disk with block volume doesn't work without the feature gate HonorWaitForFirstConsumer and WFFC storage class.

aglitke · 2023-10-09T12:28:09Z

/assign @akalenyu

lzang added the kind/bug label Oct 4, 2023

akalenyu mentioned this issue Oct 4, 2023

Respect wffc override for blank block disks #2917

Merged

akalenyu mentioned this issue Oct 5, 2023

cdi.kubevirt.io/storage.bind.immediate.requested annotation is not honored #2918

Closed

kubevirt-bot assigned akalenyu Oct 9, 2023

kubevirt-bot closed this as completed in #2917 Oct 11, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Blank disk with block volume mode stuck in ImportScheduled for WFFC storage class #2915

Blank disk with block volume mode stuck in ImportScheduled for WFFC storage class #2915

lzang commented Oct 4, 2023

alromeros commented Oct 4, 2023

akalenyu commented Oct 4, 2023 •

edited

Loading

lzang commented Oct 4, 2023

awels commented Oct 4, 2023

awels commented Oct 4, 2023

lzang commented Oct 4, 2023

lzang commented Oct 4, 2023

awels commented Oct 4, 2023

awels commented Oct 4, 2023

awels commented Oct 4, 2023

akalenyu commented Oct 4, 2023 •

edited

Loading

awels commented Oct 4, 2023

lzang commented Oct 6, 2023

lzang commented Oct 6, 2023

akalenyu commented Oct 8, 2023

lzang commented Oct 9, 2023

aglitke commented Oct 9, 2023

Blank disk with block volume mode stuck in ImportScheduled for WFFC storage class #2915

Blank disk with block volume mode stuck in ImportScheduled for WFFC storage class #2915

Comments

lzang commented Oct 4, 2023

alromeros commented Oct 4, 2023

akalenyu commented Oct 4, 2023 • edited Loading

lzang commented Oct 4, 2023

awels commented Oct 4, 2023

awels commented Oct 4, 2023

lzang commented Oct 4, 2023

lzang commented Oct 4, 2023

awels commented Oct 4, 2023

awels commented Oct 4, 2023

awels commented Oct 4, 2023

akalenyu commented Oct 4, 2023 • edited Loading

awels commented Oct 4, 2023

lzang commented Oct 6, 2023

lzang commented Oct 6, 2023

akalenyu commented Oct 8, 2023

lzang commented Oct 9, 2023

aglitke commented Oct 9, 2023

akalenyu commented Oct 4, 2023 •

edited

Loading

akalenyu commented Oct 4, 2023 •

edited

Loading