Errors when doing concurrent host assisted clones #2550

akalenyu · 2023-01-22T15:41:20Z

What happened:
Host-assisted clone source pods keep erroring when doing concurrent dumb clones.
With 2 concurrent clones it seems to converge slowly to success.

What you expected to happen:
restarts are mostly edge case on host assisted clones

How to reproduce it (as minimally and precisely as possible):
Patch storage profile to do host assisted cloning

# k get storageprofile rook-ceph-block -o yaml
selecting docker as container runtime
apiVersion: cdi.kubevirt.io/v1beta1
kind: StorageProfile
metadata:
...
  name: rook-ceph-block
spec:
  claimPropertySets:
  - accessModes:
    - ReadWriteOnce
    volumeMode: Filesystem
  cloneStrategy: copy
status:
  claimPropertySets:
  - accessModes:
    - ReadWriteOnce
    volumeMode: Filesystem
  cloneStrategy: copy
  provisioner: rook-ceph.rbd.csi.ceph.com
  storageClass: rook-ceph-block

source dv yaml

apiVersion: cdi.kubevirt.io/v1beta1
kind: DataVolume
metadata:
  name: golden-snap-source
  namespace: golden-ns
spec:
  source:
      http:
         url: http://.../Fedora-Cloud-Base-34-1.2.x86_64.qcow2
  pvc:
    accessModes:
    - ReadWriteOnce
    resources:
      requests:
        storage: 8Gi

run concurrent clones

#!/bin/bash

for (( c=0; c<2; c++ ))
do
  cat << __EOF__ | ./cluster-up/kubectl.sh create -f -
---
apiVersion: cdi.kubevirt.io/v1beta1
kind: DataVolume
metadata:
  name: cloned-datavolume-${c}
  namespace: default
spec:
  source:
    pvc:
      namespace: golden-ns
      name: golden-snap-source
  storage:
    accessModes:
      - ReadWriteOnce
    resources:
      requests:
        storage: 9Gi
__EOF__
done

Additional context:
Errors:

I0122 14:58:03.462143      10 prometheus.go:72] 25.53
F0122 14:58:03.536422      10 clone-source.go:41] Subprocess did not execute successfully, result is: '\x02'
disk.img
/usr/bin/tar: disk.img: Read error at byte 1119744, while reading 512 bytes: Permission denied
/usr/bin/tar: Exiting with failure status due to previous errors

NAMESPACE   NAME                  PHASE             PROGRESS   RESTARTS   AGE
default     cloned-datavolume-1   CloneInProgress   24.66%     3          2m13s

Environment:

CDI version (use kubectl get deployments cdi-deployment -o yaml): N/A
Kubernetes version (use kubectl version): N/A
DV specification: N/A
Cloud provider or hardware configuration: N/A
OS (e.g. from /etc/os-release): N/A
Kernel (e.g. uname -a): N/A
Install tools: N/A
Others: N/A

The text was updated successfully, but these errors were encountered:

akalenyu · 2023-02-06T13:18:08Z

Couple of notes from grooming meeting:

This reproduces on single node cluster (the default kubevirtci install)
We might want (regardless of this issue) set affinity on clone source pods due to Enforce ReadWriteOnce PVC access mode during scheduling kubernetes/kubernetes#103305

Also, make some changes to the test to catch kubevirt#2550 Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>

Also, make some changes to the test to catch #2550 Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>

kubevirt-bot · 2023-05-07T13:45:18Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

/lifecycle stale

akalenyu · 2023-05-07T14:06:40Z

/remove-lifecycle stale

alromeros · 2023-07-31T11:35:42Z

Haven't tested this yet, but maybe it's been fixed with the recent populator refactoring?

kubevirt-bot · 2023-10-29T12:24:26Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

/lifecycle stale

akalenyu · 2023-10-29T12:25:40Z

/remove-lifecycle stale

kubevirt-bot · 2024-01-27T13:14:06Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

/lifecycle stale

kubevirt-bot · 2024-02-26T13:43:38Z

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

/lifecycle rotten

kubevirt-bot · 2024-03-27T14:11:25Z

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

/close

kubevirt-bot · 2024-03-27T14:11:29Z

@kubevirt-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

akalenyu added the kind/bug label Jan 22, 2023

akalenyu mentioned this issue Jan 22, 2023

Clone from VolumeSnapshot source #2522

Merged

3 tasks

akalenyu added a commit to akalenyu/containerized-data-importer that referenced this issue Feb 6, 2023

Ensure host assisted concurrent clone test runs on all storages

43fb2c1

Also, make some changes to the test to catch kubevirt#2550 Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>

akalenyu mentioned this issue Feb 6, 2023

Ensure host assisted concurrent clone test runs on all storages #2575

Merged

akalenyu added a commit to akalenyu/containerized-data-importer that referenced this issue Feb 7, 2023

Ensure host assisted concurrent clone test runs on all storages

8ec3a55

Also, make some changes to the test to catch kubevirt#2550 Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>

akalenyu added a commit to akalenyu/containerized-data-importer that referenced this issue Mar 26, 2023

Ensure host assisted concurrent clone test runs on all storages

e847ec0

Also, make some changes to the test to catch kubevirt#2550 Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>

kubevirt-bot pushed a commit that referenced this issue Apr 12, 2023

Ensure host assisted concurrent clone test runs on all storages (#2575)

f9c049f

Also, make some changes to the test to catch #2550 Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>

akalenyu mentioned this issue Apr 18, 2023

Set pod affinity for host assisted clone #2647

Merged

kubevirt-bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 7, 2023

kubevirt-bot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 7, 2023

kubevirt-bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 29, 2023

kubevirt-bot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 29, 2023

kubevirt-bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 27, 2024

kubevirt-bot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Feb 26, 2024

kubevirt-bot closed this as completed Mar 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Errors when doing concurrent host assisted clones #2550

Errors when doing concurrent host assisted clones #2550

akalenyu commented Jan 22, 2023 •

edited

Loading

akalenyu commented Feb 6, 2023

kubevirt-bot commented May 7, 2023

akalenyu commented May 7, 2023

alromeros commented Jul 31, 2023

kubevirt-bot commented Oct 29, 2023

akalenyu commented Oct 29, 2023

kubevirt-bot commented Jan 27, 2024

kubevirt-bot commented Feb 26, 2024

kubevirt-bot commented Mar 27, 2024

kubevirt-bot commented Mar 27, 2024

Errors when doing concurrent host assisted clones #2550

Errors when doing concurrent host assisted clones #2550

Comments

akalenyu commented Jan 22, 2023 • edited Loading

akalenyu commented Feb 6, 2023

kubevirt-bot commented May 7, 2023

akalenyu commented May 7, 2023

alromeros commented Jul 31, 2023

kubevirt-bot commented Oct 29, 2023

akalenyu commented Oct 29, 2023

kubevirt-bot commented Jan 27, 2024

kubevirt-bot commented Feb 26, 2024

kubevirt-bot commented Mar 27, 2024

kubevirt-bot commented Mar 27, 2024

akalenyu commented Jan 22, 2023 •

edited

Loading