Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Errors when doing concurrent host assisted clones #2550

Closed
akalenyu opened this issue Jan 22, 2023 · 10 comments
Closed

Errors when doing concurrent host assisted clones #2550

akalenyu opened this issue Jan 22, 2023 · 10 comments
Labels
kind/bug lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.

Comments

@akalenyu
Copy link
Collaborator

akalenyu commented Jan 22, 2023

What happened:
Host-assisted clone source pods keep erroring when doing concurrent dumb clones.
With 2 concurrent clones it seems to converge slowly to success.

What you expected to happen:
restarts are mostly edge case on host assisted clones

How to reproduce it (as minimally and precisely as possible):
Patch storage profile to do host assisted cloning

# k get storageprofile rook-ceph-block -o yaml
selecting docker as container runtime
apiVersion: cdi.kubevirt.io/v1beta1
kind: StorageProfile
metadata:
...
  name: rook-ceph-block
spec:
  claimPropertySets:
  - accessModes:
    - ReadWriteOnce
    volumeMode: Filesystem
  cloneStrategy: copy
status:
  claimPropertySets:
  - accessModes:
    - ReadWriteOnce
    volumeMode: Filesystem
  cloneStrategy: copy
  provisioner: rook-ceph.rbd.csi.ceph.com
  storageClass: rook-ceph-block

source dv yaml

apiVersion: cdi.kubevirt.io/v1beta1
kind: DataVolume
metadata:
  name: golden-snap-source
  namespace: golden-ns
spec:
  source:
      http:
         url: http://.../Fedora-Cloud-Base-34-1.2.x86_64.qcow2
  pvc:
    accessModes:
    - ReadWriteOnce
    resources:
      requests:
        storage: 8Gi

run concurrent clones

#!/bin/bash

for (( c=0; c<2; c++ ))
do
  cat << __EOF__ | ./cluster-up/kubectl.sh create -f -
---
apiVersion: cdi.kubevirt.io/v1beta1
kind: DataVolume
metadata:
  name: cloned-datavolume-${c}
  namespace: default
spec:
  source:
    pvc:
      namespace: golden-ns
      name: golden-snap-source
  storage:
    accessModes:
      - ReadWriteOnce
    resources:
      requests:
        storage: 9Gi
__EOF__
done

Additional context:
Errors:

I0122 14:58:03.462143      10 prometheus.go:72] 25.53
F0122 14:58:03.536422      10 clone-source.go:41] Subprocess did not execute successfully, result is: '\x02'
disk.img
/usr/bin/tar: disk.img: Read error at byte 1119744, while reading 512 bytes: Permission denied
/usr/bin/tar: Exiting with failure status due to previous errors

NAMESPACE   NAME                  PHASE             PROGRESS   RESTARTS   AGE
default     cloned-datavolume-1   CloneInProgress   24.66%     3          2m13s

Environment:

  • CDI version (use kubectl get deployments cdi-deployment -o yaml): N/A
  • Kubernetes version (use kubectl version): N/A
  • DV specification: N/A
  • Cloud provider or hardware configuration: N/A
  • OS (e.g. from /etc/os-release): N/A
  • Kernel (e.g. uname -a): N/A
  • Install tools: N/A
  • Others: N/A
@akalenyu
Copy link
Collaborator Author

akalenyu commented Feb 6, 2023

Couple of notes from grooming meeting:

akalenyu added a commit to akalenyu/containerized-data-importer that referenced this issue Feb 6, 2023
Also, make some changes to the test to catch
kubevirt#2550

Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
akalenyu added a commit to akalenyu/containerized-data-importer that referenced this issue Feb 7, 2023
Also, make some changes to the test to catch
kubevirt#2550

Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
akalenyu added a commit to akalenyu/containerized-data-importer that referenced this issue Mar 26, 2023
Also, make some changes to the test to catch
kubevirt#2550

Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
kubevirt-bot pushed a commit that referenced this issue Apr 12, 2023
Also, make some changes to the test to catch
#2550

Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
@kubevirt-bot
Copy link
Contributor

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@kubevirt-bot kubevirt-bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 7, 2023
@akalenyu
Copy link
Collaborator Author

akalenyu commented May 7, 2023

/remove-lifecycle stale

@kubevirt-bot kubevirt-bot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 7, 2023
@alromeros
Copy link
Collaborator

Haven't tested this yet, but maybe it's been fixed with the recent populator refactoring?

@kubevirt-bot
Copy link
Contributor

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@kubevirt-bot kubevirt-bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 29, 2023
@akalenyu
Copy link
Collaborator Author

/remove-lifecycle stale

@kubevirt-bot kubevirt-bot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 29, 2023
@kubevirt-bot
Copy link
Contributor

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@kubevirt-bot kubevirt-bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 27, 2024
@kubevirt-bot
Copy link
Contributor

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

/lifecycle rotten

@kubevirt-bot kubevirt-bot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Feb 26, 2024
@kubevirt-bot
Copy link
Contributor

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

/close

@kubevirt-bot
Copy link
Contributor

@kubevirt-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.
Projects
None yet
Development

No branches or pull requests

3 participants