Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CDI import from URL is significantly slower than a manual wget + virtctl image-upload #2809

Closed
cloudymax opened this issue Jul 15, 2023 · 52 comments · Fixed by #2832
Closed
Labels

Comments

@cloudymax
Copy link

cloudymax commented Jul 15, 2023

What happened:
There is a massive performance variance between CDI image import methods which makes the declarative workflow for PVCs/DataVolumes unusable. I'm REALLY trying to make Kubevirt work as a replacement for qemu wrapped in bash but the slowness of drive creation is making it impossible to achieve buy-in from others.

You can find my full setup here: https://github.com/cloudymax/argocd-nvidia-lab/tree/main/kubevirt

Disk configs are specifically here: https://github.com/small-hack/argocd/tree/main/kubevirt/disks

CDI configuration is here: https://github.com/small-hack/argocd/tree/main/kubevirt/cdi

This has been tested across multiple hardware types:

  • Hetzner AX102 Instance with R97950X3D + NVME (Raid0)

  • Dell XPS with i7-11700 + NVME (no raid)

  • Hetzner instance with i7 7700 + SATA6 SSDs (Raid0)

  • Has been tested on K3s (bare-metal and in a KVM VPS)

  • Has been tested with local-path and Longhorn storage classes

  • The CDI has been tested with nodePort and LoadBalancer type

  • I have increased the resources available to the CDI importer but it had no effect, the process will not consume more resources.

Method: wget + virtctl
Storage Class: local-path
Time: 22 seconds

Method: wget + virtctl
Storage Class: longhorn
Time: 56 seconds

Command used:

export VOLUME_NAME=debian12-pvc
export NAMESPACE="default"
export STORAGE_CLASS="longhorn"
export ACCESS_MODE="ReadWriteMany"
export IMAGE_URL="https://cloud.debian.org/images/cloud/bookworm/daily/latest/debian-12-generic-amd64-daily.qcow2"
export IMAGE_PATH=debian-12-generic-amd64-daily.qcow2
export VOLUME_TYPE=pvc
export SIZE=120Gi
export PROXY_ADDRESS=$(kubectl get svc cdi-uploadproxy-loadbalancer -n cdi -o json | jq --raw-output '.spec.clusterIP')
# $(kubectl get svc cdi-uploadproxy -n cdi -o json | jq --raw-output 

time wget -O $IMAGE_PATH $IMAGE_URL && \
time virtctl image-upload $VOLUME_TYPE $VOLUME_NAME \
    --size=$SIZE \
    --image-path=$IMAGE_PATH \
    --uploadproxy-url=https://$PROXY_ADDRESS:443 \
    --namespace=$NAMESPACE \
    --storage-class=$STORAGE_CLASS \
    --access-mode=$ACCESS_MODE \
    --insecure --force-bind

When using the declarative method the process takes over 20 minutes

Logs attached:
Explore-logs-2023-07-15 10 36 08.txt

---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: "debian"
  labels:
    app: containerized-data-importer
  annotations:
    cdi.kubevirt.io/storage.bind.immediate.requested: "true"
    cdi.kubevirt.io/storage.import.endpoint: "https://cloud.debian.org/images/cloud/bookworm/daily/latest/debian-12-generic-amd64-daily.qcow2"
spec:
  storageClassName: local-path
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 120Gi

What you expected to happen:

I would expect this process to be MUCH faster. For example, when performing this task using just plain QEMU + Bash it takes only a couple seconds, and the VM is booted and ready to login in ~30 seconds total on the same hardware mentioned above.

Source: https://github.com/cloudymax/Scrap-Metal/blob/main/virtual-machines/vm.sh

> bash vm.sh create-cloud-vm test
[2023-07-15 10:55:18] 📞 Setting networking options.
[2023-07-15 10:55:18]  - Static IP selected.
bridge exists
tap0 exists.
[2023-07-15 10:55:18] j🖥 Set graphics options based on gpu presence.
[2023-07-15 10:55:18]  - GPU attached
[2023-07-15 10:55:18] 📂 Creating VM directory.
[2023-07-15 10:55:18]  - Done!
[2023-07-15 10:55:18] 🔐 Create an SSH key for the VM admin user
[2023-07-15 10:55:20]  - Done.
[2023-07-15 10:55:20] ⬇️ Downloading cloud image...
debian-12-generic-amd64 100%[===============================>] 361.08M  67.0MB/s    in 8.3s
[2023-07-15 10:55:29] 📈 Expanding image
[2023-07-15 10:55:29]  - Done!
[2023-07-15 10:55:29] 👤 Generating user data
[2023-07-15 08:55:29] 🔎 Checking for required utilities...
[2023-07-15 08:55:29]  - All required utilities are installed.
[2023-07-15 08:55:29] 📝 Creating user-data file
[2023-07-15 08:55:29] 📝 Checking against the cloud-inint schema...
[2023-07-15 08:55:29] Valid cloud-config: user-data.yaml
[2023-07-15 08:55:29]  - Done.
[2023-07-15 10:55:30] 🌱 Generating seed iso containing user-data
[2023-07-15 10:55:30]  - Done!
[2023-07-15 10:55:30] 🌥 Creating cloud-image based VM
[2023-07-15 10:55:30] Watching progress:
[2023-07-15 10:55:54]  - Cloud-init complete.. 22.4 -----+ 3:ce:10:8c:52:c8:eb:9
download_cloud_image(){
  log "⬇️ Downloading cloud image..."
    wget -c -O "$CLOUD_IMAGE_NAME" "$CLOUD_IMAGE_URL" -q --show-progress
}

# Expand the size of the disk image 
expand_cloud_image(){
  log "📈 Expanding image"

  export CLOUD_IMAGE_FILE_TYPE=$(echo "${CLOUD_IMAGE_NAME#*.}")

  case $CLOUD_IMAGE_FILE_TYPE in
    "img")
      echo "img"
      qemu-img create -b ${CLOUD_IMAGE_NAME} -f qcow2 \
          -F qcow2 disk.qcow2 \
          "$DISK_SIZE" 1> /dev/null
      ;;
    "qcow2")
      echo "qcow2"
      qemu-img create -b ${CLOUD_IMAGE_NAME} -f qcow2 \
          -F qcow2 disk.qcow2 \
          "$DISK_SIZE" 1> /dev/null
      ;;
    *)
      echo "error"
      exit
  esac

  log " - Done!"
}
...

# start the cloud-init backed VM
create_ubuntu_cloud_vm(){
  log "🌥 Creating cloud-image based VM"
  if tmux has-session -t "${VM_NAME}_session" 2>/dev/null; then
    echo "session exists"
  else
    tmux new-session -d -s "${VM_NAME}_session"
    tmux send-keys -t "${VM_NAME}_session" "sudo qemu-system-x86_64  \
      -machine accel=kvm,type=q35 \
      -cpu host,kvm="off",hv_vendor_id="null" \
      -smp $SMP,sockets="$SOCKETS",cores="$PHYSICAL_CORES",threads="$THREADS",maxcpus=$SMP \
      -m "$MEMORY" \
      $VGA_OPT
      $PCI_GPU
      $NETDEV
      $DEVICE
      -object iothread,id=io1 \
      -device virtio-blk-pci,drive=disk0,iothread=io1 \
      -drive if=none,id=disk0,cache=none,format=qcow2,aio=threads,file=disk.qcow2 \
      -drive if=virtio,format=raw,file=seed.img,index=0,media=disk  \
      -bios /usr/share/ovmf/OVMF.fd \
      -usbdevice tablet \
      -vnc $HOST_ADDRESS:$VNC_PORT \
      $@" ENTER
      watch_progress
  fi
}

Environment:

  • CDI version (use kubectl get deployments cdi-deployment -o yaml): v1.54.3
  • Kubernetes version (use kubectl version): v1.27.3+k3s1
  • DV specification: loacal-path and Longhorn
  • Cloud provider or hardware configuration: None - bare metal
  • OS (e.g. from /etc/os-release): Debian12 and Ubuntu 22.04
  • Kernel (e.g. uname -a): 6.1 and 5.15
  • Install tools: N/A
  • Others: N/A
@akalenyu
Copy link
Collaborator

akalenyu commented Jul 16, 2023

Hey! Thanks for reporting this

This sounds similar to previous issues we had like #2358,
but the changes that tackle those did not get backported to your CDI version (v1.54.3).
The author of the PR that fixes this (#2701) mentions:

On my machine this reduces a Ubuntu cloud image import from 90 minutes to 1 minute.

If you want to try that change, you can install CDI 1.57 RC

@cloudymax
Copy link
Author

Hey! Thanks for reporting this

This sounds similar to previous issues we had like #2358, but the changes that tackle those did not get backported to your CDI version (v1.54.3). The author of the PR that fixes this (#2701) mentions:

On my machine this reduces a Ubuntu cloud image import from 90 minutes to 1 minute.

If you want to try that change, you can install CDI 1.57 RC

Thanks for the info, I'll try that out asap 🙏

@rwmjones
Copy link
Contributor

If the change doesn't work, can you let me know the Ubuntu image URL that you're trying to download and the exact version of nbdkit.

@cloudymax
Copy link
Author

cloudymax commented Jul 21, 2023

Hey! Thanks for reporting this

This sounds similar to previous issues we had like #2358, but the changes that tackle those did not get backported to your CDI version (v1.54.3). The author of the PR that fixes this (#2701) mentions:

On my machine this reduces a Ubuntu cloud image import from 90 minutes to 1 minute.

If you want to try that change, you can install CDI 1.57 RC

I'm actually running into a lot of issues attempting to install 1.57 RC, operator fails to start with error:
cdi-apiserver-signer-bundle not found

still troubleshooting 🤷

@cloudymax
Copy link
Author

cloudymax commented Jul 21, 2023

@akalenyu @rwmjones

After making the requested changes I'm still seeing that importing takes 20+ minutes.

I upgraded to the latest kubevirt version and latest cdi:

  • CDI v1.57.0-rc1
  • Kubevirt v1.0.0-beta.0

tried with:

---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: "debian"
  labels:
    app: containerized-data-importer
  annotations:
    cdi.kubevirt.io/storage.bind.immediate.requested: "true"
    cdi.kubevirt.io/storage.import.endpoint: "https://cloud.debian.org/images/cloud/bookworm/daily/latest/debian-12-generic-amd64-daily.qcow2"
spec:
  storageClassName: longhorn
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 120Gi
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: "jammy"
  labels:
    app: containerized-data-importer
  annotations:
    cdi.kubevirt.io/storage.bind.immediate.requested: "true"
    cdi.kubevirt.io/storage.import.endpoint: "https://cloud-images.ubuntu.com/jammy/current/jammy-server-cloudimg-amd64.img"
spec:
  storageClassName: local-path
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 120Gi

tried on longhorn and local-path, same results.

Logs:

 I0721 08:35:51.752839       1 importer.go:103] Starting importer                                                                                                                                │
│ I0721 08:35:51.753025       1 importer.go:172] begin import process                                                                                                                             │
│ I0721 08:35:52.123544       1 data-processor.go:356] Calculating available size                                                                                                                 │
│ I0721 08:35:52.127447       1 data-processor.go:368] Checking out file system volume size.                                                                                                      │
│ I0721 08:35:52.127553       1 data-processor.go:376] Request image size not empty.                                                                                                              │
│ I0721 08:35:52.127561       1 data-processor.go:381] Target size 126210801664.                                                                                                                  │
│ I0721 08:35:52.127803       1 nbdkit.go:310] Waiting for nbdkit PID.                                                                                                                            │
│ I0721 08:35:52.628451       1 nbdkit.go:331] nbdkit ready.                                                                                                                                      │
│ I0721 08:35:52.628475       1 data-processor.go:255] New phase: Convert                                                                                                                         │
│ I0721 08:35:52.628485       1 data-processor.go:261] Validating image                                                                                                                           │
│ I0721 08:35:52.878701       1 nbdkit.go:293] Log line from nbdkit: nbdkit: curl[1]: error: readahead: warning: underlying plugin does not support NBD_CMD_CACHE or PARALLEL thread model, so th │
│ e filter won't do anything                                                                                                                                                                      │
│ E0721 08:35:53.394136       1 prlimit.go:155] failed to kill the process; os: process already finished                                                                                          │
│ I0721 08:35:53.396582       1 qemu.go:258] 0.00                                                                                                                                                 │
│ I0721 08:35:53.396986       1 nbdkit.go:293] Log line from nbdkit: nbdkit: curl[2]: error: readahead: warning: underlying plugin does not support NBD_CMD_CACHE or PARALLEL thread model, so th │
│ e filter won't do anything                                                                                                                                                                      │
│ I0721 08:36:08.120708       1 qemu.go:258] 1.00                                                                                                                                                 │
│ I0721 08:36:21.590231       1 qemu.go:258] 2.01

I'll keep playing around with things and see if I can get it to work or at least break in a different way

@rwmjones
Copy link
Contributor

Can you run nbdkit --version

@cloudymax
Copy link
Author

cloudymax commented Jul 21, 2023

Sure,

on the host machine:

> nbdkit --version
-bash: nbdkit: command not found

in the container:

> nbdkit --version
nbdkit 1.34.1 (nbdkit-1.34.1-1.el9)

@cloudymax cloudymax changed the title CDI import from URL is massively slower than a manual wget + virtctl image-upload CDI import from URL is significantly slower than a manual wget + virtctl image-upload Jul 21, 2023
@rwmjones
Copy link
Contributor

The fundamental problem is still qemu-img convert lacking support for NBD multi-conn which is not something we can fix in nbdkit.

But anyway a while back I had this patch to nbdkit to enable parallel curl plugin. It had some unfortunate effects because of fairness which I describe in the thread there. I looked at it again and I think we should give it another go, so I pushed this commit:

https://gitlab.com/nbdkit/nbdkit/-/commit/f2163754860d041c4cb12dace90591c280eccae8

With this patch and upping the default number of connections (connections=8) it's goes from 20 mins to about 4 mins, which while still not great (see above about qemu) is probably the best we can do until qemu gets fixed.

@rwmjones
Copy link
Contributor

I've finally worked out what's going on here. About 3/4 of the time is spent doing 302 redirects from the orignal URL located on cloud.debian.org to the mirror (which for me happens to be saimei.ftp.acc.umu.se, but I guess that depends where you are located in the world).

The way curl works is it doesn't cache 302 redirects, so basically on every single block that we request we're doing a whole new redirect.

If you "manually" resolve the redirect first then for me it is ~ 4 times faster, and near the speed of plain wget.

It turns out we've actually hit this bug before in a slightly different context:

https://bugzilla.redhat.com/show_bug.cgi?id=2013000

I proposed a patch for that bug, adding an effective-url=true flag which would do the pre-resolving once inside the curl plugin. There were various problems with that patch which you can read about here:

https://listman.redhat.com/archives/libguestfs/2021-October/026882.html

I wonder if CDI can do a simple HEAD request to the site that is submitted in order to resolve redirects, and then use the resolved URI when opening nbdkit? It should be fairly trivial if CDI has access to the curl library. I can suggest some code for this if you like. It would also be useful to give better errors for things like 404s.

@akalenyu
Copy link
Collaborator

I've finally worked out what's going on here. About 3/4 of the time is spent doing 302 redirects from the orignal URL located on cloud.debian.org to the mirror (which for me happens to be saimei.ftp.acc.umu.se, but I guess that depends where you are located in the world).

The way curl works is it doesn't cache 302 redirects, so basically on every single block that we request we're doing a whole new redirect.

If you "manually" resolve the redirect first then for me it is ~ 4 times faster, and near the speed of plain wget.

It turns out we've actually hit this bug before in a slightly different context:

https://bugzilla.redhat.com/show_bug.cgi?id=2013000

I proposed a patch for that bug, adding an effective-url=true flag which would do the pre-resolving once inside the curl plugin. There were various problems with that patch which you can read about here:

https://listman.redhat.com/archives/libguestfs/2021-October/026882.html

I wonder if CDI can do a simple HEAD request to the site that is submitted in order to resolve redirects, and then use the resolved URI when opening nbdkit? It should be fairly trivial if CDI has access to the curl library. I can suggest some code for this if you like. It would also be useful to give better errors for things like 404s.

Interesting!
I see some old PRs where we pulled the plug on CDI doing the resolves:
#1981

@awels may have more context

@rwmjones
Copy link
Contributor

The two issues are overlapping but slightly distinct. In BZ 2013000 / #1981, the problem was with Fedora redirects being flaky, sometimes resolving to a 404 mirror. We decided in the end to solve that by using the nbdkit-retry-request-filter which basically retries single requests until we get a good mirror. In that case pre-resolving the URL was thought to be bad (because we might pre-resolve it to a 404, and then it wouldn't work at all) and so #1981 was not merged. In this case the Debian mirrors all appear to be good (ie they don't resolve to 404s that I've seen), but it's the issue that we're doing the resolves on every block (curl internally doesn't cache them) and they happen to be especially slow on the central Debian server, so pre-resolving would be a good thing.

@cloudymax
Copy link
Author

cloudymax commented Jul 24, 2023

This also affects Debian Live ISOs, Ubuntu cloud and Live ISOs, and Windows ISOs in my testing.

Examples here:
https://github.com/small-hack/argocd/tree/main/kubevirt/disks

Source urls used:

----
Debian:
  12: "https://cloud.debian.org/images/cloud/bookworm/daily/latest/debian-12-generic-amd64-daily.qcow2"
Ubuntu:
  focal: "https://cloud-images.ubuntu.com/focal/current/focal-server-cloudimg-amd64.img"
  jammy: "https://cloud-images.ubuntu.com/jammy/current/jammy-server-cloudimg-amd64.img"
  lunar: "https://cloud-images.ubuntu.com/lunar/current/lunar-server-cloudimg-amd64.img"
Windows:
  10: "https://www.itechtics.com/?dl_id=173"
  11: "https://www.itechtics.com/?dl_id=168"

Happy to apply any patches and or try new urls to provide more data 👍

I can also grant access to a test cluster if that would help

@akalenyu
Copy link
Collaborator

akalenyu commented Jul 24, 2023

https://www.itechtics.com/?dl_id=173

IIUC it could be helpful to use a resolved URL to try to verify this indeed helps in these cases,
so for the Windows images for example, you could try inputting the result of this into CDI:

curl -LIs -o /dev/null -w %{url_effective} https://www.itechtics.com/?dl_id=173

@rwmjones
Copy link
Contributor

Yes please stick any URLs you find which are problematic into a comment here.

The Debian one, as discussed. You can just replace the URL in the configuration file with one which has been resolved (curl -I https://cloud.debian.org/images/cloud/bookworm/daily/latest/debian-12-generic-amd64-daily.qcow2) which should mostly fix the problem.

The Ubuntu ones are pretty weird. For wget I get 2m13. For nbdkit the numbers vary wildly from a best case of 2m5 (faster than wget) to 8 minutes or more. It is not redirecting, so the problem is distinct from the Debian URL. I'm still looking at this and the Windows ones.

@rwmjones
Copy link
Contributor

The problem with the Windows ones is redirecting, same as Debian. @akalenyu 's suggestion is right there.

@rwmjones
Copy link
Contributor

rwmjones commented Jul 24, 2023

The Ubuntu download issue is now completely unreproducible. nbdkit is about 1-2 seconds slower than wget for me. Edit: nbdkit is faster than wget for me (1m21 for wget vs 1m16 for nbdkit, both numbers fairly stable).

I'm going to say at this point it might be some kind of load balancing issue their end. I'll run the same command tomorrow morning to see if I get a different server and can reproduce it.

@cloudymax
Copy link
Author

Thanks so much for all your hard work on this @rwmjones @akalenyu

I'll be testing the resolved URLs today, I tried last night but ended up hitting the pod limit on our test node 😓

@rwmjones
Copy link
Contributor

nbdkit is even faster this morning ...

$ time ./nbdkit -r -U - curl https://cloud-images.ubuntu.com/lunar/current/lunar-server-cloudimg-amd64.img --run 'nbdcopy -p $uri null:'
 █ 100% [****************************************]
real	1m10.973s
user	0m1.613s
sys	0m1.976s

I should note that I'm running a slightly modified nbdkit. The curl plugin uses a parallel thread model which seems to make it a bit faster. And I gave in and wrote a qcow2 filter.

@cloudymax
Copy link
Author

cloudymax commented Jul 25, 2023

nbdkit is even faster this morning ...

$ time ./nbdkit -r -U - curl https://cloud-images.ubuntu.com/lunar/current/lunar-server-cloudimg-amd64.img --run 'nbdcopy -p $uri null:'
 █ 100% [****************************************]
real	1m10.973s
user	0m1.613s
sys	0m1.976s

I should note that I'm running a slightly modified nbdkit. The curl plugin uses a parallel thread model which seems to make it a bit faster. And I gave in and wrote a qcow2 filter.

I'm able to achieve similar numbers to this if i download and install nbdkit on the host machine and run the same command.
However, when the process runs in a container after being triggered by creating a PVC, the creation time is 12 minutes.

This is faster than the 20+ minutes it was prior to using a resolved URL for the CDI, but it's still nearly 10x longer than running the command directly on the host, so the declarative workflow still isn't viable unfortunately

Debian12 CDI

Time: 12:03

url: "https://laotzu.ftp.acc.umu.se/images/cloud/bookworm/latest/debian-12-generic-amd64.qcow2"

Method:

kubectl apply -f debian12-pvc.yaml

debian12-pvc.yaml:

---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: debian12-pvc
  namespace: vm0
  labels:
    app: containerized-data-importer
  annotations:
    cdi.kubevirt.io/storage.bind.immediate.requested: "true"
    cdi.kubevirt.io/storage.import.endpoint: "https://laotzu.ftp.acc.umu.se/images/cloud/bookworm/latest/debian-12-generic-amd64.qcow2"
spec:
  storageClassName: local-path
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 2Gi

Debian12 NBDkit

Time: 1:18

url: "https://laotzu.ftp.acc.umu.se/images/cloud/bookworm/latest/debian-12-generic-amd64.qcow2"

Version:

nbdkit --version
nbdkit 1.32.5

Method:

time nbdkit -r -U - curl https://laotzu.ftp.acc.umu.se/images/cloud/bookworm/latest/debian-12-generic-amd64.qcow2 --run 'nbdcopy --no-extents -p "$uri" null:'
█ 100% [****************************************]

real    1m18.763s
user    0m0.476s
sys     0m0.515s

Ubuntu 22.04 CDI

Time: 11:52

url: "https://cloud-images.ubuntu.com/jammy/current/jammy-server-cloudimg-amd64.img""

Method:

kubectl apply -f jammy-pvc.yaml

jammy-pvc.yaml:

---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: jammy-pvc
  namespace: vm0
  labels:
    app: containerized-data-importer
  annotations:
    cdi.kubevirt.io/storage.bind.immediate.requested: "true"
    cdi.kubevirt.io/storage.import.endpoint: "https://cloud-images.ubuntu.com/jammy/current/jammy-server-cloudimg-amd64.img"
spec:
  storageClassName: local-path
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 2Gi

Ubuntu 22.04 NBDkit

Time: 2:21

url: "https://cloud-images.ubuntu.com/jammy/current/jammy-server-cloudimg-amd64.img"

Version:

nbdkit --version
nbdkit 1.32.5

Method:

time nbdkit -r -U - curl https://cloud-images.ubuntu.com/jammy/current/jammy-server-cloudimg-amd64.img --run 'nbdcopy --no-extents -p "$uri" null:'
█ 100% [****************************************]

real    2m21.936s
user    0m5.608s
sys     0m1.067s

@awels
Copy link
Member

awels commented Jul 25, 2023

CDI is using a fairly old version of nbdkit. Looks like 1.34.1-1 we should look into upgrading to a recent version.

@rwmjones
Copy link
Contributor

Does CDI run nbdkit inside a VM, or is it running straight in a container on the host? I don't think running inside a VM should be a problem (there's no virt involved with nbdkit), but be good to understand exactly where it is running.

nbdkit 1.30's curl plugin lacks at least:

  • Connection pool support
  • Multi-conn support
  • Parallel thread model (not in 1.34)

@cloudymax
Copy link
Author

Does CDI run nbdkit inside a VM, or is it running straight in a container on the host? I don't think running inside a VM should be a problem (there's no virt involved with nbdkit), but be good to understand exactly where it is running.

nbdkit 1.30's curl plugin lacks at least:

* Connection pool support

* Multi-conn support

* Parallel thread model (not in 1.34)

My tests were run on a bare-metal install of K3s, so the only virtualization layer is the container 👍

@awels
Copy link
Member

awels commented Jul 25, 2023

CDI runs inside a container on the host. So I just double checked, it looks like 1.57 is running 1.34 nbdkit, and 1.56 is running 1.30. And I think we need to add a plugin to 1.34 that is not needed in 1.30.

@cloudymax could you try 1.56 instead of 1.57-rc1 to see if it works better?

@rwmjones
Copy link
Contributor

rwmjones commented Jul 25, 2023

@cloudymax I think the Ubuntu image case is really different from the Debian & Windows image cases (caused by redirects). Probably we need a new bug for that.

Are you able to reliably reproduce slow download of the Ubuntu image with nbdkit >= 1.32 (ideally 1.34)? I'm still getting faster than wget performance this afternoon.

$ time nbdkit -r -U - curl https://cloud-images.ubuntu.com/lunar/current/lunar-server-cloudimg-amd64.img --run 'nbdcopy -p $uri null:'

As everyone else is in different places around the world you might hit a different Ubuntu mirror and get the super-slow case.

If you get the super-slow case, then add these flags and capture the output (which is likely to be very large):

-f -v -D curl.verbose=1

@awels
Copy link
Member

awels commented Jul 25, 2023

I just tried with CDI 1.56 and it is still super slow there....

@cloudymax
Copy link
Author

cloudymax commented Jul 25, 2023

@cloudymax I think the Ubuntu image case is really different from the Debian & Windows image cases (caused by redirects). Probably we need a new bug for that.

Are you able to reliably reproduce slow download of the Ubuntu image with nbdkit >= 1.32 (ideally 1.34)? I'm still getting faster than wget performance this afternoon.

$ time nbdkit -r -U - curl https://cloud-images.ubuntu.com/lunar/current/lunar-server-cloudimg-amd64.img --run 'nbdcopy -p $uri null:'

As everyone else is in different places around the world you might hit a different Ubuntu mirror and get the super-slow case.

If you get the super-slow case, then add these flags and capture the output (which is likely to be very large):

-f -v -D curl.verbose=1

The comparison is not nbdkit VS wget as a race to download the image fastest at the command line on the host. Even if it was, nbkit taking +/- 1 minute is still far slower than the 7 seconds it takes for wget to download the image + 25 seconds for virtctl to create the PVC and then re-upload the image.

NBD

time nbdkit -r -U - curl https://cloud-images.ubuntu.com/lunar/current/lunar-server-cloudimg-amd64.img --run 'nbdcopy -p $uri null:'
█ 100% [****************************************]

real    2m48.143s
user    0m6.562s
sys     0m1.317s

wget

export VOLUME_NAME=jammy-pvc
export NAMESPACE="default"
export STORAGE_CLASS="local-path"
export ACCESS_MODE="ReadWriteOnce"
export IMAGE_URL="https://cloud-images.ubuntu.com/lunar/current/lunar-server-cloudimg-amd64.img "
export IMAGE_PATH="lunar-server-cloudimg-amd64.img"
export VOLUME_TYPE=pvc
export SIZE=5Gi
export PROXY_ADDRESS=$(kubectl get svc cdi-uploadproxy-loadbalancer -n cdi -o json | jq --raw-output '.spec.clusterIP')
# $(kubectl get svc cdi-uploadproxy -n cdi -o json | jq --raw-output

time wget -O $IMAGE_PATH $IMAGE_URL && \
time virtctl image-upload $VOLUME_TYPE $VOLUME_NAME \
    --size=$SIZE \
    --image-path=$IMAGE_PATH \
    --uploadproxy-url=https://$PROXY_ADDRESS:443 \
    --namespace=$NAMESPACE \
    --storage-class=$STORAGE_CLASS \
    --access-mode=$ACCESS_MODE \
    --insecure --force-bind
--2023-07-25 16:16:45--  https://cloud-images.ubuntu.com/lunar/current/lunar-server-cloudimg-amd64.img
Resolving cloud-images.ubuntu.com (cloud-images.ubuntu.com)... 2620:2d:4000:1::17, 2620:2d:4000:1::1a, 185.125.190.40, ...
Connecting to cloud-images.ubuntu.com (cloud-images.ubuntu.com)|2620:2d:4000:1::17|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 816316416 (778M) [application/octet-stream]
Saving to: ‘lunar-server-cloudimg-amd64.img’

lunar-server-cloudimg-a 100%[===============================>] 778.50M   103MB/s    in 7.9s

2023-07-25 16:16:53 (99.1 MB/s) - ‘lunar-server-cloudimg-amd64.img’ saved [816316416/816316416]


real    0m7.993s
user    0m0.404s
sys     0m0.893



PVC default/jammy-pvc not found
PersistentVolumeClaim default/jammy-pvc created
Waiting for PVC debian12-pvc upload pod to be ready...
Pod now ready
Uploading data to https://10.43.88.176:443

 778.50 MiB / 778.50 MiB [==========================================================] 100.00% 1s

Uploading data completed successfully, waiting for processing to complete, you can hit ctrl-c without interrupting the progress
Processing completed successfully
Uploading lunar-server-cloudimg-amd64.img completed successfully

real    0m25.071s
user    0m0.203s
sys     0m0.291s

This is all irrelevant though because the actual issue I have is that time-to-creation of a new PVC via the CDI when using a declarative workflow is consistently over 10 minutes regardless of image size, URL, disk speed, download bandwidth, or CPU core count.

Something is wrong within the containerized CDI process that is not present on the command line.

@rwmjones
Copy link
Contributor

real 2m48.143s

Which version of nbdkit is this?

@cloudymax
Copy link
Author

cloudymax commented Jul 25, 2023

real 2m48.143s

Which version of nbdkit is this?

On the host the version is nbdkit 1.32.5 and take 1-2 minutes. it is the latest version of nbdkit available through debian apt.

In the container it is nbdkit 1.34.1 (nbdkit-1.34.1-1.el9) this is the version provided by the CDI version 1.57-rc1. That one takes 10+ minutes

Can you please test in a cluster using the actual CDI declarative flow and not on the command line?

@awels
Copy link
Member

awels commented Jul 25, 2023

This is what is started inside the CDI pod
Start nbdkit with: ['--foreground' '--readonly' '--exit-with-parent' '-U' '/tmp/nbdkit.sock' '--pidfile' '/tmp/nbdkit.pid' '--filter=readahead' '--filter=retry' '-r' 'curl' 'header=User-Agent: cdi-nbdkit-importer' 'retry-exponential=no' 'url=https://cloud-images.ubuntu.com/jammy/current/jammy-server-cloudimg-amd64.img']

@rwmjones
Copy link
Contributor

Thanks for that. I believe my local tests are limited by my network, so I grabbed a Beaker machine which I can use instead which has a much better internet connection (and I think is physically located in the US). I can reproduce the issue much more easily now.

@awels
Copy link
Member

awels commented Jul 25, 2023

I eliminated nbdkit from the equation and I am running the go http client to directly get the data from inside the pod, and it was still slow until I increased the amount of memory available to the pod, so I used this yaml to create the DataVolume

apiVersion: cdi.kubevirt.io/v1beta1
kind: DataVolume
metadata:
  name: jammy-pvc
  annotations:
    cdi.kubevirt.io/storage.bind.immediate.requested: "true"
spec:
  source:
    http:
      # certConfigMap with an empty cert is to work around this issue:
      # https://github.com/kubevirt/containerized-data-importer/issues/2358
      certConfigMap: empty-cert
      url: https://cloud-images.ubuntu.com/jammy/current/jammy-server-cloudimg-amd64.img
  pvc:
    accessModes:
      - ReadWriteOnce
    resources:
      requests:
        storage: 5Gi
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: empty-cert
data:
  ca.pem: ""

And I increased the amount of memory allocated to the CDI importer pod by modifying the CDI CR to look like this

apiVersion: cdi.kubevirt.io/v1beta1
kind: CDI
metadata:
  name: cdi
spec:
  config:
    featureGates:
    - HonorWaitForFirstConsumer
    podResourceRequirements:
      limits:
        cpu: "4"
        memory: 2Gi
      requests:
        cpu: "1"
        memory: 250Mi
  imagePullPolicy: IfNotPresent
  infra:
    nodeSelector:
      kubernetes.io/os: linux
    tolerations:
    - key: CriticalAddonsOnly
      operator: Exists
  workload:
    nodeSelector:
      kubernetes.io/os: linux

In particular note the podResourceRequirements which increased the amount of memory and cpu available. I went a little overboard, it doesn't require that much, but apparently the defaults are just not enough. This allowed me to import using the http go client into the scratch space, and convert from there. It took maybe a minute if that on my local machine which a 1G fiber connection. The emptyCert is to trick CDI into thinking it cannot use nbdkit and fall back to the scratch space method which is a straight http download and then convert.

@awels
Copy link
Member

awels commented Jul 25, 2023

Thanks for that. I believe my local tests are limited by my network, so I grabbed a Beaker machine which I can use instead which has a much better internet connection (and I think is physically located in the US). I can reproduce the issue much more easily now.

Note that the client CDI is using is qemu-img, which is likely the problem in this scenario... 8.0.0-4 is the current version for 1.57

@rwmjones
Copy link
Contributor

Quick question: Does the container environment have broken IPv6? The Ubuntu URL resolves to IPv6 then IPv4, and curl tries the IPv6 addresses first. In the Beaker test environment IPv6 is broken and this causes a delay. I added a feature upstream to force IPv4 which improves performance.

Anyway, I'm on holiday today, will take another look at this tomorrow.

@cloudymax
Copy link
Author

Quick question: Does the container environment have broken IPv6? The Ubuntu URL resolves to IPv6 then IPv4, and curl tries the IPv6 addresses first. In the Beaker test environment IPv6 is broken and this causes a delay. I added a feature upstream to force IPv4 which improves performance.

Anyway, I'm on holiday today, will take another look at this tomorrow.

I cant provide an authoritative answer here. I can say that the Host machine has functional IPV6 (its default for hetzner instances) I'll try to get some more info on this 👍

@rwmjones
Copy link
Contributor

I reimplemented the plugin using curl "multi" handle support, and the results are way better: https://listman.redhat.com/archives/libguestfs/2023-July/032167.html (see patch 2 for some test timings)

@cloudymax
Copy link
Author

apiVersion: cdi.kubevirt.io/v1beta1
kind: DataVolume
metadata:
  name: jammy-pvc
  annotations:
    cdi.kubevirt.io/storage.bind.immediate.requested: "true"
spec:
  source:
    http:
      # certConfigMap with an empty cert is to work around this issue:
      # https://github.com/kubevirt/containerized-data-importer/issues/2358
      certConfigMap: empty-cert
      url: https://cloud-images.ubuntu.com/jammy/current/jammy-server-cloudimg-amd64.img
  pvc:
    accessModes:
      - ReadWriteOnce
    resources:
      requests:
        storage: 5Gi
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: empty-cert
data:
  ca.pem: ""

I was able to replicate this on my side and get 1:02 for the pvc creation which gets me unblocked on my end for now. Thanks a bunch!

@cloudymax
Copy link
Author

I reimplemented the plugin using curl "multi" handle support, and the results are way better: https://listman.redhat.com/archives/libguestfs/2023-July/032167.html (see patch 2 for some test timings)

awesome work 🙌

@rwmjones
Copy link
Contributor

rwmjones commented Aug 1, 2023

nbdkit changes are upstream in 1.35.8, mainly this commit: https://gitlab.com/nbdkit/nbdkit/-/commit/a74b289ee15a7c75dceb8a96403f1fa8ebd72e88

I did some tests with the URLs under discussion and we are better than wget for the Debian and Ubuntu cases. The Windows URL is weird, more below.

https://cloud-images.ubuntu.com/lunar/current/lunar-server-cloudimg-amd64.img
(HTTP/1.1)

nbdkit before change: 180s
nbdkit after change, connections=16: 42.4s
nbdkit after change, connections=32: 21.6s
wget: 20.2s
nbdkit after change, connections=64: 17.2s

https://gemmei.ftp.acc.umu.se/images/cloud/bookworm/daily/latest/debian-12-generic-amd64-daily.qcow2
(HTTP/1.1)

nbdkit before change: 100.6s
nbdkit after change, connections=16: 23.7s
wget: 20.9s
nbdkit after change, connections=32: 14.0s
nbdkit after change, connections=64: 9.1s

https://www.itechtics.com/?dl_id=173
This URL redirects twice, one to windowstan.com which is proxied by cloudflare, and a second time to software.download.prss.microsoft.com. Unfortunately the second redirect is not an actual HTTP redirect, but a message in the body of the 502 error message that cloudflare frequently sends if you attempt any automated download. So I did these tests using the microsoft.com URL directly. The microsoft site uses HTTP/2.

nbdkit before change if you use the windowstan.com URL: over 80 minutes, and very unreliable
nbdkit before change if you use the microsoft.com URL: 3m03
nbdkit after change, connections=1: 3m24
nbdkit after change, connections=2: 3m05
nbdkit after change, connections=16: 3m04
wget: 1m04

Some tips:

  • "Pre-resolving" the URLs helps greatly.
  • If IPv6 is broken, use ipresolve=v4 to force IPv4 which will save a bit of time.
  • connections parameter controls the number of HTTP connections to the remote server. Reducing will put less load on / more friendly to the server, increasing will make things faster. Default is 16.
  • HTTP/2 is way more efficient than HTTP/1.1. We will now use HTTP/2 multiplexing, usually up to 100 streams per HTTP connection. curl ought to default to HTTP/2 if available automatically.

Oh and we found and fixed an actual bug in curl: curl/curl#11557

@awels
Copy link
Member

awels commented Aug 1, 2023

Great unfortunately the latest version available in centos9 stream seems to be 1.34.1 and since our container is build using a centos9 stream image we can't grab 1.35.8 yet.

@rwmjones
Copy link
Contributor

rwmjones commented Aug 1, 2023

Totally understood. If you would like to test this, I could do a scratch build in CentOS Stream -- if you have a way to consume that.

For getting the change into RHEL in general we'll need a BZ. I would like to have the change sit in Fedora for a while to get some testing, as it's an almost complete rewrite of the plugin.

@awels
Copy link
Member

awels commented Aug 1, 2023

We have some automation that essentially just pulls the latest released RPMs and updates our build to use that. So we don't really have a good way to consume scratch builds like that. So for now I think we will use scratch space, and once we can consume the fix, we can re-enable nbdkit for those flows where it makes sense.

@rwmjones
Copy link
Contributor

rwmjones commented Aug 1, 2023

Do you (or can you) pull from C9S repos? I could provide an update there very quickly. It's getting it into RHEL which is the hard part.

@awels
Copy link
Member

awels commented Aug 1, 2023

We normally pull from http://mirror.stream.centos.org/9-stream/

@rwmjones
Copy link
Contributor

rwmjones commented Aug 1, 2023

Ok, let me see what I can do today/tomorrow.

@rwmjones
Copy link
Contributor

rwmjones commented Aug 1, 2023

There should be something to test here in about an hour: https://kojihub.stream.centos.org/koji/taskinfo?taskID=2603284

I'm not sure exactly if those builds go directly into http://mirror.stream.centos.org/9-stream/ or if they have to go through RHEL OSCI processes first. I guess in the best case there will be at least some kind of compose delay which probably takes a day or two.

@awels
Copy link
Member

awels commented Aug 1, 2023

Cool, I will check in a few days then. In the meantime switching to scratch space for 1.57, and once that is available will use nbdkit again in main, and likely 1.58

@awels
Copy link
Member

awels commented Aug 4, 2023

So I grabbed the nbdkit version from there and I am trying it, and I think I am missing some plugins or filters. Which plugins/filters should I use?

@rwmjones
Copy link
Contributor

rwmjones commented Aug 4, 2023

I think only nbdkit-retry-request-filter is relevant. It will retry single requests if the server is unavailable.

Note that performance is still going to suck unless you "pre-resolve" the URLs. Suggested above was a command to do this:

curl -LIs -o /dev/null -w %{url_effective} URL

(or equivalent libcurl APIs).

Edit: So re-reading the question, I wonder if you didn't have the separate nbdkit-curl-plugin package installed? Since curl is an extra dependency (ie. the libcurl library) we don't include the curl plugin in the basic-plugins package.

Edit #2: It looks as if the new version isn't present on https://mirror.stream.centos.org/9-stream/AppStream/x86_64/os/Packages/ yet. The new package is 1.34.2-1.el9

@awels
Copy link
Member

awels commented Aug 4, 2023

I grabbed the following rpms since we were using them before:

https://kojihub.stream.centos.org/kojifiles/packages/nbdkit/1.34.2/1.el9/x86_64/nbdkit-basic-filters-1.34.2-1.el9.x86_64.rpm
https://kojihub.stream.centos.org/kojifiles/packages/nbdkit/1.34.2/1.el9/x86_64/nbdkit-curl-plugin-1.34.2-1.el9.x86_64.rpm
https://kojihub.stream.centos.org/kojifiles/packages/nbdkit/1.34.2/1.el9/x86_64/nbdkit-gzip-filter-1.34.2-1.el9.x86_64.rpm
https://kojihub.stream.centos.org/kojifiles/packages/nbdkit/1.34.2/1.el9/x86_64/nbdkit-server-1.34.2-1.el9.x86_64.rpm

I don't think we are using the gzip filter anymore, but just in case. pre-resolving made it a little faster, but not significantly. If I curl the image it takes about 7 seconds locally. I still think the problem is likely in the qemu-img we are using.

@awels
Copy link
Member

awels commented Aug 4, 2023

This is the nbdkit command running in the pod I0804 19:36:36.403006 1 nbdkit.go:251] Start nbdkit with: ['--foreground' '--readonly' '--exit-with-parent' '-U' '/tmp/nbdkit.sock' '--pidfile' '/tmp/nbdkit.pid' '--filter=cache' '--filter=readahead' '--filter=retry' '-r' 'curl' 'header=User-Agent: cdi-nbdkit-importer' 'retry-exponential=no' 'url=https://southfront.mm.fcix.net/fedora/linux/releases/38/Cloud/x86_64/images/Fedora-Cloud-Base-38-1.6.x86_64.qcow2']

@rwmjones
Copy link
Contributor

rwmjones commented Aug 4, 2023

I think you are right about qemu-img unfortunately (but see below).

Using https://southfront.mm.fcix.net/fedora/linux/releases/38/Cloud/x86_64/images/Fedora-Cloud-Base-38-1.6.x86_64.qcow2, from a machine in Beaker, I get:

wget: 14.3
nbdkit (old, nbdcopy): 55.3
nbdkit (new, nbdcopy, connections=16): 15.5
nbdkit (new, nbdcopy, connections=32): 9.9
nbdkit (new, nbdcopy, connections=64): 8.2
nbdkit (old, qemu-img): 5m05
nbdkit (new, qemu-img, connections=16): 4m09
nbdkit (new, qemu-img, connections=32): 4m16

Test command:

time nbdkit -r -U - curl 'https://southfront.mm.fcix.net/fedora/linux/releases/38/Cloud/x86_64/images/Fedora-Cloud-Base-38-1.6.x86_64.qcow2' --run 'nbdcopy -p $uri null:' connections=XX

(for qemu tests replace nbdcopy with qemu-img convert ... /tmp/out)

nbdkit has a qcow2 filter now. Using this filter to do the conversion with nbdcopy to do the copy:

nbdkit (new, qcow2dec, nbdcopy, connections=16): 1m25
nbdkit (new, qcow2dec, nbdcopy, connections=32): 41.5
nbdkit (new, qcow2dec, nbdcopy, connections=64): 25.0

wget + qemu-img convert to raw takes 20 seconds, but I didn't spend any time optimizing the qcow2dec filter.

About the command line ...

I would get rid of --filter=cache --filter=readonly as they are unlikely to be doing anything useful with the new filter.

--filter=retry is fine, but --filter=retry-request might be used instead, although for the new curl filter it'll make no difference since all requests are now stateless and handled by a background thread.

-r and --readonly both do the same thing, but it's not a problem to have both.

-U /tmp/nbdkit.sock --pidfile /tmp/nbdkit.pid - placing these files in well-known locations is risky (insecure temporary file vulnerability). I guess this is in a container so nothing else ought to be running, but maybe defence in depth would be better.

@awels
Copy link
Member

awels commented Aug 15, 2023

@cloudymax Just released 1.57.0 which should not require the 'cert' hack shown above. It should be similar speed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants