Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve http import flow to decide whether to use scratch space or not #3219

Merged
merged 1 commit into from
Apr 26, 2024

Conversation

alromeros
Copy link
Collaborator

What this PR does / why we need it:

Follow up for #3212.

Due to some performance issues discussed in #2809, we stopped using nbdkit for most http imports and decided to use our custom scratch space method for most imports. This new behavior introduced minor differences in some specific flows, such as stopping the conversion of uncompressed raw images.

The conversion process made the actual size of raw images significantly smaller, probably because of qemu's handling of sparse images. The import of raw images without conversion ended up causing failures in some tests, as imported images had a significant increase in size.

This PR aims to fix this behavior by using scratch space with more import flows: All archived kubevirt imgs will be transfered direcly to file, while all non-archived kubevirt imgs will be imported to scratch.

Example:

Fresh image import before this PR:

sh-5.1$ qemu-img info disk.img 
image: disk.img
file format: raw
virtual size: 70 GiB (75161927680 bytes)
disk size: 55 GiB

Fresh image import after this PR (due to convert):

$ qemu-img info disk.img 
image: disk.img
file format: raw
virtual size: 70 GiB (75161927680 bytes)
disk size: 9.95 GiB

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes # https://issues.redhat.com/browse/CNV-36026

Special notes for your reviewer:

Check #2809 and #2832 for more context about the original change.
Release note:

Bugfix: Use scratch space when importing non-archived images 

@kubevirt-bot kubevirt-bot added dco-signoff: yes Indicates the PR's author has DCO signed all their commits. release-note Denotes a PR that will be considered when it comes time to generate release notes. labels Apr 22, 2024
case "xz":
r, err = fr.xzReader()
if err == nil {
fr.Archived = true
fr.ArchiveXz = true
}
case "qcow2":
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note for reviewer: Just moving this to group archived/converted formats together.

@@ -133,7 +133,7 @@ func (hs *HTTPDataSource) Info() (ProcessingPhase, error) {
if hs.contentType == cdiv1.DataVolumeArchive {
return ProcessingPhaseTransferDataDir, nil
}
if !hs.readers.Convert {
if hs.readers.Archived {
return ProcessingPhaseTransferDataFile, nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I say we transfer to scratch for archived as well so I think we can delete this if

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fear we might uncover issues like the one this PR addressed (#2845), but I'm ok with the change. Let's see how tests behave.

@mhenriks
Copy link
Member

wonder if the number of cases in which we don't use scratch for import (I think just registry node pull?) is so limited that we should get rid of the "recreate pod with scratch space" flow? import controller can be in charge of deciding when scratch is NOT needed

Since we dropped the use of nbdkit we've started prioritizing the use of scratch space for most import flows.

However, this new behavior introduced minor differences such as stop converting raw images, which caused inconsistencies in some tests.

This commit improves the importer flow to better determine whether to use scratch space or not.

Signed-off-by: Alvaro Romero <alromero@redhat.com>
@alromeros
Copy link
Collaborator Author

/retest-required

@@ -133,19 +133,13 @@ func (hs *HTTPDataSource) Info() (ProcessingPhase, error) {
if hs.contentType == cdiv1.DataVolumeArchive {
return ProcessingPhaseTransferDataDir, nil
}
if !hs.readers.Convert {
return ProcessingPhaseTransferDataFile, nil
}
if pullMethod, _ := util.ParseEnvVar(common.ImporterPullMethod, false); pullMethod == string(cdiv1.RegistryPullNode) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Planning to try getting rid of nbdkit here? I don't think it's necessary

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't this PR (#2845) address some bug for using scratch space with pull node imports? @akalenyu wdyt?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The major change in that PR is returning "convert" for registry node pull. Can still do that. I just don't think we need nbdkit

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't have much time to test this today but I added a commit to get rid of nbdkit for pull node. Let's see how tests behave and I'll do more testing tomorrow.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mhenriks seems the version of qemu-img we use doesn't support running convert/info on http endpoints, that's why we need either scratch space or nbdkit. It works on my local version though (odd since I'm using an older version), but since we expect to backport this I think we should keep using nbdkit here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That does not sound correct to me. What is the error you see? Can you check the logs to see what args we are we passing to qemu-img?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a simple qemu-img info --output=json http://localhost:8100/tinyCore.iso returns qemu-img: Could not open 'http://localhost:8100/tinyCore.iso': Unknown protocol 'http'. I've tried creating a debugging pod and manually running the command inside and it happens the same. However, running it locally works.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah the container must not have qemu-block-curl package.

Since updating rpms in old branches is annoying. We can hold off getting rid of nbdkit in this pr but I think we should get rid of it in a subsequent pr that is not backported

@alromeros
Copy link
Collaborator Author

alromeros commented Apr 26, 2024

wonder if the number of cases in which we don't use scratch for import (I think just registry node pull?) is so limited that we should get rid of the "recreate pod with scratch space" flow? import controller can be in charge of deciding when scratch is NOT needed

I like that idea, other than archive imports, vddk and registry node pull I think we can use scratch for every import type. We can leave that for other PR but good thing to consider for sure.

@mhenriks
Copy link
Member

/lgtm
/approve

Thanks @alromeros, this PR is something that can be backported. For the furure we should:

  1. Get rid of nbdkit, add qemu-img-block-curl
  2. Looks like import-controller can determine exactly when scratch space is required so let's get rid of the "pod restarts to get scratch space" machinery

@kubevirt-bot kubevirt-bot added the lgtm Indicates that a PR is ready to be merged. label Apr 26, 2024
@kubevirt-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: mhenriks

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@kubevirt-bot kubevirt-bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 26, 2024
@kubevirt-bot kubevirt-bot merged commit cd56b6c into kubevirt:main Apr 26, 2024
18 checks passed
@alromeros
Copy link
Collaborator Author

/cherrypick release-v1.57
/cherrypick release-v1.58
/cherrypick release-v1.59

@kubevirt-bot
Copy link
Contributor

@alromeros: new pull request created: #3222

In response to this:

/cherrypick release-v1.57
/cherrypick release-v1.58
/cherrypick release-v1.59

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@alromeros
Copy link
Collaborator Author

/cherrypick release-v1.58

@kubevirt-bot
Copy link
Contributor

@alromeros: new pull request created: #3223

In response to this:

/cherrypick release-v1.58

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@alromeros
Copy link
Collaborator Author

/cherrypick release-v1.59

@kubevirt-bot
Copy link
Contributor

@alromeros: new pull request created: #3224

In response to this:

/cherrypick release-v1.59

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. lgtm Indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/S
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants