-
Notifications
You must be signed in to change notification settings - Fork 254
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[release-v1.57] Ensure priority class is assigned to pod populating pvc prime #2871
Conversation
* Enable empty schedule in DataImportCron (kubevirt#2711) Allow disabling DataImportCron schedule and support external trigger Signed-off-by: Ido Aharon <iaharon@redhat.com> * expand upon kubevirt#2721 (kubevirt#2731) Need to replace requeue bool with requeue duration Signed-off-by: Michael Henriksen <mhenriks@redhat.com> * Add clone from snapshot functionalities to clone-populator (kubevirt#2724) * Add clone from snapshot functionalities to the clone populator Signed-off-by: Alvaro Romero <alromero@redhat.com> * Update clone populator unit tests to cover clone from snapshot capabilities Signed-off-by: Alvaro Romero <alromero@redhat.com> * Fix storage class assignation in temp-source claim for host-assisted clone from snapshot This commit also includes other minor and styling-related fixes Signed-off-by: Alvaro Romero <alromero@redhat.com> --------- Signed-off-by: Alvaro Romero <alromero@redhat.com> * Prepare CDI testing for the upcoming non-CSI lane (kubevirt#2730) * Update functional tests to skip incompatible default storage classes Signed-off-by: Alvaro Romero <alromero@redhat.com> * Enable the use of non-csi HPP in testing lanes This commit modifies several scripts to allow the usage of classic HPP as the default SC in tests. This allows us to test our non-populator flow with a non-csi provisioner. Signed-off-by: Alvaro Romero <alromero@redhat.com> --------- Signed-off-by: Alvaro Romero <alromero@redhat.com> * Allow snapshots as format for DataImportCron created sources (kubevirt#2700) * StorageProfile API for declaring format of resulting cron disk images Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com> * Integrate recommended format in dataimportcron controller Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com> * Take snapclass existence into consideration when populating cloneStrategy and sourceFormat Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com> --------- Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com> * Remove leader election test (kubevirt#2745) Now that we are using the standard k8s leases from the controller runtime library, there is no need to test our implementation as it is no longer in use. This will save some testing time and random failures. Signed-off-by: Alexander Wels <awels@redhat.com> * Integration of Data volume using CDI populators (kubevirt#2722) * move cleanup out of dv deletion It seemed off to call cleanup in the prepare function just because we don't call cleanup unless the dv is deleting. Instead we check in the clenup function itself if it should be done: in this 2 specific cases in case of deletion and in case the dv succeeded. The cleanup will be used in future commit also for population cleanup which we also want to happen not only on deletion. Signed-off-by: Shelly Kagan <skagan@redhat.com> * Use populator if csi storage class exists Add new datavolume phase PendingPopulation to indicate wffc when using populators, this new phase will be used in kubevirt in order to know that there is no need for dummy pod to pass wffc phase and that the population will occur once creating the vm. Signed-off-by: Shelly Kagan <skagan@redhat.com> * Update population targetPVC with pvc prime annotations The annotations will be used to update dv that uses the populators. Signed-off-by: Shelly Kagan <skagan@redhat.com> * Adjust UT with new behavior Signed-off-by: Shelly Kagan <skagan@redhat.com> * updates after review Signed-off-by: Shelly Kagan <skagan@redhat.com> * Fix import populator report progress The import pod should be taken from pvcprime Signed-off-by: Shelly Kagan <skagan@redhat.com> * Prevent requeue upload dv when failing to find progress report pod Signed-off-by: Shelly Kagan <skagan@redhat.com> * Remove size inflation in populators The populators are handling existing PVCs. The PVC already has a defined requested size, inflating the PVC' with fsoverhead will only be on the PVC' spec and will not reflect on the target PVC, this seems undesired. Instead if the populators is using by PVC that the datavolume controller created the inflation will happen there if needed. Signed-off-by: Shelly Kagan <skagan@redhat.com> * Adjust functional tests to handle dvs using populators Signed-off-by: Shelly Kagan <skagan@redhat.com> * Fix clone test Signed-off-by: Shelly Kagan <skagan@redhat.com> * add shouldUpdateProgress variable to know if need to update progress Signed-off-by: Shelly Kagan <skagan@redhat.com> * Change update of annotation from denied list to allowed list Instead if checking if the annotation on pvcPrime is not desired go over desired list and if the annotation exists add it. Signed-off-by: Shelly Kagan <skagan@redhat.com> * fix removing annotations from pv when rebinding Signed-off-by: Shelly Kagan <skagan@redhat.com> * More fixes and UT Signed-off-by: Shelly Kagan <skagan@redhat.com> * a bit more updates and UTs Signed-off-by: Shelly Kagan <skagan@redhat.com> --------- Signed-off-by: Shelly Kagan <skagan@redhat.com> * Run bazelisk run //robots/cmd/uploader:uploader -- -workspace /home/prow/go/src/github.com/kubevirt/project-infra/../containerized-data-importer/WORKSPACE -dry-run=false (kubevirt#2751) Signed-off-by: kubevirt-bot <kubevirtbot@redhat.com> * Allow dynamic linked build for non bazel build (kubevirt#2753) The current script always passes the static ldflag to the compiler which will result in a static binary. We would like to be able to build dynamic libraries instead. cdi-containerimage-server has to be static because we are copying it into the context of a container disk container which is most likely based on a scratch container and has no libraries for us to use. Signed-off-by: Alexander Wels <awels@redhat.com> * Disable DV GC by default (kubevirt#2754) * Disable DV GC by default DataVolume garbage collection is a nice feature, but unfortunately it violates fundamental principle of Kubernetes. CR should not be auto-deleted when it completes its role (Job with TTLSecondsAfter- Finished is an exception), and once CR was created we can assume it is there until explicitly deleted. In addition, CR should keep idempotency, so the same CR manifest can be applied multiple times, as long as it is a valid update (e.g. DataVolume validation webhook does not allow updating the spec). When GC is enabled, some systems (e.g GitOps / ArgoCD) may require a workaround (DV annotation deleteAfterCompletion = "false") to prevent GC and function correctly. On the next kubevirt-bot Bump kubevirtci PR (with bump-cdi), it will fail on all kubevirtci lanes with tests referring DVs, as the tests IsDataVolumeGC() looks at CDIConfig Spec.DataVolumeTTLSeconds and assumes default is enabled. This should be fixed there. Signed-off-by: Arnon Gilboa <agilboa@redhat.com> * Fix test waiting for PVC deletion with UID Signed-off-by: Arnon Gilboa <agilboa@redhat.com> * Fix clone test assuming DV was GCed Signed-off-by: Arnon Gilboa <agilboa@redhat.com> * Fix DIC controller DV/PVC deletion when snapshot is ready Signed-off-by: Arnon Gilboa <agilboa@redhat.com> --------- Signed-off-by: Arnon Gilboa <agilboa@redhat.com> --------- Signed-off-by: Ido Aharon <iaharon@redhat.com> Signed-off-by: Michael Henriksen <mhenriks@redhat.com> Signed-off-by: Alvaro Romero <alromero@redhat.com> Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com> Signed-off-by: Alexander Wels <awels@redhat.com> Signed-off-by: Shelly Kagan <skagan@redhat.com> Signed-off-by: kubevirt-bot <kubevirtbot@redhat.com> Signed-off-by: Arnon Gilboa <agilboa@redhat.com> Co-authored-by: Ido Aharon <iaharon@redhat.com> Co-authored-by: Michael Henriksen <mhenriks@redhat.com> Co-authored-by: alromeros <alromero@redhat.com> Co-authored-by: akalenyu <akalenyu@redhat.com> Co-authored-by: Shelly Kagan <skagan@redhat.com> Co-authored-by: kubevirt-bot <kubevirtbot@redhat.com> Co-authored-by: Arnon Gilboa <agilboa@redhat.com>
We were hitting a panic because we're passing the original DV object to renderPvcSpec (source is nil) instead of the mutated DV which has sourceRef converted (source not nil) Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com> Co-authored-by: Alex Kalenyuk <akalenyu@redhat.com>
Signed-off-by: Arnon Gilboa <agilboa@redhat.com> Co-authored-by: Arnon Gilboa <agilboa@redhat.com>
Signed-off-by: Arnon Gilboa <agilboa@redhat.com> Co-authored-by: Arnon Gilboa <agilboa@redhat.com>
kubevirt#2783) * remove CSI clone bye bye Signed-off-by: Michael Henriksen <mhenriks@redhat.com> * no more smart clone Signed-off-by: Michael Henriksen <mhenriks@redhat.com> * PVC clone same namespace Signed-off-by: Michael Henriksen <mhenriks@redhat.com> * cross namespace pvc clone Signed-off-by: Michael Henriksen <mhenriks@redhat.com> * various fixes to get some functional tests to work Signed-off-by: Michael Henriksen <mhenriks@redhat.com> * delete smart clone controller again somehow reappeared after rebase Signed-off-by: Michael Henriksen <mhenriks@redhat.com> * mostly pvc clone functional test fixes make sure size detect pod only runs on kubevirt content type clone populator was skipping last round op applying pvc' annotations various func test fixes review comments Signed-off-by: Michael Henriksen <mhenriks@redhat.com> * more various test fixes host clone phase should (implicitly) wait for clone source pod to exit Signed-off-by: Michael Henriksen <mhenriks@redhat.com> * remove "smart" clone from snapshot Signed-off-by: Michael Henriksen <mhenriks@redhat.com> * DataVolume clone from snapshot uses populator Signed-off-by: Michael Henriksen <mhenriks@redhat.com> * improve clone populator/datavolume coordination on "running" condition For host clone, not much changes, values still comming from annotations on host clone PVC For smart/csi clone the DataVolume will be "running" if not in pending or error phase Will have the same values for terminal "completed" state regardless of clone type Signed-off-by: Michael Henriksen <mhenriks@redhat.com> * unit tests for pvc/snapshot clone controllers Signed-off-by: Michael Henriksen <mhenriks@redhat.com> * remove skipped test added in kubevirt#2759 Signed-off-by: Michael Henriksen <mhenriks@redhat.com> * attempt address AfterSuite and generate-verify failures Signed-off-by: Michael Henriksen <mhenriks@redhat.com> * handle snapshot clone with no target size specified also add more validation to some snapshot clone tests Signed-off-by: Michael Henriksen <mhenriks@redhat.com> * remove Patch calls Using the controller runtime Patch API with controller runtime cached client seems to be a pretty bad fit At least given the way the CR API is designed where an old object is compared to new. I like patch in theory though and will revisit Signed-off-by: Michael Henriksen <mhenriks@redhat.com> * Clone populator should plan and execute even if PVC is bound It was possible to miss "preallocation applied" annotation otherwise Signed-off-by: Michael Henriksen <mhenriks@redhat.com> * add long term token to datavolume Signed-off-by: Michael Henriksen <mhenriks@redhat.com> * Rename ProgressReporter to StatusReporter Should have been done back when annotations were addded to "progress" Also, if pvc is bound do not call phase Reconcile functions only Status Signed-off-by: Michael Henriksen <mhenriks@redhat.com> --------- Signed-off-by: Michael Henriksen <mhenriks@redhat.com> Co-authored-by: Michael Henriksen <mhenriks@redhat.com>
…virt#2785) * dataimportcron: Pass dynamic credential support label (kubevirt#2760) * dataimportcron: code change: Use better matchers in tests Signed-off-by: Andrej Krejcir <akrejcir@redhat.com> * dataimportcron: Pass dynamic credential support label The label is passed from DataImportCron to DataVolume and DataSource. Signed-off-by: Andrej Krejcir <akrejcir@redhat.com> --------- Signed-off-by: Andrej Krejcir <akrejcir@redhat.com> * Add DataImportCron snapshot sources docs (kubevirt#2747) Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com> * add akalenyu as approver, some others as reviewers (kubevirt#2766) Signed-off-by: Michael Henriksen <mhenriks@redhat.com> * Run `make rpm-deps` (kubevirt#2741) * Run make rpm-deps Signed-off-by: Maya Rashish <mrashish@redhat.com> * Avoid overlayfs error message by using vfs driver Signed-off-by: Maya Rashish <mrashish@redhat.com> --------- Signed-off-by: Maya Rashish <mrashish@redhat.com> * Fix Destructive test lane failure - missing pod following recreate of CDI (kubevirt#2744) Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com> * [WIP] Handle nil ptr in dataimportcron controller (kubevirt#2769) Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com> * Revert some gomega error checking that produce confusing output (kubevirt#2772) One of these tests flakes, but the error is hard to debug because gomega will yell about `Unexpected non-nil/non-zero argument at index 0` Instead of showing the error. Apparently this is intended: https://github.com/onsi/gomega/pull/480/files#diff-e696deff1a5be83ad03053b772926cb325cede3b33222fa76c2bb1fcf2efd809R186-R190 Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com> * Run bazelisk run //robots/cmd/uploader:uploader -- -workspace /home/prow/go/src/github.com/kubevirt/project-infra/../containerized-data-importer/WORKSPACE -dry-run=false (kubevirt#2770) Signed-off-by: kubevirt-bot <kubevirtbot@redhat.com> * [CI] Add metrics name linter (kubevirt#2774) Signed-off-by: Aviv Litman <alitman@redhat.com> --------- Signed-off-by: Andrej Krejcir <akrejcir@redhat.com> Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com> Signed-off-by: Michael Henriksen <mhenriks@redhat.com> Signed-off-by: Maya Rashish <mrashish@redhat.com> Signed-off-by: kubevirt-bot <kubevirtbot@redhat.com> Signed-off-by: Aviv Litman <alitman@redhat.com> Co-authored-by: Andrej Krejcir <akrejcir@gmail.com> Co-authored-by: Michael Henriksen <mhenriks@redhat.com> Co-authored-by: Maya Rashish <mrashish@redhat.com> Co-authored-by: kubevirt-bot <kubevirtbot@redhat.com> Co-authored-by: Aviv Litman <64130977+avlitman@users.noreply.github.com>
…rs (kubevirt#2793) * Allow ImmediateBind annotation when using populators In case of using PVC with populators if the PVC has this annotation we prevent from waiting for it to be schedueled and we proceed with the process. When using datavolumes with populators in case the dv has the annotation it will be passed to the PVC. we prevent from being in pendingPopulation in case the created pvc has the annotaion. Plus when having honorWaitForFirstConsumer feature gate disabled we will put on the target PVC the immediateBind annotation. Now we allow to use populators when having the annotation the the feature gate disabled. Signed-off-by: Shelly Kagan <skagan@redhat.com> * Add functional tests to population using PVCs Signed-off-by: Shelly Kagan <skagan@redhat.com> * Support immediate binding with clone datavolume Signed-off-by: Shelly Kagan <skagan@redhat.com> * Pass allowed annotations from target pvc to pvc prime This annotations are used for the import/upload/clone pods to define netork configurations. Signed-off-by: Shelly Kagan <skagan@redhat.com> --------- Signed-off-by: Shelly Kagan <skagan@redhat.com> Co-authored-by: Shelly Kagan <skagan@redhat.com>
Signed-off-by: Alexander Wels <awels@redhat.com>
Signed-off-by: Alvaro Romero <alromero@redhat.com> Co-authored-by: Alvaro Romero <alromero@redhat.com>
Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com> Co-authored-by: Alex Kalenyuk <akalenyu@redhat.com>
This may be the reason for the flakes we're seeing: https://storage.googleapis.com/kubevirt-prow/pr-logs/pull/kubevirt_containerized-data-importer/2774/pull-containerized-data-importer-e2e-hpp-latest/1675408112015642624/artifacts/1_pvs.log With GC on we may want to delete the DV. We may have more cases like this one scattered around, so let's keep an eye Not sure how populators DVs handle that situation but we see that the test PV is simply gone from artifacts Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com> Co-authored-by: Alex Kalenyuk <akalenyu@redhat.com>
* Add documentation for cdi populators Signed-off-by: Shelly Kagan <skagan@redhat.com> * populators doc updates after review Signed-off-by: Shelly Kagan <skagan@redhat.com> --------- Signed-off-by: Shelly Kagan <skagan@redhat.com> Co-authored-by: Shelly Kagan <skagan@redhat.com>
Per recommendation from ceph CSI team, switching over to doing CSI clones by default for ceph provisioners. Implementation is very similar to today's smart cloning via temporary snapshot. Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com> Co-authored-by: Alex Kalenyuk <akalenyu@redhat.com>
…; add missing events (kubevirt#2824) * Annotate PVC with host-assisted clone fallback reason and add missing events Signed-off-by: Arnon Gilboa <agilboa@redhat.com> * Add PVC clone fallback annotation and event when DV controller is using the non-CSI clone path Signed-off-by: Arnon Gilboa <agilboa@redhat.com> * Add func tests Signed-off-by: Arnon Gilboa <agilboa@redhat.com> --------- Signed-off-by: Arnon Gilboa <agilboa@redhat.com> Co-authored-by: Arnon Gilboa <agilboa@redhat.com>
regression introduced in kubevirt#2700 Signed-off-by: Arnon Gilboa <agilboa@redhat.com> Co-authored-by: Arnon Gilboa <agilboa@redhat.com>
… all cases (kubevirt#2833) This commit fixes a race condition in CDI populators so we attempt to get PVC' even if the population source doesn't exist. This new behavior allows us to correctly delete the PVC' once the population has succeeded, even if the population source doesn't exist anymore. Signed-off-by: Alvaro Romero <alromero@redhat.com>
…kubevirt#2841) * Remove older nbdkit * When converting always use scratch space importing instead of ndbkit. Once we are able to get nbdkit 1.35.8 or newer we can revert this change since that will include improvements to the downloading speed. * Disable metrics test for import because straight import doesn't return total, and this means the metrics are disabled. * Fix broken functional tests * Address review comments * Additional review comments. Fixed functional test that was not doing the right thing while running the test. * Always set preallocation on block devices when directly writing to the device --------- Signed-off-by: Alexander Wels <awels@redhat.com>
Signed-off-by: Michael Henriksen <mhenriks@redhat.com> Co-authored-by: Michael Henriksen <mhenriks@redhat.com>
* Adjust tests for WFFC ceph lane - We now support namespace transfer with WFFC due to using populators underneath - AnnCloneType/SourceInUse only appear when target binds - CloneFromSnapshotSourceInProgress only appears on non WFFC storage ATM Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com> * Specify wffc storage class variation of ceph explicitly for snapshot/csiclone/block Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com> * Increase test timeout Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com> --------- Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com> Co-authored-by: Alex Kalenyuk <akalenyu@redhat.com>
When importing via node container runtime cache, we always have the image handy locally. This manifests itself in the form of a bug where we loop over ```bash E0813 13:32:38.443088 1 data-processor.go:251] scratch space required and none found E0813 13:32:38.443102 1 importer.go:181] scratch space required and none found ``` On registry node pull imports where images are not raw Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com> Co-authored-by: Alex Kalenyuk <akalenyu@redhat.com>
Signed-off-by: Alexander Wels <awels@redhat.com>
…d metrics names to meet the metrics naming conventions. The old metrics names will not be available after this fix. (kubevirt#2850) Signed-off-by: Aviv Litman <alitman@redhat.com> Co-authored-by: Aviv Litman <alitman@redhat.com>
Seen in practice when deleting CDI CR in issue kubevirt#2852 Signed-off-by: Maya Rashish <mrashish@redhat.com> Co-authored-by: Maya Rashish <mrashish@redhat.com>
…ubevirt#2863) * bump k8s.io/client-go dep for discovery client fixes k8s.io/client-go [v0.26.0, v0.26.3) was impacted by a regression in discovery client behavior kubernetes/kubernetes#118361 (comment) for details We are probably not hitting this due to not testing 1.27 upstream yet, or don't have the custom metric endpoints that send these nils in the response. (Reproduces on OpenShift ECs for example) * make generate --------- Signed-off-by: Alex Kalenyuk <akalenyu@redhat.com>
This commit ups the cpu request for for all our installed compopents (cdi-deployment, cdi-apiserver, cdi-uploadproxy, cdi-operator) for 10m (1% of a core) to 100m (10% of a core). The main driver of this is BZ: 2216038. Without this change, it is pretty easy to create a large number of concurrent clone operations and get token timeout errors. Upping resource requests and concurrency addresses the issue in a very direct way. Signed-off-by: Michael Henriksen <mhenriks@redhat.com> Co-authored-by: Michael Henriksen <mhenriks@redhat.com>
Currently there is a repeating reconcile error due to exponential backoff: "DataVolume.storage spec is missing accessMode and no storageClass to choose profile", however the error is not needed as we already have SC watch for this case, e.g. when the DV is using the default SC but no default SC is configured yet. Signed-off-by: Arnon Gilboa <agilboa@redhat.com> Co-authored-by: Arnon Gilboa <agilboa@redhat.com>
…rt#2864) * Ensure priority class is assigned to pod populating pvc prime Priority class was not being passed to the pvc prime from the PVC and thus was not being added to the importer pod populating the pvc prime. There is a list of allowed annotations that can be passed and the priority class annotation was not in it. This commit adds the annotation to the allowed list. Cleaned up unneed log argument to a function related to passing the annotations. Signed-off-by: Alexander Wels <awels@redhat.com> * Restore the logger as it was logging the pvc name as well as the log message. Modified the name to make it clearer this is the case. Signed-off-by: Alexander Wels <awels@redhat.com> * Have upload also use the populator, can't do clone because the pod disappears too quickly without retain Signed-off-by: Alexander Wels <awels@redhat.com> * Test for priority class in upload as well as fix typo in cloning test. Signed-off-by: Alexander Wels <awels@redhat.com> --------- Signed-off-by: Alexander Wels <awels@redhat.com>
Signed-off-by: Alexander Wels <awels@redhat.com>
PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@awels: The following tests failed, say
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
What this PR does / why we need it:
Manual backport of #2864
Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):Fixes #
Special notes for your reviewer:
Release note: