Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade base image to centos9 and support both docker on docker and p… #4036

Merged

Conversation

michaellevy101
Copy link
Contributor

@michaellevy101 michaellevy101 commented Jun 28, 2022

For NMState performance improvement, we are upgrading our base images to Centos9 (Assisted Service and Assisted Service build).
This PR deals with the Assisted Service build container only.
Assisted Service build images create containers (e.g. Postgres databases) using a container tool.
Currently assisted-service fully support docker as a container tool. This PR supports using podman as a container tool.
The idea is to expose the host socket to the container and the container uses a thin client to work against the host.
The container detects the right tool by skipper CONTAINER_RUNTIME_COMMAND env parameter (automatically sets by skipper) or CONTAINER_TOOL (manually sets by the user) and then for podman find the right version.
A podman4 cli client (podman-remote or podman -r) does not have backward compatibility and cannot be used with a podman3 server.
We overcome this by installing two podman-remote versions (3 and 4) and selecting the correct one at runtime.

List all the issues related to this PR

  • New Feature
  • Enhancement
  • Bug fix
  • Tests
  • Documentation
  • CI/CD

What environments does this code impact?

  • Automation (CI, tools, etc)
  • Cloud
  • Operator Managed Deployments
  • None

How was this code tested?

  • assisted-test-infra environment
  • dev-scripts environment
  • Reviewer's test appreciated
  • Waiting for CI to do a full test run
  • Manual (Elaborate on how it was tested)
  • No tests needed

Assignees

/cc @
/cc @

Checklist

  • Title and description added to both, commit and PR.
  • Relevant issues have been associated (see CONTRIBUTING guide)
  • Reviewers have been listed
  • This change does not require a documentation update (docstring, docs, README, etc)
  • Does this change include unit-tests (note that code changes require unit-tests)

Reviewers Checklist

  • Are the title and description (in both PR and commit) meaningful and clear?
  • Is there a bug required (and linked) for this change?
  • Should this PR be backported?

@michaellevy101
Copy link
Contributor Author

/hold not ready yet

@openshift-ci openshift-ci bot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jun 28, 2022
@openshift-ci openshift-ci bot added kind/dependency-change Categorizes issue or PR as related to changing dependencies approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Jun 28, 2022
@michaellevy101 michaellevy101 force-pushed the centos9-upgrade branch 2 times, most recently from d948a04 to e21815a Compare June 28, 2022 15:32
Copy link
Contributor

@nmagnezi nmagnezi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you end up with:
No package docker-ce-cli available.

because openshift/release swapped the image with something else. Please check.

@openshift-ci openshift-ci bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 29, 2022
@openshift-ci openshift-ci bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 29, 2022
@michaellevy101 michaellevy101 force-pushed the centos9-upgrade branch 3 times, most recently from 4b20a5a to daab395 Compare June 30, 2022 17:50
@openshift-ci openshift-ci bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 4, 2022
@osherdp
Copy link
Contributor

osherdp commented Jul 4, 2022

/retest

@osherdp
Copy link
Contributor

osherdp commented Jul 4, 2022

Ah merd, didn't notice there are git conflicts

@openshift-ci openshift-ci bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 5, 2022
@openshift-ci openshift-ci bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 7, 2022
@michaellevy101 michaellevy101 force-pushed the centos9-upgrade branch 3 times, most recently from 4191709 to 9e1a848 Compare July 7, 2022 14:10
@openshift-ci openshift-ci bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 7, 2022
@michaellevy101
Copy link
Contributor Author

/retest

@nmagnezi
Copy link
Contributor

I'd like to update you that the latest PR revision worked okay for me locally.
Tested with:

  1. Docker installed on the host, running docker commands via non-root user (sudoer)
  2. Skipper version 1.29.2

Tested okay:

  1. lint
  2. unit
  3. subsystem

Dockerfile.assisted-service-build Outdated Show resolved Hide resolved
docs/dev/podman.md Outdated Show resolved Hide resolved
@osherdp
Copy link
Contributor

osherdp commented Jul 25, 2022

For some reason skipper make deploy-test is stuck for me (with podman) on:

[3/4] STEP 14/15: RUN CGO_ENABLED=0 GOFLAGS="" GO111MODULE=on go build -o /build/assisted-service-admission cmd/webadmission/main.go
--> Using cache 5dd66ad8b03c4f810ac7e86c31da0703c60c3e2ad7a5cbf40c3087b7b7cc1d9f
--> 5dd66ad8b03
[3/4] STEP 15/15: RUN CGO_ENABLED=0 GOFLAGS="" GO111MODULE=on go build -o /build/agent-installer-client cmd/agentbasedinstaller/client/main.go
--> Using cache 5192ded21e6c450ef948d232992048f442042ca29334243faca35c63a9050d08
--> 5192ded21e6
[4/4] STEP 1/16: FROM quay.io/centos/centos:stream8
[4/4] STEP 2/16: RUN dnf install -y libvirt-libs nmstate &&    dnf clean all

@michaellevy101 michaellevy101 force-pushed the centos9-upgrade branch 3 times, most recently from 2d4afef to db67afc Compare July 25, 2022 08:30
@michaellevy101
Copy link
Contributor Author

/hold updating test-infra
openshift/assisted-test-infra#1767

@michaellevy101 michaellevy101 force-pushed the centos9-upgrade branch 3 times, most recently from b71a58e to c007e1a Compare July 25, 2022 13:40
@nmagnezi
Copy link
Contributor

Tested the latest revision. found some issues:

skipper make lint
Tested okay.

skipper make unit-test

Successfully tagged assisted-service-build:latest
useradd: failure while writing changes to /etc/subuid
CONTAINER_RUNTIME_COMMAND doesn't set on old skipper version -> default to docker. Upgrade your skipper to the latest version docker ps -q --filter "name=postgres" | xargs -r CONTAINER_RUNTIME_COMMAND doesn't set on old skipper version -> default to docker. Upgrade your skipper to the latest version docker kill && sleep 3
/bin/sh: line 1: CONTAINER_RUNTIME_COMMAND: command not found
make: *** [Makefile:488: run-db-container] Error 127

skipper deploy-test

Successfully tagged assisted-service-build:latest
useradd: failure while writing changes to /etc/subuid
python3 ./tools/wait_for_cluster_info.py --namespace "assisted-installer"
2022-07-26 08:04:30.673117 DEBUG - kubectl --namespace assisted-installer cluster-info
Kubernetes control plane is running at https://192.168.49.2:8443

To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
mkdir -p /home/nmagnezi/go/src/github.com/assisted-service/build/assisted-installer
(cd ./tools && go run auth_keys_generator.go -keys-dir=/home/nmagnezi/go/src/github.com/assisted-service/build/assisted-installer)
Keys already generated. To re-generate, delete files: /home/nmagnezi/go/src/github.com/assisted-service/build/assisted-installer/auth-test-pub.json, /home/nmagnezi/go/src/github.com/assisted-service/build/assisted-installer/auth-test.json, /home/nmagnezi/go/src/github.com/assisted-service/build/assisted-installer/auth-token*String
# Temporary hack that updates the local k8s(e.g minikube) with the latest image.
# Should be replaced after installing a local registry
./hack/update_local_image.sh
+ source hack/utils.sh
++ kubectl -n assisted-installer config current-context
++ cut -d- -f1
+ CLUSTER_CONTEXT=minikube
+ case "${CLUSTER_CONTEXT}" in
+ command -v minikube
+++ get_container_runtime_command
+++ '[' -z '' ']'
+++ running_from_skipper
+++ '[' -n x ']'
+++ sed s/-remote//
hack/utils.sh: line 12: CONTAINER_RUNTIME_COMMAND: unbound variable
++ SHELL=/bin/bash
++ minikube -env
flag needs an argument: 'v' in -v
Usage of minikube:
      --add_dir_header                   If true, adds the file directory to the header of the log messages
      --alsologtostderr                  log to standard error as well as files
  -h, --help
      --log_backtrace_at traceLocation   when logging hits line file:N, emit a stack trace (default :0)
      --log_dir string                   If non-empty, write log files in this directory
      --log_file string                  If non-empty, use this log file
      --log_file_max_size uint           Defines the maximum size a log file can grow to. Unit is megabytes. If the value is 0, the maximum file size is unlimited. (default 1800)
      --logtostderr                      log to standard error instead of files (default true)
      --one_output                       If true, only write logs to their native severity level (vs also writing to each lower severity level)
      --skip_headers                     If true, avoid header prefixes in the log messages
      --skip_log_headers                 If true, avoid headers when opening log files
      --stderrthreshold severity         logs at or above this threshold go to stderr (default 2)
  -v, --v Level                          number for the log level verbosity
      --vmodule moduleSpec               comma-separated list of pattern=N settings for file-filtered logging
+ eval flag needs an argument: ''\''v'\''' in -v
++ flag needs an argument: v in -v
./hack/update_local_image.sh: line 40: flag: command not found
make: *** [Makefile:237: _update-local-k8s-image] Error 127

@michaellevy101 michaellevy101 force-pushed the centos9-upgrade branch 2 times, most recently from 51da617 to bd1e6bc Compare July 26, 2022 14:45
@michaellevy101
Copy link
Contributor Author

/cc @osherdp @adriengentil

docs/dev/podman.md Outdated Show resolved Hide resolved
docs/dev/podman.md Outdated Show resolved Hide resolved
docs/dev/podman.md Outdated Show resolved Hide resolved
@nmagnezi
Copy link
Contributor

Tested the latest revision with skipper 1.29.2 and docker:

  1. lint tested okay
  2. The subsystem did run; some tests fail, but it is probably unrelated. However, I do see some permission issues:
Summarizing 4 Failures:

[Fail] test authorization with cluster editor role [It] can delete cluster
/home/nmagnezi/go/src/github.com/assisted-service/subsystem/authz_test.go:124

[Fail] test authorization with cluster editor role [It] can update cluster
/home/nmagnezi/go/src/github.com/assisted-service/subsystem/authz_test.go:130

[Fail] test authorization regular user [It] can't update not owned cluster, can only read cluster
/home/nmagnezi/go/src/github.com/assisted-service/subsystem/authz_test.go:188

[Fail] Leader tests [It] cleaning old leader lock configmap
/home/nmagnezi/go/src/github.com/assisted-service/subsystem/leader_test.go:230

Ran 270 of 273 Specs in 2165.581 seconds
FAIL! -- 266 Passed | 4 Failed | 0 Pending | 3 Skipped

DONE 1 tests, 1 failure in 2168.453s
make[3]: Entering directory '/home/nmagnezi/go/src/github.com/assisted-service'
hack/utils.sh: line 13: /dev/stderr: Permission denied
hack/utils.sh: line 13: /dev/stderr: Permission denied
make _coverage
make[4]: Entering directory '/home/nmagnezi/go/src/github.com/assisted-service'
hack/utils.sh: line 13: /dev/stderr: Permission denied
hack/utils.sh: line 13: /dev/stderr: Permission denied
make[4]: Nothing to be done for '_coverage'.
make[4]: Leaving directory '/home/nmagnezi/go/src/github.com/assisted-service'
make[3]: Leaving directory '/home/nmagnezi/go/src/github.com/assisted-service'
make[2]: *** [Makefile:457: _test] Error 1
make[2]: Leaving directory '/home/nmagnezi/go/src/github.com/assisted-service'
make[1]: *** [Makefile:425: _run_subsystem_test] Error 2
make[1]: Leaving directory '/home/nmagnezi/go/src/github.com/assisted-service'
make: *** [Makefile:413: test] Error 2
[nmagnezi@rdu-infra-edge-03 assisted-service]$ client_loop: send disconnect: Broken pipe
  1. unit tests did run, also saw permission issues:
DONE 62 tests in 72.766s
make _post_unit_test
make[2]: Entering directory '/home/nmagnezi/go/src/github.com/assisted-service'
hack/utils.sh: line 13: /dev/stderr: Permission denied
hack/utils.sh: line 13: /dev/stderr: Permission denied
make _unit_test_coverage
make[3]: Entering directory '/home/nmagnezi/go/src/github.com/assisted-service'
hack/utils.sh: line 13: /dev/stderr: Permission denied
hack/utils.sh: line 13: /dev/stderr: Permission denied
make[3]: Nothing to be done for '_unit_test_coverage'.
make[3]: Leaving directory '/home/nmagnezi/go/src/github.com/assisted-service'
make[2]: Leaving directory '/home/nmagnezi/go/src/github.com/assisted-service'
make[1]: Leaving directory '/home/nmagnezi/go/src/github.com/assisted-service'
docker kill postgres

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Jul 31, 2022
@openshift-ci
Copy link

openshift-ci bot commented Jul 31, 2022

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: michaellevy101, osherdp

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [michaellevy101,osherdp]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci
Copy link

openshift-ci bot commented Jul 31, 2022

@michaellevy101: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/edge-e2e-ai-operator-ztp-converged daab395 link true /test edge-e2e-ai-operator-ztp-converged

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@michaellevy101
Copy link
Contributor Author

/retest

@michaellevy101
Copy link
Contributor Author

/unhold

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 31, 2022
@openshift-merge-robot openshift-merge-robot merged commit b894970 into openshift:master Jul 31, 2022
danielerez pushed a commit to danielerez/assisted-service that referenced this pull request Oct 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. kind/dependency-change Categorizes issue or PR as related to changing dependencies lgtm Indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants