Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multi-architecture builds and publish #1681

Merged
merged 11 commits into from
Jan 16, 2024

Conversation

jon-nfc
Copy link
Contributor

@jon-nfc jon-nfc commented Jan 8, 2024

SUMMARY

Add changes related to multi-architecture builds and publishing.

The desired endstate is that there is officially built and supported multi-architecture container images that can be built from a single command make docker-buildx and that on quay.io/ansible/awx-operator those same muli-architecture images are available for consumption.

ISSUE TYPE

  • New or Enhanced Feature

ADDITIONAL INFORMATION

This PR does not have the intention of changing the release workflow. The intent is that as part of the established workflow, that multi-architecture containers are built and published for use by the community.

Adjustment was made to ensure that there is ever only one image built for a given git commit and is used for the entire workflow. This prevents the scenario where it's as if a witch has casted a spell upon you, when at the end of the day there were two or more images that exist for the same commit and you were the poor sole who can't figure out what is wrong because the image you have is "Good to go!!" Bottom line no one wants to debug black magic (be in a scenario where two images exist for the same code but are different enough to make you pull your hair out). As each commit is stored on GHCR, it becomes the single source of truth for any containers built, especially for debugging purposes.

Changes to Github Actions

  • devel push to devel branch builds multi-arch containers and tags it with the git commit and stores it on GHCR in addition to the existing step of pushing said manifest to quay.io with the devel tag

  • stage when triggered no longer builds it's own image. pulls the image from GHCR built in devel action and follows the existing flow of tagging with the specified version and pushing back to GHCR

  • promote Github Release No change, however now uses the multi-arch manifest.

Other Changes

reference: ansible/eda-server-operator#158

  • makefile now has a target docker-buildx this target provides the basis for all multi-arch container builds. As buildx creates a manifest with all container architectures included, as part of the same command will automagic push the image to the tagged registry. This command can also be used in development

    • from a terminal make docker-buildx
    • To customize the image name and registry IMG=registry/name/name2:tag make docker-buildx.
    • To specifiy different architectures to build PLATFORMS=linux/arm64,linux/amd64 make docker-buildx
    • multiple variables can be addded to the command should it be desired.

Fixes issues

@jon-nfc jon-nfc force-pushed the multi-arch-builds branch 2 times, most recently from 4935911 to 506369a Compare January 8, 2024 12:16
@jon-nfc
Copy link
Contributor Author

jon-nfc commented Jan 8, 2024

- name: Build and publish bundle to operator-hub
working-directory: awx-operator-${{ env.VERSION }}
env:
IMG_REPOSITORY: ${{ env.IMAGE_REGISTRY }}/${{ env.IMAGE_REGISTRY_ORGANIZATION }}
GITHUB_TOKEN: ${{ secrets.AWX_AUTO_GITHUB_TOKEN }}
run: |
git config --global user.email "awx-automation@redhat.com"
git config --global user.name "AWX Automation"
./hack/publish-to-operator-hub.sh

which leads to

# Build bundle and catalog images
make bundle-build bundle-push BUNDLE_IMG=$BUNDLE_IMG IMG=$OPERATOR_IMG
make catalog-build catalog-push CATALOG_IMG=$CATALOG_IMG BUNDLE_IMGS=$BUNDLE_IMG BUNDLE_IMG=$BUNDLE_IMG IMG=$OPERATOR_IMG

As part of the publish there is another build and push of the image. Can this be safely removed? AS the complete manifest is already pushed to quay.io in the promote action

awx-operator/Makefile

Lines 235 to 237 in 07427be

.PHONY: catalog-build
catalog-build: opm ## Build a catalog image.
$(OPM) index add --container-tool ${CONTAINER_CMD} --mode semver --tag $(CATALOG_IMG) --bundles $(BUNDLE_IMGS) $(FROM_INDEX_OPT)

Also I noticed that step make catalog-build uses something called opm, what is it and in it's current usage, can it handle docker manifests?

@jon-nfc jon-nfc marked this pull request as ready for review January 8, 2024 13:48
@jon-nfc
Copy link
Contributor Author

jon-nfc commented Jan 8, 2024

Don't know who to ping, PR ready for review.

also require someone in the know to answer my above question

@rooftopcellist
Copy link
Member

@jon-nfc

As part of the publish there is another build and push of the image. Can this be safely removed? AS the complete manifest is already pushed to quay.io in the promote action

The publish-to-operator-hub.sh script is for adding the awx-operator version to https://operatorhub.io/operator/awx-operator so it can be installed with OLM. The bundle image is needed for OLM installs, it contains content from the manifests and crd directories (critically, the CSV).

  • bundle image build - This image is separate from the operator image. It is necessary for OLM installs, but shouldn't be any trouble for multi-arch.

Bundle images are not architecture-specific. They contain only plaintext Kubernetes manifests and operator metadata. (Reference)[https://sdk.operatorframework.io/docs/advanced-topics/multi-arch/#operator-lifecycle-manager]

  • catalog image build - not strictly necessary, but it is a nice-to-have because it allows for easier testing of OLM-specific parameter changes. Also handy for demos of new features because there can sometimes be a 1-2 day lag for the operator getting published to OperatorHub.io after a release.

The opm tool is a cli tool for inspecting and managing catalog images. More information can be found here.

A catalog image is a 3rd image that contains pointers to operator bundle versions it knows about, which populates the list of available operators that can be installed on Openshift or an OLM enabled k8s cluster. The newer catalog images are just plaintext on the inside as far as I know (file-based-catalogs) as far as I know.

This is not strictly needed for users as the folks from https://github.com/k8s-operatorhub/community-operators maintain a catalog image that gets installed as part of OLM when you install an operator from OperatorHub.io. So I don't think we need to worry about multi-arch for these images.

@rooftopcellist
Copy link
Member

@jon-nfc it looks like there are a few linting issues that need to be cleaned up.

./.github/workflows/promote.yaml
  Warning: 32:76 [comments] too few spaces before comment
  Warning: 40:76 [comments] too few spaces before comment
  Error: 49:171 [line-length] line too long (176 > 170 characters)
  Error: 54:171 [line-length] line too long (171 > 170 characters)

./.github/workflows/stage.yml
  Warning: 58:76 [comments] too few spaces before comment
  Error: 67:171 [line-length] line too long (216 > 170 characters)

./.github/workflows/devel.yaml
  Warning: 17:76 [comments] too few spaces before comment
  Warning: 25:76 [comments] too few spaces before comment
  Error: [39](https://github.com/ansible/awx-operator/actions/runs/7447379325/job/20266238173?pr=1681#step:6:43):171 [line-length] line too long (176 > 170 characters)
  Error: [40](https://github.com/ansible/awx-operator/actions/runs/7447379325/job/20266238173?pr=1681#step:6:44):1 [empty-lines] too many blank lines (1 > 0)

jon-nfc added a commit to jon-nfc/awx-operator that referenced this pull request Jan 9, 2024
As there are ENV vars set by the build process, there needs to be a way of passing them at build time

PR ansible#1681
jon-nfc added a commit to jon-nfc/awx-operator that referenced this pull request Jan 9, 2024
As part of the release process, env vars for the operator and awx must be set. As such the image needs to be rebuilt.

PR ansible#1681
jon-nfc added a commit to jon-nfc/awx-operator that referenced this pull request Jan 9, 2024
.github/workflows/stage.yml Show resolved Hide resolved
.github/workflows/stage.yml Show resolved Hide resolved
@jon-nfc
Copy link
Contributor Author

jon-nfc commented Jan 9, 2024

This is not strictly needed for users as the folks from https://github.com/k8s-operatorhub/community-operators maintain a catalog image that gets installed as part of OLM when you install an operator from OperatorHub.io. So I don't think we need to worry about multi-arch for these images.

I though as much.

@rooftopcellist
Copy link
Member

rooftopcellist commented Jan 10, 2024

@jon-nfc I think this is a good approach. I didn't spot any issues while reading this through except for one more linting error CI is complaining about. I would like to run through a mock release on my fork first though before merging. I'll plan on doing that tomorrow.

./.github/workflows/devel.yaml
  Error: 42:1 [empty-lines] too many blank lines (1 > 0)

cc @TheRealHaoLiu for review as well.

jon-nfc added a commit to jon-nfc/awx-operator that referenced this pull request Jan 10, 2024
42:1 [empty-lines] too many blank lines (1 > 0)

PR ansible#1681
@jon-nfc
Copy link
Contributor Author

jon-nfc commented Jan 10, 2024

@jon-nfc I think this is a good approach. I didn't spot any issues while reading this through except for one more linting error CI is complaining about. I would like to run through a mock release on my fork first though before merging. I'll plan on doing that tomorrow.

./.github/workflows/devel.yaml
  Error: 42:1 [empty-lines] too many blank lines (1 > 0)

cc @TheRealHaoLiu for review as well.

Righto, I'll have everything good to go in the next few hours.

tomorrow

depends on where you are, where i am it could still be yesterday for you.

IMAGE_TAG_BASE=ghcr.io/${{ github.repository_owner }}/awx-operator \
VERSION=${{ github.event.inputs.version }} make docker-build docker-push
--build-arg OPERATOR_VERSION=${{ github.event.inputs.version }}" \
IMG=ghcr.io/${{ github.repository_owner }}/${{ github.repository }}:${{ github.event.inputs.version }} \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
IMG=ghcr.io/${{ github.repository_owner }}/${{ github.repository }}:${{ github.event.inputs.version }} \
IMG=ghcr.io/${{ github.repository }}:${{ github.event.inputs.version }} \

We need to remove ${{ github.repository_owner }} here because it is already there from when it is set during the checkout:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

- name: Checkout awx-operator
uses: actions/checkout@v3
with:
repository: ${{ github.repository_owner }}/awx-operator
path: awx-operator

I did not know this, however note taken.

question, re variable preceedence

- name: Checkout awx
uses: actions/checkout@v3
with:
repository: ${{ github.repository_owner }}/awx
path: awx
- name: Checkout awx-operator
uses: actions/checkout@v3
with:
repository: ${{ github.repository_owner }}/awx-operator
path: awx-operator

there are two checkouts, which variable takes precedence? first set, last set?

jon-nfc added a commit to jon-nfc/awx-operator that referenced this pull request Jan 11, 2024
molecule (--skip-tags=replicas): .github/workflows/devel.yaml#L41
41:65 [new-line-at-end-of-file] no new line character at the end of file

PR ansible#1681
jon-nfc added a commit to jon-nfc/awx-operator that referenced this pull request Jan 11, 2024
molecule (--skip-tags=replicas): .github/workflows/devel.yaml#L41
41:65 [new-line-at-end-of-file] no new line character at the end of file

PR ansible#1681
run: |
docker buildx imagetools create \
ghcr.io/${{ github.repository_owner }}/${{ github.repository }}:${{ github.sha }} \
--tag quay.io/ansible/${{ github.repository }}:devel
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't know if there is an issue with the linter but it inconsistent.

A default set of platforms is specified and will build those if var `PLATFORMS` is not specified on the CLI

Issue ansible#31 fixes ansible#1680
1. builds containers and stores @ghcr with tag that is git commit.
2. pushes the image to quay.io/ansible with tag devel.

Issue ansible#31
It's not good practice to rebuild images as the scenario exists where the same code base builds different images. As such only use images from a single build point.

1. Pulls the image that matches the git commit
2. Tag image with the specified version and push to ghcr

Issue ansible#31
…i-arch manifest

1. Pulls the complete manifest containing all architectures that matches the version from ghcr
2. Tag manifest with the specified version and publish to quay.io
3. Tag manifest as 'latest' and publish to quay.io

fixes ansible#31
As there are ENV vars set by the build process, there needs to be a way of passing them at build time

PR ansible#1681
As part of the release process, env vars for the operator and awx must be set. As such the image needs to be rebuilt.

PR ansible#1681
42:1 [empty-lines] too many blank lines (1 > 0)

PR ansible#1681
molecule (--skip-tags=replicas): .github/workflows/devel.yaml#L41
41:65 [new-line-at-end-of-file] no new line character at the end of file

PR ansible#1681
@jon-nfc
Copy link
Contributor Author

jon-nfc commented Jan 16, 2024

branch updated with rebase.

PR ready for review

Comment on lines 50 to 51
ghcr.io/${{ github.repository_owner }}/${{ github.repository }}:${TAG_NAME} \
--tag quay.io/ansible/${{ github.repository }}:${TAG_NAME}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

based on https://docs.github.com/en/enterprise-cloud@latest/actions/learn-github-actions/variables#default-environment-variables

GITHUB_REPOSITORY | The owner and repository name. For example, octocat/Hello-World.

i think we need to take in a variable for "which quay repo to push this to"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TheRealHaoLiu

i have an outstanding question above re this from the review that @rooftopcellist did. That's why the discussions are not resolved as the next reviewer can see and resolve as completed and as appropriate.

#1681 (comment). this question still remains!!!!

@rooftopcellist
Copy link
Member

@jon-nfc ,

@TheRealHaoLiu and I tested out this workflow on our fork today and got it working with the changes in this PR.

Mainly, we modified how it runs so that we could test it on a fork and push the images to our personal quay registry so as not to affect the official awx-operator images while testing.

There were also a few variable refs that needed to be changed for the workflow to work. Notably:

  • ghcr.io/${{ github.repository_owner }}/${{ github.repository }} evaluates to ghcr.io/ansible/ansible/awx-operator, which failed.

I also changed how it creates the draft release so that it uses the github action for it rather than custom logic to git the GitHub API.

I opened a PR to your branch here:

Note that a variable needs to be created in the GitHub repo if you want to override the default when triggering the promote pipeline by publishing the draft release:
image

Here is the successful test release on my fork:

@rooftopcellist rooftopcellist merged commit 7e2c2bf into ansible:devel Jan 16, 2024
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants