Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

latest helm chart doesn't get recreated with the update of the scylla-operator images since Dec 1, 2023 leading to failures #1690

Closed
vponomaryov opened this issue Jan 17, 2024 · 11 comments
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.

Comments

@vponomaryov
Copy link
Contributor

vponomaryov commented Jan 17, 2024

What happened?

The GKE deployments started failing with the following error in the scylla-operator pods:

2024/01/14 00:44:06 maxprocs: Leaving GOMAXPROCS=2: CPU quota undefined
I0114 00:44:06.949317       1 clientcmd/merged_client_builder.go:121] Using in-cluster configuration
Error: operator image can't be empty

The reason for it is appeared incompatibility of the latest helm chart (Dec 1, 2023) and the latest operator image (Jan 15, 2024):
Screenshot from 2024-01-17 14-58-19

Screenshot from 2024-01-17 15-03-45

What did you expect to happen?

Scylla-operator must start correctly

How can we reproduce it (as minimally and precisely as possible)?

Deploy latest Scylla-operator using helm chart.

Scylla Operator version

v1.12.0-alpha.0-144-g60f7824

Kubernetes platform name and version

GKE, v1.27.3

Please attach the must-gather archive.

Logs:

Jenkins job URL
Argus

Anything else we need to know?

No response

@vponomaryov vponomaryov added the kind/bug Categorizes issue or PR as related to a bug. label Jan 17, 2024
@scylla-operator-bot scylla-operator-bot bot added the needs-priority Indicates a PR lacks a `priority/foo` label and requires one. label Jan 17, 2024
@vponomaryov
Copy link
Contributor Author

On 7th of January the bug was absent yet.
So, some of the merged commits for last 10 days did add the bug described here.

@vponomaryov
Copy link
Contributor Author

vponomaryov commented Jan 17, 2024

Scylla-operator deployment:

apiVersion: apps/v1
kind: Deployment
metadata:
  annotations:
    deployment.kubernetes.io/revision: "1"
    meta.helm.sh/release-name: scylla-operator
    meta.helm.sh/release-namespace: scylla-operator
  creationTimestamp: "2024-01-14T00:33:03Z"
  generation: 1
  labels:
    app.kubernetes.io/instance: scylla-operator
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: scylla-operator
  name: scylla-operator
  namespace: scylla-operator
  resourceVersion: "10087"
  uid: 0656d1bd-280a-46bb-a896-5831cd947926
spec:
  ...
  template:
    metadata:
      creationTimestamp: null
      labels:
        app.kubernetes.io/instance: scylla-operator
        app.kubernetes.io/name: scylla-operator
    spec:
      affinity:
        ...
      containers:
      - args:
        - operator
        - --loglevel=4
        env:
        - name: SCYLLA_OPERATOR_POD_NAME
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: metadata.name
        image: scylladb/scylla-operator:latest
        imagePullPolicy: IfNotPresent
        name: scylla-operator
        resources:
          requests:
            cpu: 100m
            memory: 20Mi
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      serviceAccount: scylla-operator
      serviceAccountName: scylla-operator
      terminationGracePeriodSeconds: 10
status:
  ...

The SCYLLA_OPERATOR_IMAGE env var is absent, but it is expected according to the following:

Looks like something cuts-off this env var and adds the SCYLLA_OPERATOR_POD_NAME instead of it.

@vponomaryov
Copy link
Contributor Author

Following change to the helm spec is not applied de-facto: 7d9d80e

So, @tnozicka and @zimnx , I assume that the latest helm chart was not recreated.

@vponomaryov
Copy link
Contributor Author

Here is the reason:
Screenshot from 2024-01-17 14-58-19

The latest helm chart is of the 1 Dec 2023!

@vponomaryov vponomaryov changed the title Scylla-operator pods fail with the Error: operator image can't be empty error using helm latest helm chart doesn't get recreated with the update of the scylla-operator images since Dec 1, 2023 Jan 17, 2024
@vponomaryov vponomaryov changed the title latest helm chart doesn't get recreated with the update of the scylla-operator images since Dec 1, 2023 latest helm chart doesn't get recreated with the update of the scylla-operator images since Dec 1, 2023 leading to failures Jan 17, 2024
@vponomaryov
Copy link
Contributor Author

Updated the bug description.

@tnozicka
Copy link
Member

I agree that seems to be a publishing issue

@vponomaryov
Copy link
Contributor Author

This bug must be closed only when new latest helm chart gets rebuild.
So, need to wait for some of the PRs to be merged in the scylla-operator repo.

@vponomaryov vponomaryov reopened this Jan 18, 2024
@ku9nov
Copy link

ku9nov commented Jan 18, 2024

I have the same problem with DOKS.

Thank you for bringing up this topic for discussion.

@vponomaryov
Copy link
Contributor Author

@tnozicka

$ helm repo add scylla-operator https://storage.googleapis.com/scylla-operator-charts/latest
$ helm search repo scylla-operator/scylla-operator --devel --versions -o yaml | grep 76
  version: v1.11.0-10-g92763c2
  version: v1.10.0-alpha.0-11-g7a2ef76-nightly
  version: v1.9.0-alpha.1-60-ga449676-nightly
  version: v1.9.0-alpha.1-60-ga449676
  version: v1.9.0-alpha.1-4-g2d76516
  version: v1.7.0-alpha.0-76-g44b41f9
  version: v1.7.0-alpha.0-39-gaebf760-nightly
  version: v1.7.0-alpha.0-39-gaebf760
  version: v1.7.0-alpha.0-37-g6869766-nightly
  version: v1.7.0-alpha.0-37-g6869766
  version: v1.7.0-alpha.0-23-g830a762
  version: v1.5.0-alpha.0-53-gc2a76eb
  version: v1.4.0-beta.0-4-g0764538

The expected v1.12.0-alpha.1-76-g9fbd70d-latest new version doesn't get listed by the helm binary.

@vponomaryov vponomaryov reopened this Jan 22, 2024
@tnozicka
Copy link
Member

should be fixed now https://prow.scylla-operator.scylladb.com/view/gs/scylla-operator-prow/logs/post-scylla-operator-master-helm/1749707363855634432#1:helm-build-build-log.txt%3A17
(I see this chart being present in the index in the GCS bucket)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
Projects
None yet
Development

No branches or pull requests

3 participants