Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple running deployments reported in a test - invariant violated #16003

Closed
smarterclayton opened this issue Aug 26, 2017 · 19 comments
Closed
Assignees
Labels
component/apps kind/test-flake Categorizes issue or PR as related to test flakes. priority/P2 sig/master
Milestone

Comments

@smarterclayton
Copy link
Contributor

https://ci.openshift.redhat.com/jenkins/job/zz_origin_gce_image/406/testReport/junit/(root)/Extended/deploymentconfigs_with_revision_history_limits__Conformance__should_never_persist_more_old_deployments_than_acceptable_after_being_observed_by_the_controller/

Extended.deploymentconfigs with revision history limits [Conformance] should never persist more old deployments than acceptable after being observed by the controller (from (empty))

/go/src/github.com/openshift/origin/test/extended/deployments/deployments.go:891
the current deployment needs to have finished before attempting to trigger a new deployment through configuration change
Expected error:
    <*errors.errorString | 0xc420895260>: {
        s: "found multiple running deployments: [history-limit-5 history-limit-7]",
    }
    found multiple running deployments: [history-limit-5 history-limit-7]
not to have occurred
/go/src/github.com/openshift/origin/test/extended/deployments/deployments.go:851
Standard Output

[BeforeEach] [Top Level]
  /go/src/github.com/openshift/origin/test/extended/util/test.go:52
[BeforeEach] deploymentconfigs
  /go/src/github.com/openshift/origin/vendor/k8s.io/kubernetes/test/e2e/framework/framework.go:130
STEP: Creating a kubernetes client
Aug 26 19:11:01.814: INFO: >>> kubeConfig: /tmp/cluster-admin.kubeconfig
STEP: Building a namespace api object
Aug 26 19:11:02.438: INFO: configPath is now "/tmp/extended-test-cli-deployment-v5q4b-cvh4c-user.kubeconfig"
Aug 26 19:11:02.438: INFO: The user is now "extended-test-cli-deployment-v5q4b-cvh4c-user"
Aug 26 19:11:02.438: INFO: Creating project "extended-test-cli-deployment-v5q4b-cvh4c"
STEP: Waiting for a default service account to be provisioned in namespace
[It] should never persist more old deployments than acceptable after being observed by the controller
  /go/src/github.com/openshift/origin/test/extended/deployments/deployments.go:891
Aug 26 19:11:03.779: INFO: Running 'oc create --config=/tmp/extended-test-cli-deployment-v5q4b-cvh4c-user.kubeconfig --namespace=extended-test-cli-deployment-v5q4b-cvh4c -f /tmp/fixture-testdata-dir982692598/test/extended/testdata/deployments/deployment-history-limit.yaml -o name'
Aug 26 19:12:03.108: INFO: Latest rollout of dc/history-limit (rc/history-limit-1) is complete.
Aug 26 19:12:03.108: INFO: 00: triggering a new deployment with config change
Aug 26 19:12:03.108: INFO: Running 'oc set env --config=/tmp/extended-test-cli-deployment-v5q4b-cvh4c-user.kubeconfig --namespace=extended-test-cli-deployment-v5q4b-cvh4c dc/history-limit A=0'
Aug 26 19:12:16.334: INFO: Latest rollout of dc/history-limit (rc/history-limit-2) is complete.
Aug 26 19:12:16.334: INFO: 01: triggering a new deployment with config change
Aug 26 19:12:16.334: INFO: Running 'oc set env --config=/tmp/extended-test-cli-deployment-v5q4b-cvh4c-user.kubeconfig --namespace=extended-test-cli-deployment-v5q4b-cvh4c dc/history-limit A=1'
Aug 26 19:12:17.141: INFO: Latest rollout of dc/history-limit (rc/history-limit-2) is complete.
Aug 26 19:12:17.141: INFO: 02: triggering a new deployment with config change
Aug 26 19:12:17.141: INFO: Running 'oc set env --config=/tmp/extended-test-cli-deployment-v5q4b-cvh4c-user.kubeconfig --namespace=extended-test-cli-deployment-v5q4b-cvh4c dc/history-limit A=2'
Aug 26 19:12:38.779: INFO: Latest rollout of dc/history-limit (rc/history-limit-4) is complete.
Aug 26 19:12:38.779: INFO: 03: triggering a new deployment with config change
Aug 26 19:12:38.779: INFO: Running 'oc set env --config=/tmp/extended-test-cli-deployment-v5q4b-cvh4c-user.kubeconfig --namespace=extended-test-cli-deployment-v5q4b-cvh4c dc/history-limit A=3'
Aug 26 19:12:49.002: INFO: Latest rollout of dc/history-limit (rc/history-limit-5) is complete.
Aug 26 19:12:49.002: INFO: 04: triggering a new deployment with config change
Aug 26 19:12:49.002: INFO: Running 'oc set env --config=/tmp/extended-test-cli-deployment-v5q4b-cvh4c-user.kubeconfig --namespace=extended-test-cli-deployment-v5q4b-cvh4c dc/history-limit A=4'
Aug 26 19:13:00.100: INFO: Latest rollout of dc/history-limit (rc/history-limit-6) is complete.
Aug 26 19:13:00.100: INFO: 05: triggering a new deployment with config change
Aug 26 19:13:00.100: INFO: Running 'oc set env --config=/tmp/extended-test-cli-deployment-v5q4b-cvh4c-user.kubeconfig --namespace=extended-test-cli-deployment-v5q4b-cvh4c dc/history-limit A=5'
Aug 26 19:13:01.567: INFO: Latest rollout of dc/history-limit (rc/history-limit-6) is complete.
Aug 26 19:13:01.567: INFO: 06: triggering a new deployment with config change
Aug 26 19:13:01.567: INFO: Running 'oc set env --config=/tmp/extended-test-cli-deployment-v5q4b-cvh4c-user.kubeconfig --namespace=extended-test-cli-deployment-v5q4b-cvh4c dc/history-limit A=6'
[AfterEach] with revision history limits [Conformance]
  /go/src/github.com/openshift/origin/test/extended/deployments/deployments.go:839
Aug 26 19:13:02.546: INFO: Running 'oc get --config=/tmp/extended-test-cli-deployment-v5q4b-cvh4c-user.kubeconfig --namespace=extended-test-cli-deployment-v5q4b-cvh4c dc/history-limit -o yaml'
Aug 26 19:13:02.957: INFO: 
apiVersion: v1
kind: DeploymentConfig
metadata:
  creationTimestamp: 2017-08-26T23:11:05Z
  generation: 16
  name: history-limit
  namespace: extended-test-cli-deployment-v5q4b-cvh4c
  resourceVersion: "9742"
  selfLink: /oapi/v1/namespaces/extended-test-cli-deployment-v5q4b-cvh4c/deploymentconfigs/history-limit
  uid: d61d6960-8ab3-11e7-9509-42010a800004
spec:
  replicas: 1
  revisionHistoryLimit: 3
  selector:
    name: history-limit
  strategy:
    activeDeadlineSeconds: 21600
    resources: {}
    rollingParams:
      intervalSeconds: 1
      maxSurge: 25%
      maxUnavailable: 25%
      timeoutSeconds: 600
      updatePeriodSeconds: 1
    type: Rolling
  template:
    metadata:
      creationTimestamp: null
      labels:
        name: history-limit
    spec:
      containers:
      - command:
        - /bin/sleep
        - "100"
        env:
        - name: A
          value: "6"
        image: docker.io/alpine:3.6
        imagePullPolicy: IfNotPresent
        name: myapp
        resources: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      terminationGracePeriodSeconds: 30
  test: false
  triggers:
  - type: ConfigChange
status:
  availableReplicas: 1
  conditions:
  - lastTransitionTime: 2017-08-26T23:12:02Z
    lastUpdateTime: 2017-08-26T23:12:02Z
    message: Deployment config has minimum availability.
    status: "True"
    type: Available
  - lastTransitionTime: 2017-08-26T23:13:02Z
    lastUpdateTime: 2017-08-26T23:13:02Z
    message: replication controller "history-limit-7" is waiting for pod "history-limit-7-deploy"
      to run
    status: Unknown
    type: Progressing
  details:
    causes:
    - type: ConfigChange
    message: config change
  latestVersion: 8
  observedGeneration: 15
  readyReplicas: 1
  replicas: 1
  unavailableReplicas: 0
  updatedReplicas: 0

Aug 26 19:13:03.236: INFO: Running 'oc get --config=/tmp/extended-test-cli-deployment-v5q4b-cvh4c-user.kubeconfig --namespace=extended-test-cli-deployment-v5q4b-cvh4c rc/history-limit-3 -o yaml'
Aug 26 19:13:03.638: INFO: 
apiVersion: v1
kind: ReplicationController
metadata:
  annotations:
    kubectl.kubernetes.io/desired-replicas: "1"
    kubectl.kubernetes.io/update-source-id: history-limit-2:f8dc3f4f-8ab3-11e7-9509-42010a800004
    openshift.io/deployer-pod.name: history-limit-3-deploy
    openshift.io/deployment-config.latest-version: "3"
    openshift.io/deployment-config.name: history-limit
    openshift.io/deployment.cancelled: "true"
    openshift.io/deployment.phase: Failed
    openshift.io/deployment.replicas: "0"
    openshift.io/deployment.status-reason: newer deployment was found running
    openshift.io/encoded-deployment-config: |
      {"kind":"DeploymentConfig","apiVersion":"v1","metadata":{"name":"history-limit","namespace":"extended-test-cli-deployment-v5q4b-cvh4c","selfLink":"/apis/apps.openshift.io/v1/namespaces/extended-test-cli-deployment-v5q4b-cvh4c/deploymentconfigs/history-limit","uid":"d61d6960-8ab3-11e7-9509-42010a800004","resourceVersion":"6914","generation":6,"creationTimestamp":"2017-08-26T23:11:05Z"},"spec":{"strategy":{"type":"Rolling","rollingParams":{"updatePeriodSeconds":1,"intervalSeconds":1,"timeoutSeconds":600,"maxUnavailable":"25%","maxSurge":"25%"},"resources":{},"activeDeadlineSeconds":21600},"triggers":[{"type":"ConfigChange"}],"replicas":1,"revisionHistoryLimit":3,"test":false,"selector":{"name":"history-limit"},"template":{"metadata":{"creationTimestamp":null,"labels":{"name":"history-limit"}},"spec":{"containers":[{"name":"myapp","image":"docker.io/alpine:3.6","command":["/bin/sleep","100"],"env":[{"name":"A","value":"1"}],"resources":{},"terminationMessagePath":"/dev/termination-log","terminationMessagePolicy":"File","imagePullPolicy":"IfNotPresent"}],"restartPolicy":"Always","terminationGracePeriodSeconds":30,"dnsPolicy":"ClusterFirst","securityContext":{},"schedulerName":"default-scheduler"}}},"status":{"latestVersion":3,"observedGeneration":5,"replicas":1,"updatedReplicas":1,"availableReplicas":1,"unavailableReplicas":0,"details":{"message":"config change","causes":[{"type":"ConfigChange"}]},"conditions":[{"type":"Available","status":"True","lastUpdateTime":"2017-08-26T23:12:02Z","lastTransitionTime":"2017-08-26T23:12:02Z","message":"Deployment config has minimum availability."},{"type":"Progressing","status":"True","lastUpdateTime":"2017-08-26T23:12:15Z","lastTransitionTime":"2017-08-26T23:12:13Z","reason":"NewReplicationControllerAvailable","message":"replication controller \"history-limit-2\" successfully rolled out"}],"readyReplicas":1}}
  creationTimestamp: 2017-08-26T23:12:17Z
  generation: 3
  labels:
    openshift.io/deployment-config.name: history-limit
  name: history-limit-3
  namespace: extended-test-cli-deployment-v5q4b-cvh4c
  ownerReferences:
  - apiVersion: apps.openshift.io/v1
    blockOwnerDeletion: true
    controller: true
    kind: DeploymentConfig
    name: history-limit
    uid: d61d6960-8ab3-11e7-9509-42010a800004
  resourceVersion: "7830"
  selfLink: /api/v1/namespaces/extended-test-cli-deployment-v5q4b-cvh4c/replicationcontrollers/history-limit-3
  uid: 00e3b427-8ab4-11e7-9509-42010a800004
spec:
  replicas: 0
  selector:
    deployment: history-limit-3
    deploymentconfig: history-limit
    name: history-limit
  template:
    metadata:
      annotations:
        openshift.io/deployment-config.latest-version: "3"
        openshift.io/deployment-config.name: history-limit
        openshift.io/deployment.name: history-limit-3
      creationTimestamp: null
      labels:
        deployment: history-limit-3
        deploymentconfig: history-limit
        name: history-limit
    spec:
      containers:
      - command:
        - /bin/sleep
        - "100"
        env:
        - name: A
          value: "1"
        image: docker.io/alpine:3.6
        imagePullPolicy: IfNotPresent
        name: myapp
        resources: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      terminationGracePeriodSeconds: 30
status:
  observedGeneration: 3
  replicas: 0

Aug 26 19:13:03.638: INFO: Running 'oc get --config=/tmp/extended-test-cli-deployment-v5q4b-cvh4c-user.kubeconfig --namespace=extended-test-cli-deployment-v5q4b-cvh4c rc/history-limit-4 -o yaml'
Aug 26 19:13:04.126: INFO: 
apiVersion: v1
kind: ReplicationController
metadata:
  annotations:
    kubectl.kubernetes.io/original-replicas: "1"
    openshift.io/deployer-pod.name: history-limit-4-deploy
    openshift.io/deployment-config.latest-version: "4"
    openshift.io/deployment-config.name: history-limit
    openshift.io/deployment.phase: Complete
    openshift.io/deployment.replicas: ""
    openshift.io/deployment.status-reason: config change
    openshift.io/encoded-deployment-config: |
      {"kind":"DeploymentConfig","apiVersion":"v1","metadata":{"name":"history-limit","namespace":"extended-test-cli-deployment-v5q4b-cvh4c","selfLink":"/apis/apps.openshift.io/v1/namespaces/extended-test-cli-deployment-v5q4b-cvh4c/deploymentconfigs/history-limit","uid":"d61d6960-8ab3-11e7-9509-42010a800004","resourceVersion":"6993","generation":8,"creationTimestamp":"2017-08-26T23:11:05Z"},"spec":{"strategy":{"type":"Rolling","rollingParams":{"updatePeriodSeconds":1,"intervalSeconds":1,"timeoutSeconds":600,"maxUnavailable":"25%","maxSurge":"25%"},"resources":{},"activeDeadlineSeconds":21600},"triggers":[{"type":"ConfigChange"}],"replicas":1,"revisionHistoryLimit":3,"test":false,"selector":{"name":"history-limit"},"template":{"metadata":{"creationTimestamp":null,"labels":{"name":"history-limit"}},"spec":{"containers":[{"name":"myapp","image":"docker.io/alpine:3.6","command":["/bin/sleep","100"],"env":[{"name":"A","value":"2"}],"resources":{},"terminationMessagePath":"/dev/termination-log","terminationMessagePolicy":"File","imagePullPolicy":"IfNotPresent"}],"restartPolicy":"Always","terminationGracePeriodSeconds":30,"dnsPolicy":"ClusterFirst","securityContext":{},"schedulerName":"default-scheduler"}}},"status":{"latestVersion":4,"observedGeneration":7,"replicas":1,"updatedReplicas":0,"availableReplicas":1,"unavailableReplicas":0,"details":{"message":"config change","causes":[{"type":"ConfigChange"}]},"conditions":[{"type":"Available","status":"True","lastUpdateTime":"2017-08-26T23:12:02Z","lastTransitionTime":"2017-08-26T23:12:02Z","message":"Deployment config has minimum availability."},{"type":"Progressing","status":"Unknown","lastUpdateTime":"2017-08-26T23:12:17Z","lastTransitionTime":"2017-08-26T23:12:17Z","message":"replication controller \"history-limit-3\" is waiting for pod \"history-limit-3-deploy\" to run"}],"readyReplicas":1}}
  creationTimestamp: 2017-08-26T23:12:27Z
  generation: 3
  labels:
    openshift.io/deployment-config.name: history-limit
  name: history-limit-4
  namespace: extended-test-cli-deployment-v5q4b-cvh4c
  ownerReferences:
  - apiVersion: apps.openshift.io/v1
    blockOwnerDeletion: true
    controller: true
    kind: DeploymentConfig
    name: history-limit
    uid: d61d6960-8ab3-11e7-9509-42010a800004
  resourceVersion: "9718"
  selfLink: /api/v1/namespaces/extended-test-cli-deployment-v5q4b-cvh4c/replicationcontrollers/history-limit-4
  uid: 06dd7f1b-8ab4-11e7-9509-42010a800004
spec:
  replicas: 0
  selector:
    deployment: history-limit-4
    deploymentconfig: history-limit
    name: history-limit
  template:
    metadata:
      annotations:
        openshift.io/deployment-config.latest-version: "4"
        openshift.io/deployment-config.name: history-limit
        openshift.io/deployment.name: history-limit-4
      creationTimestamp: null
      labels:
        deployment: history-limit-4
        deploymentconfig: history-limit
        name: history-limit
    spec:
      containers:
      - command:
        - /bin/sleep
        - "100"
        env:
        - name: A
          value: "2"
        image: docker.io/alpine:3.6
        imagePullPolicy: IfNotPresent
        name: myapp
        resources: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      terminationGracePeriodSeconds: 30
status:
  observedGeneration: 3
  replicas: 0

Aug 26 19:13:04.126: INFO: Running 'oc get --config=/tmp/extended-test-cli-deployment-v5q4b-cvh4c-user.kubeconfig --namespace=extended-test-cli-deployment-v5q4b-cvh4c rc/history-limit-5 -o yaml'
Aug 26 19:13:04.504: INFO: 
apiVersion: v1
kind: ReplicationController
metadata:
  annotations:
    kubectl.kubernetes.io/original-replicas: "1"
    openshift.io/deployer-pod.name: history-limit-5-deploy
    openshift.io/deployment-config.latest-version: "5"
    openshift.io/deployment-config.name: history-limit
    openshift.io/deployment.phase: Complete
    openshift.io/deployment.replicas: ""
    openshift.io/deployment.status-reason: config change
    openshift.io/encoded-deployment-config: |
      {"kind":"DeploymentConfig","apiVersion":"v1","metadata":{"name":"history-limit","namespace":"extended-test-cli-deployment-v5q4b-cvh4c","selfLink":"/apis/apps.openshift.io/v1/namespaces/extended-test-cli-deployment-v5q4b-cvh4c/deploymentconfigs/history-limit","uid":"d61d6960-8ab3-11e7-9509-42010a800004","resourceVersion":"8338","generation":10,"creationTimestamp":"2017-08-26T23:11:05Z"},"spec":{"strategy":{"type":"Rolling","rollingParams":{"updatePeriodSeconds":1,"intervalSeconds":1,"timeoutSeconds":600,"maxUnavailable":"25%","maxSurge":"25%"},"resources":{},"activeDeadlineSeconds":21600},"triggers":[{"type":"ConfigChange"}],"replicas":1,"revisionHistoryLimit":3,"test":false,"selector":{"name":"history-limit"},"template":{"metadata":{"creationTimestamp":null,"labels":{"name":"history-limit"}},"spec":{"containers":[{"name":"myapp","image":"docker.io/alpine:3.6","command":["/bin/sleep","100"],"env":[{"name":"A","value":"3"}],"resources":{},"terminationMessagePath":"/dev/termination-log","terminationMessagePolicy":"File","imagePullPolicy":"IfNotPresent"}],"restartPolicy":"Always","terminationGracePeriodSeconds":30,"dnsPolicy":"ClusterFirst","securityContext":{},"schedulerName":"default-scheduler"}}},"status":{"latestVersion":5,"observedGeneration":9,"replicas":1,"updatedReplicas":1,"availableReplicas":1,"unavailableReplicas":0,"details":{"message":"config change","causes":[{"type":"ConfigChange"}]},"conditions":[{"type":"Available","status":"True","lastUpdateTime":"2017-08-26T23:12:02Z","lastTransitionTime":"2017-08-26T23:12:02Z","message":"Deployment config has minimum availability."},{"type":"Progressing","status":"True","lastUpdateTime":"2017-08-26T23:12:38Z","lastTransitionTime":"2017-08-26T23:12:35Z","reason":"NewReplicationControllerAvailable","message":"replication controller \"history-limit-4\" successfully rolled out"}],"readyReplicas":1}}
  creationTimestamp: 2017-08-26T23:12:39Z
  generation: 3
  labels:
    openshift.io/deployment-config.name: history-limit
  name: history-limit-5
  namespace: extended-test-cli-deployment-v5q4b-cvh4c
  ownerReferences:
  - apiVersion: apps.openshift.io/v1
    blockOwnerDeletion: true
    controller: true
    kind: DeploymentConfig
    name: history-limit
    uid: d61d6960-8ab3-11e7-9509-42010a800004
  resourceVersion: "9720"
  selfLink: /api/v1/namespaces/extended-test-cli-deployment-v5q4b-cvh4c/replicationcontrollers/history-limit-5
  uid: 0e1555c4-8ab4-11e7-9509-42010a800004
spec:
  replicas: 0
  selector:
    deployment: history-limit-5
    deploymentconfig: history-limit
    name: history-limit
  template:
    metadata:
      annotations:
        openshift.io/deployment-config.latest-version: "5"
        openshift.io/deployment-config.name: history-limit
        openshift.io/deployment.name: history-limit-5
      creationTimestamp: null
      labels:
        deployment: history-limit-5
        deploymentconfig: history-limit
        name: history-limit
    spec:
      containers:
      - command:
        - /bin/sleep
        - "100"
        env:
        - name: A
          value: "3"
        image: docker.io/alpine:3.6
        imagePullPolicy: IfNotPresent
        name: myapp
        resources: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      terminationGracePeriodSeconds: 30
status:
  observedGeneration: 3
  replicas: 0

Aug 26 19:13:04.504: INFO: Running 'oc get --config=/tmp/extended-test-cli-deployment-v5q4b-cvh4c-user.kubeconfig --namespace=extended-test-cli-deployment-v5q4b-cvh4c rc/history-limit-6 -o yaml'
Aug 26 19:13:04.921: INFO: 
apiVersion: v1
kind: ReplicationController
metadata:
  annotations:
    openshift.io/deployer-pod.name: history-limit-6-deploy
    openshift.io/deployment-config.latest-version: "6"
    openshift.io/deployment-config.name: history-limit
    openshift.io/deployment.phase: Complete
    openshift.io/deployment.replicas: ""
    openshift.io/deployment.status-reason: config change
    openshift.io/encoded-deployment-config: |
      {"kind":"DeploymentConfig","apiVersion":"v1","metadata":{"name":"history-limit","namespace":"extended-test-cli-deployment-v5q4b-cvh4c","selfLink":"/apis/apps.openshift.io/v1/namespaces/extended-test-cli-deployment-v5q4b-cvh4c/deploymentconfigs/history-limit","uid":"d61d6960-8ab3-11e7-9509-42010a800004","resourceVersion":"9043","generation":12,"creationTimestamp":"2017-08-26T23:11:05Z"},"spec":{"strategy":{"type":"Rolling","rollingParams":{"updatePeriodSeconds":1,"intervalSeconds":1,"timeoutSeconds":600,"maxUnavailable":"25%","maxSurge":"25%"},"resources":{},"activeDeadlineSeconds":21600},"triggers":[{"type":"ConfigChange"}],"replicas":1,"revisionHistoryLimit":3,"test":false,"selector":{"name":"history-limit"},"template":{"metadata":{"creationTimestamp":null,"labels":{"name":"history-limit"}},"spec":{"containers":[{"name":"myapp","image":"docker.io/alpine:3.6","command":["/bin/sleep","100"],"env":[{"name":"A","value":"4"}],"resources":{},"terminationMessagePath":"/dev/termination-log","terminationMessagePolicy":"File","imagePullPolicy":"IfNotPresent"}],"restartPolicy":"Always","terminationGracePeriodSeconds":30,"dnsPolicy":"ClusterFirst","securityContext":{},"schedulerName":"default-scheduler"}}},"status":{"latestVersion":6,"observedGeneration":11,"replicas":1,"updatedReplicas":1,"availableReplicas":1,"unavailableReplicas":0,"details":{"message":"config change","causes":[{"type":"ConfigChange"}]},"conditions":[{"type":"Available","status":"True","lastUpdateTime":"2017-08-26T23:12:02Z","lastTransitionTime":"2017-08-26T23:12:02Z","message":"Deployment config has minimum availability."},{"type":"Progressing","status":"True","lastUpdateTime":"2017-08-26T23:12:48Z","lastTransitionTime":"2017-08-26T23:12:45Z","reason":"NewReplicationControllerAvailable","message":"replication controller \"history-limit-5\" successfully rolled out"}],"readyReplicas":1}}
  creationTimestamp: 2017-08-26T23:12:49Z
  generation: 2
  labels:
    openshift.io/deployment-config.name: history-limit
  name: history-limit-6
  namespace: extended-test-cli-deployment-v5q4b-cvh4c
  ownerReferences:
  - apiVersion: apps.openshift.io/v1
    blockOwnerDeletion: true
    controller: true
    kind: DeploymentConfig
    name: history-limit
    uid: d61d6960-8ab3-11e7-9509-42010a800004
  resourceVersion: "9532"
  selfLink: /api/v1/namespaces/extended-test-cli-deployment-v5q4b-cvh4c/replicationcontrollers/history-limit-6
  uid: 14256b41-8ab4-11e7-9509-42010a800004
spec:
  replicas: 1
  selector:
    deployment: history-limit-6
    deploymentconfig: history-limit
    name: history-limit
  template:
    metadata:
      annotations:
        openshift.io/deployment-config.latest-version: "6"
        openshift.io/deployment-config.name: history-limit
        openshift.io/deployment.name: history-limit-6
      creationTimestamp: null
      labels:
        deployment: history-limit-6
        deploymentconfig: history-limit
        name: history-limit
    spec:
      containers:
      - command:
        - /bin/sleep
        - "100"
        env:
        - name: A
          value: "4"
        image: docker.io/alpine:3.6
        imagePullPolicy: IfNotPresent
        name: myapp
        resources: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      terminationGracePeriodSeconds: 30
status:
  availableReplicas: 1
  fullyLabeledReplicas: 1
  observedGeneration: 2
  readyReplicas: 1
  replicas: 1

Aug 26 19:13:04.921: INFO: Running 'oc get --config=/tmp/extended-test-cli-deployment-v5q4b-cvh4c-user.kubeconfig --namespace=extended-test-cli-deployment-v5q4b-cvh4c rc/history-limit-7 -o yaml'
Aug 26 19:13:05.526: INFO: 
apiVersion: v1
kind: ReplicationController
metadata:
  annotations:
    kubectl.kubernetes.io/desired-replicas: "1"
    openshift.io/deployer-pod.name: history-limit-7-deploy
    openshift.io/deployment-config.latest-version: "7"
    openshift.io/deployment-config.name: history-limit
    openshift.io/deployment.cancelled: "true"
    openshift.io/deployment.phase: Pending
    openshift.io/deployment.replicas: "0"
    openshift.io/deployment.status-reason: newer deployment was found running
    openshift.io/encoded-deployment-config: |
      {"kind":"DeploymentConfig","apiVersion":"v1","metadata":{"name":"history-limit","namespace":"extended-test-cli-deployment-v5q4b-cvh4c","selfLink":"/apis/apps.openshift.io/v1/namespaces/extended-test-cli-deployment-v5q4b-cvh4c/deploymentconfigs/history-limit","uid":"d61d6960-8ab3-11e7-9509-42010a800004","resourceVersion":"9672","generation":14,"creationTimestamp":"2017-08-26T23:11:05Z"},"spec":{"strategy":{"type":"Rolling","rollingParams":{"updatePeriodSeconds":1,"intervalSeconds":1,"timeoutSeconds":600,"maxUnavailable":"25%","maxSurge":"25%"},"resources":{},"activeDeadlineSeconds":21600},"triggers":[{"type":"ConfigChange"}],"replicas":1,"revisionHistoryLimit":3,"test":false,"selector":{"name":"history-limit"},"template":{"metadata":{"creationTimestamp":null,"labels":{"name":"history-limit"}},"spec":{"containers":[{"name":"myapp","image":"docker.io/alpine:3.6","command":["/bin/sleep","100"],"env":[{"name":"A","value":"5"}],"resources":{},"terminationMessagePath":"/dev/termination-log","terminationMessagePolicy":"File","imagePullPolicy":"IfNotPresent"}],"restartPolicy":"Always","terminationGracePeriodSeconds":30,"dnsPolicy":"ClusterFirst","securityContext":{},"schedulerName":"default-scheduler"}}},"status":{"latestVersion":7,"observedGeneration":13,"replicas":1,"updatedReplicas":1,"availableReplicas":1,"unavailableReplicas":0,"details":{"message":"config change","causes":[{"type":"ConfigChange"}]},"conditions":[{"type":"Available","status":"True","lastUpdateTime":"2017-08-26T23:12:02Z","lastTransitionTime":"2017-08-26T23:12:02Z","message":"Deployment config has minimum availability."},{"type":"Progressing","status":"True","lastUpdateTime":"2017-08-26T23:12:59Z","lastTransitionTime":"2017-08-26T23:12:56Z","reason":"NewReplicationControllerAvailable","message":"replication controller \"history-limit-6\" successfully rolled out"}],"readyReplicas":1}}
  creationTimestamp: 2017-08-26T23:13:01Z
  generation: 1
  labels:
    openshift.io/deployment-config.name: history-limit
  name: history-limit-7
  namespace: extended-test-cli-deployment-v5q4b-cvh4c
  ownerReferences:
  - apiVersion: apps.openshift.io/v1
    blockOwnerDeletion: true
    controller: true
    kind: DeploymentConfig
    name: history-limit
    uid: d61d6960-8ab3-11e7-9509-42010a800004
  resourceVersion: "9745"
  selfLink: /api/v1/namespaces/extended-test-cli-deployment-v5q4b-cvh4c/replicationcontrollers/history-limit-7
  uid: 1b6eeadc-8ab4-11e7-9509-42010a800004
spec:
  replicas: 0
  selector:
    deployment: history-limit-7
    deploymentconfig: history-limit
    name: history-limit
  template:
    metadata:
      annotations:
        openshift.io/deployment-config.latest-version: "7"
        openshift.io/deployment-config.name: history-limit
        openshift.io/deployment.name: history-limit-7
      creationTimestamp: null
      labels:
        deployment: history-limit-7
        deploymentconfig: history-limit
        name: history-limit
    spec:
      containers:
      - command:
        - /bin/sleep
        - "100"
        env:
        - name: A
          value: "5"
        image: docker.io/alpine:3.6
        imagePullPolicy: IfNotPresent
        name: myapp
        resources: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      terminationGracePeriodSeconds: 30
status:
  observedGeneration: 1
  replicas: 0
@smarterclayton
Copy link
Contributor Author

smarterclayton commented Aug 27, 2017

@smarterclayton
Copy link
Contributor Author

I was unable to find an occurrence of this in the last two days except today (twice, it looks like)

@mfojtik
Copy link
Contributor

mfojtik commented Aug 28, 2017

@tnozicka FYI

@tnozicka
Copy link
Contributor

@mfojtik I am already checking what's wrong with that. Also https://github.com/openshift/origin/pull/14954/files#diff-f2c8446f523f5b04af6def6670658089R939 started suddenly failing on the weekend but that might be just a coincidence)

@tnozicka tnozicka added the kind/test-flake Categorizes issue or PR as related to test flakes. label Aug 28, 2017
@tnozicka
Copy link
Contributor

tnozicka commented Aug 28, 2017

The reason why it's failing now might be OOMs like:

Aug 26 19:13:06.210: INFO: At 2017-08-26 19:12:48 -0400 EDT - event for history-limit-5-deploy: {kubelet ci-primg406-ig-n-ftqj} Killing: Killing container with id docker://deployment:Need to kill Pod

Still looking into the reason why the controller breaks the invariant in this case. Working theory is that it considers that deployment (RC) failed and continues with a new deploymnet but then the previously killed deploy-pod gets rescheduled to another node and runs again

Aug 26 19:13:06.210: INFO: At 2017-08-26 19:13:00 -0400 EDT - event for history-limit-5-deploy: {default-scheduler } Scheduled: Successfully assigned history-limit-5-deploy to ci-primg406-ig-n-hs6q

@smarterclayton
Copy link
Contributor Author

smarterclayton commented Aug 28, 2017 via email

@tnozicka
Copy link
Contributor

tnozicka commented Sep 4, 2017

for the record the reason why it fails: #16129 (comment)

@tnozicka
Copy link
Contributor

tnozicka commented Oct 6, 2017

I am lowering this to P2 because we just merged #14910 that should lower the changes of this happening significantly to the level it always was. There is still a very subtle race condition dependent on caches being out of sync that is addressed in #16129 which is waiting on reviews.

@smarterclayton
Copy link
Contributor Author

smarterclayton commented Oct 6, 2017 via email

@tnozicka
Copy link
Contributor

tnozicka commented Oct 6, 2017

@smarterclayton the DC controller was creating the same deployment twice which I think was the cause of this issue - see #16671 that was fixed by #14910

The subtle race being fixed in #16129 was always there but I don't mind having that for 3.7 if there is someone to review

@smarterclayton
Copy link
Contributor Author

smarterclayton commented Oct 6, 2017 via email

@tnozicka tnozicka added this to the 3.8.0 milestone Oct 9, 2017
@tnozicka
Copy link
Contributor

tnozicka commented Oct 9, 2017

Talked to @mfojtik and we are leaving this for early 3.8 as that's fairly big change this late and it's not a regression.

@bortek
Copy link

bortek commented Jan 12, 2018

I am seeing this in After upgrade to Openshift 3.7 There is always one more cancelled deployment with reason "newer deployment was found running".

Is there a bugreport for this to RH in Bugzilla?

@tnozicka
Copy link
Contributor

@bortek this race should be extremely rare and has always been there. The confirmation would be to see 2 deployer pods for that deployment running at the same time.

If so that is more likely caused by rouge kubelet:

The usual suspect for this would be 2 subsequent triggers (configChange, imageChange). Best to file an issue with steps to reproduce so we can look into it.

@bortek
Copy link

bortek commented Jan 15, 2018

I am pretty sure now I am ah hitting this bug. I tested on two different applications in different namespaces and they both behave like that. Yes they have both configChange & imageChange triggers. It happens when I start a new build which when completed initiates new deployment. It does not help if I disable configChange trigger. (Cant disable imageChange as I need it). Steps to reproduce... I just start a new build and wait till its done followed by a deploy. Here is a screenshot how it looks like.

https://photos.app.goo.gl/kkBD04ur0VaRHsaQ2

@tnozicka
Copy link
Contributor

tnozicka commented Jan 15, 2018

@bortek Unless you see 2 deployer pods running at the same time it isn't this bug. Say like

while true; do echo "=========" && oc get pods -o go-template --template '{{ range $i, $elem := .items }}{{ printf "%s - %s\n" .metadata.name .status.phase }}{{end}}'; done

Please open a new bug stating your exact openshift version, and the output of oc get dc,rc,po,is,build -o yaml so we can determine why it's happening for you. Master logs would help as well.

openshift-merge-robot added a commit that referenced this issue Jan 23, 2018
Automatic merge from submit-queue (batch tested with PRs 18233, 18068, 18228, 18227).

UPSTREAM: 58547: Send correct resource version for delete events from watch cache

Backport of kubernetes/kubernetes#58547

Watch cache was returning incorrect (old) ResourceVersion on "deleted" events breaking informers that were going back in time. This fixes it.

/assign @liggitt 
/cc @mfojtik 

Fixes #17581 #16003 and likely others
openshift-publish-robot pushed a commit to openshift/kubernetes that referenced this issue Jan 25, 2018
…e-fix-58547

Automatic merge from submit-queue (batch tested with PRs 18233, 18068, 18228, 18227).

UPSTREAM: 58547: Send correct resource version for delete events from watch cache

Backport of kubernetes#58547

Watch cache was returning incorrect (old) ResourceVersion on "deleted" events breaking informers that were going back in time. This fixes it.

/assign @liggitt
/cc @mfojtik

Fixes openshift/origin#17581 openshift/origin#16003 and likely others

Origin-commit: 042a63f8c1effc2fb911ce2cf494458872e9f8a3
openshift-publish-robot pushed a commit to openshift/kubernetes that referenced this issue Feb 27, 2018
…e-fix-58547

Automatic merge from submit-queue (batch tested with PRs 18233, 18068, 18228, 18227).

UPSTREAM: 58547: Send correct resource version for delete events from watch cache

Backport of kubernetes#58547

Watch cache was returning incorrect (old) ResourceVersion on "deleted" events breaking informers that were going back in time. This fixes it.

/assign @liggitt
/cc @mfojtik

Fixes openshift/origin#17581 openshift/origin#16003 and likely others

Origin-commit: 042a63f8c1effc2fb911ce2cf494458872e9f8a3
openshift-publish-robot pushed a commit to openshift/kubernetes-apiserver that referenced this issue Feb 28, 2018
Automatic merge from submit-queue (batch tested with PRs 18233, 18068, 18228, 18227).

UPSTREAM: 58547: Send correct resource version for delete events from watch cache

Backport of kubernetes/kubernetes#58547

Watch cache was returning incorrect (old) ResourceVersion on "deleted" events breaking informers that were going back in time. This fixes it.

/assign @liggitt
/cc @mfojtik

Fixes openshift/origin#17581 openshift/origin#16003 and likely others

Origin-commit: 042a63f8c1effc2fb911ce2cf494458872e9f8a3


Kubernetes-commit: b1d49808af3db35be42e4b705953d656a21bc201
@skynardo
Copy link

@bortek - we had same issue after upgrading from origin 3.6 to 3.7. This was the issue: #18293

@tnozicka
Copy link
Contributor

watch cache issue and kubelet should now be fixed; haven't seen this for a while

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/apps kind/test-flake Categorizes issue or PR as related to test flakes. priority/P2 sig/master
Projects
None yet
Development

No branches or pull requests

5 participants