Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Comprehensive fixes for deployment image triggering #1433

Merged

Conversation

ironcladlou
Copy link
Contributor

The current deployment image triggering logic is not accounting for various edge cases related to image IDs (or lack thereof). Define all the triggering scenarios in the comprehensive unit test and plug the holes in the triggering code so that triggers work consistently.

One known side effect of the current flawed code is mis-firing triggers which update the container to a nil image ID, causing races described in #1443.

Base all comparisons on image ID, and add new state to image change trigger params to store the last image ID used for triggering.

@ironcladlou
Copy link
Contributor Author

@smarterclayton @ncdc @pmorie PTAL.

@ncdc
Copy link
Contributor

ncdc commented Mar 24, 2015

I'm wondering if we could just call ParseDockerImageReference for both container.Image and latest.DockerImageReference and then compare the 2?

@ironcladlou
Copy link
Contributor Author

I'm wondering if we could just call ParseDockerImageReference for both container.Image and latest.DockerImageReference and then compare the 2?

If ImageRepository.DockerImageReference always eventually contains the latest state (including image ID), I don't see why not. But that's not saying much, since I'm still learning all the IR mechanics on the fly. 😁

@ironcladlou
Copy link
Contributor Author

Talked with @ncdc, there's mork work to do here still. Hashing out unit tests currently.

@ncdc
Copy link
Contributor

ncdc commented Mar 24, 2015

It's not ImageRepository.DockerImageReference - it's the DockerImageReferences created by parsing container.Image and latest.DockerImageReference. Very confusing, I know :-).

status.tags.items[n].dockerImageReference will always be a pull spec. Sometimes it will just be a pull spec to a tag, sometimes it will be a pull spec to a "pull by id" tag, and once we have v2 registries and Docker 1.6, it will be a pull spec to an image by its digest, for true pull by id semantics.

@ironcladlou
Copy link
Contributor Author

@ncdc Do you think the new test support in ironcladlou@face432 sufficient to express our scenarios?

@ncdc
Copy link
Contributor

ncdc commented Mar 24, 2015

@ironcladlou I think so

@ironcladlou ironcladlou changed the title Fall back to pullspecs for IR triggering Comprehensive fixes for deployment image triggering Mar 25, 2015
@smarterclayton
Copy link
Contributor

Status on this?

@ironcladlou
Copy link
Contributor Author

Paused it while @ncdc was out to make progress on deployment hooks, will be revisiting it this afternoon.

@ironcladlou
Copy link
Contributor Author

@ncdc helped me hash this out. I'm going to make a stateless utility method in the pkg/image/api/helper.go which will be something like:

func TagDiffers(tag string, pullspec string, repo ImageRepository) bool

The table tests in pkg/deploy/controller/imagechange/controller_test.go#TestHandle_matchScenarios are currently conflating tests for the diff logic and the controller's core triggering logic. I'll use them as a guide to make tests for TagDiffers and simplify the controller test.

cc @bparees for builds.

@ironcladlou
Copy link
Contributor Author

Nevermind, the proposed solution still isn't right. We're working on it.

@ironcladlou ironcladlou force-pushed the deploy-trigger-test-fix branch 5 times, most recently from 1db139a to b1d69e8 Compare March 27, 2015 17:55
@ironcladlou
Copy link
Contributor Author

Latest branch has big refactorings to the API, image trigger controller, and generator to implement the correct behavior. The e2e tests are already working, but I need to go back and re-do the unit tests.

@ironcladlou ironcladlou force-pushed the deploy-trigger-test-fix branch 4 times, most recently from 9df3d81 to 29a2ba5 Compare March 30, 2015 19:36
@ironcladlou
Copy link
Contributor Author

Okay, @ncdc @smarterclayton and @pmorie.

I had to rewrite the deployment image change trigger controller, the generator, and all the related tests. I also had to make an additive/backwards-compatible API change to add a status struct to DeploymentTriggerImageChangeParams so we can correctly track the last triggered image ID.

The changes are significant, but as far as I know, the behavior is actually correct now in terms of ImageRepository interactions. Hopefully you all can do some auditing.

In the meantime, [test].

@ironcladlou
Copy link
Contributor Author

I was also able to remove a metric ton of test cruft during the course of this by consolidating test fixtures and removing dead tests.

@smarterclayton
Copy link
Contributor

On Mar 30, 2015, at 3:40 PM, Dan Mace notifications@github.com wrote:

Okay, @ncdc @smarterclayton and @pmorie.

I had to rewrite the deployment image change trigger controller, the generator, and all the related tests. I also had to make an additive/backwards-compatible API change to add a status struct to DeploymentTriggerImageChangeParams so we can correctly track the last triggered image ID.

So I have a question... Why is the pull spec in the pod template not sufficient? Why do we need this other field?
The changes are significant, but as far as I know, the behavior is actually correct now in terms of ImageRepository interactions. Hopefully you all can do some auditing.

In the meantime, [test].


Reply to this email directly or view it on GitHub.

@ncdc
Copy link
Contributor

ncdc commented Mar 30, 2015

Why is the pull spec in the pod template not sufficient? Why do we need this other field?

We need the image id, which won't be in the pull spec if the registry it comes from doesn't support pull by id.

@smarterclayton
Copy link
Contributor

Why do we need the image id? If the pull spec doesn't change you can't guarantee you'll get a new image (although you can guarantee you'll get a new pod, you can't force the node to pull a new image).

----- Original Message -----

Why is the pull spec in the pod template not sufficient? Why do we need
this other field?

We need the image id, which won't be in the pull spec if the registry it
comes from doesn't support pull by id.


Reply to this email directly or view it on GitHub:
#1433 (comment)

@ncdc
Copy link
Contributor

ncdc commented Mar 30, 2015

True, but if the ImagePullPolicy is the default, it will always pull. This is the easiest way to be consistent for all our image change trigger semantics. The code was getting too ugly and too hard to reason through. Going forward in my IS PR, the TagEvents in Status.Tags[tag] must have both DockerImageReference and Image (id) to help enable this. Dan and I spent multiple hours talking through this on IRC and the phone, and this way is easy to understand and follow.

@ncdc
Copy link
Contributor

ncdc commented Mar 30, 2015

In that case image id isn't helping, you just need a way to trigger the deployment. The important distinction here is that only a change to the image pull spec guarantees a deployment. If we import his registry but don't change the tag, when we trigger a new deployment we're not promising he's going to get a new image.

Agreed, unless the pull policy is set correctly, but we can't easily guarantee that.

He has to rev the tag himself.

From what to what? If the current TagEvent for latest is DockerImageReference=joe/awesome:latest, Image=abcd1234 and he pushes a new image, the only thing that changes in the new TagEvent is Image. The pull spec, DockerImageReference, hasn't changed.

@smarterclayton
Copy link
Contributor

He must physically change the tag on the docker end. I'm provisionally ok with someone setting PullAlways (or maybe we need one which is PullAtLeastOnce) after the trigger a deploy but i'm not sure we require this change in order to reflect that.

----- Original Message -----

In that case image id isn't helping, you just need a way to trigger the
deployment. The important distinction here is that only a change to the
image pull spec guarantees a deployment. If we import his registry but
don't change the tag, when we trigger a new deployment we're not promising
he's going to get a new image.

Agreed, unless the pull policy is set correctly, but we can't easily
guarantee that.

He has to rev the tag himself.

From what to what? If the current TagEvent for latest is
DockerImageReference=joe/awesome:latest, Image=abcd1234 and he pushes a
new image, the only thing that changes in the new TagEvent is Image. The
pull spec, DockerImageReference, hasn't changed.


Reply to this email directly or view it on GitHub:
#1433 (comment)

@ncdc
Copy link
Contributor

ncdc commented Mar 30, 2015

He must physically change the tag on the docker end

Is this what you mean?

  1. docker build -t joe/awesome:revision2 .
  2. docker push joe/awesome:revision2
  3. push button to manually sync status.tags
  4. update ImageStream, set spec.tags[latest] = ImageStreamTag(awesome:revision2)

at this point, with a trigger watching ImageStreamTag(awesome:latest), the pull spec for status.tags[latest].Items[0].DockerImageReference is now joe/awesome:revision2, which is different from before (joe/awesome:latest), thus triggering a deployment.

Isn't it a whole lot easier just to trigger on image id? 😄 We can always educate users to set the pull policy to always in this situation.

@smarterclayton
Copy link
Contributor

----- Original Message -----

He must physically change the tag on the docker end

Is this what you mean?

  1. docker build -t joe/awesome:revision2 .
  2. docker push joe/awesome:revision2
  3. push button to manually sync status.tags
  4. update ImageStream, set spec.tags[latest] =
    ImageStreamTag(awesome:revision2)

at this point, with a trigger watching ImageStreamTag(awesome:latest), the
pull spec for status.tags[latest].Items[0].DockerImageReference is now
joe/awesome:revision2, which is different from before
(joe/awesome:latest), thus triggering a deployment.

Isn't it a whole lot easier just to trigger on image id? 😄 We can
always educate users to set the pull policy to always in this situation.

You don't need to physically store the image ID to trigger a build - that happens automatically as long as you can make at least one edit to the deployment config (or just to the latest version, but that's a bit tacky right now).

@ncdc
Copy link
Contributor

ncdc commented Mar 30, 2015

So are you saying then that all that needs to happen is:

  1. docker build -t joe/awesome:latest
  2. docker push joe/awesome:latest
  3. update deployment config
  4. profit?


// If the tag event doesn't have an image ID yet, we can't determine
// whether there was really a change
if len(latestEvent.Image) == 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not? Why can't you just use the pull spec like Builds?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This actually won't be an issue once my ImageStream PR goes in - Image won't ever be empty

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if I want to tag an arbitrary docker repository on the docker hub?

----- Original Message -----

  •           }
    
  •           if latest.Image != containerImageID {
    
  •               glog.V(4).Infof("Container %s for config %s: image id changed from %q
    
    to %q; regenerating config", container.Name, labelFor(config),
    containerImageID, latest.Image)
  •               configsToGenerate = append(configsToGenerate, config)
    
  •               firedTriggersForConfig[config.Name] =
    
    append(firedTriggersForConfig[config.Name], params)
  •           }
    
  •       // Find the latest tag event for the trigger tag
    
  •       latestEvent, err := imageapi.LatestTaggedImage(imageRepo, params.Tag)
    
  •       if err != nil {
    
  •           glog.V(4).Infof("Couldn't find latest tag event for tag %s in
    
    imageRepo %s: %s", params.Tag, labelForRepo(imageRepo), err)
  •           continue
    
  •       }
    
  •       // If the tag event doesn't have an image ID yet, we can't determine
    
  •       // whether there was really a change
    
  •       if len(latestEvent.Image) == 0 {
    

This actually won't be an issue once my ImageStream PR goes in - Image won't
ever be empty


Reply to this email directly or view it on GitHub:
https://github.com/openshift/origin/pull/1433/files#r27434928

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have to create an ImageStream for it

Sent from my iPhone

On Mar 30, 2015, at 6:16 PM, Clayton Coleman notifications@github.com wrote:

In pkg/deploy/controller/imagechange/controller.go:

  •           }
    
  •           if latest.Image != containerImageID {
    
  •               glog.V(4).Infof("Container %s for config %s: image id changed from %q to %q; regenerating config", container.Name, labelFor(config), containerImageID, latest.Image)
    
  •               configsToGenerate = append(configsToGenerate, config)
    
  •               firedTriggersForConfig[config.Name] = append(firedTriggersForConfig[config.Name], params)
    
  •           }
    
  •       // Find the latest tag event for the trigger tag
    
  •       latestEvent, err := imageapi.LatestTaggedImage(imageRepo, params.Tag)
    
  •       if err != nil {
    
  •           glog.V(4).Infof("Couldn't find latest tag event for tag %s in imageRepo %s: %s", params.Tag, labelForRepo(imageRepo), err)
    
  •           continue
    
  •       }
    
  •       // If the tag event doesn't have an image ID yet, we can't determine
    
  •       // whether there was really a change
    
  •       if len(latestEvent.Image) == 0 {
    
    What if I want to tag an arbitrary docker repository on the docker hub?


    Reply to this email directly or view it on GitHub.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

----- Original Message -----

  •           }
    
  •           if latest.Image != containerImageID {
    
  •               glog.V(4).Infof("Container %s for config %s: image id changed from %q
    
    to %q; regenerating config", container.Name, labelFor(config),
    containerImageID, latest.Image)
  •               configsToGenerate = append(configsToGenerate, config)
    
  •               firedTriggersForConfig[config.Name] =
    
    append(firedTriggersForConfig[config.Name], params)
  •           }
    
  •       // Find the latest tag event for the trigger tag
    
  •       latestEvent, err := imageapi.LatestTaggedImage(imageRepo, params.Tag)
    
  •       if err != nil {
    
  •           glog.V(4).Infof("Couldn't find latest tag event for tag %s in
    
    imageRepo %s: %s", params.Tag, labelForRepo(imageRepo), err)
  •           continue
    
  •       }
    
  •       // If the tag event doesn't have an image ID yet, we can't determine
    
  •       // whether there was really a change
    
  •       if len(latestEvent.Image) == 0 {
    

You have to create an ImageStream for it

I don't think we are on the same page about that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ironcladlou I talked to @smarterclayton and agree that a user shouldn't have to go through the hoops of creating an ImageStream that points to an external Docker image repository just to be able to deploy something. Let's talk tomorrow about how to trigger off of pull spec instead of image (it shouldn't be too much of a change).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, will catch up with you and figure out how to refactor.

@smarterclayton
Copy link
Contributor

Is this going to fix the integration test race?

@ironcladlou
Copy link
Contributor Author

I re-centered the logic around TagEvent.Status.DockerImageReference rather than .Image. PTAL @smarterclayton @ncdc

@ironcladlou
Copy link
Contributor Author

Not sure yet exactly what failed in that jenkins run, so I want to run this through [test] again.

@ironcladlou
Copy link
Contributor Author

Still not sure what's going on with jenkins. Unit, integration, and e2e tests are passing.

@ironcladlou
Copy link
Contributor Author

#1550 is what is up with Jenkins.

@smarterclayton PTAL, I think your comments are addressed.

firedTriggersForConfig[config.Name] = append(firedTriggersForConfig[config.Name], params)
}
// Ensure a change occured
if latestEvent.DockerImageReference != params.LastTriggeredImage {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If latestEvent.DockerImageReference is empty you shouldn't trigger an update.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will never be empty

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is now accounted for.

@smarterclayton
Copy link
Contributor

LGTM, squash

Rewrite deployment image triggering and generator logic to correctly
interact with ImageRepository objects.
@ironcladlou
Copy link
Contributor Author

Squashed

@smarterclayton
Copy link
Contributor

[merge]

@openshift-bot
Copy link
Contributor

Evaluated for origin up to ab496df

@openshift-bot
Copy link
Contributor

continuous-integration/openshift-jenkins/test SUCCESS (https://ci.openshift.redhat.com/jenkins/job/test_pull_requests_openshift3/1648/)

@openshift-bot
Copy link
Contributor

continuous-integration/openshift-jenkins/merge SUCCESS (https://ci.openshift.redhat.com/jenkins/job/test_pull_requests_openshift3/1648/) (Image: devenv-fedora_1188)

openshift-bot pushed a commit that referenced this pull request Apr 2, 2015
@openshift-bot openshift-bot merged commit 71639c1 into openshift:master Apr 2, 2015
@ironcladlou ironcladlou deleted the deploy-trigger-test-fix branch April 2, 2015 14:35
@smarterclayton smarterclayton added the kind/bug Categorizes issue or PR as related to a bug. label Apr 8, 2015
@smarterclayton smarterclayton modified the milestone: 0.5.0 (beta3) Apr 23, 2015
jpeeler added a commit to jpeeler/origin that referenced this pull request Oct 23, 2017
…service-catalog/' changes from aa27078754..dabde2eb85

dabde2eb85 origin build: add origin tooling
b70c076 Reorder class and plan creation; test plan conflict handling (openshift#1459)
4bea012 Use versioned client APIs (openshift#1458)
ff4af30 clean up logic for 410 gone deprovision poll (openshift#1452)
3fddf27 clean up logic and fix message for failed poll (openshift#1451)
40926cd Fix typo from openshift#1354 (openshift#1456)
ff86ef2 Delete removed serviceplans when they have no instances left (openshift#1444)
8411a16 tweak binding setAndUpdateOrphanMitigation function (openshift#1448)
ce28252 Combine apiserver and controller-manager into a single service-catalog image (openshift#1343)
7bbc8ee Check service class / plan before allowing provisioning or plan changes. (openshift#1439)
baf28de Create listers before adding event handlers in controller (openshift#1446)
294157d remove setServiceBindingCondition dependency on controller (openshift#1441)
118a0f7 Fix typo in validation (openshift#1447)
117bfbd clean up error logging (openshift#1443)
dff470f Move "External" around in some resource names/properties (openshift#1354)
0885edb Adding expectedGot function and using it. (openshift#1440)
a7d582e Pretty controller broker (openshift#1442)
c5edfaf Set apimachinery build variables with semver info (openshift#1429)
0e90d82 Add a pretty formatter for ClusterService[Class|Plan] (openshift#1408)
fb874df Remove deprecated basic auth config support (openshift#1431)
f4cd181 Migrate to metav1 methods for manipulating controllerRefs (openshift#1433)
96b286e Make service/plan reference fields on instance spec selectable (openshift#1422)
33f2b04 First example using the pretty context builder. (openshift#1403)
7852917 Stop using corev1.ObjectReference and corev1.LocalObjectReference (openshift#1417)
fcf9480 Add tests for plan updates (openshift#1412)
819332e Add root CAs (openshift#1419)
b49a76a Clean Makefile a little (openshift#1399)
d681da0 Use a separate etcd prefix for each integration test to keep tests isolated (openshift#1415)
314a622 Wire etcd prefix to storage and call complete with options (openshift#1394)
REVERT: aa27078754 origin build: add origin tooling

git-subtree-dir: cmd/service-catalog/go/src/github.com/kubernetes-incubator/service-catalog
git-subtree-split: dabde2eb859b5e31e97c01a704561fc27e1848b2
jpeeler added a commit to jpeeler/origin that referenced this pull request Oct 24, 2017
…service-catalog/' changes from aa27078754..510060232e

510060232e origin build: add origin tooling
de45e94 v0.1.0 chart changes (openshift#1468)
0bb9982 Modify Makefile to only specify ldflags once (openshift#1471)
5d6afac Fixes openshift#735: Add repo-sync script for charts (openshift#1453)
630f13f fix lingering unversioned client API (openshift#1466)
6f49128 Fix several logging errors (openshift#1464)
2aece61 Delete removed serviceClasses when they have no instances left (openshift#1450)
179d302 Uncommenting UID field after updating to k8s 1.8 (openshift#1457)
b70c076 Reorder class and plan creation; test plan conflict handling (openshift#1459)
4bea012 Use versioned client APIs (openshift#1458)
ff4af30 clean up logic for 410 gone deprovision poll (openshift#1452)
3fddf27 clean up logic and fix message for failed poll (openshift#1451)
40926cd Fix typo from openshift#1354 (openshift#1456)
ff86ef2 Delete removed serviceplans when they have no instances left (openshift#1444)
8411a16 tweak binding setAndUpdateOrphanMitigation function (openshift#1448)
ce28252 Combine apiserver and controller-manager into a single service-catalog image (openshift#1343)
7bbc8ee Check service class / plan before allowing provisioning or plan changes. (openshift#1439)
baf28de Create listers before adding event handlers in controller (openshift#1446)
294157d remove setServiceBindingCondition dependency on controller (openshift#1441)
118a0f7 Fix typo in validation (openshift#1447)
117bfbd clean up error logging (openshift#1443)
dff470f Move "External" around in some resource names/properties (openshift#1354)
0885edb Adding expectedGot function and using it. (openshift#1440)
a7d582e Pretty controller broker (openshift#1442)
c5edfaf Set apimachinery build variables with semver info (openshift#1429)
0e90d82 Add a pretty formatter for ClusterService[Class|Plan] (openshift#1408)
fb874df Remove deprecated basic auth config support (openshift#1431)
f4cd181 Migrate to metav1 methods for manipulating controllerRefs (openshift#1433)
96b286e Make service/plan reference fields on instance spec selectable (openshift#1422)
33f2b04 First example using the pretty context builder. (openshift#1403)
7852917 Stop using corev1.ObjectReference and corev1.LocalObjectReference (openshift#1417)
fcf9480 Add tests for plan updates (openshift#1412)
819332e Add root CAs (openshift#1419)
b49a76a Clean Makefile a little (openshift#1399)
d681da0 Use a separate etcd prefix for each integration test to keep tests isolated (openshift#1415)
314a622 Wire etcd prefix to storage and call complete with options (openshift#1394)
REVERT: aa27078754 origin build: add origin tooling

git-subtree-dir: cmd/service-catalog/go/src/github.com/kubernetes-incubator/service-catalog
git-subtree-split: 510060232e54eb64b294213bb5d7847e169a2fac
jpeeler pushed a commit to jpeeler/origin that referenced this pull request Feb 1, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants