Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bump observedGeneration first #6163

Closed
wants to merge 1 commit into from
Closed

Conversation

dgerd
Copy link

@dgerd dgerd commented Dec 5, 2019

Move observedGeneration bump to the top of reconcile()

Fixes #4937

Move observedGeneration bump to the top of reconcile()

Fixes knative#4937
@knative-prow-robot knative-prow-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Dec 5, 2019
@googlebot googlebot added the cla: yes Indicates the PR's author has signed the CLA. label Dec 5, 2019
@knative-prow-robot knative-prow-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. area/API API objects and controllers area/networking labels Dec 5, 2019
Copy link
Contributor

@knative-prow-robot knative-prow-robot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dgerd: 0 warnings.

In response to this:

Move observedGeneration bump to the top of reconcile()

Fixes #4937

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@knative-prow-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dgerd

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@knative-prow-robot knative-prow-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Dec 5, 2019
@knative-metrics-robot
Copy link

The following is the coverage report on the affected files.
Say /test pull-knative-serving-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/reconciler/route/route.go 81.2% 81.1% -0.1

@knative-test-reporter-robot

The following jobs failed:

Test name Triggers Retries
pull-knative-serving-unit-tests 0/3

Failed non-flaky tests preventing automatic retry of pull-knative-serving-unit-tests:

pkg/activator/net.TestMultipleActivators

@dgerd
Copy link
Author

dgerd commented Dec 5, 2019

/test pull-knative-serving-unit-tests

@dprotaso
Copy link
Member

dprotaso commented Dec 6, 2019

/lgtm

Probably worth noting in some 'best-practice' location that we should always set the observed generation prior to any operation that can cause the reconciliation to 'fail'

@knative-prow-robot knative-prow-robot added the lgtm Indicates that a PR is ready to be merged. label Dec 6, 2019
@dgerd
Copy link
Author

dgerd commented Dec 6, 2019

/hold

I am going to hold this for the next release since the release is so close.

I have looked through the code that depends on observedGeneration and don't see any potential problems moving this. Unit and integration tests pass. I also played around creating and updating services with Kn without problem.

@dprotaso My plan is to start with these controllers (Service, Configuration, Route, and Revision) and then will do the same with SKS, PA, etc. Once we have serving and eventing all doing it at the beginning it probably makes sense to make this part of the framework in pkg so people don't have to worry about observedGeneration at all.

/cc @mattmoor

@knative-prow-robot knative-prow-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Dec 6, 2019
@dgerd dgerd changed the title [WIP] Bump observedGeneration first Bump observedGeneration first Dec 6, 2019
@knative-prow-robot knative-prow-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Dec 6, 2019
@mattmoor
Copy link
Member

mattmoor commented Dec 8, 2019

Thanks for holding. I'm not sure this is completely correct either... What I'm imagining is a flow like:

InitConditions()
SetObservedGeneration()
if err := SomethingThatCouldFail(); err != nil {
   return err
}
// More

When the error-path is taken, suppose our previous state was Ready: True and that hasn't been downgraded to reflect what should maybe be Ready: Unknown due the the error (maybe that happens in // More?).

To offer an alternative: would it be bad if we didn't update ObservedGeneration when Reconcile returns an error? What if we changed out boilerplate to:

	config := original.DeepCopy()
	reconcileErr := c.reconcile(ctx, config)

	// If the reconcile succeeded, then we always update observed generation.
	if reconcileErr == nil {
		config.Status.ObservedGeneration = original.Generation
	}

	// Same from here on.
	if equality.Semantic.DeepEqual(original.Status, config.Status) {

Copy link
Contributor

@taragu taragu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we apply the same changes to ingress and sks?

@dgerd
Copy link
Author

dgerd commented Dec 9, 2019

When the error-path is taken, suppose our previous state was Ready: True and that hasn't been downgraded to reflect what should maybe be Ready: Unknown due the the error

I explored this in #5076 (comment) and hope to discuss the findings more in the next API working group. We currently have multiple places where we exit the loop early in a terminal state, but do not update status. I was hoping to fix that in a subsequent change, but thinking about this further you are probably right that these concepts are to coupled to commit separately.

Rather than holding back observedGeneration and keeping FooReady: True we should be setting FooReady to False and bumping observedGeneration.

would it be bad if we didn't update ObservedGeneration when Reconcile returns an error?

Yes I think this would be bad. My understanding of observedGeneration is to determine if the controller has seen the updated -- it is not meant to notate that that updated spec has successfully reconciled. If we update it only when reconcile succeeds we (1) will have problems with our Service's getting stuck in "Unknown" due to mismatched generations and (2) we will be missing data that helps separate failures in a reconciler seeing an update from failures to reach the desired state.

@taragu Yes, my goal is for all reconcilers in Knative to follow the same pattern. I started this change as a bit of an experiment, and I do want to break up changing this into smaller bits to reduce risk. See previous comment on the PR.

@dgerd dgerd closed this Feb 24, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/API API objects and controllers area/networking cla: yes Indicates the PR's author has signed the CLA. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. lgtm Indicates that a PR is ready to be merged. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

metadata.observedGeneration behaves differently across CRDs
8 participants