Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

internal: Add code for DAG status to emit Conditions #2931

Merged
merged 7 commits into from
Sep 29, 2020

Conversation

youngnick
Copy link
Member

@youngnick youngnick commented Sep 22, 2020

This commit adds code (unused as yet) for DAG status updates to emit
Conditions instead of updating the currentStatus and description fields
only. Instead, there is a Valid condition that indicates the same
information, but allows extra details to be added via SubConditions.

To do this, there's a new status package that handles status during
the DAG build, including a cache. This has been built with extensibility
towards ExtensionService and the service-apis types in mind, so we
should be able to add support for emitting Conditions on them all
as well reasonably easily.

After discussion on #2886, I've also locked in the behavior of the
additions to the apis/projectcontour/v1 helpers with tests. I'll
do another pass after I finish out this work and add it to the auth
helpers as well.

Signed-off-by: Nick Young ynick@vmware.com

@youngnick youngnick self-assigned this Sep 22, 2020
@codecov
Copy link

codecov bot commented Sep 22, 2020

Codecov Report

Merging #2931 into main will decrease coverage by 0.17%.
The diff coverage is 56.60%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #2931      +/-   ##
==========================================
- Coverage   75.14%   74.97%   -0.18%     
==========================================
  Files          87       89       +2     
  Lines        5678     5731      +53     
==========================================
+ Hits         4267     4297      +30     
- Misses       1317     1339      +22     
- Partials       94       95       +1     
Impacted Files Coverage Δ
internal/status/cache.go 0.00% <0.00%> (ø)
internal/status/proxystatus.go 88.23% <88.23%> (ø)

@youngnick
Copy link
Member Author

youngnick commented Sep 22, 2020

The intent with this PR is to get my approach in front of everyone as soon as possible; the actual changing of the status will be an enormous PR, because of the changes to tests everywhere. The aim is to validate this approach and give a more reviewable PR before I followup with the actual cutover.

What I'm aiming for here is a relatively small changeover. The new status.Cache is a reasonably direct replacement for the dag.ObjectStatusWriter, and is intended to be included in the DAG in the same way (for now).

I'll change it something like this:

func (b *Builder) Build() *DAG {
	dag := DAG{
-		StatusWriter: StatusWriter{
-			statuses: map[types.NamespacedName]Status{},
-		},
+		StatusCache: status.NewCache(),
	}

	for _, p := range b.Processors {
		p.Run(&dag, &b.Source)
	}
	return &dag
}

Then, the eventhandler can be changed like this:

func (e *EventHandler) rebuildDAG() {
	latestDAG := e.Builder.Build()
	e.Observer.OnChange(latestDAG)

-	select {
-	case <-e.IsLeader:
-		// We're the leader, update resource status.
-		e.setStatus(latestDAG.Statuses())
-	default:
-		e.Debug("skipping metrics and CRD status update, not leader")
-	}
+     for _, su := range dag.StatusCache.GetStatusUpdates() {
+         e.StatusClient.SendStatus(su)
+     }
}

Note that we always just send the updates to the statusclient - it now performs the leader check, so we don't need to do it here as well.

In the processors, we'll be able to change the call to something like this (this is just the first call site I found, we would obviously change the ObjectStatusWriter everywhere):

	for meta := range p.orphaned {
		proxy, ok := p.source.httpproxies[meta]
		if ok {
-			sw, commit := p.dag.WithObject(proxy)
-			sw.WithValue("status", k8s.StatusOrphaned).
-				WithValue("description", "this HTTPProxy is not part of a delegation chain from a root HTTPProxy")
-			commit()
+			pa, commit := p.dag.StatusCache.ProxyAccessor(proxy)
+			pa.ConditionFor(status.ValidCondition).AddError("Orphaned",
+				"Orphaned",
+				"this HTTPProxy is not part of a delegation chain from a root HTTPProxy")
+			commit()
		}
	}

Although reading the change, I just realised I haven't handled orphaned correctly, I'll fix that now. Updated to fix this.

Hopefully this should give enough context to be able to review, even without the actual changes to internal/dag and internal/contour.

@youngnick youngnick force-pushed the status-conditions-prep branch from 4e65cf4 to 9f422c9 Compare September 22, 2020 05:40
@skriss
Copy link
Member

skriss commented Sep 22, 2020

thanks for the context and code samples, very helpful!

@skriss
Copy link
Member

skriss commented Sep 22, 2020

I read through this and the approach looks reasonable to me - about what I expected based on the design & previous discussions. Line-by-line review will have to wait until my tomorrow though.

@youngnick youngnick force-pushed the status-conditions-prep branch from 9f422c9 to 3136500 Compare September 23, 2020 04:42
Copy link
Member

@stevesloka stevesloka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Kubebuilder validations: Should we worry about the limits on sizes of messages, etc. I ask because nothing stops a developer from creating a message that's too big for the field, to have the write fail to the api server and then nothing will ever update.
  • How does something get removed from error list or warning, etc? Right now it's added based upon the type.

apis/projectcontour/v1/helpers.go Outdated Show resolved Hide resolved
// condition like `Valid` or `Ready`, and false otherwise.
func (dc *DetailedCondition) IsPositivePolarity() bool {

switch dc.Type {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This switch contradicts the comments on how the Status can be true or false:

// `Valid`, `status: false` means that the object has had one or more fatal errors during processing into Contour.
//  The details of the errors will be present under the `errors` field.

Copy link
Member Author

@youngnick youngnick Sep 24, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, this is indicating if the Condition is a positive or negative polarity.

In this case, "Valid" is a positive-polarity condition, meaning that if it has Status: true, then there is no problem, and Status: false means there is a problem.

Other Conditions are expected to have the negative polarity, in that case Status: true means there is a problem, Status: false means that there is no problem. Negative polarity conditions are effectively error conditions, like InsufficientDiskSpace or something.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The method returns true if you pass it Valid otherwise it returns false. How then do you have a condition of Valid with status of false as described in the API comment noted above? You'll also have status true.

Is the comment incorrect?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where this method is used is in the AddError method. It's used to determine what to set the condition status to - for 'Valid', it's true, which means that the status is set to false when an error is added. If the condition is not positive polarity, then the status is set to true when an error is added.

This is not setting the status directly, but is being used when an error is added to determine what the status should be set to.

apis/projectcontour/v1/helpers.go Outdated Show resolved Hide resolved
apis/projectcontour/v1/helpers.go Outdated Show resolved Hide resolved
apis/projectcontour/v1/helpers.go Show resolved Hide resolved
apis/projectcontour/v1/helpers.go Outdated Show resolved Hide resolved
internal/status/cache.go Show resolved Hide resolved

return proxy
default:
panic(fmt.Sprintf("Unsupported object %s/%s in status Address mutator",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

panic seems a bit aggressive here to restart Contour if an unsupported type is found.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This package controls the mutator, and it's should only ever be passed HTTPProxy objects. If we get to this code, something has gone seriously wrong. That's why the panic.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems lame to panic and break someone's application. Should a non-HTTPProxy object get here, log it and move on. There are other places that we panic as well for similar conditions. To me feels like ignoring that case is fine since it would have no affect anyway.

internal/status/cache.go Show resolved Hide resolved
internal/status/cache.go Show resolved Hide resolved
const ValidCondition ConditionType = "Valid"

// ProxyUpdate holds status updates for a particular HTTPProxy object
type ProxyUpdate struct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it'd be nice to have some tests for this type

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, I wanted to nail down if I was on the right track first. I'll make sure they're added before merging.

apis/projectcontour/v1/helpers.go Show resolved Hide resolved
// the DetailedCondition reason by the errorType because we're
// adding SubConditions; DetailedCondition Reason is a summary of the
// SubCondition reasons.
detailedReason := errorType + reason
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we put a : or some kind of separator between them?

Copy link
Member Author

@youngnick youngnick Sep 24, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolutely, I should have done that.
Actually, part of the Reason contract is that it's supposed to be a CamelCase, alpha-only field. The idea here is that the first error subcondition will update the containing DetailedCondition to its' TypeReason combo (so something like VhostErrorNoFQDN or something), and subsequent ones will update to MultipleReasons and tell you to look at the Errors and Warnings for more info.

This plainly needs more explanation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this to work well, the strings need to be careful chosen. So we should have constants for all the error types and reasons, and type alias to make it easier to track where the strings are used.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a constant for the Valid type, the only type we are currently using. Changing the type of the DetailedCondition Type field is not available to us because we are inlining (what will be) the upstream Condition type.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still on the fence about this DetailedCondition API. I think that it's quite specific to the Valid condition and I can see how it will work OK there, but I'm not really convinced that this logic belongs in the general DetailedCondition.

We are hard up on the release date, so what if we move all this code into the status package as ValidCondition type something like this:

type ValidCondition interface {
    AddError()
    AddWarning()

    AsDetailedCondition() *projectcontour.DetailedCondition
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I thought some more and how about we keep AddWarning/AddError on the DetailedCondition, but we move the rest of the logic to the status package, where we can apply the "Valid" conditions transformation based on the known condition types:

func TransformValidCondition(c *DetailedCondition) {
 assert(c.Type == "Valid")

  switch len(c.Errors) {
  case 0:
    c.Status = ConditionTrue
  case 1:
    c.Status = ConditionFalse
    c.Reason = c.Errors[0].Reason
    c.Message = rollup(c.Errors)
  default:
    c.Status = ConditionFalse
    c.Reason = "MultipleErrors"
    c.Message = rollup(c.Errors)
  }
}

🤷

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I'll move it all around then. I feel like all that it means is that there will be zero consistency across the use of Subconditions in various places, but whatever.

apis/projectcontour/v1/helpers_test.go Outdated Show resolved Hide resolved
@youngnick youngnick force-pushed the status-conditions-prep branch from 3136500 to 259a4a2 Compare September 25, 2020 01:32
@youngnick
Copy link
Member Author

youngnick commented Sep 25, 2020

So, my todo list after this round of comments is:

  • explain the way that SubConditions affect DetailedConditions in a more clear way. This is currently confusing.
  • Put a limit on the size of the message fields everywhere, I will simply truncate the fields whenever I access them if they're too long. If we're writing 32k characters into a string, then we have other problems.
  • Add specific tests for status.ProxyStatus.
  • Add updating the DetailedCondition ObservedGeneration into the StatusMutatorFunc.

Copy link
Member

@skriss skriss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no more comments from me, LGTM

@youngnick youngnick force-pushed the status-conditions-prep branch from 666f0c3 to ccef4ea Compare September 28, 2020 01:33
Copy link
Contributor

@jpeach jpeach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have a few comment, but overall the API looks like it will work pretty well.

internal/status/doc.go Outdated Show resolved Hide resolved

// ProxyUpdate holds status updates for a particular HTTPProxy object
type ProxyUpdate struct {
Object projcontour.HTTPProxy
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is a shallow copy what we want here? Since the API allocates an update with ProxyAccessor, you could just sample the fields (namespacedname, observed generation) that you need here.

internal/status/cache.go Outdated Show resolved Hide resolved
internal/status/cache.go Outdated Show resolved Hide resolved
internal/status/cache.go Outdated Show resolved Hide resolved
apis/projectcontour/v1/helpers.go Outdated Show resolved Hide resolved
internal/k8s/statusupdater.go Outdated Show resolved Hide resolved
// Set the ObservedGeneration correctly
cond.ObservedGeneration = currentGeneration

condIndex := projectcontour.GetConditionIndex(string(condType), proxy.Status.Conditions)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could eliminate the indexing with a helper that matches the ProxyUpdate pattern:

c := proxy.Status.ConditionFor(condType)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is done now, much cleaner, thanks.

internal/status/proxystatus.go Outdated Show resolved Hide resolved
internal/status/proxystatus.go Outdated Show resolved Hide resolved
@youngnick youngnick force-pushed the status-conditions-prep branch 3 times, most recently from 69f2375 to 1cb1356 Compare September 28, 2020 11:44
if dc.Message != message {
// Only add the message if it's not already in there somewhere.
if !strings.Contains(dc.Message, message) {
dc.Message = dc.Message + ", " + message
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can end up blowing the max message length validation constraint.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is going to move to status anyway; so I'll truncate the message there.

type ConditionType string

// ValidCondition is the ConditionType for Valid.
const ValidCondition ConditionType = "Valid"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const ValidCondition ConditionType = "Valid"
const ValidCondition ConditionType = projectcontour.ValidConditionType

Or maybe use projectcontour.ValidConditionType directly?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll change all the signatures to use the projectcontour type then

dc, ok := pu.Conditions[cond]
if !ok {
newDc := &projectcontour.DetailedCondition{}
newDc.Type = string(cond)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

newDc := &projectcontour.DetailedCondition{
        Type: string(cond),
}

pu.Conditions[cond] = newDc
return newDc
}
return dc
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we even need to switch on ValidCondition here? How bad would it be if we just returned a projectcontour.DetailedCondition for whatever condition type the caller asked for?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, we don't, but you asked for this change earlier... I'll revert.

ObjectMeta: v1.ObjectMeta{
Name: "test",
Namespace: "test",
Generation: testGeneration,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Generation field is never used by the proxy mutation.

If you drop it from here, you can use something like fixture.NewProxy("test/test").WithSpec(projectcontour.HTTPProxySpec{}).

Alternatively, maybe we should be dropping the status update if the generation is stale?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is used by the proxy mutation, that's what sets the top-level ObservedGeneration. I think it's fine to apply the update anyway, as it means that if the object has changed in the meantime, this reflects when the status was generated.

// the DetailedCondition reason by the errorType because we're
// adding SubConditions; DetailedCondition Reason is a summary of the
// SubCondition reasons.
detailedReason := errorType + reason
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still on the fence about this DetailedCondition API. I think that it's quite specific to the Valid condition and I can see how it will work OK there, but I'm not really convinced that this logic belongs in the general DetailedCondition.

We are hard up on the release date, so what if we move all this code into the status package as ValidCondition type something like this:

type ValidCondition interface {
    AddError()
    AddWarning()

    AsDetailedCondition() *projectcontour.DetailedCondition
}

message = truncateLongMessage(message)

detailedReason := warnType + reason
dc.updateReason(detailedReason, message)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does adding a warning update the top-level condition? If there are warnings, then I don't think that we would rewrite the top-level reason (at least in the case of the Valid condition).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fine.

// the DetailedCondition reason by the errorType because we're
// adding SubConditions; DetailedCondition Reason is a summary of the
// SubCondition reasons.
detailedReason := errorType + reason
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I thought some more and how about we keep AddWarning/AddError on the DetailedCondition, but we move the rest of the logic to the status package, where we can apply the "Valid" conditions transformation based on the known condition types:

func TransformValidCondition(c *DetailedCondition) {
 assert(c.Type == "Valid")

  switch len(c.Errors) {
  case 0:
    c.Status = ConditionTrue
  case 1:
    c.Status = ConditionFalse
    c.Reason = c.Errors[0].Reason
    c.Message = rollup(c.Errors)
  default:
    c.Status = ConditionFalse
    c.Reason = "MultipleErrors"
    c.Message = rollup(c.Errors)
  }
}

🤷

{
Condition: projectcontour.Condition{
Type: string(ValidCondition),
Status: projectcontour.ConditionTrue,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reason is a required field ... what should it be when the condition is true?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"valid", I'll update.

}

if sc.Reason != reason {
sc.Reason = "MultipleReasons"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How can a subcondition have multiple reasons?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the same Type is used for multiple reasons. For example, if there are conflicting TLS features enabled, with the secret referred to not existing. Those are currently both Errors of Type TLSError, I think.

I felt that the idea of having a Type field as well as a Reason field is to give you a general type of thing, and then a reason why that's the case. The Message is a more readable version of the Reason.

I agree that this situation won't currently arise, as we short-circuit. I was planning on reevaluating this as part of the refactor, but I'd like to have updated the tests before I start messing with it.

}

if !strings.Contains(sc.Message, message) {
sc.Message = sc.Message + ", " + message
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can blow the validation max length.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we keep this, I'll put a truncate afterwards then.

@youngnick youngnick force-pushed the status-conditions-prep branch from 86b1437 to 9de3dc7 Compare September 29, 2020 11:23
Copy link
Contributor

@jpeach jpeach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks pretty good. I think that both the internal and external APIs are sound. There's a few nits that could be cleaned up, but we can take care of those on follow-up PRs.

Copy link
Member

@stevesloka stevesloka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with just a question on this comment: #2931 (comment)

Copy link
Member

@skriss skriss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚢

This commit adds code for DAG status updates to emit Conditions
instead of updating the `currentStatus` and `description` fields
only. Instead, there is a `Valid` condition that indicates the same
information, but allows extra details to be added via SubConditions.

To do this, there's a new `status` package that handles status during
the DAG build, including a cache. This has been built with extensibility
towards ExtensionService and the service-apis types in mind, so we
should be able to add support for emitting Conditions on them all
as well reasonably easily.

After discussion on projectcontour#2886, I've also locked in the behavior of the
additions to the `apis/projectcontour/v1` helpers with tests. I'll
do another pass after I finish out this work and add it to the auth
helpers as well.

Signed-off-by: Nick Young <ynick@vmware.com>
Nick Young added 6 commits September 29, 2020 23:22
- Better explanation of how SubConditions affect DetailedConditions
- Truncate long messages
- Test the rest of the added functions in `apis/projectcontour/v1/helpers_test.go`.

Signed-off-by: Nick Young <ynick@vmware.com>
Signed-off-by: Nick Young <ynick@vmware.com>
Signed-off-by: Nick Young <ynick@vmware.com>
- get rid of GetConditionIndex as a public API, replaced with `GetConditionFor` on HTTPProxyStatus.
- ProxyUpdate updated to only hold the relevant fields from the HTTPProxy it's for.

Signed-off-by: Nick Young <ynick@vmware.com>
Signed-off-by: Nick Young <ynick@vmware.com>
Signed-off-by: Nick Young <ynick@vmware.com>
@youngnick youngnick force-pushed the status-conditions-prep branch from 9de3dc7 to 02b72bc Compare September 29, 2020 23:22
@youngnick youngnick merged commit 9a2890c into projectcontour:main Sep 29, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants