Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

better error handling when working with kube and kudo. #1097

Merged
merged 1 commit into from
Nov 26, 2019

Conversation

kensipe
Copy link
Member

@kensipe kensipe commented Nov 21, 2019

What this PR does / why we need it:
We can NOT wrap the timeout error associated with a bad kubeconfig. It is private error which can be interrogated to determine that it was a Timeout however that ability doesn't work if wrapped.

When the cluster is down or not started DNS requests from go will fail on Line 142 of https://golang.org/src/net/dial.go with an error which is private (internal). It can not be wrapped because it can't be unwrapped.

Fixes #
This part 1 of a 2 part solution towards fixing bug #1056

Copy link
Member

@gerred gerred left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

always a fan of getting rid of dependencies

@kensipe
Copy link
Member Author

kensipe commented Nov 21, 2019

Oh... I meant to mention that I was looking to get rid of "github.com/pkg/errors" :)

return nil, errors.WithMessage(err, "operators")
// timeout is not a wrappable error, timeout is an underlying issue that is NOT CRD specific, there is no value in wrapping or converting as well.
// best to provide the actual error for proper reporting.
if os.IsTimeout(err) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand that we don't wrap the timeout error but to what end? Where do you do anything with it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The intention is to do something with it :). I was wrongly of the opinion that this was such a simple and not debatable change that I could land this quickly and move on to using it.

if os.IsTimeout(err) {
return nil, err
}
return nil, fmt.Errorf("operators crd: %w", err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you're at it: I'm not sure what "operators crd: adds to the error message. Maybe smth. like:

Suggested change
return nil, fmt.Errorf("operators crd: %w", err)
return nil, fmt.Errorf("failed to fetch operators.kudo.dev CRD: %w", err)

?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this request for change is not consistent with the actual change. The message that you are wanting to change was original there. Lets focus on the change.

Copy link
Contributor

@zen-dog zen-dog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's unclear as to why this change is made

@alenkacz
Copy link
Contributor

It's unclear as to why this change is made

yeah, I struggled with that as well and I would like to understand, could you maybe explain @kensipe ? and how it connects to the issue linked as well. I don't have anything against the change per say but I don't understand the motivation...

@kensipe
Copy link
Member Author

kensipe commented Nov 25, 2019

@zen-dog This change was made because the previous code wrapped the error with the GH errors as in return nil, errors.WithMessage(err, "operators")
This made it impossible to know if the error was a timeout. There is value in knowing if the error is because 1) the crd doesn't exist or 2) because you aren't connected to a cluster. That is lost with the previous code.

Making matters worse... in go core there are 2 types of timeouts... what around context... another based on an internal error. the common solution for detecting if it is a timeout is to use os.IsTimeout() however this is lackluster way and is limited to the set of errors defined in the core (our wrapping isn't detected). The context is lost.

It was proposed to fix in 1.13. Ross Cox indicates in golang/go#31449 (comment) that it is targeted for 1.14.

The best summary is: golang/go#30322

For more reading
golang/go#32735
golang/go#31449
golang/go#33411

@zen-dog zen-dog dismissed their stale review November 26, 2019 11:22

I guess it doesn't harm so ¯_(ツ)_/¯

@kensipe kensipe merged commit 912bfd4 into master Nov 26, 2019
@kensipe kensipe deleted the ken/better-connection-errors branch November 26, 2019 11:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants