Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

apiserver errors should include reason #50622

Open
discordianfish opened this issue Aug 14, 2017 · 19 comments
Open

apiserver errors should include reason #50622

discordianfish opened this issue Aug 14, 2017 · 19 comments
Labels
area/api Indicates an issue on api area. area/apiserver kind/feature Categorizes issue or PR as related to a new feature. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery.

Comments

@discordianfish
Copy link
Contributor

/kind feature

What happened:
Spend too much time figuring out that my apiserver didn't start because the etcd DNS name couldn't be resolved. I got the following error:

Error: error waiting for etcd connection: timed out waiting for the condition

And even --v=10 didn't provide any more context. This was very misleading and made me assume the issue is on application level.

What you expected to happen:
I expected the apiserver error to tell me that DNS resolution wasn't working.

How to reproduce it (as minimally and precisely as possible):
I used the bootkube manifest but just starting the apiserver with --etcd-servers pointing to a nonexisting name.

@k8s-ci-robot k8s-ci-robot added the kind/feature Categorizes issue or PR as related to a new feature. label Aug 14, 2017
@k8s-github-robot k8s-github-robot added the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Aug 14, 2017
@discordianfish
Copy link
Contributor Author

@kubernetes/sig-cluster-ops

@k8s-ci-robot k8s-ci-robot added the area/api Indicates an issue on api area. label Aug 14, 2017
@zhangxiaoyu-zidif
Copy link
Contributor

/area apiserver

@xiangpengzhao
Copy link
Contributor

maybe
/sig api-machinery
also.

@k8s-ci-robot k8s-ci-robot added the sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. label Aug 17, 2017
@k8s-github-robot k8s-github-robot removed the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Aug 17, 2017
@mml
Copy link
Contributor

mml commented Aug 17, 2017

/assign @jpbetz cc @mml

@k8s-ci-robot
Copy link
Contributor

@mml: GitHub didn't allow me to assign the following users: jpbetz, cc.

Note that only kubernetes members can be assigned.

In response to this:

/assign @jpbetz cc @mml

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@Cryptophobia
Copy link

Cryptophobia commented Aug 28, 2017

@discordianfish : How did you ultimately fix this issue? We are getting the same message when deploying with kops and we think its a DNS issue but we keep getting this error in the kube-apiserver.log file.

@discordianfish
Copy link
Contributor Author

@Cryptophobia It's been a "special" setup, but in general you can verify whether it's a DNS issue by using nslookup or dig.

@Cryptophobia
Copy link

Cryptophobia commented Aug 29, 2017

@discordianfish : It turns out instanceGroups are very important for configuring DNS Route53 and etcd during cluster configuration with kops. If those instanceGroups are not defined correctly (particularly when there are even number of master or multiple groups of masters into one availability zone in a single region), etcd servers will not start and master nodes will not check in. This is definitely something that is not very well documented in the kops documentation.

@discordianfish
Copy link
Contributor Author

@Cryptophobia Ah, you should fill an issue with kops then. This issue here is mostly about saving time by pointing you into the right direction.

@Cryptophobia
Copy link

Okay, I'll add another issue to the 784 issues already open. 😆 👍

@ntfrnzn
Copy link
Contributor

ntfrnzn commented Sep 27, 2017

I'm commenting here only to give my +1 to the underlying issue, that the error message emitted by the apiserver is not sufficient to begin debugging related problems.

I'm seeing the same error message, my experimental configuration is quite different, but it would be more pleasing if the apiserver told me clearly why it cannot talk to an etcd cluster, where it thinks that etcd cluster is, and so on.

@Cryptophobia
Copy link

Agreed. Better error messages and more error details going up to the api layer errors so we know what kubernetes apiserver is actually trying to.

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

Prevent issues from auto-closing with an /lifecycle frozen comment.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or @fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 11, 2018
@discordianfish
Copy link
Contributor Author

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 11, 2018
@Cryptophobia
Copy link

/lifecycle frozen

@k8s-ci-robot k8s-ci-robot added the lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. label Jan 11, 2018
@Cryptophobia
Copy link

This would be a really nice feature. I wish I knew more Go so I could help out.

@sttts sttts added the help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. label Mar 1, 2018
@jordy25519
Copy link

I've opened a PR for this. Could use a review

@nikhita
Copy link
Member

nikhita commented Jun 13, 2018

I've opened a PR for this.

@Holygits thanks!!

I'll remove the help-wanted label on this. :)

/remove-help

@k8s-ci-robot k8s-ci-robot removed the help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. label Jun 13, 2018
@mml mml removed their assignment Nov 27, 2023
@aaronchall
Copy link

Will this issue ever be addressed?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/api Indicates an issue on api area. area/apiserver kind/feature Categorizes issue or PR as related to a new feature. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery.
Projects
None yet
Development

Successfully merging a pull request may close this issue.