-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add statusz endpoint with basic diagnostic information. #131
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! PTAL.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The build is failing. Can you fix that so we can detect other errors?
// | ||
// Some fields are not part of this table; for example, metricRenames | ||
// and staticMetadata are displayed separately. | ||
func (c *mainConfig) TableForStatusz() map[string]string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this method need to be exported? If not, start with a lowercase letter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Removed.)
// endpoint. | ||
// | ||
// Some fields are not part of this table; for example, metricRenames | ||
// and staticMetadata are displayed separately. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you document how to decide what goes here and what doesn't? If there is a public design, we should link to that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Per other comments, I removed this function and just referenced the config directly from the template.
I think all config fields should display on the /statusz page as long as there is some meaningful value (it now displays everything but manualResolver).
We don't have a public design yet.
@@ -255,6 +283,9 @@ func main() { | |||
a.Flag("web.listen-address", "Address to listen on for UI, API, and telemetry."). | |||
Default("0.0.0.0:9091").StringVar(&cfg.listenAddress) | |||
|
|||
a.Flag("web.enable-statusz", "If true, then enables a /statusz endpoint on the web server with diagnostic information."). | |||
Default("false").BoolVar(&cfg.enableStatusz) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the thinking for not leaving this on by default or always?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to on by default. But yeah, maybe we should just omit the option entirely.
I don't have a strong feeling either way -- what do you suggest?
} | ||
} | ||
|
||
var statuszTmpl = template.Must(template.New("").Parse(` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you extract this to a file? It'll be more convenient to maintain.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
data.GKEInfo.ClusterLocation = h.cfg.kubernetesLabels.location | ||
data.GKEInfo.ClusterName = h.cfg.kubernetesLabels.clusterName | ||
|
||
data.ConfigTable = h.cfg.TableForStatusz() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you considered using the main config struct as-is, and handling all the presentation in the template? Then we could remove the TableForStatusz method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea, done.
data.Uname = Uname() | ||
data.FdLimits = FdLimits() | ||
|
||
data.StartTime = serverStart.Format(time.RFC1123) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should use the newer RFC 3339 unless backward compatibility is needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
data.StartTime = serverStart.Format(time.RFC1123) | ||
uptime := int64(time.Since(serverStart).Seconds()) | ||
data.Uptime = fmt.Sprintf("%d hr %02d min %02d sec", | ||
uptime/3600, (uptime/60)%60, uptime%60) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you use time.Duration's default string conversion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
kube/patch.sh
Outdated
- name: POD_NAMESPACE | ||
valueFrom: | ||
fieldRef: | ||
fieldPath: metadata.namespace |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will go straight into our Getting Started docs and I'm concerned about clutter. What do you think about adding this to the updating the configuration section instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aha. I had put this in here so that I could use deploy.sh on a test cluster. Is there a better way make that work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about creating a copy the file for docs? Alternatively you could ask one of our tech writers to give a thumbs up on the larger snippet. I'd do it outside this PR, so you can submit the rest as-is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, removed these for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! PTAL.
data.Uname = Uname() | ||
data.FdLimits = FdLimits() | ||
|
||
data.StartTime = serverStart.Format(time.RFC1123) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
data.StartTime = serverStart.Format(time.RFC1123) | ||
uptime := int64(time.Since(serverStart).Seconds()) | ||
data.Uptime = fmt.Sprintf("%d hr %02d min %02d sec", | ||
uptime/3600, (uptime/60)%60, uptime%60) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
} | ||
} | ||
|
||
var statuszTmpl = template.Must(template.New("").Parse(` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@@ -255,6 +283,9 @@ func main() { | |||
a.Flag("web.listen-address", "Address to listen on for UI, API, and telemetry."). | |||
Default("0.0.0.0:9091").StringVar(&cfg.listenAddress) | |||
|
|||
a.Flag("web.enable-statusz", "If true, then enables a /statusz endpoint on the web server with diagnostic information."). | |||
Default("false").BoolVar(&cfg.enableStatusz) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to on by default. But yeah, maybe we should just omit the option entirely.
I don't have a strong feeling either way -- what do you suggest?
// endpoint. | ||
// | ||
// Some fields are not part of this table; for example, metricRenames | ||
// and staticMetadata are displayed separately. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Per other comments, I removed this function and just referenced the config directly from the template.
I think all config fields should display on the /statusz page as long as there is some meaningful value (it now displays everything but manualResolver).
We don't have a public design yet.
// | ||
// Some fields are not part of this table; for example, metricRenames | ||
// and staticMetadata are displayed separately. | ||
func (c *mainConfig) TableForStatusz() map[string]string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Removed.)
data.GKEInfo.ClusterLocation = h.cfg.kubernetesLabels.location | ||
data.GKEInfo.ClusterName = h.cfg.kubernetesLabels.clusterName | ||
|
||
data.ConfigTable = h.cfg.TableForStatusz() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea, done.
kube/patch.sh
Outdated
- name: POD_NAMESPACE | ||
valueFrom: | ||
fieldRef: | ||
fieldPath: metadata.namespace |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aha. I had put this in here so that I could use deploy.sh on a test cluster. Is there a better way make that work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really nice. Thanks, Dave!
data.Uptime = fmt.Sprintf("%d hr %02d min %02d sec", | ||
uptime/3600, (uptime/60)%60, uptime%60) | ||
data.StartTime = serverStart.Format(time.RFC3339) | ||
data.Uptime = time.Since(serverStart).String() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit, I'm guessing the template system will happily accept the time and duration objects, and may even provider better formatting options. It's fine if you decide to keep this as-is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
kube/patch.sh
Outdated
- name: POD_NAMESPACE | ||
valueFrom: | ||
fieldRef: | ||
fieldPath: metadata.namespace |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about creating a copy the file for docs? Alternatively you could ask one of our tech writers to give a thumbs up on the larger snippet. I'd do it outside this PR, so you can submit the rest as-is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks.
data.Uptime = fmt.Sprintf("%d hr %02d min %02d sec", | ||
uptime/3600, (uptime/60)%60, uptime%60) | ||
data.StartTime = serverStart.Format(time.RFC3339) | ||
data.Uptime = time.Since(serverStart).String() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
kube/patch.sh
Outdated
- name: POD_NAMESPACE | ||
valueFrom: | ||
fieldRef: | ||
fieldPath: metadata.namespace |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, removed these for now.
48745ff
to
aecdfe0
Compare
No description provided.