-
Notifications
You must be signed in to change notification settings - Fork 162
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BR: metrics rework #3176
BR: metrics rework #3176
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 25 of 25 files at r1.
Reviewable status: all files reviewed, 25 unresolved discussions (waiting on @kormat and @sgmonroy)
go/border/io.go, line 126 at r1 (raw file):
rp.Raw = rp.Raw[:msg.N] // Set the length of the slice rp.Ingress.Dst = dst rp.Ingress.IfLabel = s.Label
where do you we use that?
go/border/router.go, line 188 at r1 (raw file):
l.Result = metrics.ErrRoute metrics.Process.PktsWith(l).Inc() }
Do we need also here to increase metric with success?
go/border/ifstate/ifstate.go, line 62 at r1 (raw file):
type state struct { // info is a pointer to an Info object. // XXX(sgmonroy) why is this an unsafe pointer?
Is that left on purpose? Follow up story?
go/border/ifstate/ifstate.go, line 84 at r1 (raw file):
SRevInfo: srev, RawSRev: rawSRev, ActiveMetric: metrics.Control.IFStateWith(label),
I cannot find where we use the ActiveMetric (apart from some lines below).
go/border/ifstate/ifstate.go, line 100 at r1 (raw file):
var rawSRev common.RawBytes ifid := common.IFIDType(info.IfID) // XXX Why accept unsigned ifstate infos?
Left on purpose?
go/border/internal/metrics/ctrl.go, line 31 at r1 (raw file):
) type ControlLabels struct {
To tested as in
https://github.com/scionproto/scion/blob/99f78f029d55c277b1551052a13136ecda8367e0/go/path_srv/internal/metrics/registration_test.go
go/border/internal/metrics/ctrl.go, line 35 at r1 (raw file):
Type string Src string Dst string
Result,Type,Src,Dst string
is slightly more golang way
go/border/internal/metrics/ctrl.go, line 38 at r1 (raw file):
} func (l *ControlLabels) Labels() []string {
no pointer
go/border/internal/metrics/ctrl.go, line 63 at r1 (raw file):
} func (c *control) PktsWith(l ControlLabels) prometheus.Counter {
comments as usual
go/border/internal/metrics/input.go, line 39 at r1 (raw file):
sub := "input" intf := IntfLabels{} l := intf.Labels()
one line?
go/border/internal/metrics/input.go, line 62 at r1 (raw file):
} func (in *input) PktsWith(l IntfLabels) prometheus.Counter {
comments as usual
go/border/internal/metrics/metrics.go, line 30 at r1 (raw file):
// Result values. const (
Best practice would be to document/comment all the exported vars
go/border/internal/metrics/metrics.go, line 53 at r1 (raw file):
) type IntfLabels struct {
this can be tested as in https://github.com/scionproto/scion/blob/99f78f029d55c277b1551052a13136ecda8367e0/go/path_srv/internal/metrics/registration_test.go
go/border/internal/metrics/metrics.go, line 56 at r1 (raw file):
Intf string }
Comment is missing , see https://github.com/scionproto/scion/blob/master/doc/Metrics.md for the generic description
go/border/internal/metrics/metrics.go, line 57 at r1 (raw file):
} func (l *IntfLabels) Labels() []string {
no pointer needed
go/border/internal/metrics/metrics.go, line 61 at r1 (raw file):
} func (l *IntfLabels) Values() []string {
no pointer needed
go/border/internal/metrics/metrics.go, line 70 at r1 (raw file):
} func IntfToLabel(ifid common.IFIDType) string {
If we use IFIDType.String()
and do not have the default loc
, would that be no desirable?
What loc
means?
go/border/internal/metrics/output.go, line 38 at r1 (raw file):
sub := "output" intf := IntfLabels{} l := intf.Labels()
IntfLabels{}.Labels()
, with Labels a method on a value receiver?
go/border/internal/metrics/output.go, line 58 at r1 (raw file):
} func (o *output) PktsWith(l IntfLabels) prometheus.Counter {
Generic comments
go/border/internal/metrics/process.go, line 35 at r1 (raw file):
Out string }
Generic comments so linter will complain less,
go/border/rctrl/ctrl.go, line 90 at r1 (raw file):
cl.Result = metrics.ErrRead metrics.Control.PktsWith(cl).Inc() fatal.Fatal(common.NewBasicError("Reading packet", err))
Confuses me the behavior here, why not just break? (irrelevant with your changes)
go/border/rctrl/ctrl.go, line 93 at r1 (raw file):
} cl.Type = metrics.IFStateInfo cl.Result = metrics.Success
this can be initialized out side the for loop
go/border/rctrl/ifstate.go, line 44 at r1 (raw file):
// startup. func ifStateUpdate() { if err := genIFStateReq(); err != nil {
Do we need to run once manually? the for range is not enough?
go/border/rctx/io.go, line 71 at r1 (raw file):
Dir: dir, Ifid: ifid, Label: metrics.IntfToLabel(ifid),
if you have already the ifid, do we still need to have the Label?
We could always jhave this value?
go/border/rpkt/route.go, line 65 at r1 (raw file):
for _, epair := range rp.Egress { epair.S.Ring.Write(ringbuf.EntryList{&EgressRtrPkt{rp, epair.Dst}}, true) l.Out = epair.S.Label
Here, you could use the S.Ifid.String()
, maybe :)
97de2d4
to
1f15142
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 25 unresolved discussions (waiting on @karampok and @kormat)
go/border/io.go, line 126 at r1 (raw file):
Previously, karampok (Konstantinos) wrote…
where do you we use that?
go/border/rpkt/route.go#L59
go/border/router.go, line 188 at r1 (raw file):
Previously, karampok (Konstantinos) wrote…
Do we need also here to increase metric with success?
go/border/rpkt/route.go#L66
go/border/ifstate/ifstate.go, line 62 at r1 (raw file):
Previously, karampok (Konstantinos) wrote…
Is that left on purpose? Follow up story?
Done.
go/border/ifstate/ifstate.go, line 84 at r1 (raw file):
Previously, karampok (Konstantinos) wrote…
I cannot find where we use the ActiveMetric (apart from some lines below).
Done.
go/border/ifstate/ifstate.go, line 100 at r1 (raw file):
Previously, karampok (Konstantinos) wrote…
Left on purpose?
Done.
go/border/internal/metrics/ctrl.go, line 31 at r1 (raw file):
Previously, karampok (Konstantinos) wrote…
To tested as in
https://github.com/scionproto/scion/blob/99f78f029d55c277b1551052a13136ecda8367e0/go/path_srv/internal/metrics/registration_test.go
Done.
go/border/internal/metrics/ctrl.go, line 35 at r1 (raw file):
Previously, karampok (Konstantinos) wrote…
Result,Type,Src,Dst string
is slightly more golang way
I added comments, so left on separated lines.
go/border/internal/metrics/ctrl.go, line 38 at r1 (raw file):
Previously, karampok (Konstantinos) wrote…
no pointer
Done.
go/border/internal/metrics/ctrl.go, line 63 at r1 (raw file):
Previously, karampok (Konstantinos) wrote…
comments as usual
Done.
go/border/internal/metrics/input.go, line 39 at r1 (raw file):
Previously, karampok (Konstantinos) wrote…
one line?
Done.
go/border/internal/metrics/input.go, line 62 at r1 (raw file):
Previously, karampok (Konstantinos) wrote…
comments as usual
Done.
go/border/internal/metrics/metrics.go, line 30 at r1 (raw file):
Previously, karampok (Konstantinos) wrote…
Best practice would be to document/comment all the exported vars
Done.
go/border/internal/metrics/metrics.go, line 53 at r1 (raw file):
Previously, karampok (Konstantinos) wrote…
this can be tested as in https://github.com/scionproto/scion/blob/99f78f029d55c277b1551052a13136ecda8367e0/go/path_srv/internal/metrics/registration_test.go
Done.
go/border/internal/metrics/metrics.go, line 56 at r1 (raw file):
Previously, karampok (Konstantinos) wrote…
Comment is missing , see https://github.com/scionproto/scion/blob/master/doc/Metrics.md for the generic description
Done
go/border/internal/metrics/metrics.go, line 57 at r1 (raw file):
Previously, karampok (Konstantinos) wrote…
no pointer needed
Done.
go/border/internal/metrics/metrics.go, line 61 at r1 (raw file):
Previously, karampok (Konstantinos) wrote…
no pointer needed
Done.
go/border/internal/metrics/metrics.go, line 70 at r1 (raw file):
Previously, karampok (Konstantinos) wrote…
If we use
IFIDType.String()
and do not have the defaultloc
, would that be no desirable?
Whatloc
means?
You need to ask ops team about that. Existing code tagged that interface with the 'loc' label.
go/border/internal/metrics/output.go, line 38 at r1 (raw file):
Previously, karampok (Konstantinos) wrote…
IntfLabels{}.Labels()
, with Labels a method on a value receiver?
Done.
go/border/internal/metrics/output.go, line 58 at r1 (raw file):
Previously, karampok (Konstantinos) wrote…
Generic comments
Done.
go/border/internal/metrics/process.go, line 35 at r1 (raw file):
Previously, karampok (Konstantinos) wrote…
Generic comments so linter will complain less,
Done.
go/border/rctrl/ctrl.go, line 90 at r1 (raw file):
Previously, karampok (Konstantinos) wrote…
Confuses me the behavior here, why not just break? (irrelevant with your changes)
Logic here is something went wrong and we are likely not able to recover, to quit and restart process.
This just preserves the current behavior. I suggest opening an issue about it if you think it should be changed.
go/border/rctrl/ctrl.go, line 93 at r1 (raw file):
Previously, karampok (Konstantinos) wrote…
this can be initialized out side the for loop
Not the result, it can be set to ErrParse in the below branch.
go/border/rctrl/ifstate.go, line 44 at r1 (raw file):
Previously, karampok (Konstantinos) wrote…
Do we need to run once manually? the for range is not enough?
That was the original code, the tick doesn't trigger straight away.
go/border/rctx/io.go, line 71 at r1 (raw file):
Previously, karampok (Konstantinos) wrote…
if you have already the ifid, do we still need to have the Label?
We could always jhave this value?
you need to do fmt.Sprintf per packet just for the counter
go/border/rpkt/route.go, line 65 at r1 (raw file):
Previously, karampok (Konstantinos) wrote…
Here, you could use the
S.Ifid.String()
, maybe :)
see above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 19 of 21 files at r2.
Reviewable status: complete! all files reviewed, all discussions resolved (waiting on @kormat)
go/border/rctrl/ctrl.go, line 93 at r1 (raw file):
Previously, sgmonroy (Sergio Gonzalez Monroy) wrote…
Not the result, it can be set to ErrParse in the below branch.
OK
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 25 files at r1, 1 of 21 files at r2.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @sgmonroy)
a discussion (no related file):
border_control_pkts_total{dst="1-ff00:0:110,[BS M (0x8000)]:0 (UDP)",result="success",src="self",type="ifstate_request"} 3
Don't put host addresses into label values
a discussion (no related file):
Total number of output bytes received.
That should probably be "sent"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @kormat)
a discussion (no related file):
Previously, kormat (Stephen Shirley) wrote…
Total number of output bytes received.
That should probably be "sent"
Done.
a discussion (no related file):
Previously, kormat (Stephen Shirley) wrote…
border_control_pkts_total{dst="1-ff00:0:110,[BS M (0x8000)]:0 (UDP)",result="success",src="self",type="ifstate_request"} 3
Don't put host addresses into label values
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 21 files at r2.
Reviewable status: 18 of 26 files reviewed, 8 unresolved discussions (waiting on @karampok, @kormat, and @sgmonroy)
a discussion (no related file):
It occurs to me that it would be really useful to add the remote IA as a label to all input/output metrics. neigh_ia
is the label name that we're using in #3167. For packets sent/received on the local socket, the label can be the empty string.
go/border/internal/metrics/metrics.go, line 48 at r2 (raw file):
) const Self = "self"
I'd group this with Namespace
above.
go/border/internal/metrics/metrics.go, line 59 at r2 (raw file):
type IntfLabels struct { // Itnf in the interface ID
"Intf is the interface ID"
go/border/internal/metrics/metrics.go, line 77 at r2 (raw file):
return "loc" } return fmt.Sprintf("%d", ifid)
You can use ifid.String()
go/border/internal/metrics/output.go, line 40 at r2 (raw file):
return output{ pkts: prom.NewCounterVec(Namespace, sub, "pkts_total", "Total number of output packets received.", l),
Looking at this again, the wording can be simplified and improved: "Total number of packets sent" (and "Total number of packets received" for the input equivelent).
go/border/internal/metrics/output.go, line 51 at r2 (raw file):
writeErrors: prom.NewCounterVec(Namespace, sub, "write_errors_total", "Total number of output socket write errors.", l), latency: prom.NewCounterVec(Namespace, sub,
I think this is a copy/paste error. Input has "latency" which measures how long packets wait in the kernel buffers before the BR reads them. Output has write duration; how long it takes to write packets back to the kernel. These are fundamentally different measurements.
(I do see that the old code calls the var OutputWriteLatency
, which was most likely a copy/paste error on my side)
So:
- Rename the var to
duration
- the metric to
output write_duration_seconds_total
- and the description to "Total time spent writing packets to the kernel, in seconds."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 8 files at r3.
Reviewable status: 19 of 26 files reviewed, 7 unresolved discussions (waiting on @karampok, @kormat, and @sgmonroy)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 8 files at r3.
Reviewable status: 20 of 26 files reviewed, 6 unresolved discussions (waiting on @karampok, @kormat, and @sgmonroy)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 21 files at r2.
Reviewable status: 20 of 26 files reviewed, 6 unresolved discussions (waiting on @karampok, @kormat, and @sgmonroy)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 6 of 8 files at r3.
Reviewable status: all files reviewed, 6 unresolved discussions (waiting on @kormat and @sgmonroy)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 21 files at r2, 1 of 8 files at r3.
Reviewable status: all files reviewed, 8 unresolved discussions (waiting on @sgmonroy)
go/border/internal/metrics/ctrl.go, line 49 at r3 (raw file):
type control struct { pkts *prometheus.CounterVec
There should be separate metrics for messages sent and received. E.g.
sent_msgs_total{type="ifstate_req", result="success"}
received_msgs_total{type="ifstate_info", result="err_invalid"}
go/border/internal/metrics/process.go, line 23 at r3 (raw file):
) const (
These 2 seem to be unused.
go/border/internal/metrics/process.go, line 60 at r3 (raw file):
"pkts_total", "Total number of processed packets.", ProcessLabels{}.Labels()), time: prom.NewCounterVec(Namespace, sub, "time_seconds_total", "Total packet processing time.", IntfLabels{}.Labels()),
I think this should be duration
instead of time
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 4 of 25 files at r1, 1 of 21 files at r2, 5 of 8 files at r3.
Reviewable status: all files reviewed, 10 unresolved discussions (waiting on @sgmonroy)
go/border/rctrl/ctrl.go, line 91 at r3 (raw file):
fatal.Fatal(common.NewBasicError("Reading packet", err)) } cl.Result = metrics.Success
I think it would be more natural to put this at the top of the loop.
go/border/rctrl/ifstate.go, line 81 at r3 (raw file):
bsAddrs, err := rctx.Get().ResolveSVCMulti(addr.SvcBS) if err != nil { metrics.Control.Pkts(cl).Inc()
I think svc resolution failures should be a separate result
type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 8 of 21 files at r2.
Reviewable status: all files reviewed, 13 unresolved discussions (waiting on @sgmonroy)
go/border/io.go, line 74 at r3 (raw file):
// Called when the packet's reference count hits 0. free := func(rp *rpkt.RtrPkt) { metrics.Process.Time(l).Add(time.Since(rp.TimeIn).Seconds())
Why did this change from a pre-computed metric?
go/border/ifstate/ifstate.go, line 59 at r3 (raw file):
type state struct { // info is a pointer to an Info object. Processing gorouting can update this value.
"goroutine"
go/border/rctrl/ifstate.go, line 55 at r3 (raw file):
// genIFStateReq generates an Interface State request packet to the local beacon service. func genIFStateReq() error {
There should be a metric that increments every time this function runs. That way it can be used to compute error ratios etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 13 unresolved discussions (waiting on @kormat)
a discussion (no related file):
Previously, kormat (Stephen Shirley) wrote…
It occurs to me that it would be really useful to add the remote IA as a label to all input/output metrics.
neigh_ia
is the label name that we're using in #3167. For packets sent/received on the local socket, the label can be the empty string.
I don't understand how that is really useful when you already know the interface ID, neither in this case nor the case you are pointing me to in #3167.
Anyway, done.
go/border/io.go, line 74 at r3 (raw file):
Previously, kormat (Stephen Shirley) wrote…
Why did this change from a pre-computed metric?
Done.
go/border/ifstate/ifstate.go, line 59 at r3 (raw file):
Previously, kormat (Stephen Shirley) wrote…
"goroutine"
Done.
go/border/internal/metrics/ctrl.go, line 49 at r3 (raw file):
Previously, kormat (Stephen Shirley) wrote…
There should be separate metrics for messages sent and received. E.g.
sent_msgs_total{type="ifstate_req", result="success"} received_msgs_total{type="ifstate_info", result="err_invalid"}
Done.
go/border/internal/metrics/metrics.go, line 59 at r2 (raw file):
Previously, kormat (Stephen Shirley) wrote…
"Intf is the interface ID"
Done.
go/border/internal/metrics/metrics.go, line 77 at r2 (raw file):
Previously, kormat (Stephen Shirley) wrote…
You can use
ifid.String()
Done.
go/border/internal/metrics/output.go, line 40 at r2 (raw file):
Previously, kormat (Stephen Shirley) wrote…
Looking at this again, the wording can be simplified and improved: "Total number of packets sent" (and "Total number of packets received" for the input equivelent).
Done.
go/border/internal/metrics/output.go, line 51 at r2 (raw file):
Previously, kormat (Stephen Shirley) wrote…
I think this is a copy/paste error. Input has "latency" which measures how long packets wait in the kernel buffers before the BR reads them. Output has write duration; how long it takes to write packets back to the kernel. These are fundamentally different measurements.
(I do see that the old code calls the var
OutputWriteLatency
, which was most likely a copy/paste error on my side)So:
- Rename the var to
duration
- the metric to
output write_duration_seconds_total
- and the description to "Total time spent writing packets to the kernel, in seconds."
Done.
go/border/internal/metrics/process.go, line 23 at r3 (raw file):
Previously, kormat (Stephen Shirley) wrote…
These 2 seem to be unused.
Removed Ctrl, and now using Drop.
go/border/internal/metrics/process.go, line 60 at r3 (raw file):
Previously, kormat (Stephen Shirley) wrote…
I think this should be
duration
instead oftime
.
Done.
go/border/rctrl/ctrl.go, line 91 at r3 (raw file):
Previously, kormat (Stephen Shirley) wrote…
I think it would be more natural to put this at the top of the loop.
Done.
go/border/rctrl/ifstate.go, line 55 at r3 (raw file):
Previously, kormat (Stephen Shirley) wrote…
There should be a metric that increments every time this function runs. That way it can be used to compute error ratios etc.
Done.
go/border/rctrl/ifstate.go, line 81 at r3 (raw file):
Previously, kormat (Stephen Shirley) wrote…
I think svc resolution failures should be a separate
result
type.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 14 of 14 files at r4.
Reviewable status: all files reviewed, 13 unresolved discussions (waiting on @kormat)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 14 of 14 files at r4.
Reviewable status: all files reviewed, 7 unresolved discussions (waiting on @sgmonroy)
a discussion (no related file):
Previously, sgmonroy (Sergio Gonzalez Monroy) wrote…
I don't understand how that is really useful when you already know the interface ID, neither in this case nor the case you are pointing me to in #3167.
Anyway, done.
If you're looking at some dashboards for a random AS, the interface IDs will be meaningless to you. Having the neighbouring AS in the metric labels will make it much easier to get an overview of what's going on.
It looks like the ctrl part isn't setting the neigh_ia label btw:
# HELP border_control_interface_active Interface is active.
# TYPE border_control_interface_active gauge
border_control_interface_active{intf="1",neigh_ia=""} 1
border_control_interface_active{intf="2",neigh_ia=""} 1
go/border/internal/metrics/ctrl.go, line 55 at r4 (raw file):
func newControl() control { sub := "control"
Suggestion: ctrl
go/border/internal/metrics/metrics.go, line 59 at r2 (raw file):
Previously, sgmonroy (Sergio Gonzalez Monroy) wrote…
Done.
"Itnf" -> "Intf"
go/border/internal/metrics/metrics.go, line 24 at r4 (raw file):
) const Namespace = "border"
Let's change to br
, to fit with the other services.
go/border/rctrl/ctrl.go, line 1 at r4 (raw file):
// Copyright 2018 ETH Zurich, Anapaya Systems
Ctrl Metrics:
- low-level:
- br_ctrl_reads_total{result="ok_success|err_read"}
- br_ctrl_process_errors{type="err_parse|err_invalid_request"} (err_invalid_request for both "Unsupported control payload" and "Unsupported PathMgmt payload")
- ifstate:
- br_ctrl_received_ifstateinfo_total{result="ok_success|err_process"}
- br_ctrl_sent_ifstatereqs_total{result="ok_success|err_process|err_resolve_svc"}
- rev forwarding:
- br_ctrl_read_revinfos_total{result="ok_success|err_parse"}
- br_ctrl_sent_revinfos_total{result="ok_success|err_process|err_resolve_svc", svc="bs|ps"}
go/border/rctrl/ifstate.go, line 47 at r4 (raw file):
logger.Error(err.Error(), nil) } metrics.Control.IFStateTick().Inc()
It would be simpler to put this at the top of genIFStateReq
go/border/rctrl/revinfo.go, line 76 at r4 (raw file):
dst.NextHop, err = ctx.ResolveSVCAny(dstHost) if err != nil { metrics.Control.SentMsgs(cl).Inc()
Missing a cl.Result = metrics.ErrResolveSVC
70dbe46
to
a5c0927
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 7 unresolved discussions (waiting on @kormat)
a discussion (no related file):
Previously, kormat (Stephen Shirley) wrote…
If you're looking at some dashboards for a random AS, the interface IDs will be meaningless to you. Having the neighbouring AS in the metric labels will make it much easier to get an overview of what's going on.
It looks like the ctrl part isn't setting the neigh_ia label btw:
# HELP border_control_interface_active Interface is active. # TYPE border_control_interface_active gauge border_control_interface_active{intf="1",neigh_ia=""} 1 border_control_interface_active{intf="2",neigh_ia=""} 1
Done.
go/border/internal/metrics/ctrl.go, line 55 at r4 (raw file):
Previously, kormat (Stephen Shirley) wrote…
Suggestion:
ctrl
Done.
go/border/internal/metrics/metrics.go, line 24 at r4 (raw file):
Previously, kormat (Stephen Shirley) wrote…
Let's change to
br
, to fit with the other services.
Done.
go/border/rctrl/ctrl.go, line 1 at r4 (raw file):
Previously, kormat (Stephen Shirley) wrote…
Ctrl Metrics:
- low-level:
- br_ctrl_reads_total{result="ok_success|err_read"}
- br_ctrl_process_errors{type="err_parse|err_invalid_request"} (err_invalid_request for both "Unsupported control payload" and "Unsupported PathMgmt payload")
- ifstate:
- br_ctrl_received_ifstateinfo_total{result="ok_success|err_process"}
- br_ctrl_sent_ifstatereqs_total{result="ok_success|err_process|err_resolve_svc"}
- rev forwarding:
- br_ctrl_read_revinfos_total{result="ok_success|err_parse"}
- br_ctrl_sent_revinfos_total{result="ok_success|err_process|err_resolve_svc", svc="bs|ps"}
Done.
go/border/rctrl/ifstate.go, line 47 at r4 (raw file):
Previously, kormat (Stephen Shirley) wrote…
It would be simpler to put this at the top of
genIFStateReq
Done.
go/border/rctrl/revinfo.go, line 76 at r4 (raw file):
Previously, kormat (Stephen Shirley) wrote…
Missing a
cl.Result = metrics.ErrResolveSVC
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 9 of 10 files at r5, 1 of 1 files at r6.
Reviewable status: all files reviewed, 4 unresolved discussions (waiting on @sgmonroy)
go/border/internal/metrics/metrics.go, line 37 at r6 (raw file):
ErrWrite = "err_write" // ErrValidate is an error validating the packet. ErrValidate = "err_validate"
Some new values got added to go/lib/prom since you created this pr. ErrValidate can be reused from there
go/border/internal/metrics/metrics.go, line 40 at r6 (raw file):
// ErrValidate is an error routing the packet. ErrRoute = "err_route" // ErrProcessLocal is an error on local processing the packet, ie. SVC resolution.
Hmm. It looks like this is only used for the result of NeedsLocalProcessing
, which can't actually return an error at the moment. I'm inclined to say drop this value, and make NeedsLocalProcessing
not return anything.
go/border/internal/metrics/metrics.go, line 47 at r6 (raw file):
ErrResolveSVC = "err_resolve_svc" // ErrInvalidRequest is an error for unsupported control message. ErrInvalidRequest = "err_invalid_request"
prom.ErrInvalidReq
go/border/rpkt/path.go, line 126 at r6 (raw file):
// If the BR does not have a revocation for the current epoch, it considers // the interface as active until it receives a new revocation. intf, ok := rp.Ctx.Conf.BR.IFs[*ifid]
I think this is already covered by the if _, ok := rp.Ctx.Conf.Topo.IFInfoMap[*ifid]; !ok {
check at the top of this method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 5 unresolved discussions (waiting on @sgmonroy)
go/border/rctrl/revinfo.go, line 55 at r6 (raw file):
cl := metrics.SentRevInfoLabels{ Result: metrics.ErrProcess, SVC: dstHost.String(),
dstHost.BaseString()
would give a better metric value (you get plain BS
, PS
, rather than the full BS A (0x0000)
)
376ae7d
to
b9d5ac1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 21 of 28 files reviewed, 5 unresolved discussions (waiting on @karampok, @kormat, and @sgmonroy)
go/border/internal/metrics/metrics.go, line 37 at r6 (raw file):
Previously, kormat (Stephen Shirley) wrote…
Some new values got added to go/lib/prom since you created this pr. ErrValidate can be reused from there
Done.
go/border/internal/metrics/metrics.go, line 40 at r6 (raw file):
Previously, kormat (Stephen Shirley) wrote…
Hmm. It looks like this is only used for the result of
NeedsLocalProcessing
, which can't actually return an error at the moment. I'm inclined to say drop this value, and makeNeedsLocalProcessing
not return anything.
Done.
go/border/internal/metrics/metrics.go, line 47 at r6 (raw file):
Previously, kormat (Stephen Shirley) wrote…
prom.ErrInvalidReq
Done.
go/border/rctrl/revinfo.go, line 55 at r6 (raw file):
Previously, kormat (Stephen Shirley) wrote…
dstHost.BaseString()
would give a better metric value (you get plainBS
,PS
, rather than the fullBS A (0x0000)
)
Done.
go/border/rpkt/path.go, line 126 at r6 (raw file):
Previously, kormat (Stephen Shirley) wrote…
I think this is already covered by the
if _, ok := rp.Ctx.Conf.Topo.IFInfoMap[*ifid]; !ok {
check at the top of this method.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 7 of 7 files at r7.
Reviewable status: complete! all files reviewed, all discussions resolved
This change is