-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Grafana dashboards #260
Grafana dashboards #260
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, thanks @KalmanMeth ! Just small suggestions on my side
go.mod
Outdated
|
||
replace github.com/netobserv/flowlogs-pipeline v0.1.7-0.20221221173558-e6e77158b956 => github.com/KalmanMeth/flowlogs2metrics v0.0.0-20230129124539-a447dc625fd4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
replace github.com/netobserv/flowlogs-pipeline v0.1.7-0.20221221173558-e6e77158b956 => github.com/KalmanMeth/flowlogs2metrics v0.0.0-20230129124539-a447dc625fd4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No longer relevant. I included the latest release of flp confgenerator, which now includes the function to create the dashboards.
@KalmanMeth I'm trying that on cluster bot using OCP 4.12.1; The dashboard is correctly created but I don't get data in the graphs. FLP is using
And here is the dashboard config:
Am I missing something ? |
@jpinsonneau Did you run the hack/enable_metrics.sh script? Do you have data being generated, such as from the sample workload? |
Thanks ! the content is now loading. |
Yes, I also do not see proper labels. I might need some help to understand what is needed here. I am continuing to investigate. |
You should add
|
@@ -21,7 +21,7 @@ visualization: | |||
type: grafana | |||
grafana: | |||
- expr: 'topk(5,rate(netobserv_namespace_flows_total[1m]))' | |||
legendFormat: '{{SrcK8S_Namespace}} : {{DstK8S_Namespace}}' | |||
legendFormat: '{{pod}} : {{DstK8S_Namespace}}' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For all legendFormat, what about using the same format as console plugin ?
When namespace is available{{namespace}}.{{name}}
(resource / owner) else only {{name}}
(namespace / host / ip) for every peers
https://github.com/netobserv/network-observability-console-plugin/blob/68239b88737f676152a9cf51e3b58bc6b781bd75/web/src/utils/metrics.ts#L77
Then we use ->
between source and destination as:
{{source}} -> {{destination}}
https://github.com/netobserv/network-observability-console-plugin/blob/fe28d186fcd9acc1078c54db7690e86d6c83a293/web/src/components/metrics/metrics-helper.tsx#L147
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added ->
between source and destination. Instead of .
separating namespace
and owner
I used ,
because sometimes the name of an owner or namespace already includes .
.
I included the pod
in the legend because we get separate statistics for each flowlogs-pipeline instance that is running. Since the different FLPs report separately and have different labels, we do not have a cumulative summed total, since the same metrics may be spread across multiple FLPs. We have to think about how we want to handle this.
We can further change the legends in a future PR by simply changing the metrics_definitions yaml files, with no additional changes in the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it's good enough to merge as is. +1 for followup on totals
Thanks @KalmanMeth !
06c079c
to
0616efa
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, tested on OCP 4.12.4
and works fine 👍
Thanks @KalmanMeth !
kubectl delete servicemonitor netobserv-plugin || true | ||
kubectl delete configmap -n openshift-config-managed flowlogs-pipeline-metrics-dashboard || true | ||
kubectl delete configmap -n netobserv flowlogs-pipeline-config || true | ||
kubectl delete configmap -n netobserv console-plugin-config || true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doing make undeploy-sample-cr
should be sufficient to remove everything created without having to maintain a custom list here. Are there downsides using this instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(you just need to make sure that the operator is still running while doing make undeploy-sample-cr
, else it won't work)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make undeploy-operator
is very convenient to clean up all the allocated items in the case we are debugging the operator running locally and something goes wrong.
@@ -40,6 +40,7 @@ func newMonolithReconciler(info *reconcilersCommonInfo) *flpMonolithReconciler { | |||
promService: &corev1.Service{}, | |||
serviceAccount: &corev1.ServiceAccount{}, | |||
configMap: &corev1.ConfigMap{}, | |||
dbConfigMap: &corev1.ConfigMap{}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this new config map should also be added with AddManagedObject
, same as the existing configmap. This is necessary to manage namespace switching: if netobserv
is moved to a different namespace, it ensures the previous configmap is correctly going to be deleted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(same applies to flp_transfo_reconciler
file)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that is what I thought at first. But then I ran into an issue that this new config map is not really in the netobserv
namespace. It technically is in the openshift-config-managed
namespace, but we have to set it up when we create all the objects in the netobserv
namespace.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I didn't notice that, thanks for the explaination
} else if !equality.Semantic.DeepDerivative(newCM.Data, r.owned.configMap.Data) { | ||
if err := r.UpdateOwned(ctx, r.owned.configMap, newCM); err != nil { | ||
return err | ||
} | ||
if r.reconcilersCommonInfo.availableAPIs.HasConsoleConfig() { | ||
if err := r.UpdateOwned(ctx, r.owned.dbConfigMap, dbConfigMap); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure about that ... basically it is saying : "if the main configMap is changed, then update the DB config map" ?
I would rather copy-paste the whole if/else block and adapt it for dbConfiogMap
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(same applies to flp_transfo_reconciler
file)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the main configMap is changed, it is possible that the labels in the dashboard are changed because of a new prefix to the prometheus variables. Otherwise, there is no reason to track the changes to the new dashboards configMap, and as noted above, the dashboard configMap does not technically belong to the netobserv` namespace.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, but tying the dbConfigMap
lifecycle to the presence of configMap
seems risky to me
For instance, imagine that thr process runs through the top if !r.nobjMngr.Exists(r.owned.configMap)
at first creation, and configMap
is created, so far so good. Then imagine there's an issue (e.g. a temporary connectivity issue) and the subsequent call to r.CreateOwned(ctx, dbConfigMap)
fails: in that case, configMap
exists and dbConfigMap
doesn't. In such scenario, the operator will never be able to recover from that state, as the reconcile doesn't create dbConfigMap
when configMap
exists.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The worst that happens is that there is no dashbaord, but everything else runs properly.
To manage it independently from the original configMap, we have to track it. We cannot put it in the netobserv
nobjMngr structure because it is not in the netobserv
namespace and it would cause other confusion. So we would need another place to save the status of this particular dbConfigMap. Where would you suggest?
Note: we would have to perform a separate Fetch for this object since it would not come back in the FetchAll operation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I agree. Something like this should work I guess:
func (r *reconcilersCommonInfo) reconcileDbConfig(ctx context.Context, dbConfigMap *corev1.ConfigMap) error {
curr := &corev1.ConfigMap{}
if err := r.Get(ctx, types.NamespacedName{
Name: dbConfigMap.Name,
Namespace: dbConfigMap.Namespace,
}, curr); err != nil {
if errors.IsNotFound(err) {
return r.CreateOwned(ctx, dbConfigMap)
}
return err
}
if !equality.Semantic.DeepDerivative(dbConfigMap.Data, curr.Data) {
return r.UpdateOwned(ctx, curr, dbConfigMap)
}
return nil
}
(Same function can be shared between monolith & transfo reconcilers)
It makes also the dbConfigMap: &corev1.ConfigMap{}
in owned
struct useless
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Thanks for the PR |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, lgtm!
aecf5df
to
fe44481
Compare
rebased |
143775a
to
4cbd8f0
Compare
reintegrated with latest code base |
Codecov Report
@@ Coverage Diff @@
## main #260 +/- ##
=======================================
Coverage ? 48.32%
=======================================
Files ? 43
Lines ? 4919
Branches ? 0
=======================================
Hits ? 2377
Misses ? 2335
Partials ? 207
Flags with carried forward coverage won't be shown. Click here to find out more. Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
I tested the PR, it looks good, there's still a few things to do but they can be addressed in a follow-up:
|
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: jotak The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Take grafana dashboards created by flp confgenerator [https://github.com/netobserv/flowlogs-pipeline/pull/369] and insert them in the Openshift console.
It is required to first merge FLP PR369, then update go.mod, go mod vendor, etc, and then to merge this PR.