-
Notifications
You must be signed in to change notification settings - Fork 401
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement CRD's for Grafana 9 Alert Rules, Alert Groups, Contact Points, Notification Policies and Silences #911
Comments
This could be interesting for v5 (https://github.com/grafana-operator/grafana-operator-experimental). |
@R-Studio i've transferred your request to the new repository, I hope you don't mind! |
Any news on this? |
So I have looked through the API of Alerts. In your request, you mention:
ThoughtsSo my understanding of
If we added a CRD for When it comes to But all of the above things are possible to solve and could still be "user friendly" as a CRD. So we have two big issues left. Big issuesLet's start with the core feature of The other option is that you write the alert through the UI and then export the alerts by doing API requests to Grafana. The second big issue is that Silence contains state, since it's not reasonable for a developer to create a CRD to mute an alert this state needs to be stored else your silence will disappear when your grafana instance is restarted. In short, I think this is a really hard issue to solve, and I don't have any good solutions for it, especially the way of working around alert management. Possible workaroundSo if you are a prometheus-operator user I would recommend using the AlertManager, grafana supports importing external alert managers as a datasource and it got support to write back silences to the alertmanager. That way the alerts silance is stored in the alert manager which already got state. This is how I solved it at my last company. Alerts management creation isn't amazing in prometheus-operator CRD ether, but if you are used to writing promql it's not a big problem. The problem with this is of course, what about those of us that uses Grafana for none prometheus datasources like GCP, well to be honest we are kind of out of luck when it comes to the perfect GitOps alerts as code workflow that we all are dreaming of. Sure it might be possible to use GCP as a datasource for your alerts as well (I haven't tried) but you probably want to keep your developers in the grafana UI and not go over to GCP just to mute a alert. ConclusionIf you use prometheus-operator I suggest using the import alertmanager as an datasource feature in grafana to manage your alerts. I don't think there is any reasonable way to solve this through the grafana-operator and creating new CRD Please don't agree with my, I would love to hear an idea how these issues could be solved because I want this feature as well. What are your thoughts on this @R-Studio and the other people that want this feature. |
Could you say more about what this means? My naive assumption (keeping in mind that although I'm a user of this project and grafana itself I'm not an authority on the dev SDKs) is that there exists an API to perform CRUD ops on these rules, as that API must be what the grafana web console is interacting with. Thus the CRD would be a yaml-definable mapping to that same API interface. As to the statefulness of one of more attributes.. I could see value in having a field be specifyable as "only_on_create" or whatever a good name is to indicate that the value of said field in yaml is to be set on original creation and then no longer managed by the CRD. This would allow as-code creation of alerts and then subsequent management of, for example, silencing to be done via the web console. FWIW in my use cases the primary value of the CRD is that I have as-code definitions of my grafana objects for better gitops. I want to roll out newly developed alerts and dashboards through each environment with low friction. |
So how it would look isn't an easy question to answer and It would take many hours to come up with a suggestion that would even start to being good enough to discuss. But I created a small app to try to get a feeling for the API. Looking at the SDK it's very focused on folders and groups which would make this even harder. My app will list all alerts that you have in a specific group called package main
import (
"fmt"
"net/url"
gapi "github.com/grafana/grafana-api-golang-client"
)
func main() {
// Create gapi config
config := gapi.Config{
BasicAuth: url.UserPassword("root", "secret"),
}
// Create a new client
client, err := gapi.New("http://localhost:3000", config)
if err != nil {
panic(err)
}
folderList, err := client.Folders()
if err != nil {
panic(err)
}
// Look through the list of folders and get the uid
for _, folder := range folderList {
fmt.Println(folder.UID)
listRules, err := client.AlertRuleGroup(folder.UID, "group")
if err != nil {
panic(err)
}
fmt.Println(listRules)
}
} Basic grafana instance apiVersion: grafana.integreatly.org/v1beta1
kind: Grafana
metadata:
name: grafana
labels:
dashboards: "grafana"
spec:
config:
log:
mode: "console"
auth:
disable_login_form: "false"
security:
admin_user: root
admin_password: secret I port-forward to my instance and I created an alert manually in my grafana instance and then ran my script. It provides the following output. {
group rDzwzO8Vz 60 [
{map[description:if this happens the world has ended
] A [
0xc000180230
] Alerting rDzwzO8Vz 1 map[hello:world
] NoData 1 group test xu9Qkd8Vz 2023-05-12 06: 14: 35 +0000 UTC 5m 0s false
}
]
} So if you want to give the CRD definition I would start reading the specs and playing around a bit with the SDK because I personally don't think the API documentation is great... there are no examples in the output. https://grafana.com/docs/grafana/latest/developers/http_api/alerting_provisioning/ |
Totally agree. I'm also using with prom-operator and it works fine. Grafana alerting is a good feature in the UI but I think that APIs are not mature enough to work with third party software (like the operator). |
Somehow I missed this issue when I searched for relevant ones but here's my two cents.
I'd be willing to draw up a proposal/design for this, however I can't assist with the implementation in the near future sadly. |
@siegenthalerroger , creating a design proposal is a great start. How the document should look isn't set in stone, but if you take a look at, you will see what we are after. Looking forward to seeing the design :) |
@siegenthalerroger I'd be willing to help out here - I agree, that an iterative approach is good (enough) here. As for implementation - I hope to be able to help out here, but time is also an issue sadly. |
@rammelmueller please create a design document as a first step. Looking forward to a PR around it |
@rammelmueller I've created an initial proposal that's missing any/all of the specification for the new CRDs. Feel free to reach out if you want to assist. |
Hi. I just wanted to ask if there is already more than just the proposal, like a decision it's going to happen or not? |
@AlexEndris Hey! it is happening, @theSuess is planning on picking it up in the coming days I believe |
That's great. I'd be willing to support, if I can, although I have 0 go experience. But I surely could play around with test versions. |
We got an initial PR that can be viewed here: #1420 |
We would love some feedback around #1420, if you can try it out and see that it's working as you expect that would be great. |
@NissesSenap I hope to find time to test it in the next few weeks. |
There is no issue for implementing GrafanaAlertRule I could track, yet. Right? Or was this already implemented by #1420? |
I believe alert rules are sub-resources of the
|
Oh, then I must have missed that completely. Thank you! And Thx @theSuess for the implementation. Came just in time when I needed it! :) |
Is your feature request related to a problem? Please describe.
No existing problem.
Describe the solution you'd like
Grafana Alerting is a great new feature, but the current grafana-operator does not support a CRD definition for the decoupled alerts, contact points, alert groups, notification policies and silences. The desired solution would be CRDs for each of these so that the new alerting tools are equally accessible via yaml.
kind: GrafanaAlertRule
kind: GrafanaAlertGroup
kind: GrafanaContactPoint
kind: GrafanaNotificationPolicy
kind: GrafanaSilence
Describe alternatives you've considered
A workaround being considered is to manually mount the relevant config files into the running container via VolumeMount from a ConfigMap containing raw file contents. The only other alternative I am aware of is to manually enter these into the Grafana web console, which isn't the point.
This feature request is not new but has been reopened since the previous one was closed: #564
The text was updated successfully, but these errors were encountered: