New component: Alert Manager Receiver and Exporter #18526

nicolastakashi · 2023-02-13T11:59:05Z

The purpose and use-cases of the new component

Would be nice if we have an Alert Manager Receiver and Exporter so that we can leverage the pipeline and processors to enrich the alert content before we send it to the final destination.

Many APM vendors enrich the Alert with meta information before delivering it to the destination channels like Slack or Pager Duty and having this ability on the otel combined with the alert manager can help O11y platforms provide useful information on their alert notifications such as.
Below you can find a few examples of processors we can apply to alert before sending it.

Graph screenshot

Is very useful having screenshots with the metric plot on the alert notification, and this can be implemented using tools like Promplot or Grafana Image Render

Check dependency alerts firing.

This could be a little bit trick, but if you are using spanprocessors we may find all the dependencies for a give service and use the alertmanager api to check if there are any firing alert for that service dependency (this could be achived using the job label)

Example configuration for the component

receivers:
  alertmanager:
    address: 0.0.0.0:9093
exporters:
  alertmanager:
    address: http://monitoring.alertmanager.svc.cluster.local:9093

Telemetry data types supported

Logs

Is this a vendor-specific component?

This is a vendor-specific component
If this is a vendor-specific component, I am proposing to contribute this as a representative of the vendor.

Sponsor (optional)

No response

Additional context

No response

The text was updated successfully, but these errors were encountered:

djaglowski · 2023-02-13T14:08:21Z

@nicolastakashi, I think the first step here is to clarify whether you are proposing alerts as an entirely new data type (same level as traces, metrics, logs), or if they would fit into an existing data type (most likely logs).

My opinion is that the logs data model is sufficient to represent alerts, so I'm looking at this as a proposal for a logs receiver and a logs exporter. However, it's not clear what data format you are suggesting would be sent to the receiver and sent from the exporter.

nicolastakashi · 2023-02-13T16:00:05Z

Hi @djaglowski
I was in doubt when I created the issue but after reading your comment, I understood it better.
Yeah, it makes sense logs data types are flexible enough to handle the alerting schema, especially bearing in mind this is a JSON object 😄

BTW!
I updated the issue with the proper data type.

djaglowski · 2023-02-13T16:28:02Z

Thanks @nicolastakashi.

Based on the example config, it looks like the receiver would stand up a server and listen for alerts. Is that right? What protocol are you suggesting would be used for this? Would the tcplog or udplog receivers work for you?

nicolastakashi · 2023-02-13T16:54:44Z

@djaglowski yeah exacly,

The receiver will be listening for alerts and it should be tcplog since we need feedback about the success or failure of receiving the alert.

djaglowski · 2023-02-13T17:07:08Z

The existing tcplog receiver would be good starting point then. It may work for you as is or otherwise please propose specific enhancements it would need.

nicolastakashi · 2023-02-13T17:17:28Z

Cool!
@djaglowski I'll try to prepare a POC and let you know, as soon as I have something working.

nicolastakashi · 2023-02-17T16:51:26Z

Hi @djaglowski, I tested it locally.

The tcplog will not work in that case because Prometheus needs to talk to a specific Alert Manager API.
To be able to use tcplog we need to have a different flow, Prometheus sends to AlertManager and Alertmanaged sends to tcplog and after we have an exporter to send back to Alertmanager, in my view is not the best experience.
Ideally we should have an alert manager receiver to receive the Alert Payload as a log entry and then we can flow that through the OpenTelemetry pipeline.

djaglowski · 2023-02-17T17:06:53Z

It sounds like the components you are proposing are specific to prometheus or at least to a protocol or API that prometheus uses. I think in order for your proposal to be evaluated, you need to explain this protocol or API in detail.

nicolastakashi · 2023-02-21T21:45:57Z

Hi @djaglowski indeed AlertManager is mostly used with Prometheus, but this is an Alert System and it could be used in many different use cases.

Alertmanager has an OpenAPI definition with its API implementation, from my understanding we only need to have a receiver that handles only the create alert endpoint as you can see here

The receiver is going to receiver will work as a tcp receiver exposing an HTTP endpoint.

Let me know what kind other information you need, and if you have any other example where I can look on I can provide the infor using specific standard.

djaglowski · 2023-02-22T14:13:15Z

Thanks for the link @nicolastakashi.

Based on the fact that alert manager is a prometheus repository, I think it's important to have this context, so I would suggest these components should include the prometheus name in some way.

That said, this technology and use case are a bit outside of my wheelhouse so I do not expect to sponsor the component. The best way to find a sponsor is often to attend the Collector SIG meeting to explain the value and ask if any approvers or maintainers are willing to sponsor. There's a meeting today and every Wednesday at 5PM UTC.

andrzej-stencel · 2023-03-01T16:40:25Z

This sounds to me like shoehorning an arbitrary data type into the collector for processing. The OpenTelemetry Collector was created to process specific data types - telemetry in form of logs, metrics and traces. You certainly can feed any type of unstructured or structured data into the collector, but is it what the contrib should be concerned with?

andrzej-stencel · 2023-03-01T20:12:57Z

Dan has raised a good point during today's Collector SIG meeting: it's probably easier to justify an Alertmanager receiver than an exporter, as you could argue an alert is as good a source of telemetry as any other event. Exporting to Alertmanager is not as easily justifiable in my eyes, as the Alertmanager does not meet the criteria of a "telemetry backend".

I don't think we reached a definite conclusion during the meeting; I suppose if a sponsor decides to support this proposal then it's good to go.

nicolastakashi · 2023-03-02T09:01:12Z

Hey, @astencel-sumo thanks for sharing your thoughts, and apologize for not attending the meeting. I had a setback the last time, and have no time to say it.

But regarding the AlertManager exporter I agree with you it's harder to justify.

Maybe should have an external service provided by the community that accepts otel and overt to alert the manager doing it out side the collector.

So we can receive the Alerts can enrich the alert but export than to AM should be a job out of the collector.

I'll try to ask for some opinions about the AlertManager maintainers also.

gouthamve · 2023-03-06T06:39:52Z

So I could potentially see some usefulness here:

Alertmanager Receiver: Currently, we don't have historical persistence on the notifications sent out by Prometheus. And having an AM receiver would be useful if someone wants to export that data to Loki or Elastic for further analysis down the line.

If we had a logs2metrics connector, then we could also generate custom metrics from alerts, like how many times a particular namespace has alerted. While these metrics are available in Prometheus, they are per Prometheus and this could be a global view.

Alertmanager Exporter: Now, this is harder to justify but one usecase is if we want the Collector to generate Alertmanager payloads and trigger notifications. I am not actually sure if this is in the purview of the Collector.

nicolastakashi · 2023-03-06T08:17:15Z

@gouthamve an amazing use case about the logs2metrics connector, regarding the exporter on the worst case scenario, we can build a service that receives the OTLP and Sends to AM.

gouthamve · 2023-03-06T09:50:08Z

The countconnector can do it today. Thanks for the idea @kovrus!

nicolastakashi · 2023-03-10T08:41:31Z

@atoulme and @djaglowski is the idea to have something more generic AM can post an alert using webhooks?

djaglowski · 2023-03-10T14:17:18Z

@nicolastakashi, the webhook receiver was proposed independently. Would it work for AM?

nicolastakashi · 2023-03-10T16:18:41Z

@djaglowski yeah it would also work.

Alertmanager has a feature that let push an alert to an webhook, since it's a JSON object we can put the entire JSON object as a log body

djaglowski · 2023-03-10T17:41:27Z

@nicolastakashi, that's great to hear.

Given that this issue has not found a sponsor, and that there are concerns about whether an AM exporter is appropriate, do you think we should close this issue?

nicolastakashi · 2023-03-13T08:10:36Z

@djaglowski yeah, I guess for now we can close this, thanks for all the support

nicolastakashi added the needs triage New item requiring triage label Feb 13, 2023

djaglowski added the Sponsor Needed New component seeking sponsor label Feb 22, 2023

atoulme removed the needs triage New item requiring triage label Mar 7, 2023

djaglowski mentioned this issue Mar 8, 2023

[receiver/webhookevent] First commit #19377

Merged

djaglowski closed this as completed Mar 13, 2023

jpkrohling mentioned this issue Jun 26, 2023

New component: Alert Manager Exporter #23569

Closed

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New component: Alert Manager Receiver and Exporter #18526

New component: Alert Manager Receiver and Exporter #18526

nicolastakashi commented Feb 13, 2023 •

edited

Loading

djaglowski commented Feb 13, 2023

nicolastakashi commented Feb 13, 2023 •

edited

Loading

djaglowski commented Feb 13, 2023

nicolastakashi commented Feb 13, 2023

djaglowski commented Feb 13, 2023

nicolastakashi commented Feb 13, 2023

nicolastakashi commented Feb 17, 2023

djaglowski commented Feb 17, 2023

nicolastakashi commented Feb 21, 2023

djaglowski commented Feb 22, 2023

andrzej-stencel commented Mar 1, 2023 •

edited

Loading

andrzej-stencel commented Mar 1, 2023

nicolastakashi commented Mar 2, 2023

gouthamve commented Mar 6, 2023

nicolastakashi commented Mar 6, 2023

gouthamve commented Mar 6, 2023

nicolastakashi commented Mar 10, 2023

djaglowski commented Mar 10, 2023

nicolastakashi commented Mar 10, 2023

djaglowski commented Mar 10, 2023

nicolastakashi commented Mar 13, 2023

New component: Alert Manager Receiver and Exporter #18526

New component: Alert Manager Receiver and Exporter #18526

Comments

nicolastakashi commented Feb 13, 2023 • edited Loading

The purpose and use-cases of the new component

Graph screenshot

Check dependency alerts firing.

Example configuration for the component

Telemetry data types supported

Is this a vendor-specific component?

Sponsor (optional)

Additional context

djaglowski commented Feb 13, 2023

nicolastakashi commented Feb 13, 2023 • edited Loading

djaglowski commented Feb 13, 2023

nicolastakashi commented Feb 13, 2023

djaglowski commented Feb 13, 2023

nicolastakashi commented Feb 13, 2023

nicolastakashi commented Feb 17, 2023

djaglowski commented Feb 17, 2023

nicolastakashi commented Feb 21, 2023

djaglowski commented Feb 22, 2023

andrzej-stencel commented Mar 1, 2023 • edited Loading

andrzej-stencel commented Mar 1, 2023

nicolastakashi commented Mar 2, 2023

gouthamve commented Mar 6, 2023

nicolastakashi commented Mar 6, 2023

gouthamve commented Mar 6, 2023

nicolastakashi commented Mar 10, 2023

djaglowski commented Mar 10, 2023

nicolastakashi commented Mar 10, 2023

djaglowski commented Mar 10, 2023

nicolastakashi commented Mar 13, 2023

nicolastakashi commented Feb 13, 2023 •

edited

Loading

nicolastakashi commented Feb 13, 2023 •

edited

Loading

andrzej-stencel commented Mar 1, 2023 •

edited

Loading