-
Notifications
You must be signed in to change notification settings - Fork 291
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AlertManager v2 #2643
AlertManager v2 #2643
Conversation
- Introduce legacy_alertmanager/grafana_alerting integrations - Hanlde legacy/v2 integrations in public API – Handle alerts from legacy/v2 integrations on same urls
# Conflicts: # CHANGELOG.md
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
overall LGTM, nice work 👍 !
I would also check w/ someone who is more familiar w/ the engine/config_integrations/*.py
files and CSS.
CHANGELOG.md
Outdated
@@ -25,6 +25,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 | |||
|
|||
- Apply swap requests details to schedule events ([#2677](https://github.com/grafana/oncall/pull/2677)) | |||
|
|||
- Rework of AlertManager integration ([#2643](https://github.com/grafana/oncall/pull/2643)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move to "Unreleased" section
> You must have the [role of Admin][user-and-team-management] to be able to create integrations in Grafana OnCall. | ||
> ⚠️ A note about **(Legacy)** integrations: | ||
> We are changing internal behaviour of AlertManager integration. | ||
> Integrations that were created before version **VERSION** are marked as **(Legacy)**. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
swap out **VERSION**
> ⚠️ A note about **(Legacy)** integrations: | ||
> We are changing internal behaviour of AlertManager integration. | ||
> Integrations that were created before version **VERSION** are marked as **(Legacy)**. | ||
> These integrations are still receiving and escalating alerts but will be automatically migrated after DEPRECATION_DATE. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
swap out DEPRECATION_DATE
> Integrations that were created before version **VERSION** are marked as **(Legacy)**. | ||
> These integrations are still receiving and escalating alerts but will be automatically migrated after DEPRECATION_DATE. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same thing here w/ **VERSION**
and DEPRECATION_DATE
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll fill them right before the release, since I'll now VERSION only then.
value = remove_legacy_prefix(value) | ||
return value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
value = remove_legacy_prefix(value) | |
return value | |
return remove_legacy_prefix(value) |
if integration == AlertReceiveChannel.INTEGRATION_GRAFANA_ALERTING: | ||
# TODO: probably only needs to check if unified alerting is on |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this TODO still relevant?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, it still makes sense for Grafana Alerting v2
return renderContent(); | ||
|
||
function renderContent() { | ||
if (isLegacyIntegration) { | ||
return ( | ||
<HorizontalGroup spacing="xs"> | ||
<Tooltip placement="top" content={'This integration has been deprecated, consider migrating it.'}> | ||
<Icon name="info-circle" className="u-opacity" /> | ||
</Tooltip> | ||
<Text type="secondary"> | ||
<span className="u-opacity">{integration?.display_name}</span> | ||
</Text> | ||
</HorizontalGroup> | ||
); | ||
} | ||
|
||
return ( | ||
<HorizontalGroup spacing="xs"> | ||
<IntegrationLogo scale={0.08} integration={integration} /> | ||
<Text type="secondary">{integration?.display_name}</Text> | ||
</HorizontalGroup> | ||
); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is a separate renderContent
function needed here? what about just including the body of renderContent
inside of renderDatasource
instead of declaring a separate function that is then always invoked?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, not really. silly mistake
{isLegacyIntegration && ( | ||
<WithPermissionControlTooltip userAction={UserActions.IntegrationsWrite}> | ||
<div | ||
className={cx('integration__actionItem')} | ||
onClick={() => | ||
setConfirmModal({ | ||
isOpen: true, | ||
title: 'Migrate Integration?', | ||
body: ( | ||
<VerticalGroup spacing="lg"> | ||
<Text type="primary"> | ||
Are you sure you want to migrate <Emoji text={alertReceiveChannel.verbal_name} /> ? | ||
</Text> | ||
|
||
<VerticalGroup spacing="xs"> | ||
<Text type="secondary">- Integration internal behaviour will be changed</Text> | ||
<Text type="secondary"> | ||
- Integration URL will stay the same, so no need to change {getMigrationDisplayName()}{' '} | ||
configuration | ||
</Text> | ||
<Text type="secondary"> | ||
- Integration templates will be reset to suit the new payload | ||
</Text> | ||
<Text type="secondary">- It is needed to adjust routes manually to the new payload</Text> | ||
</VerticalGroup> | ||
</VerticalGroup> | ||
), | ||
onConfirm: onIntegrationMigrate, | ||
dismissText: 'Cancel', | ||
confirmText: 'Migrate', | ||
}) | ||
} | ||
> | ||
Migrate | ||
</div> | ||
</WithPermissionControlTooltip> | ||
)} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
did you consider adding a link to the migration docs you added?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Link is added in warning banner. This modal is a final message before pressing migrate button. I'll show a demo at weekly :)
async migrateChannelFilter(id: AlertReceiveChannel['id']) { | ||
return await makeRequest(`/alert_receive_channels/${id}/migrate`, { | ||
method: 'POST', | ||
}); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
async migrateChannelFilter(id: AlertReceiveChannel['id']) { | |
return await makeRequest(`/alert_receive_channels/${id}/migrate`, { | |
method: 'POST', | |
}); | |
} | |
migrateChannelFilter = (id: AlertReceiveChannel['id']) => | |
makeRequest(`/alert_receive_channels/${id}/migrate`, { | |
method: 'POST', | |
}); | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All store methods use async/await notation, so I think it's worth to discuss it with frontend team. What is the advantage of using arrow function here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's just a preference of shortening the outputted code, no real advantage. IMO marking it as async/await it has better readability
Co-authored-by: Joey Orlando <joey.orlando@grafana.com>
@@ -552,7 +555,12 @@ def send_demo_alert(self, payload=None): | |||
if payload is None: | |||
payload = self.config.example_payload | |||
|
|||
if self.has_alertmanager_payload_structure: | |||
# TODO: AMV2: hack to keep demo alert working for integration with legacy alertmanager behaviour. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you provide more clear instruction what should be done here and when?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I can't. We still have GRAFANA integration which might or might not have legacy alertmanager behaviour and that's why there is no clear instruction what to do. I'll remove TODO.
# if has_legacy_prefix(integration): | ||
# raise BadRequest(detail="This integration is deprecated") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bumping this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can be removed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can be removed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please don't use abbreviations as they are reducing code readability
alerts = request.data.get("alerts", []) | ||
|
||
data = request.data | ||
if "firingAlerts" not in request.data: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
consider adding a check for max_alerts
(or limit alerts), to avoid processing alerts from misconfigured alertmanagers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It won't help us. If we receive request which suits our size limits, iteration over array which already in memory is almost free
@@ -1,38 +1,50 @@ | |||
# Main | |||
enabled = True | |||
title = "Alertmanager" | |||
title = "AlertManager" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
keep "Alertmanager", this is how it called by the authors https://prometheus.io/docs/alerting/latest/alertmanager/
|
||
# Slack | ||
# Slack templates |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's be consistent with the comments
|
||
slack_image_url = None | ||
|
||
# SMS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's be consistent with the comments
@@ -65,7 +65,7 @@ | |||
FEATURE_INBOUND_EMAIL_ENABLED = getenv_boolean("FEATURE_INBOUND_EMAIL_ENABLED", default=False) | |||
FEATURE_PROMETHEUS_EXPORTER_ENABLED = getenv_boolean("FEATURE_PROMETHEUS_EXPORTER_ENABLED", default=False) | |||
FEATURE_WEBHOOKS_2_ENABLED = getenv_boolean("FEATURE_WEBHOOKS_2_ENABLED", default=True) | |||
FEATURE_SHIFT_SWAPS_ENABLED = getenv_boolean("FEATURE_SHIFT_SWAPS_ENABLED", default=False) | |||
FEATURE_SHIFT_SWAPS_ENABLED = getenv_boolean("FEATURE_SHIFT_SWAPS_ENABLED", default=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mistake
Introduce AlertManager v2 integration with improved internal behaviour it's using grouping from AlertManager, not trying to re-group alerts on OnCall side. Existing AlertManager and Grafana Alerting integrations are marked as Legacy with options to migrate them manually now or be migrated automatically after DEPRECATION DATE(TBD). Integration urls and public api responses stay the same both for legacy and new integrations. --------- Co-authored-by: Rares Mardare <rares.mardare@grafana.com> Co-authored-by: Joey Orlando <joey.orlando@grafana.com>
Introduce AlertManager v2 integration with improved internal behaviour – it's using grouping from AlertManager, not trying to re-group alerts on OnCall side.
Existing AlertManager and Grafana Alerting integrations are marked as Legacy with options to migrate them manually now or be migrated automatically after DEPRECATION DATE(TBD).
Integration urls and public api responses stay the same both for legacy and new integrations.