-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Uptime] Refactor cert alerts from batched to individual #102138
[Uptime] Refactor cert alerts from batched to individual #102138
Conversation
Pinging @elastic/uptime (Team:uptime) |
@elasticmachine merge upstream |
…b.com/dominiqueclarke/kibana into feature/101150-individual-cert-alerts
@elasticmachine merge upstream |
id: CLIENT_ALERT_TYPES.TLS_LEGACY, | ||
iconClass: 'uptimeApp', | ||
documentationUrl(docLinks) { | ||
return `${docLinks.ELASTIC_WEBSITE_URL}guide/en/uptime/${docLinks.DOC_LINK_VERSION}/uptime-alerting.html#_tls_alerts`; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please update relevant docs as well especially for message , otherwise create a follow up issue for docs team.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How should we handle the deprecation of the other rule type in the docs?
@@ -8,14 +8,31 @@ | |||
import { i18n } from '@kbn/i18n'; | |||
|
|||
export const TlsTranslations = { | |||
defaultActionMessage: i18n.translate('xpack.uptime.alerts.tls.legacy.defaultActionMessage', { | |||
defaultMessage: `Detected TLS certificate {commonName} from issuer {issuer} is expiring or becoming too old. Certificate {summary} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Message should be more accurate like we discussed offline
) | ||
.valueOf(); | ||
const alertInstance = alertInstanceFactory( | ||
`${TLS.id}-${cert.common_name}-${cert.issuer}` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i don't think we need tls.id
common_name and issue should be enough. tls.id will pollute ui where we are displaying list of instances in alert management ui
const uniqueCerts = certs.reduce<Cert[]>((uniqueCertArray, currentCert) => { | ||
if (!uniqueCertArray.find((cert) => cert.common_name === currentCert.common_name)) { | ||
uniqueCertArray.push(currentCert); | ||
} | ||
|
||
return uniqueCertArray; | ||
}, []); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i don't think we need this reduce, if you check getCerts function, es query is a collapse on tls.server.hash.sha256
which means it will only return unique certificates. common_name actually can be same, tls.server.hash.sha256
will always be different.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. I will use sha256
for the id then, as using common_name
for the individual alert instance id was causing duplicate scheduling issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i would say keep common name at start of instance id and sha256 after that. sha256 isn't very helpful in identification.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe append common name + index or something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed offline the id will be common name + issuer + sha256
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are we keeping this reduce? i think we don't need this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shahzad31 , you're right, my mistake.
…0-individual-cert-alerts
…b.com/dominiqueclarke/kibana into feature/101150-individual-cert-alerts
@dominiqueclarke can we change the alert flyout , notify default for this alert, i think , by default we should set it to On status change means, user will be only notified once, i think for this it make sense to remind them after interval. |
That means that you would get an action connector message each time the executor runs for the set interval, the default of which is 1 minute. That seems a bit excessive, if our users aren't updating the interval. Perhaps we should also update the interval. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't have any blocking concerns besides the extra reduce method, otherwise looks good.
Great Job !!
💚 Build SucceededMetrics [docs]Module Count
Page load bundle
History
To update your PR or re-run it, just comment with: |
) * refactor cert alerts from batched to individual * remove old translations * create new certificate alert rule type and transition old cert rule type to legacy * update translations * maintain legacy tls rule UI to support legacy rule editing * update translations * update TLS alert content, rule type id, and alert instance id schema * remove extraneous logic and format date content Co-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com>
💚 Backport successful
This backport PR will be merged automatically after passing CI. |
…103035) * refactor cert alerts from batched to individual * remove old translations * create new certificate alert rule type and transition old cert rule type to legacy * update translations * maintain legacy tls rule UI to support legacy rule editing * update translations * update TLS alert content, rule type id, and alert instance id schema * remove extraneous logic and format date content Co-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com> Co-authored-by: Dominique Clarke <doclarke71@gmail.com>
Fixes #101150
This PR transitions batched certificate alerts into individual certificate alerts. Previously, 1 alert was generated for all certs that listed a summary of the number of expiring and aging certs. Now there is one alert for each expired or aging cert along with context on it's status. An example of the alert content can be found below.
Here are some highlights form the implementation
Note
This PR will break the current batched cert alert experience. The new alert executor function will no longer provides the action variables (for example, expiring/aging count and batched summary strings) necessary for the batched alert message, and the new individual alert instances will not work with the old default action message, as it would create duplicate alert strings which the alerting framework rejects as duplicate alert instances. This creates a need for a migration to transition the default action message to the updated individual message. Some consideration may need to happen for users who have created custom action messages that expect the previous action variables.This PR creates a new TLS rule type
tlsIndividual
, while maintaining support for the old rule type. All of the old rule type server side code is maintained in a separate code path in order to keep compatibility with old batched certificate rules.Author Checklist
note: This feature is purely technical in nature, and does not directly touch UI elements that need accessibility considerations.
Telemetry has been added where relevant@justinkambic @shahzad31 Do we currently have telemetry in place for alerts?Docs have been added to this PR covering any new, changed, or removed featuresAny special/edge cases that need to be manually tested must be documentednote: This feature is purely technical in nature, and does not directly touch UI elements that need accessibility considerations.
Reviewer Checklist
For maintainers