Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New Time Slice SLO #20888

Merged
merged 42 commits into from
Dec 18, 2023
Merged

New Time Slice SLO #20888

merged 42 commits into from
Dec 18, 2023

Conversation

estherk15
Copy link
Contributor

@estherk15 estherk15 commented Dec 4, 2023

What does this PR do? What is the motivation?

Merge instructions

Do not merge, pending PM approval

@estherk15 estherk15 added the WORK IN PROGRESS No review needed, it's a wip ;) label Dec 4, 2023
@estherk15 estherk15 requested a review from a team as a code owner December 4, 2023 19:25
@github-actions github-actions bot added Architecture Everything related to the Doc backend Images Images are added/removed with this PR labels Dec 4, 2023
@estherk15 estherk15 added editorial review Waiting on a more in-depth review and removed WORK IN PROGRESS No review needed, it's a wip ;) labels Dec 11, 2023
@estherk15 estherk15 requested review from a team as code owners December 13, 2023 12:01
@github-actions github-actions bot added the Guide Content impacting a guide label Dec 13, 2023
@github-actions github-actions bot removed the Guide Content impacting a guide label Dec 13, 2023
@estherk15 estherk15 removed request for a team and rodrigo-roca December 13, 2023 12:02
@estherk15
Copy link
Contributor Author

@roxanne-moslehi added the examples you provided, thank you!

## Overview

When creating SLOs, you can choose from the following types:
- **Metric-based SLOs**: can be used for count-based data streams, the SLI is based on the sum of good events divided by the sum of total events.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we say "data streams" but below we say "data sets". I think I would personally just say "data". It's true that the data has to be some kind of count. It is also find to say "the SLI is calculated as the sum of the good events divided by the sum of total events" rather than saying "based on". It literally is that calculation.


When creating SLOs, you can choose from the following types:
- **Metric-based SLOs**: can be used for count-based data streams, the SLI is based on the sum of good events divided by the sum of total events.
- **Monitor-based SLOs**: can be be used for time-based data sets, the SLI is based on the amount of time your system exhibits good behavior divided by the total time. Monitor-based SLOs must be based on a new or existing Datadog monitor, any adjustments must be made to the underlying monitor (cannot be done through SLO creation).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not really true that the data set needs to be "time based". That describes the calculation but not the data. They can use can data at all. It would be more correct to say that the SLI is based on the monitor's uptime (times when the monitor is not in the "Alert" state).

When creating SLOs, you can choose from the following types:
- **Metric-based SLOs**: can be used for count-based data streams, the SLI is based on the sum of good events divided by the sum of total events.
- **Monitor-based SLOs**: can be be used for time-based data sets, the SLI is based on the amount of time your system exhibits good behavior divided by the total time. Monitor-based SLOs must be based on a new or existing Datadog monitor, any adjustments must be made to the underlying monitor (cannot be done through SLO creation).
- **Time Slice SLOs**: can be be used for time-based data sets, the SLI is based on the amount of time your system exhibits good behavior divided by the total time. Time Slice SLOs do not require a Datadog monitor, you can try out different metric filters and thresholds and instantly explore downtime during SLO creation.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment here. "time-based" describes the calculation, but not the data. The way we describe this is that it can be used with any kind of data, and "good behavior" is defined using the condition specified by the user.

| **Handling missing data in the SLO calculation** | Missing data is ignored in SLO status and error budget calculations | Missing data is handled based on the [underlying Monitor's configuration][6] | Missing data is treated as uptime in SLO status and error budget calculations |
| **Uptime Calculations** | N/A | Uptime calculations are based on the underlying Monitor <br><br>If groups are present, overall uptime requires *all* groups to have uptime| [Uptime][7] is calculated by looking at discrete time chunks, not rolling time windows<br><br>If groups are present, overall uptime requires *all* groups to have uptime |
| **Calendar View on SLO Manage Page** | Available | Not available | Available |
| **Public [APIs][8] and Terraform Support** | Available | Available | Not available |

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For time-slices can we say "Not yet available" or "Coming soon"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above

| **SLO alerting ([Error Budget][1] or [Burn Rate][2] Alerts)** | Available | Available for SLOs based on Metric Monitor types only (not available for Synthetic Monitors or Service Checks) | Not available |
| [**SLO Status Corrections**][3] | Correction periods are ignored from SLO status calculation | Correction periods are ignored from SLO status calculation | Correction periods are counted as uptime in SLO status calculation |
| **[SLO Widgets][4] (up to 90 days of historical data)** | Available | Available | Available |
| [**SLO Data Source**][5] | Available (with up to 15 months of historical data) | Not available | Not available |

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For time slices can we say "Coming soon".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We try to stay away from future promises in docs. We can update the docs as soon as features are available!

@estherk15 estherk15 merged commit 100b8a0 into master Dec 18, 2023
12 checks passed
@estherk15 estherk15 deleted the esther/docs-6808-time-slice-slo branch December 18, 2023 22:35
MaelNamNam pushed a commit that referenced this pull request Jan 17, 2024
* Add time slice to left nav

* Add time slice instructions and images

* Add uptime calculations page

* Add uptime calculations to left nav

* Standardize use of Time Slice SLO

* Remove duplicate file

* Merge uptime with time slice

* Add SLO comparison chart

* Apply code review suggestions

* Update content/en/service_management/service_level_objectives/_index.md

* Apply suggestions from code review

Co-authored-by: jhgilbert <jen.gilbert@datadoghq.com>

* Apply suggestions from code review, removed commented examples

* Add time slice to left nav

* Add time slice instructions and images

* Add uptime calculations page

* Add uptime calculations to left nav

* Standardize use of Time Slice SLO

* Remove duplicate file

* Merge uptime with time slice

* Add SLO comparison chart

* Apply code review suggestions

* Update content/en/service_management/service_level_objectives/_index.md

* Apply suggestions from code review

Co-authored-by: jhgilbert <jen.gilbert@datadoghq.com>

* Apply suggestions from code review, removed commented examples

* Add examples with images

* minor changes

* API info comparison chart

* update comparison chart

* update comparison chart again

* fix status correction info

* update SLO definitions

* calendar view info

---------

Co-authored-by: jhgilbert <jen.gilbert@datadoghq.com>
Co-authored-by: Roxanne Moslehi <roxanne.moslehi@datadoghq.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Architecture Everything related to the Doc backend editorial review Waiting on a more in-depth review Images Images are added/removed with this PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants