Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix duplicate orders for user notification policies #2278

Merged
merged 27 commits into from
Jun 21, 2023

Conversation

vadimkerr
Copy link
Member

@vadimkerr vadimkerr commented Jun 16, 2023

What this PR does

Fixes an issue when multiple user notification policies have duplicated order values, leading to the following unexpected behaviours:

  1. Not possible to rearrange notification policies that have duplicated orders.
  2. The notification system only executes the first policy from each order group. For example, if there are policies with orders [0, 0, 0, 0], only the first policy will be executed, and all others will be skipped. So the user will see four policies in the UI, while only one of them will be actually executed.

This PR fixes the issue by adding a unique index on (user_id, important, order) for UserNotificationPolicy model. However, it's not possible to add that unique index using the ordering library that we use due to it's implementation details.
I added a new abstract Django model OrderedModel that's able to work with such unique indices + under concurrent load.

Important info on this new OrderedModel abstract model:

  • Orders are unique on the DB level
  • Orders are allowed to be non-consecutive, for example order sequence [100, 150, 400] is valid
  • When deleting an instance, orders of other instances don't change. This is a notable difference from the library we use. I think it's better to only delete the instance without changing any other orders, because it reduces the number of dependencies between instances (e.g. Terraform drift will be much smaller this way if a policy is deleted via the web UI).

Which issue(s) this PR fixes

Related to https://github.com/grafana/oncall-private/issues/1680

Checklist

  • Unit, integration, and e2e (if applicable) tests updated
  • Documentation added (or pr:no public docs PR label added if not required)
  • CHANGELOG.md updated (or pr:no changelog PR label added if not required)

return _retry_with_params


class OrderedModel(models.Model):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Main focus of the PR

def create(self, validated_data):
prev_step = validated_data.pop("prev_step", None)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prev_step param seems to be deprecated, I couldn't find any usages

@@ -208,7 +208,7 @@ services:
container_name: mysql
labels: *oncall-labels
image: mysql:8.0.32
command: --default-authentication-plugin=mysql_native_password --character-set-server=utf8mb4 --collation-server=utf8mb4_unicode_ci
command: --default-authentication-plugin=mysql_native_password --character-set-server=utf8mb4 --collation-server=utf8mb4_unicode_ci --max_connections=1024
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Increasing max_connections for local concurrency tests that open more connections than default number (151)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what do you think about adding a ./dev/mysql.ini file?

We have four containers (two in both this file + docker-compose-mysql-rabbitmq.yml) which run the mysql image. We could de-dupe this config and just volume mount this new file into all of these containers.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All 4 mysql containers share only these two settings: --character-set-server=utf8mb4 and --collation-server=utf8mb4_unicode_ci. Not sure if deduping those is worth of the extra complexity added by volume mounting the ini file. I'll take a closer look at this, I think it should be outside of the scope of this PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that it's outside the scope of this PR 👍 just a thought that came to mind 😄

@vadimkerr vadimkerr added the pr:no public docs Added to a PR that does not require public documentation updates label Jun 19, 2023
@@ -9,6 +9,6 @@ banned-modules =
[pytest]
# https://pytest-django.readthedocs.io/en/latest/configuring_django.html#order-of-choosing-settings
# https://pytest-django.readthedocs.io/en/latest/database.html
addopts = --color=yes --showlocals
addopts = --no-migrations --color=yes --showlocals
Copy link
Member Author

@vadimkerr vadimkerr Jun 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This allows creating models on demand in tests, e.g. when testing abstract models such as TestOrderedModel.
Without --no-migrations it would be required to generate & store migrations for such test models.

@vadimkerr vadimkerr marked this pull request as ready for review June 20, 2023 15:35
@vadimkerr vadimkerr requested a review from a team June 20, 2023 15:35
Copy link
Contributor

@matiasb matiasb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@vadimkerr vadimkerr added this pull request to the merge queue Jun 21, 2023
Merged via the queue into dev with commit eada4a4 Jun 21, 2023
@vadimkerr vadimkerr deleted the vadimkerr/ordered-model branch June 21, 2023 11:17
@matiasb matiasb mentioned this pull request Jun 22, 2023
github-merge-queue bot pushed a commit that referenced this pull request Jul 18, 2023
# What this PR does

Fix duplicate `order` values for models `EscalationPolicy` and
`ChannelFilter` using the same approach as
#2278.

- Make internal API action `move_to_position` a part of
[OrderedModelViewSet](https://github.com/grafana/oncall/pull/2568/files#diff-eb62521ccbcb072d1bd2156adeadae3b5895bce6d0d54432a23db3948b0ada54R11-R34),
so all ordered model views use the same logic.
- Make public API serializers for ordered models inherit from
[OrderedModelSerializer](https://github.com/grafana/oncall/pull/2568/files#diff-d749f94af5a49adaf5074325cdfad10ddd5a52dbfd78b49582867ebb9c92fae5R6-R38),
so ordered model views are consistent with each other in public API.
- Remove `order` from plugin fronted, since it's not being used
anywhere. The frontend uses step indices & `move_to_position` action
instead.
- Make escalation snapshot use step indices instead of orders, since
orders are not guaranteed to be sequential (+fix a minor off-by-one bug)

## Which issue(s) this PR fixes

grafana/oncall-private#1680

## Checklist

- [x] Unit, integration, and e2e (if applicable) tests updated
- [x] Documentation added (or `pr:no public docs` PR label added if not
required)
- [x] `CHANGELOG.md` updated (or `pr:no changelog` PR label added if not
required)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr:no public docs Added to a PR that does not require public documentation updates
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants