Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Add a new third party callback check_event_allowed_v2 that is compatible with new batch persisting mechanisms. #15131

Closed
wants to merge 13 commits into from

Conversation

H-Shay
Copy link
Contributor

@H-Shay H-Shay commented Feb 21, 2023

This is a suggestion for how to rework the third party rules callback check_event_allowed so that it is compatible with the newly added batch persisting mechanism, added in #13487 and #13800. We currently only batch up events during room creation, but there is a desire to integrate this functionality elsewhere as it has the potential to improve performance (by reducing round trips to the DB).

However, check_event_allowed as it is currently implemented allows for a module writer to use this callback to send an event into a room, triggered by the event being checked by check_event_allowed. This causes a problem when the event triggering the secondary event is part of a batch, as it forks the DAG.

To alleviate this, I propose adding a second version of check_event_allowed, check_event_allowed_v2 which explicitly allows for a module writer to optionally provide an event dict that will form the basis of the new additional event they would like to send into the room. The event dict is then either turned into an event immediately after the original event, if the original event is not part of a batch, or if the original event is part of a batch the event dict is turned into an event after each of the batched events have already been created and added to the end of the batch for persisting.

In this scheme only one of check_event_allowed or check_event_allowed_v2 could be active at a time, and if both are provided check_event_allowed will be executed rather than check_event_allowed_v2. The idea would be to eventually deprecate check_event_allowed in favor of check_event_allowed_v2.

Best reviewable commit-by-commit.

Copy link
Contributor

@reivilibre reivilibre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I owe you a proper review but I think these questions are probably blockers for doing a fair review.

docs/modules/third_party_rules_callbacks.md Show resolved Hide resolved
docs/modules/third_party_rules_callbacks.md Show resolved Hide resolved
Copy link
Contributor

@reivilibre reivilibre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems sound.

I bring a few comments (mostly just pedantry) and ideas, but nothing wrong other than I don't see anywhere we let modules register these (...which could be me being bad at reading!).

I'd be tempted to support persisting multiple-events because I can see it being useful (e.g. in the frozen room use case: we might post an informational message about the room being frozen alongside freezing it. In other cases, we might want to update multiple state keys in one go) and it'd seem a shame to have to think about a v3 instead of introducing a list here and now (it doesn't seem too hard to do?).

@@ -234,15 +241,16 @@ async def check_event_allowed(
self,
event: EventBase,
context: UnpersistedEventContextBase,
) -> Tuple[bool, Optional[dict]]:
) -> Tuple[bool, Optional[dict], Optional[dict]]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bikeshed risk: should we allow the creation of multiple events so we don't wind up having to introduce a v3 for that later? :-)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't tell if I am being a curmudgeon here but my instinct would be to say no, as honestly in a perfect world one would not be able to inject events here at all - but since it's already been done this is a way to gracefully work around it. I feel like adding multiple events creates needless complexity, but again, I might just be being a curmudgeon.

else:
for v2_callback in self._check_event_allowed_v2_callbacks:
try:
res, replacement_data, new_event = await delay_cancellation(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of introducing a v2, I wonder if we could just widen the return type to accept either doubles or triples, with doubles being interpreted the same as before and triples being interpreted the same as what you're doing for v2.
This could be done with a somewhat nasty Union for the return type.

I'm not sure how that sounds to you?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think since we don't want to mix v1 and v2 callbacks (ie we want to eventually deprecate the v1 callbacks) it makes sense to not do this. I also have a distaste for Union return types, it always seems unwieldy to deal with them if you can avoid it.

@@ -32,6 +32,10 @@
CHECK_EVENT_ALLOWED_CALLBACK = Callable[
[EventBase, StateMap[EventBase]], Awaitable[Tuple[bool, Optional[dict]]]
]
CHECK_EVENT_ALLOWED_V2_CALLBACK = Callable[
[EventBase, StateMap[EventBase]],
Awaitable[Tuple[bool, Optional[dict], Optional[dict]]],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is merely a suggestion, but I'd be tempted to replace this tuple with an attrs or dataclass class. That buys you:

  • named fields (helps disambiguate the two Optional[dict]s; also means we can document the individual pieces more easily IMO)
  • an easier time for extending the return type in the future (we can add new fields with default values)

At this stage it's not really critical, but if this tuple grew any larger I'd start to think it was getting a bit unwieldy.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm on the fence about this, here's my thinking: I agree that a class might be cleaner (although only slightly, my hope is that the tuple does not grow any larger!) but I wonder about the utility for module-writers. It seems easier to say "just give us a dict" rather than describing a class and expecting them to match their data to the class, but I may be wrong here. I'm open to being convinced otherwise.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

under this hypothetical situation, we would provide the dataclass and they would just instantiate it, e.g. they'd return something like

return EventAllowedResult(
    allowed=True,
    replace={"x": "y"},
)

return res, None, None
elif isinstance(replacement_data, dict):
return True, replacement_data, None
else:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious as to why we don't support a mix of v1 and v2 callbacks. Pedantically this could make it hard to upgrade as you'd need to upgrade all relevant modules at once, rather than doing it bit by bit. Realistically, this may not be an issue as I don't know if anyone runs with more than one such module anyway?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main reason is that we'd like to phase out the v1 callbacks - they are not compatible with the batching mechanism, which is why the v2 callback which is compatible with the batching mechanism was proposed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought it was just the sending of additional events that was not compatible when done in v1 callbacks and with batching?

I think it would be OK to support v1+v2 as a transitional period with the caveat that v1 can't send additional events.

I think this would be good practice but I don't know if it's truly warranted here given the relatively low use of modules, so I'm tempted to actually just accept use of either v1 or v2 if that prevents having to think harder about this.

synapse/events/third_party_rules.py Show resolved Hide resolved
@@ -155,6 +159,9 @@ def __init__(self, hs: "HomeServer"):
self._storage_controllers = hs.get_storage_controllers()

self._check_event_allowed_callbacks: List[CHECK_EVENT_ALLOWED_CALLBACK] = []
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(just as a note if it interests you, particularly relevant when combined with 'why not both v1 and v2?'):
In the past, for callback upgrades like these, we've sometimes made an adapter for the old callbacks and registered the adapter as a v2 callback (and then removed the list of v1 callbacks, since they're all wrapped).
This has sometimes proved easier than trying to call each callback in the way it was originally defined.

To illustrate (but I'd put proper variable names and type annotations on, just a tad fiddly to do from within GitHub without having it all fresh in mind):

def check_event_allowed_v1_v2_adapter(v1: CHECK_EVENT_ALLOWED_CALLBACK) -> CHECK_EVENT_ALLOWED_V2_CALLBACK:
    async def adapter(x, y):
        a, b = await v1(x, y)
        return a, b, None
    return adapter

Comment on lines +334 to +340
# FIXME: Being able to throw SynapseErrors is relied upon by
# some modules. PR #10386 accidentally broke this ability.
# That said, we aren't keen on exposing this implementation detail
# to modules and we should one day have a proper way to do what
# is wanted.
# This module callback needs a rework so that hacks such as
# this one are not necessary.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bikeshed risk: is this a good time to address this? If it looks like more than a tiny amount of work I'm happy to leave it, but if the only reason we haven't done this was 'it's not pressing enough to introduce a breaking change' then now might be a good opportunity.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fascinatingly you were the one to add this todo: https://github.com/matrix-org/synapse/pull/11042/files

Do you remember what the discussion was at time/why it wasn't fixed then? I don't see an issue for the regression so I am a little unclear on what the original problem was.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the vague problem is that some modules would reject events by return False, others would raise SynapseError(403, "This event is not allowed because blahblahblah") — in effect, they could give it a custom error code and error message.

We don't like the latter — it's not explicitly supported but makes use of the fact that SynapseErrors bubble up automatically and get turned into Matrix/API-level errors (in the servlet? I think).
If we care about this mechanism then we should spec support for it without relying on exceptions, it feels like.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm well it seems to me given the context that fixing this might be outside the purview of this particular PR?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well OK, it's just that this needs a new version of the callback so it would have been possibly better to do the changes together for the same version

docs/modules/third_party_rules_callbacks.md Outdated Show resolved Hide resolved
docs/modules/third_party_rules_callbacks.md Outdated Show resolved Hide resolved
with the event being checked, if this is the case the module writer must provide a dict that
will form the basis of the event that is to be added to the room and it must be returned by `check_event_allowed_v2`.
This dict will then be turned into an event at the appropriate time and it will be persisted after the event
that triggered it, and if the event that triggered it is in a batch of events for persisting, it will be added to the
end of that batch.
end of that batch. Note that the event MAY NOT be a membership event.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
end of that batch. Note that the event MAY NOT be a membership event.
end of that batch. Note that the event MUST NOT be a membership event.

if we're going for RFC conventions (the capitals make me feel like it), MUST NOT is the term to use

(
event,
unpersisted_context,
_,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so (if I can read correctly) we don't let the callbacks on third-party events themselves generate more events? This could be worth putting in a comment and in the docs for the callback?

@clokep
Copy link
Member

clokep commented Aug 24, 2023

@H-Shay Is this still useful? Should we revive it?

@H-Shay
Copy link
Contributor Author

H-Shay commented Aug 24, 2023

It's not totally useless as it unlocks a further (minimal-moderate) increase in the speed of creating rooms - but I was going to close it as I don't think the benefits are worth the time to rehab it.

@H-Shay H-Shay closed this Aug 24, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants