Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Add a new third party callback check_event_allowed_v2 that is compatible with new batch persisting mechanisms. #15131

Closed
wants to merge 13 commits into from
1 change: 1 addition & 0 deletions changelog.d/15131.misc
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Add a new third party callback `check_event_allowed_v2` that is compatible with new batch persisting mechanisms.
69 changes: 69 additions & 0 deletions docs/modules/third_party_rules_callbacks.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,75 @@ The available third party rules callbacks are:

### `check_event_allowed`

_First introduced in Synapse v1.7x.x

```python
async def check_event_allowed_v2(
event: "synapse.events.EventBase",
state_events: "synapse.types.StateMap",
) -> Tuple[bool, Optional[dict], Optional[dict]]
H-Shay marked this conversation as resolved.
Show resolved Hide resolved
```

**<span style="color:red">
This callback is very experimental and can and will break without notice. Module developers
are encouraged to implement `check_event_for_spam` from the spam checker category instead.
</span>**

Returns:

- A tuple consisting of:

- a boolean representing whether or not the event is allowed
- an optional dict to form the basis of a replacement event for the event
- an optional dict to form the basis of an additional event to be sent into the
room

Called when processing any incoming event, with the event and a `StateMap`
representing the current state of the room the event is being sent into. A `StateMap` is
a dictionary that maps tuples containing an event type and a state key to the
corresponding state event. For example retrieving the room's `m.room.create` event from
the `state_events` argument would look like this: `state_events.get(("m.room.create", ""))`.
The module must return a boolean indicating whether the event can be allowed.

Note that this callback function processes incoming events coming via federation
traffic (on top of client traffic). This means denying an event might cause the local
copy of the room's history to diverge from that of remote servers. This may cause
federation issues in the room. It is strongly recommended to only deny events using this
callback function if the sender is a local user, or in a private federation in which all
servers are using the same module, with the same configuration.

If the boolean returned by the module is `True`, it may tell Synapse to replace the
event with new data by returning the new event's data as a dictionary. In order to do
that, it is recommended the module calls `event.get_dict()` to get the current event as a
dictionary, and modify the returned dictionary accordingly.

Module writers may also wish to use this check to send an event into the room concurrent
H-Shay marked this conversation as resolved.
Show resolved Hide resolved
with the event being checked, if this is the case the module writer must provide a dict that
will form the basis of the event that is to be added to the room and it must be returned by `check_event_allowed_v2`.
This dict will then be turned into an event at the appropriate time and it will be persisted after the event
that triggered it, and if the event that triggered it is in a batch of events for persisting, it will be added to the
end of that batch.
H-Shay marked this conversation as resolved.
Show resolved Hide resolved

If `check_event_allowed_v2` raises an exception, the module is assumed to have failed.
The event will not be accepted but is not treated as explicitly rejected, either.
An HTTP request causing the module check will likely result in a 500 Internal
Server Error.

When the boolean returned by the module is `False`, the event is rejected.
(Module developers should not use exceptions for rejection.)

Note that replacing the event or adding an event only works for events sent by local users, not for events
H-Shay marked this conversation as resolved.
Show resolved Hide resolved
received over federation.

If multiple modules implement this callback, they will be considered in order. If a
callback returns `True`, Synapse falls through to the next one. The value of the first
callback that does not return `True` will be used. If this happens, Synapse will not call
any of the subsequent implementations of this callback. This callback cannot be used in conjunction with `check_event_allowed`,
only one of these callbacks may be operational at a time - if both `check_event_allowed` and `check_event_allowed_v2`
active only `check_event_allowed` will be executed.

### `check_event_allowed`

_First introduced in Synapse v1.39.0_

```python
Expand Down
101 changes: 70 additions & 31 deletions synapse/events/third_party_rules.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,10 @@
CHECK_EVENT_ALLOWED_CALLBACK = Callable[
[EventBase, StateMap[EventBase]], Awaitable[Tuple[bool, Optional[dict]]]
]
CHECK_EVENT_ALLOWED_V2_CALLBACK = Callable[
[EventBase, StateMap[EventBase]],
Awaitable[Tuple[bool, Optional[dict], Optional[dict]]],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is merely a suggestion, but I'd be tempted to replace this tuple with an attrs or dataclass class. That buys you:

  • named fields (helps disambiguate the two Optional[dict]s; also means we can document the individual pieces more easily IMO)
  • an easier time for extending the return type in the future (we can add new fields with default values)

At this stage it's not really critical, but if this tuple grew any larger I'd start to think it was getting a bit unwieldy.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm on the fence about this, here's my thinking: I agree that a class might be cleaner (although only slightly, my hope is that the tuple does not grow any larger!) but I wonder about the utility for module-writers. It seems easier to say "just give us a dict" rather than describing a class and expecting them to match their data to the class, but I may be wrong here. I'm open to being convinced otherwise.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

under this hypothetical situation, we would provide the dataclass and they would just instantiate it, e.g. they'd return something like

return EventAllowedResult(
    allowed=True,
    replace={"x": "y"},
)

]
ON_CREATE_ROOM_CALLBACK = Callable[[Requester, dict, bool], Awaitable]
CHECK_THREEPID_CAN_BE_INVITED_CALLBACK = Callable[
[str, str, StateMap[EventBase]], Awaitable[bool]
Expand Down Expand Up @@ -155,6 +159,9 @@ def __init__(self, hs: "HomeServer"):
self._storage_controllers = hs.get_storage_controllers()

self._check_event_allowed_callbacks: List[CHECK_EVENT_ALLOWED_CALLBACK] = []
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(just as a note if it interests you, particularly relevant when combined with 'why not both v1 and v2?'):
In the past, for callback upgrades like these, we've sometimes made an adapter for the old callbacks and registered the adapter as a v2 callback (and then removed the list of v1 callbacks, since they're all wrapped).
This has sometimes proved easier than trying to call each callback in the way it was originally defined.

To illustrate (but I'd put proper variable names and type annotations on, just a tad fiddly to do from within GitHub without having it all fresh in mind):

def check_event_allowed_v1_v2_adapter(v1: CHECK_EVENT_ALLOWED_CALLBACK) -> CHECK_EVENT_ALLOWED_V2_CALLBACK:
    async def adapter(x, y):
        a, b = await v1(x, y)
        return a, b, None
    return adapter

self._check_event_allowed_v2_callbacks: List[
H-Shay marked this conversation as resolved.
Show resolved Hide resolved
CHECK_EVENT_ALLOWED_V2_CALLBACK
] = []
self._on_create_room_callbacks: List[ON_CREATE_ROOM_CALLBACK] = []
self._check_threepid_can_be_invited_callbacks: List[
CHECK_THREEPID_CAN_BE_INVITED_CALLBACK
Expand Down Expand Up @@ -251,15 +258,16 @@ async def check_event_allowed(
self,
event: EventBase,
context: UnpersistedEventContextBase,
) -> Tuple[bool, Optional[dict]]:
) -> Tuple[bool, Optional[dict], Optional[dict]]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bikeshed risk: should we allow the creation of multiple events so we don't wind up having to introduce a v3 for that later? :-)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't tell if I am being a curmudgeon here but my instinct would be to say no, as honestly in a perfect world one would not be able to inject events here at all - but since it's already been done this is a way to gracefully work around it. I feel like adding multiple events creates needless complexity, but again, I might just be being a curmudgeon.

"""Check if a provided event should be allowed in the given context.

The module can return:
* True: the event is allowed.
* False: the event is not allowed, and should be rejected with M_FORBIDDEN.

If the event is allowed, the module can also return a dictionary to use as a
replacement for the event.
replacement for the event, and/or return a dictionary to use as the basis for
another event to be sent into the room.

Args:
event: The event to be checked.
Expand All @@ -269,8 +277,11 @@ async def check_event_allowed(
The result from the ThirdPartyRules module, as above.
"""
# Bail out early without hitting the store if we don't have any callbacks to run.
if len(self._check_event_allowed_callbacks) == 0:
return True, None
if (
len(self._check_event_allowed_callbacks) == 0
and len(self._check_event_allowed_v2_callbacks) == 0
):
return True, None, None

prev_state_ids = await context.get_prev_state_ids()

Expand All @@ -283,35 +294,63 @@ async def check_event_allowed(
# the hashes and signatures.
event.freeze()

for callback in self._check_event_allowed_callbacks:
try:
res, replacement_data = await delay_cancellation(
callback(event, state_events)
)
except CancelledError:
raise
except SynapseError as e:
# FIXME: Being able to throw SynapseErrors is relied upon by
# some modules. PR #10386 accidentally broke this ability.
# That said, we aren't keen on exposing this implementation detail
# to modules and we should one day have a proper way to do what
# is wanted.
# This module callback needs a rework so that hacks such as
# this one are not necessary.
raise e
except Exception:
raise ModuleFailedException(
"Failed to run `check_event_allowed` module API callback"
)
if len(self._check_event_allowed_callbacks) != 0:
for callback in self._check_event_allowed_callbacks:
try:
res, replacement_data = await delay_cancellation(
callback(event, state_events)
)
except CancelledError:
raise
except SynapseError as e:
# FIXME: Being able to throw SynapseErrors is relied upon by
# some modules. PR #10386 accidentally broke this ability.
# That said, we aren't keen on exposing this implementation detail
# to modules and we should one day have a proper way to do what
# is wanted.
# This module callback needs a rework so that hacks such as
# this one are not necessary.
raise e
except Exception:
raise ModuleFailedException(
"Failed to run `check_event_allowed` module API callback"
)

# Return if the event shouldn't be allowed or if the module came up with a
# replacement dict for the event.
if res is False:
return res, None
elif isinstance(replacement_data, dict):
return True, replacement_data
# Return if the event shouldn't be allowed or if the module came up with a
# replacement dict for the event.
if res is False:
return res, None, None
elif isinstance(replacement_data, dict):
return True, replacement_data, None
else:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious as to why we don't support a mix of v1 and v2 callbacks. Pedantically this could make it hard to upgrade as you'd need to upgrade all relevant modules at once, rather than doing it bit by bit. Realistically, this may not be an issue as I don't know if anyone runs with more than one such module anyway?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main reason is that we'd like to phase out the v1 callbacks - they are not compatible with the batching mechanism, which is why the v2 callback which is compatible with the batching mechanism was proposed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought it was just the sending of additional events that was not compatible when done in v1 callbacks and with batching?

I think it would be OK to support v1+v2 as a transitional period with the caveat that v1 can't send additional events.

I think this would be good practice but I don't know if it's truly warranted here given the relatively low use of modules, so I'm tempted to actually just accept use of either v1 or v2 if that prevents having to think harder about this.

for v2_callback in self._check_event_allowed_v2_callbacks:
try:
res, replacement_data, new_event = await delay_cancellation(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of introducing a v2, I wonder if we could just widen the return type to accept either doubles or triples, with doubles being interpreted the same as before and triples being interpreted the same as what you're doing for v2.
This could be done with a somewhat nasty Union for the return type.

I'm not sure how that sounds to you?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think since we don't want to mix v1 and v2 callbacks (ie we want to eventually deprecate the v1 callbacks) it makes sense to not do this. I also have a distaste for Union return types, it always seems unwieldy to deal with them if you can avoid it.

v2_callback(event, state_events)
)
except CancelledError:
raise
except SynapseError as e:
# FIXME: Being able to throw SynapseErrors is relied upon by
# some modules. PR #10386 accidentally broke this ability.
# That said, we aren't keen on exposing this implementation detail
# to modules and we should one day have a proper way to do what
# is wanted.
# This module callback needs a rework so that hacks such as
# this one are not necessary.
Comment on lines +343 to +349
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bikeshed risk: is this a good time to address this? If it looks like more than a tiny amount of work I'm happy to leave it, but if the only reason we haven't done this was 'it's not pressing enough to introduce a breaking change' then now might be a good opportunity.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fascinatingly you were the one to add this todo: https://github.com/matrix-org/synapse/pull/11042/files

Do you remember what the discussion was at time/why it wasn't fixed then? I don't see an issue for the regression so I am a little unclear on what the original problem was.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the vague problem is that some modules would reject events by return False, others would raise SynapseError(403, "This event is not allowed because blahblahblah") — in effect, they could give it a custom error code and error message.

We don't like the latter — it's not explicitly supported but makes use of the fact that SynapseErrors bubble up automatically and get turned into Matrix/API-level errors (in the servlet? I think).
If we care about this mechanism then we should spec support for it without relying on exceptions, it feels like.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm well it seems to me given the context that fixing this might be outside the purview of this particular PR?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well OK, it's just that this needs a new version of the callback so it would have been possibly better to do the changes together for the same version

raise e
except Exception:
raise ModuleFailedException(
"Failed to run `check_event_allowed_v2` module API callback"
)

return True, None
# Return if the event shouldn't be allowed, if the module came up with a
# replacement dict for the event, or if the module wants to send a new event
if res is False:
return res, None, None
else:
return True, replacement_data, new_event
return True, None, None

async def on_create_room(
self, requester: Requester, config: dict, is_requester_admin: bool
Expand Down
9 changes: 7 additions & 2 deletions synapse/handlers/federation.py
Original file line number Diff line number Diff line change
Expand Up @@ -1006,6 +1006,7 @@ async def on_make_join_request(
(
event,
unpersisted_context,
_,
) = await self.event_creation_handler.create_new_client_event(
builder=builder,
prev_event_ids=prev_event_ids,
Expand Down Expand Up @@ -1197,7 +1198,7 @@ async def on_make_leave_request(
},
)

event, _ = await self.event_creation_handler.create_new_client_event(
event, _, _ = await self.event_creation_handler.create_new_client_event(
builder=builder
)

Expand Down Expand Up @@ -1250,9 +1251,10 @@ async def on_make_knock_request(
(
event,
unpersisted_context,
_,
) = await self.event_creation_handler.create_new_client_event(builder=builder)

event_allowed, _ = await self.third_party_event_rules.check_event_allowed(
event_allowed, _, _ = await self.third_party_event_rules.check_event_allowed(
event, unpersisted_context
)
if not event_allowed:
Expand Down Expand Up @@ -1427,6 +1429,7 @@ async def exchange_third_party_invite(
(
event,
unpersisted_context,
_,
) = await self.event_creation_handler.create_new_client_event(
builder=builder
)
Expand Down Expand Up @@ -1509,6 +1512,7 @@ async def on_exchange_third_party_invite_request(
(
event,
unpersisted_context,
_,
) = await self.event_creation_handler.create_new_client_event(
builder=builder
)
Expand Down Expand Up @@ -1591,6 +1595,7 @@ async def add_display_name_to_third_party_invite(
(
event,
unpersisted_context,
_,
) = await self.event_creation_handler.create_new_client_event(builder=builder)

EventValidator().validate_new(event, self.config)
Expand Down
8 changes: 5 additions & 3 deletions synapse/handlers/federation_event.py
Original file line number Diff line number Diff line change
Expand Up @@ -404,9 +404,11 @@ async def on_send_membership_event(
# for knock events, we run the third-party event rules. It's not entirely clear
# why we don't do this for other sorts of membership events.
if event.membership == Membership.KNOCK:
event_allowed, _ = await self._third_party_event_rules.check_event_allowed(
event, context
)
(
event_allowed,
_,
_,
) = await self._third_party_event_rules.check_event_allowed(event, context)
if not event_allowed:
logger.info("Sending of knock %s forbidden by third-party rules", event)
raise SynapseError(
Expand Down
Loading