This repository has been archived by the owner on Apr 26, 2024. It is now read-only.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Allow room creator to send MSC2716 related events in existing room versions #10566
Allow room creator to send MSC2716 related events in existing room versions #10566
Changes from all commits
6481502
13d0929
259303a
16a41dd
fffce99
aafa069
fedd250
8a2db20
2b177b7
ee406df
9f8e22b
759e78c
71c20f7
9b828ab
3c9b5a6
9a600ff
79b4991
25db289
41e72c2
9a3c015
9a887a4
6f9cb41
ad29e96
462ab25
a3581d3
20196c8
725cf22
e047e42
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After merging
develop
(725cf22) (also fails ondevelop
directly), one of the Complement tests started failing. I think it's related to some change in one of these recent refactors:_resolve_state_at_missing_prevs
#10624 touches_resolve_state_at_missing_prevs
which is in the errorFederationHandler
in half #10692 is a big change and splitsfederation.py
andfederation_event.py
backfill
andget_missing_events
use the same codepath #10645 because of the missing events related stuffon_receive_pdu
in half #10640 touchesprocess_pulled_event
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like our fake
prev_event
is coming back to bite us 😣 and is causing the insertion event that references the fake event to be rejected. The fake event does appear to still play nicely with the state though (maybe so far AFAICT).$hbHXgulNKlmg9mkGQiBH5SSgq2NL4GO70sIIfHCa-YY
is the insertion event for the chunk$TiICyIaUJplRPSUJfVCHnjmtOmYtRFiwoWWnYvLIABg
is the fake eventWe originally discussed the fake event at matrix-org/matrix-spec-proposals#2716 (comment) (look for the
floating
discussion) to make the state events float off on their own pretending it was in the depths of history somewhere we haven't fetched yet.And the fake event is used as a
prev_event
for the insertion event because,synapse/synapse/rest/client/room_batch.py
Lines 298 to 301 in c586d68
and was changed in #10245 (specifically a8c5311) for federation. But previously they were always connected to the prev event specified in the query param. This change was made as discussed at:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like the culprit is in #10645 which changed how
backfill
gets state and auths events.If I restore
_get_state_for_room
and the old implementation inbackfill
, the Complement tests pass again.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmmm, maaayyybe we don't need to "[cause] the HS to ask for the state at the start of the chunk later." using
fake_prev_event_id
with the insertion events. The Complement tests still pass if I revert a8c5311.And to jog the memory, here is the DAG in either scenario (pulled from the description history in #10245):
prev_event
fake_prev_event_id
@erikjohnston I could use some insight here ^ in case you have opinions on keeping the fake event or where/what we need to add to accommodate it if we need to use it still.
Otherwise, I will be digging into the past and new
backfill
implementation finding the key difference.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we don't have the fake event then the state of the chunks will be inherited from
A
, which will likely be wrong, so I think we still need the prev event.It looks like that PR changed the way we handle fetching state for backfills, before we'd just get the state at the last event, whereas now we try and get the state for each prev event (including the fake event), which obviously fails. I'll talk to @richvdh to try and see what the correct thing to do here is, but my hunch is we probably do want to be able to handle the case where no one has the state for an event.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I chatted with @richvdh briefly, and the conclusion was that a) that code is protecting against an attack where a remote server could arbitrarily replace the servers view of the current state of the room, but b) the mitigation we use is a sledgehammer and we should replace it with something more intelligent that both mitigates the attack and gracefully handles missing events.
I'm not sure what that looks like yet, but will have a think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the checking this out! Since the Complement test also fails on
develop
, I think we're good to merge this and move forward.I'll create a new separate issue to track this new unobtanium fake event hammer we're going to forge 💪
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tracked in #10764
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@MadLittleMods sorry to raise this PR from the dead, but can you remember if there is a reason we use
room_creator = create_event.content.get(EventContentFields.ROOM_CREATOR)
here, but select thecreator
column fromrooms
elsewhere in the same PR?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably an oversight. I can't think of a reason why it would matter. We just want the room creator which doesn't change.
For the
insertion
andbatch
events, we switched from them.room.create
event lookup and moved to therooms
table lookup in aafa069@richvdh Are you touching this in a different refactor to update this?