Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MSC2314: A Method to Backfill Room State #2314

Closed
wants to merge 9 commits into from
Closed
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
56 changes: 56 additions & 0 deletions proposals/2314-backfill-current-state.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
# MSC2314: Backfilling Current State

If a server experiences data loss, it is difficult for it to recover membership in federation without having a user be reinvited, as backfill alone can not easily retrieve the data that is required to operate in a room (namely the current state and auth chain).
hawkowl marked this conversation as resolved.
Show resolved Hide resolved

This MSC introduces S2S APIs to provide a given room's auth chain and current state events, provided the requesting server is in the room specified.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you see this API actually being used? The "current state" isn't really a concept that is used much in s2s, as if the server wants to start participating in the room it needs a set of most recent events and their state (from which the current state can be derived).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's asking another server for it's current state, in a situation where it knows it's joined to the room but is missing state (data loss).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But how does it actually use the current state? I don't think the current state is useful in rehydrating a room for example, as you need the extremities and the state at each of them.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hrm, it seems I missed this thread previously. @erikjohnston raises a good point - it is very hard to see how this API could be usefully used.

If we can't figure out a practical use for this change, we should instead reject it, and back out the changes in Synapse and Sytest.

@erikjohnston any further thoughts here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does feel like it should be useful, but I can't really see how. If this is mainly useful for rescuing a server that has suffered catastrophic memory lost then I think I'd like to see a step-by-step write up of the process of recovering from that, including any new APIs that need to be added.

Having a one stop "bootstrap this room for me" API might be useful, which returns the extremities, auth chain and state at each event (probably with some clever de-duplication there)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right then. In that case, I'm going to close this MSC, and back out the changes in sytest and synapse.


## Existing APIs

If one knows the room ID and an event ID, `/_matrix/federation/v1/state/{roomId}?event_id={eventId}` can be used to retrieve the auth chain and current state events for that given event. However, this requires knowing an event ID (which cannot be assumed), as well as the version of the room (which can be assumed to be contained within the current state delivered). If an event arrives for a room that a Matrix server does not know about, the event ID could be used to backfill the state, but this is impractical for rooms which may be low-traffic yet valuable to the end-user.
hawkowl marked this conversation as resolved.
Show resolved Hide resolved

## Proposal

Add a new v2 state API that returns the server's present auth chain and state PDUs if an event ID is not provided, as well as specifying the room version to ease the parsing of the given events.

```
GET /_matrix/federation/v2/state/{roomId}
hawkowl marked this conversation as resolved.
Show resolved Hide resolved

{
"room_version": "3",
"auth_chain": [
{
"type": "m.room.minimal_pdu",
"room_id": "!somewhere:example.org",
"content": {
"see_room_version_spec": "The event format changes depending on the room version."
}
}
],
"pdus": [
{
"type": "m.room.minimal_pdu",
"room_id": "!somewhere:example.org",
"content": {
"see_room_version_spec": "The event format changes depending on the room version."
}
}
]
}
```

This requires the following changes to the existing API:
hawkowl marked this conversation as resolved.
Show resolved Hide resolved

- The "event_id" query parameter's definition changes to "Optional. An event ID in the room to retrieve the state at. If this is not provided, the results are the receiving server's latest current state."
- "room_version" is added to the response, defined as "Type: string. Required. The version of the room that the state was queried of."

## Potential issues

Although the creating server is part of the room ID, a server using this API as a client may find that the target server does not presently know about the room (for example, it has been shut down or deleted). Finding servers that will successfully return results from this API is outside of the scope of this MSC.

Users may not know the room ID for a given room, only a room alias. Translating this alias into a room ID is outside of the scope of this MSC. Users of this API may want to use it in conjunction with `/_matrix/federation/v1/query/directory` to resolve aliases to room IDs as part of an end-user focused API.

Excessively large rooms may cause performance problems for servers implementing this API (including in its v1 incarnation). Discretionary rate limiting of this API may be required.

## Security considerations
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

servers might also have to worry about malicious attempts to get the server to join rooms. A server that really wants you to join a room could send the server an event over /send, therefore confusing the server into thinking it might be in the room. The victim server would then use this new API, find out some information about the room (possibly a falsified join event) and push the room through as joined. In theory at some point a signature check would fail, but another workaround could be to ask a couple servers in the room (if possible) to verify the state returned by a server.

richvdh marked this conversation as resolved.
Show resolved Hide resolved

In the case of a domain-name hijack, this may make recovering rooms that the domain name was in easier. However, since a domain name hijack will lead to other servers potentially sending PDUs with the required event IDs to allow backfill and state querying, this does not constitute a meaningful increase in attack surface. Proposals such as MSC1228 are expected to mitigate.