Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] MSC3414: Encrypted state events #3414

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from
Draft
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
257 changes: 257 additions & 0 deletions proposals/3414-encrypted-state.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,257 @@
# MSC3414: Encrypted state events

Currently in Matrix, all room state is unencrypted and accessible to everyone in the room, and
occasionally people outside the room (public room directory, invite state, peekable rooms, etc).
Most events in room state could be encrypted to preserve metadata, which is what this MSC aims
to achieve. Some parts cannot be encrypted though in order to maintain a working protocol.

## Proposal
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This MSC is fundamentally incompatible with Sliding Sync (MSC #3575), which relies on the server being able to calculate room names serverside in order to paginate them. Moreover, if it lands, it will fragment the network into SS-capable traffic versus non-SS-capable traffic.

The only mitigation I can see is if there's a homomorphic way we can map the plaintext room & room membership names to ordered items which can be used to order the rooms. (Or we could give up on encrypting m.room.name and profile data, but that /really/ starts to undermine the intentions of this MSC).

Copy link
Contributor

@poljar poljar Jan 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only mitigation I can see is if there's a homomorphic way we can map the plaintext room & room membership names to ordered items which can be used to order the rooms.

I'm not sure if homomorphic encryption would help here. To remind ourselves what homomorphic encryption is about:

Homomorphic encryption is a form of encryption that permits users to perform computations on its encrypted data without first decrypting it. These resulting computations are left in an encrypted form which, when decrypted, result in an identical output to that produced had the operations been performed on the unencrypted data.

So let's make a thought experiment out of this, suppose we have a homomorphic encryption scheme which allows us to order a variable sized array of strings in a alphabetical order. Let's call this function $S(x)$.

Let's say that the array of strings are the display names of all the rooms the user is in, $D$ from now on. This data lives on the server and clients initially don't know the size of the array nor the contents. Each element in $D$ is encrypted and the homeserver can sort $D$ via $S(x)$.

So the client tells the server I want you to sort $D$ and give me a sub-array of those. Skip the first $n$ and give me the remaining $m$ elements, a window into the whole array $D$ if you will.

The server now goes around and does it's thing and computes $R$ via:
$$R = S\left(D\right)$$

Now by definition, each element in $R$ is encrypted and the server can't correlate the elements in $R$ to the ones in $D$.

The client receives $R$ and can decrypt it. It now contains the sub-array of display names for the rooms user is currently interested in. We can suppose that we included the room IDs in $D$ as well for this to be useful.

Finally the client can tell the server, I want to start syncing the rooms with the given room ids. Let us ignore the fact that this second request will leak the alphabetical order of the rooms to the server, and that it might figure out the room names over time while the user is sliding the window while scrolling through the room list.

Let us focus on the fact that this still isn't really what sliding sync wants. Sliding sync wants to know the order of the rooms. It wants to partially decrypt $R$, in other words we need to leak the order to the server. If we don't leak the order to the server, we need to make a second request like it was shown above.

Now if the above is not what you had in mind, and instead you always intended to leak the order to the server, then we arrive at order-preserving encryption[1] (OPE). But as already hinted above, leaking the order leaks more than just the order[2][3]. How much additional data we would be leaking is not clear. Room display names are a long way off from being a perfectly uniform plaintext. This makes OPE schemes a questionable choice.

I would like to be wrong and someone to come along and tell me that what I wrote up above is not how things should work, instead they would work in such and such a way where everything fits, and everything is nicely encrypted. As it stands I don't see how we would make the ordering of rooms by name, that sliding sync wants, and encrypted display names compatible.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, i was being lazy in referencing homomorphic encryption - in practice OPE is what'd be needed. Agreed that this would be very leaky though, to the extent that it might not be worthwhile at all.

Another approach we could take to protect state event metadata would be to go full P2P, and have the servers just be store-and-forward nodes (of encrypted blobs). Then the servers running in the clients can happily maintain the metadata in plaintext, and index it and speak SS to the client to rapidly paginate through it. (Although in that model, why would we even need E2EE...?)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and have the servers just be store-and-forward nodes (of encrypted blobs).

(Although in that model, why would we even need E2EE...?)

Perhaps I'm misreading the architecture you're proposing here, but if there's no E2EE, what are the encrypted blobs encrypted with?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that the ordering of sliding sync is already broken by the current encryption for rooms. The notifications counts for encrypted rooms are usually incorrect and get recalculated by clients. So there is precedent for having clients apply fixups to server side data. Now of course if you want alphabetical sorting, that will be a bit more annoying than just notification counts, since that will affect more rooms. However the roomname is (next to the topic) usually the event with the most sensitive information.

If this MSC is changed to be able to detect room name events without being able to decrypt those events, those could just get sent along the sync. This would not increase the database load, since the server already needs to load those events usually to sort the rooms (or should at least not make as much of a difference). It would increase bandwidth usage for clients that sort by displayname and make the sync a bit slower. However that might be a trade-off, that is worth the added privacy and security.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sliding sync aside, there's several reasons why the server would want to know the room name or at least where the room sits in an ordered set. Examples include email notifications (would need to be unencrypted or completely made of magic), abuse mitigations (can be encrypted), and directories (spaces, non-public rooms, etc - can be encrypted).

Preferably, we keep sliding sync design decisions on the sliding sync MSCs, as this conversation is now heading into that territory.

Copy link
Contributor

@poljar poljar Jan 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sliding sync aside, there's several reasons why the server would want to know the room name or at least where the room sits in an ordered set. Examples include email notifications (would need to be unencrypted or completely made of magic), abuse mitigations (can be encrypted), and directories (spaces, non-public rooms, etc - can be encrypted).

I'm not sure what you're trying to say here, does this mean that we don't want to encrypt m.room.name, if so that should be listed in the MSC besides the other events that will stay unencrypted.

On the other hand if you mean that it's OK to encrypt m.room.name then I think it's OK to discuss how this will impact sliding sync and other server functionality. We should perhaps expand the small section of broken functionality to get a comprehensive list. As of now we only have this single comment:

This effectively breaks the room directory and #2946 as they use named fields for details such as the room name.

Furthermore, I don't think it was obvious to people, it certainly wasn't obvious to me, that leaking the alphabetical order of the m.room.name events will likely leak, albeit only partially, the contents as well.

Preferably, we keep sliding sync design decisions on the sliding sync MSCs, as this conversation is now heading into that territory.

Well sure, to some extent. If I go to the sliding sync MSC, then people will tell me that the sliding sync MSC isn't concerned with encrypted state, we'll figure that out later in the encrypted state MSC.

The problem is that sliding sync and encrypted state have a deeply intertwined relationship. Fetching events via the /sync endpoint lies at the heart of the protocol, if you're going to potentially break it by encrypting some event then this should be discussed and understood.

So things that I would like to know:

  1. Do we want to encrypt m.room.name
  2. Do we want to leak the order of the m.room.name events to the server
  3. If we want to encrypt m.room.name, which services will be impacted by this
  4. If we don't want to leak the order, how does this affect sliding sync. Is the alphabetical order even crucial to the performance of sliding sync.

Out of those 4 questions, the only one that would be sensible to shove into the sliding sync MSC would be the last one.


Under this proposal, all room state events can be encrypted with the exception of events critical to
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Practically speaking, m.room.member is one of the most important things to encrypt - I wonder if this MSC should propose a way to encrypt the redactable bits of the payloads via megolm too. (It could be as easy as keeping the event type the same, but having an m.encrypted mix-in for the encrypted fields?)

(Also, worth noting that Synapse currently depends on m.room.name and m.room.topic for searching the roomdir)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(actually, we could split this off into its own MSC, i guess, but having a generic mechanism of encrypting the redactable fields of a state event without breaking its event type feel potentially useful)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'd probably be best to split this off to its own MSC, but it's also somewhat worth considering whether we'll have profiles-as-rooms (and how that works for per-room profiles) in a reasonable timeframe. If yes, then we might not need to encrypt half of an event, but if no then it's definitely worth pursuing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think only for m.room.member it makes sense to encrypt the redactable bits, but not the whole event (since the server needs to know membership, statekey and event type). For the other state events we can actually do all or nothing. So we could also craft an alternative just for member events, where we add an m.room.member_profile and put all the redactable bits there. Then the m.room.member would just have the membership, state_key and type fields and we would potentially also ease the load on state res, because a name change wouldn't potentially affect what events a user can send. profiles-as-rooms could solve that too, but are a much bigger task with its own downsides, if you want to encrypt as much metadata as possible.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Splitting profiles away from membership events is a bit non-trivial for this MSC, but possibly a future one.

maintain the protocol. Those critical events are:

* `m.room.create`
* `m.room.member`
* `m.room.join_rules`
* `m.room.power_levels`
* `m.room.third_party_invite`
* `m.room.history_visibility`
* `m.room.guest_access`
* `m.room.encryption` - Though not critical to the server-side protocol, it is critical to clients.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this list seems to be missing the acl event

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And tombstone events.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tombstones might be more questionable tbh. They're not exactly "critical" to the client any more than the name is, and the server just doesn't care.


Clients should ignore encrypted events of the above types.

Events which could be encrypted but aren't recommended to be encrypted under this proposal are:

* `m.space.child` and `m.space.parent` - These are used by the server in some endpoints, though could
be interpretted by clients if absolutely needed.

All other state event types are able to be encrypted, if the room itself is encrypted.

An encrypted state event looks very similar to a regular encrypted room message: the `type` becomes
`m.room.encrypted` and the `content` is the same shape as a regular `m.room.encrypted` event. The
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little scared about clobbering the state event's event type. For one thing, it means that PLs can't be enforced for that event type.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

they can't be enforced, but the (undocumented) thought on this is that we can have clients enforce the restrictions locally, as they already kinda need to for every other event type.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will be very challenging to implement, since clients usually don't have the full history cached and as such might not know, when a power level changed. This might lead to clients discarding old state events as invalid, because they think the sender was not allowed to send it. In the end I do appreciate though, that the state key and event type are not visible in plain text anymore. While that would make the implementation much easier, in a lot of cases that is the interesting data of a state event.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clients have access to /context though?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is just a bit complicated and there will be bugs, imo. I still prefer this approach in general though. (Also I didn't know that /context returned state events)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm honestly going to need a compelling argument to not hide the type/state key, given we do the same approach for timelines.

`state_key` for encrypted state events is an arbitrary string used to differentiate different kinds
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this means that we lose the ability for the server to mandate that a state_key containing a mxid can only be written by that mxid. We'd need a replacement (e.g. a flag on the unencrypted event payload to say "only let the author edit this event")

of state while protecting the underlying event type.

**Note**: It is deliberate that state keys can ultimately cause conflicts when the room state as a
whole is decrypted. This is because a consistent hash of, say, `room_id + event_type` would mean that
an attacker can simply run the hash algorithm themselves to determine which state event to target
if they wanted to try "breaking" the encryption. Without consistent hashing, the attacker is forced
to break all/most of the room state or find another vector of attack for the information they're
after. Lack of consistent hashing does cause some problems though.

Clients are expected to decrypt all room state so they can reuse the appropriate `state_key` when
updating state events. However, conflicts are still possible within the decrypted set of state (such
as two events claiming to be `m.room.name` with empty state key): these are resolved by using the
most recent event by `origin_server_ts` then by `event_id` in lexicographic order.

An obvious challenge to all of this is that clients won't have decryption keys for the events if
they joined after the events were sent. To combat this, when a client joins a room it should send
keyshare requests to anyone with larger-than-default power level in the room, where those empowered
clients should then respond with the appropriate decryption keys regardless of device/user trust. Note
Comment on lines +50 to +51
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure that we should spec out this behaviour. "Larger-than-default" could potentially be a lot of people and it would be best to even scale up those. Maybe send a request to 2, then if they don't respond quickly start ratcheting it up. But either way the exact algorithm is not important, so I think it is best to leave it unspecified so that clients can experiment and find out what works best. For example maybe it is better to contact users who recently sent a message, or it is a good idea to add some randomness to the decision so that not all clients are asking the same admin.

I would replace this with something like "When a client joins a room it should send keyshare requests to existing room members. To avoid unnecessary work it should limit the rate at which it sends requests. It is recommended that clients send request to users of higher power level first."

And maybe you can provide "larger-than-default" as a non-normative example.

that this effectively means a client should use a dedicated megolm session when working with room state.
Comment on lines +51 to +52
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This really is the most important part imo in this MSC and needs to be more explicit and a bit more nuanced. You will probably need a different megolm session for each event_type+state_key tuple. Otherwise you risk leaking old state to new members.

For example if we use the same session for the room name and the topic and we set it like this:

event_type megolm_index
m.room.name 0
m.room.topic 1
m.room.topic 2

Then we leak the old topic to a new joiner, if we share the room name with them. That also means we will have quite a few megolm sessions per room, if we try to prevent that by having one session per newest state event. On the other hand the impact would be limited by the rotation of megolm keys after a week, if that still applies to state events.

To me this is the critical part of this MSC and deserves additional attention, so that clients can implement it correctly and securely.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What sort of leak are you thinking of? This would be no different than state events today when someone joins a room, and history visibility would more often be used to protect old state from prying eyes.

Copy link
Contributor

@deepbluev7 deepbluev7 Sep 28, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you don't want to leak old messages, you protect that by only sharing indices from where the user was invited/joined, I'd say the same should apply to state events. A user should only be able to decrypt the state from when they joined/got invited. You don't rely on history visibility in encrypted rooms, because that would mean trusting the server. But currently it is impossible to share the decryption key for index 0 and 2 without also sharing 1.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, that nuance is supposed to be implied as part of "use matrix crypto" - I'm very keen to not duplicate the entire crypto spec into this MSC. This MSC only goes as far as describing how the existing systems don't work for this setup, but intentionally does not say how the existing rules already work. This applies to all sorts of areas like sharing at particular indices and how keys are sent to people.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not saying to duplicate the existing spec, but mentioning the danger of using the same session for all different state event types would be appreciated, because it might not be obvious. Especially since not calling out such issues explicitly was one of the reasons for the recent key sharing security issue.

This also implies that all room state is still available to all members of the room - per-event ACLs
are deliberately not covered by this MSC, though could be trivially implemented on top of a system
like this.

Keyshare requests are sent to all larger-than-default members of the room to ensure that the event sender
isn't required to stay online 24/7 for new members to decrypt the events. The theory here is that at least
one of the room moderators/admins will be online when the user joins (considering most joins to encrypted
rooms will follow an invite), so getting the decryption keys should be easier. Clients might wish to
send out a wider keyshare request if the larger-than-default members don't respond in X seconds.

For invites, servers are expected to include all `m.room.encrypted` state events in the invite/knock
state. When sending invites, clients *should* send decryption keys for the aesthetic events (room name,
topic, etc) so the invite can be contextual, though this is not required. This same approach is not
possible on 3PID invites, however that is considered an acceptable loss under this MSC.

### Worked example

To encrypt an `m.room.name` state event, the client would first determine if there's already an encrypted
room name event. If so, it would use the same `state_key`, but otherwise generate a string to become
the `state_key`. Then, the event is encrypted as normal with the following payload:
Comment on lines +70 to +72
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
To encrypt an `m.room.name` state event, the client would first determine if there's already an encrypted
room name event. If so, it would use the same `state_key`, but otherwise generate a string to become
the `state_key`. Then, the event is encrypted as normal with the following payload:
To encrypt an `m.room.name` state event, the client would first determine if there's already an encrypted
room name event. If so, it would reuse the outer `state_key` as the outer state key of the new event, otherwise the client would generate a random string to become
the outer `state_key`. Then, the event is encrypted as normal with the following payload:

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found this section very confusing as the inner and outer state keys are not given different names to help differentiate them.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"inner" and "outer" are honestly more confusing imo. They imply a parent/child relationship when there isn't one.

Will have a think about how to improve it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe "server-visible" and "encrypted"? I agree that inner and outer aren't the best. I don't think they imply parent-child but aim to say that one state-key is inside the encrypted payload and one is outside.


```json5
{
"type": "m.room.name",
"state_key": "", // actual state key for the event, not the generated one!
turt2live marked this conversation as resolved.
Show resolved Hide resolved
"content": {
"name": "Secret Chat"
},
"room_id": "!room:example.org"
}
```

Finally, the state event is sent to the room to end up as:

```json5
{
"type": "m.room.encrypted",
"state_key": "<generated_or_reused_string>",
turt2live marked this conversation as resolved.
Show resolved Hide resolved
"content": {
"algorithm": "m.megolm.v1.aes-sha2",
"sender_key": "<sender_curve25519_key>",
"device_id": "<sender_device_id>",
"session_id": "<outbound_group_session_id>",
"ciphertext": "<encrypted_payload_base_64>"
}
}
```

### Restricted rooms

Restricted rooms ([MSC3083](https://github.com/matrix-org/matrix-doc/pull/3083)) implement a join rule where
people can join a room if they are members of one or more other given rooms. While
[MSC3061](https://github.com/matrix-org/matrix-doc/pull/3061) can be leveraged for explicit invites to new
members of the room, the MSC also does not solve the problem that restricted rooms impose.

To fix the reliance on key sharing, it is proposed that keys be automatically sent whenever restricted-ness
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An interesting idea that someone (i forget who, sorry) proposed is that one could maintain serverside key backups per space (or room). So when you invite a new person into a big space tree of encrypted rooms, you literally give them the key to access the serverside backup for as long as they are a member of the space. Obviously this breaks pfs, but pfs is already gone the second you are sharing keys on invite anyway.

changes and when membership changes in a room used by a restricted room.

As a worked example, Room A and Room B exist, where Room B is restricted to members of Room A. Alice is admin
of both rooms and Bob got invited to Room A by Alice. Alice shares keys for all relevant parts of Room A (the
state and timeline under MSC3061) and Room B's state with Bob. If Alice were to add Room C as a room for members
to join Room B with, then Alice would send decryption keys for all of Room B's state with all members of Room
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is unclear why Alice is doing all of the sharing. Is it because she is an admin of room A or because she is an admin of room B? Or are we assuming that she is making these state event changes. I think this needs to be explicit.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps clarifying what privileges's Bob assumed to have (regular user, so non-admin rights, that's read/write to public rooms, but no invites..., for example?) in this scenario'd also be helpful.

C too.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like it would be very noisy. If you imagine a space for a company or similar every topic change of a space-joinable room is going to be sent to all employees. This could be a lot of traffic if you have a couple of rooms have frequently changing topics such as critical issues, current oncall or anything else.


In order to limit the exposure of state to unjoined parties, it is acceptable for a client to only share keys
for aesthetic events (name, avatar, topic, etc) until the user joins the room where they then receive the full
state through key sharing or other mechanisms. This is primarily intended to reduce exposure of a room's
metadata until the user joins an appropriate room. As an alternative, clients could share just the aesthetic
events on invite and the remainder when the user accepts the invite, pre-empting their join attempt to the
restricted room.

### Other details

Clients should ensure that keyshare requests are from current members of the room, from their view. Clients
turt2live marked this conversation as resolved.
Show resolved Hide resolved
should also attempt to request keys from their own devices in addition to empowered users in the event that
a different device accepted the invite or the keys were sent to the "wrong" device. Key backup might also be
useful here.

Keyshare requests from blocked devices/users are still expected to be ignored or declined.

Note that clients will have to track where a session was used to ensure the sessions are appropriately sent
to users. For some clients this might be a simple lookup to see if the session ID is used in a state event,
though for others it might make more sense to explicitly track it.

For media events like room avatars, encryption is the same as regular room messages. The media is uploaded
encrypted to the homeserver then detailed in a payload that is also encrypted to the final room state event.

## Potential issues

Decryption failures of any variety are always an ongoing risk with encryption, however it is believed that
the vast majority of decryption failures are related to implementation or out-of-scope bugs. This MSC has
taken steps to ensure that each user will receive decryption keys eventually, though the speed of receipt
for these keys might be delayed.

Keysharing is not great as a method of transport for these keys, however this is believed to be better over
members of the room sending keys when they see the join. Instead of spamming decryption keys at the joined
user, or worse: missing the join and not sending anything, the joiner is responsible for trying to locate
the keys themselves.

Conflicts are extremely easy to manufacture, however the feature making it easy to form conflicts is needed
in order to protect the event's metadata. The conflict resolution algorithm can additionally be easily
abused (ie: sending a timestamp for the year 4025 to force the room name to never change), however at that
point the room is making it harder on itself to change over time.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This allows all users who can send state events to bypass power level restrictions, which isn't just "making it harder on itself". The server cannot enforce PL rules effectively with this change.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a unique problem for this MSC, and state events are typically only sent by those who have higher power in the room: they would be spamming themselves.


Protocol-critical events are still unencrypted, however this is required given it is impractical to give
servers specialized devices of their own at this time. A peer-to-peer world might solve this problem in
a different way, considering the servers are devices which can receive decryption keys.

This MSC prioritizes protection of the average room's metadata, which unfortuantely lends protection to
abusive or illegal rooms as well. Though this MSC does make it harder for trust & safety teams to identify
encrypted abusive rooms, there are other methods and approaches which teams can use to remove harmful
content from their servers. Those methods are not detailed here to keep the MSC on topic.

Clients won't be able to use helper APIs for complete protection of metadata, such as the `name` key on
[`/createRoom`](https://matrix.org/docs/spec/client_server/r0.6.1#post-matrix-client-r0-createroom). This
might lead to additional server load when the client hits
[`PUT /state`](https://matrix.org/docs/spec/client_server/r0.6.1#put-matrix-client-r0-rooms-roomid-state-eventtype-statekey)
though more intelligent processing of rate limits should make it easier for clients and servers to honour
the requests. For example, increased burst limits for the few minutes after a room is created.

This effectively breaks the room directory and [space summary API](https://github.com/matrix-org/matrix-doc/pull/2946)
as they use named fields for details such as the room name. Considering both APIs use the same shape,
as defined by [`/publicRooms`](https://matrix.org/docs/spec/client_server/r0.6.0#post-matrix-client-r0-publicrooms),
the proposal here is to include a new `encrypted_state` key which is an array of Stripped State events
for all `m.room.encrypted` state events. For example:

```json5
{
"chunk": [
{
"aliases": ["#home:example.org"],
"canonical_alias": "#home:example.org",
"guest_can_join": true,
"num_joined_members": 42,
"room_id": "!room:example.org",
"world_readable": true,
"encrypted_state": [
{"type": "m.room.encrypted", "state_key": "<string>", "content": {/*encrypted*/}},
{"type": "m.room.encrypted", "state_key": "<string>", "content": {/*encrypted*/}},
{"type": "m.room.encrypted", "state_key": "<string>", "content": {/*encrypted*/}}
]
}
]
}
```

Other approaches such as deprecating `name`, `topic`, etc in favour of a `state` array are best considered
for other MSCs.

## Alternatives
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not discussed here is the option of moving the entire room model client-side, where servers just move JSON blobs around and clients do event auth, state res, etc locally in an encrypted setting. This is obviously more involved/complicated to pull off, though.


We could send a regular room message with the state event content, then in the room state simply use the
regular event type and state key with a content of `{"encrypted": "$event_id"}`, however this leaks
state metadata still and requires the client to have visibility on a random message event. Further, it
is not easily enforced that the `$event_id` even exist. Variations on this idea such as using the event
ID in the state key and using `m.room.encrypted` for the `type` are subject to similar issues.

## Security considerations

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does a user know, if it should share keys with a member? A server could always insert a new member into a room (or even show the new member only to a specific room member). That way it could effectively attack the room to get access to the room state. Messages are protected against that, by tracking who was in the room, when an encrypted message is sent. The user then has the chance to not send the message to the members in the room and keys are only reshared to devices that should have received that message.

This does not apply to room state. You need to share the keys for room state to new members in the room. You can somewhat limit the impact by only sharing the keys with trusted members, but most clients don't do that by default, because it leads to a lot of undecryptable messages.

So the question is:

  • Should the user be prompted before sharing the room state? (This might make joining a room annoying=
  • Should it automatically treat members you invited yourself or you already verified as trusted and automatically share room state? (This still has a risk of a trusted user joining a room, that it shouldn't and getting access to state they shouldn't have.)

is that something clients have to figure out for themselves how secure they want to be or should we have some guidelines, so that we can rely on any client having some standard of security?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The MSC outlines that all members who join get the whole room state, regardless of user/device trust.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My comment was about the server having easy access to the encrypted state by just temporarily adding a device or user into the room and then removing it. This doesn't even have to be in the room state, but could be just sent to a local user via /sync. From what I can tell, this effectively makes the encryption useless, because it can't protect you from the server, just from other users. I don't see where the MSC addresses that concern, but maybe I'm just blind and you could link me to the relevant lines? Because I don't see, how this protects you from the server just requesting the keys for the room state.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, I see now. That is a bit of a problem.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will be solved by #3917, which IMO this MSC should wait on.

Decryption keys would be sent to invited parties under this proposal. This is a risk if there are badly
behaving clients which populated room state (such as re-using a megolm session across regular messages
and room state), and specifies that someone not in a room can decrypt events without being a member of
the room. The primary expected risk here is that a malicious user might be using a well-behaved client
to leak aesthetic details of the room to other parties through invites: room operators should be careful
who they allow to send invites. This is a similar risk to [MSC3061](https://github.com/matrix-org/matrix-doc/pull/3061).

Users could be excluded from seeing what the room is about because no one will provide keys to decrypt
the events. This is considered an unlikely social risk.

Sole admins could lose the decryption keys for their own state events, prompting keyshare requests to
other non-empowered users which effectively gives them authority over the room state (they could decline
to provide keys). Room admins will still be able to send further state events which leverage the conflict
resolution algorithm outlined by this MSC, though the original state events are subject to the remainder
of the room agreeing to send keys appropriately.

For DMs and similiarly sized rooms, it's somewhat likely that all parties lose decryption keys, thus
making the room state effectively lost to time. This is a pre-existing issue in Matrix and best solved
by a user remembering the event contents to re-send them.

As mentioned in the early parts of this proposal, clients could encrypt and send events which are listed
as "critical" to the protocol. Clients must take care to *not* honour encrypted versions of these events
as they will likely be mismatched from how the server is treating the room.

Users leaving the room, either voluntarily or by force, can still decrypt the room state. This is unavoidable
given keys were given to them in the first place, and is the same risk as the room events themselves.
This MSC makes no specific recommendation for when to re-encrypt/rotate sessions outside of the guidelines
already present in the specification. For example, do not use a session after a user has left the room as
that user could theoretically acquire and decrypt a state event they weren't supposed to. Clients are
welcome to re-encrypt the entire room state with a new session after a member leaves the room, however
rate limits and practicality will come into question.

While encrypted state events are, well, encrypted, it is not proposed that implementations use this to
store credentials or secrets in room state: sensitive information can still be revealed with keyshare
requests, manual key exports/imports, etc. This MSC also has a number of places where keys are shared
with invited members rather than joined members, leading to a potential of exposure.

## Unstable prefix

While this MSC is not considered stable, clients should send both encrypted and unencrypted copies of
state events to the room. The conflict resolution algorithm applies to the decrypted set of events, which
ultimately means clients will pick up the most recent encrypted or unencrypted state event for that event
type.

No requirement to use namespaced prefixes is present due to `m.room.encrypted` already being a valid and
supported event type.