-
Notifications
You must be signed in to change notification settings - Fork 382
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] MSC3414: Encrypted state events #3414
base: main
Are you sure you want to change the base?
Conversation
|
||
## Proposal | ||
|
||
Under this proposal, all room state events can be encrypted with the exception of events critical to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Practically speaking, m.room.member
is one of the most important things to encrypt - I wonder if this MSC should propose a way to encrypt the redactable bits of the payloads via megolm too. (It could be as easy as keeping the event type the same, but having an m.encrypted mix-in for the encrypted fields?)
(Also, worth noting that Synapse currently depends on m.room.name
and m.room.topic
for searching the roomdir)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(actually, we could split this off into its own MSC, i guess, but having a generic mechanism of encrypting the redactable fields of a state event without breaking its event type feel potentially useful)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'd probably be best to split this off to its own MSC, but it's also somewhat worth considering whether we'll have profiles-as-rooms (and how that works for per-room profiles) in a reasonable timeframe. If yes, then we might not need to encrypt half of an event, but if no then it's definitely worth pursuing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think only for m.room.member
it makes sense to encrypt the redactable bits, but not the whole event (since the server needs to know membership, statekey and event type). For the other state events we can actually do all or nothing. So we could also craft an alternative just for member events, where we add an m.room.member_profile and put all the redactable bits there. Then the m.room.member would just have the membership, state_key and type fields and we would potentially also ease the load on state res, because a name change wouldn't potentially affect what events a user can send. profiles-as-rooms could solve that too, but are a much bigger task with its own downsides, if you want to encrypt as much metadata as possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Splitting profiles away from membership events is a bit non-trivial for this MSC, but possibly a future one.
All other state event types are able to be encrypted, if the room itself is encrypted. | ||
|
||
An encrypted state event looks very similar to a regular encrypted room message: the `type` becomes | ||
`m.room.encrypted` and the `content` is the same shape as a regular `m.room.encrypted` event. The |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little scared about clobbering the state event's event type. For one thing, it means that PLs can't be enforced for that event type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
they can't be enforced, but the (undocumented) thought on this is that we can have clients enforce the restrictions locally, as they already kinda need to for every other event type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will be very challenging to implement, since clients usually don't have the full history cached and as such might not know, when a power level changed. This might lead to clients discarding old state events as invalid, because they think the sender was not allowed to send it. In the end I do appreciate though, that the state key and event type are not visible in plain text anymore. While that would make the implementation much easier, in a lot of cases that is the interesting data of a state event.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clients have access to /context
though?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it is just a bit complicated and there will be bugs, imo. I still prefer this approach in general though. (Also I didn't know that /context returned state events)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm honestly going to need a compelling argument to not hide the type/state key, given we do the same approach for timelines.
clients should then respond with the appropriate decryption keys regardless of device/user trust. Note | ||
that this effectively means a client should use a dedicated megolm session when working with room state. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This really is the most important part imo in this MSC and needs to be more explicit and a bit more nuanced. You will probably need a different megolm session for each event_type+state_key tuple. Otherwise you risk leaking old state to new members.
For example if we use the same session for the room name and the topic and we set it like this:
event_type | megolm_index |
---|---|
m.room.name |
0 |
m.room.topic |
1 |
m.room.topic |
2 |
Then we leak the old topic to a new joiner, if we share the room name with them. That also means we will have quite a few megolm sessions per room, if we try to prevent that by having one session per newest state event. On the other hand the impact would be limited by the rotation of megolm keys after a week, if that still applies to state events.
To me this is the critical part of this MSC and deserves additional attention, so that clients can implement it correctly and securely.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What sort of leak are you thinking of? This would be no different than state events today when someone joins a room, and history visibility would more often be used to protect old state from prying eyes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you don't want to leak old messages, you protect that by only sharing indices from where the user was invited/joined, I'd say the same should apply to state events. A user should only be able to decrypt the state from when they joined/got invited. You don't rely on history visibility in encrypted rooms, because that would mean trusting the server. But currently it is impossible to share the decryption key for index 0 and 2 without also sharing 1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, that nuance is supposed to be implied as part of "use matrix crypto" - I'm very keen to not duplicate the entire crypto spec into this MSC. This MSC only goes as far as describing how the existing systems don't work for this setup, but intentionally does not say how the existing rules already work. This applies to all sorts of areas like sharing at particular indices and how keys are sent to people.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not saying to duplicate the existing spec, but mentioning the danger of using the same session for all different state event types would be appreciated, because it might not be obvious. Especially since not calling out such issues explicitly was one of the reasons for the recent key sharing security issue.
ID in the state key and using `m.room.encrypted` for the `type` are subject to similar issues. | ||
|
||
## Security considerations | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does a user know, if it should share keys with a member? A server could always insert a new member into a room (or even show the new member only to a specific room member). That way it could effectively attack the room to get access to the room state. Messages are protected against that, by tracking who was in the room, when an encrypted message is sent. The user then has the chance to not send the message to the members in the room and keys are only reshared to devices that should have received that message.
This does not apply to room state. You need to share the keys for room state to new members in the room. You can somewhat limit the impact by only sharing the keys with trusted members, but most clients don't do that by default, because it leads to a lot of undecryptable messages.
So the question is:
- Should the user be prompted before sharing the room state? (This might make joining a room annoying=
- Should it automatically treat members you invited yourself or you already verified as trusted and automatically share room state? (This still has a risk of a trusted user joining a room, that it shouldn't and getting access to state they shouldn't have.)
is that something clients have to figure out for themselves how secure they want to be or should we have some guidelines, so that we can rely on any client having some standard of security?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The MSC outlines that all members who join get the whole room state, regardless of user/device trust.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My comment was about the server having easy access to the encrypted state by just temporarily adding a device or user into the room and then removing it. This doesn't even have to be in the room state, but could be just sent to a local user via /sync. From what I can tell, this effectively makes the encryption useless, because it can't protect you from the server, just from other users. I don't see where the MSC addresses that concern, but maybe I'm just blind and you could link me to the relevant lines? Because I don't see, how this protects you from the server just requesting the keys for the room state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, I see now. That is a bit of a problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will be solved by #3917, which IMO this MSC should wait on.
* `m.room.third_party_invite` | ||
* `m.room.history_visibility` | ||
* `m.room.guest_access` | ||
* `m.room.encryption` - Though not critical to the server-side protocol, it is critical to clients. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this list seems to be missing the acl event
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And tombstone events.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tombstones might be more questionable tbh. They're not exactly "critical" to the client any more than the name is, and the server just doesn't care.
keyshare requests to anyone with larger-than-default power level in the room, where those empowered | ||
clients should then respond with the appropriate decryption keys regardless of device/user trust. Note |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure that we should spec out this behaviour. "Larger-than-default" could potentially be a lot of people and it would be best to even scale up those. Maybe send a request to 2, then if they don't respond quickly start ratcheting it up. But either way the exact algorithm is not important, so I think it is best to leave it unspecified so that clients can experiment and find out what works best. For example maybe it is better to contact users who recently sent a message, or it is a good idea to add some randomness to the decision so that not all clients are asking the same admin.
I would replace this with something like "When a client joins a room it should send keyshare requests to existing room members. To avoid unnecessary work it should limit the rate at which it sends requests. It is recommended that clients send request to users of higher power level first."
And maybe you can provide "larger-than-default" as a non-normative example.
To encrypt an `m.room.name` state event, the client would first determine if there's already an encrypted | ||
room name event. If so, it would use the same `state_key`, but otherwise generate a string to become | ||
the `state_key`. Then, the event is encrypted as normal with the following payload: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To encrypt an `m.room.name` state event, the client would first determine if there's already an encrypted | |
room name event. If so, it would use the same `state_key`, but otherwise generate a string to become | |
the `state_key`. Then, the event is encrypted as normal with the following payload: | |
To encrypt an `m.room.name` state event, the client would first determine if there's already an encrypted | |
room name event. If so, it would reuse the outer `state_key` as the outer state key of the new event, otherwise the client would generate a random string to become | |
the outer `state_key`. Then, the event is encrypted as normal with the following payload: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found this section very confusing as the inner and outer state keys are not given different names to help differentiate them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"inner" and "outer" are honestly more confusing imo. They imply a parent/child relationship when there isn't one.
Will have a think about how to improve it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe "server-visible" and "encrypted"? I agree that inner and outer aren't the best. I don't think they imply parent-child but aim to say that one state-key is inside the encrypted payload and one is outside.
of both rooms and Bob got invited to Room A by Alice. Alice shares keys for all relevant parts of Room A (the | ||
state and timeline under MSC3061) and Room B's state with Bob. If Alice were to add Room C as a room for members | ||
to join Room B with, then Alice would send decryption keys for all of Room B's state with all members of Room | ||
C too. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like it would be very noisy. If you imagine a space for a company or similar every topic change of a space-joinable room is going to be sent to all employees. This could be a lot of traffic if you have a couple of rooms have frequently changing topics such as critical issues, current oncall or anything else.
As a worked example, Room A and Room B exist, where Room B is restricted to members of Room A. Alice is admin | ||
of both rooms and Bob got invited to Room A by Alice. Alice shares keys for all relevant parts of Room A (the | ||
state and timeline under MSC3061) and Room B's state with Bob. If Alice were to add Room C as a room for members | ||
to join Room B with, then Alice would send decryption keys for all of Room B's state with all members of Room |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is unclear why Alice is doing all of the sharing. Is it because she is an admin of room A or because she is an admin of room B? Or are we assuming that she is making these state event changes. I think this needs to be explicit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps clarifying what privileges's Bob assumed to have (regular user, so non-admin rights, that's read/write to public rooms, but no invites..., for example?) in this scenario'd also be helpful.
|
||
An encrypted state event looks very similar to a regular encrypted room message: the `type` becomes | ||
`m.room.encrypted` and the `content` is the same shape as a regular `m.room.encrypted` event. The | ||
`state_key` for encrypted state events is an arbitrary string used to differentiate different kinds |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this means that we lose the ability for the server to mandate that a state_key containing a mxid can only be written by that mxid. We'd need a replacement (e.g. a flag on the unencrypted event payload to say "only let the author edit this event")
[MSC3061](https://github.com/matrix-org/matrix-doc/pull/3061) can be leveraged for explicit invites to new | ||
members of the room, the MSC also does not solve the problem that restricted rooms impose. | ||
|
||
To fix the reliance on key sharing, it is proposed that keys be automatically sent whenever restricted-ness |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An interesting idea that someone (i forget who, sorry) proposed is that one could maintain serverside key backups per space (or room). So when you invite a new person into a big space tree of encrypted rooms, you literally give them the key to access the serverside backup for as long as they are a member of the space. Obviously this breaks pfs, but pfs is already gone the second you are sharing keys on invite anyway.
…light dependency Co-authored-by: Jae Lo Presti <me@jae.fi> Signed-off-by: Aminda Suomalainen <suomalainen+git@mikaela.info>
Most events in room state could be encrypted to preserve metadata, which is what this MSC aims | ||
to achieve. Some parts cannot be encrypted though in order to maintain a working protocol. | ||
|
||
## Proposal |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This MSC is fundamentally incompatible with Sliding Sync (MSC #3575), which relies on the server being able to calculate room names serverside in order to paginate them. Moreover, if it lands, it will fragment the network into SS-capable traffic versus non-SS-capable traffic.
The only mitigation I can see is if there's a homomorphic way we can map the plaintext room & room membership names to ordered items which can be used to order the rooms. (Or we could give up on encrypting m.room.name and profile data, but that /really/ starts to undermine the intentions of this MSC).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only mitigation I can see is if there's a homomorphic way we can map the plaintext room & room membership names to ordered items which can be used to order the rooms.
I'm not sure if homomorphic encryption would help here. To remind ourselves what homomorphic encryption is about:
Homomorphic encryption is a form of encryption that permits users to perform computations on its encrypted data without first decrypting it. These resulting computations are left in an encrypted form which, when decrypted, result in an identical output to that produced had the operations been performed on the unencrypted data.
So let's make a thought experiment out of this, suppose we have a homomorphic encryption scheme which allows us to order a variable sized array of strings in a alphabetical order. Let's call this function
Let's say that the array of strings are the display names of all the rooms the user is in,
So the client tells the server I want you to sort
The server now goes around and does it's thing and computes
Now by definition, each element in
The client receives
Finally the client can tell the server, I want to start syncing the rooms with the given room ids. Let us ignore the fact that this second request will leak the alphabetical order of the rooms to the server, and that it might figure out the room names over time while the user is sliding the window while scrolling through the room list.
Let us focus on the fact that this still isn't really what sliding sync wants. Sliding sync wants to know the order of the rooms. It wants to partially decrypt
Now if the above is not what you had in mind, and instead you always intended to leak the order to the server, then we arrive at order-preserving encryption[1] (OPE). But as already hinted above, leaking the order leaks more than just the order[2][3]. How much additional data we would be leaking is not clear. Room display names are a long way off from being a perfectly uniform plaintext. This makes OPE schemes a questionable choice.
I would like to be wrong and someone to come along and tell me that what I wrote up above is not how things should work, instead they would work in such and such a way where everything fits, and everything is nicely encrypted. As it stands I don't see how we would make the ordering of rooms by name, that sliding sync wants, and encrypted display names compatible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, i was being lazy in referencing homomorphic encryption - in practice OPE is what'd be needed. Agreed that this would be very leaky though, to the extent that it might not be worthwhile at all.
Another approach we could take to protect state event metadata would be to go full P2P, and have the servers just be store-and-forward nodes (of encrypted blobs). Then the servers running in the clients can happily maintain the metadata in plaintext, and index it and speak SS to the client to rapidly paginate through it. (Although in that model, why would we even need E2EE...?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and have the servers just be store-and-forward nodes (of encrypted blobs).
(Although in that model, why would we even need E2EE...?)
Perhaps I'm misreading the architecture you're proposing here, but if there's no E2EE, what are the encrypted blobs encrypted with?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that the ordering of sliding sync is already broken by the current encryption for rooms. The notifications counts for encrypted rooms are usually incorrect and get recalculated by clients. So there is precedent for having clients apply fixups to server side data. Now of course if you want alphabetical sorting, that will be a bit more annoying than just notification counts, since that will affect more rooms. However the roomname is (next to the topic) usually the event with the most sensitive information.
If this MSC is changed to be able to detect room name events without being able to decrypt those events, those could just get sent along the sync. This would not increase the database load, since the server already needs to load those events usually to sort the rooms (or should at least not make as much of a difference). It would increase bandwidth usage for clients that sort by displayname and make the sync a bit slower. However that might be a trade-off, that is worth the added privacy and security.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sliding sync aside, there's several reasons why the server would want to know the room name or at least where the room sits in an ordered set. Examples include email notifications (would need to be unencrypted or completely made of magic), abuse mitigations (can be encrypted), and directories (spaces, non-public rooms, etc - can be encrypted).
Preferably, we keep sliding sync design decisions on the sliding sync MSCs, as this conversation is now heading into that territory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sliding sync aside, there's several reasons why the server would want to know the room name or at least where the room sits in an ordered set. Examples include email notifications (would need to be unencrypted or completely made of magic), abuse mitigations (can be encrypted), and directories (spaces, non-public rooms, etc - can be encrypted).
I'm not sure what you're trying to say here, does this mean that we don't want to encrypt m.room.name
, if so that should be listed in the MSC besides the other events that will stay unencrypted.
On the other hand if you mean that it's OK to encrypt m.room.name
then I think it's OK to discuss how this will impact sliding sync and other server functionality. We should perhaps expand the small section of broken functionality to get a comprehensive list. As of now we only have this single comment:
This effectively breaks the room directory and #2946 as they use named fields for details such as the room name.
Furthermore, I don't think it was obvious to people, it certainly wasn't obvious to me, that leaking the alphabetical order of the m.room.name
events will likely leak, albeit only partially, the contents as well.
Preferably, we keep sliding sync design decisions on the sliding sync MSCs, as this conversation is now heading into that territory.
Well sure, to some extent. If I go to the sliding sync MSC, then people will tell me that the sliding sync MSC isn't concerned with encrypted state, we'll figure that out later in the encrypted state MSC.
The problem is that sliding sync and encrypted state have a deeply intertwined relationship. Fetching events via the /sync
endpoint lies at the heart of the protocol, if you're going to potentially break it by encrypting some event then this should be discussed and understood.
So things that I would like to know:
- Do we want to encrypt
m.room.name
- Do we want to leak the order of the
m.room.name
events to the server - If we want to encrypt
m.room.name
, which services will be impacted by this - If we don't want to leak the order, how does this affect sliding sync. Is the alphabetical order even crucial to the performance of sliding sync.
Out of those 4 questions, the only one that would be sensible to shove into the sliding sync MSC would be the last one.
Conflicts are extremely easy to manufacture, however the feature making it easy to form conflicts is needed | ||
in order to protect the event's metadata. The conflict resolution algorithm can additionally be easily | ||
abused (ie: sending a timestamp for the year 4025 to force the room name to never change), however at that | ||
point the room is making it harder on itself to change over time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This allows all users who can send state events to bypass power level restrictions, which isn't just "making it harder on itself". The server cannot enforce PL rules effectively with this change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not a unique problem for this MSC, and state events are typically only sent by those who have higher power in the room: they would be spamming themselves.
* `m.room.third_party_invite` | ||
* `m.room.history_visibility` | ||
* `m.room.guest_access` | ||
* `m.room.encryption` - Though not critical to the server-side protocol, it is critical to clients. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And tombstone events.
Other approaches such as deprecating `name`, `topic`, etc in favour of a `state` array are best considered | ||
for other MSCs. | ||
|
||
## Alternatives |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not discussed here is the option of moving the entire room model client-side, where servers just move JSON blobs around and clients do event auth, state res, etc locally in an encrypted setting. This is obviously more involved/complicated to pull off, though.
Rendered
See also: