Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MSC3700: Deprecate plaintext sender key #3700

Merged
merged 7 commits into from
Apr 10, 2022
Merged

Conversation

erikjohnston
Copy link
Member

@erikjohnston erikjohnston commented Feb 3, 2022

This deprecates the superfluous sender_key and device_id in a megolm message, e.g. an encrypted room event.

Rendered

Implementation: matrix-org/matrix-rust-sdk#481

Preview: https://pr3700--matrix-org-previews.netlify.app

@erikjohnston erikjohnston force-pushed the erikj/deprecate_sender_key branch from efdfed7 to 48e87de Compare February 3, 2022 14:23
@erikjohnston erikjohnston changed the title MSC: Deprecate plaintext sender key MSC3700: Deprecate plaintext sender key Feb 3, 2022
@erikjohnston erikjohnston force-pushed the erikj/deprecate_sender_key branch from 48e87de to 3908f07 Compare February 3, 2022 14:23
When updating an existing session key, clients must ensure:
1. that the updated session data comes from a trusted source, e.g. either the
session data has a) a valid signature, or b) comes from the user’s session
key backup; and
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should spell out what we mean by "trusted source" here

@turt2live turt2live added proposal A matrix spec change proposal proposal-in-review kind:maintenance MSC which clarifies/updates existing spec e2e needs-implementation This MSC does not have a qualifying implementation for the SCT to review. The MSC cannot enter FCP. labels Feb 3, 2022
@erikjohnston
Copy link
Member Author

I have done an implementation in the matrix-rust-sdk here: matrix-org/matrix-rust-sdk#481

@anoadragon453
Copy link
Member

The positive review of matrix-org/matrix-rust-sdk#481 makes me cautiously optimistic about the implementation side of this. Though I would still like to see a review here from someone more intimately familiar with megolm before calling FCP.

@erikjohnston erikjohnston removed the needs-implementation This MSC does not have a qualifying implementation for the SCT to review. The MSC cannot enter FCP. label Mar 22, 2022
@erikjohnston
Copy link
Member Author

I think this is ready for FCP

proposals/3700-deprecate-sender-key.md Outdated Show resolved Hide resolved
globally unique and so no disambiguation using `sender_key` or `device_id` is
needed.

Session IDs are encoded ed25519 public keys; when new sessions are shared they
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Session IDs are encoded ed25519 public keys

I think we should consider strengthening the wording in the spec to make this clear. Currently, https://spec.matrix.org/v1.2/client-server-api/#mmegolmv1aes-sha2 just says "outbound_group_session_id", without clarity on what exactly it is.

Buried within https://gitlab.matrix.org/matrix-org/olm/blob/master/docs/megolm.md we have "... the public part of K (acting as a session identifier) ...", but it's a bit hard to join the dots as things stand.

(Background: when I designed the megolm message format, the intention was that receiving clients would treat session_id as opaque, and they had no guarantee that it would be unique to a given device or even sender, hence the presence of the other fields.)

proposals/3700-deprecate-sender-key.md Outdated Show resolved Hide resolved
@richvdh
Copy link
Member

richvdh commented Mar 22, 2022

seems good to me.

@mscbot fcp merge

@mscbot
Copy link
Collaborator

mscbot commented Mar 22, 2022

Team member @richvdh has proposed to merge this. The next step is review by the rest of the tagged people:

Once at least 75% of reviewers approve (and there are no outstanding concerns), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for information about what commands tagged team members can give me.

@mscbot mscbot added disposition-merge proposed-final-comment-period Currently awaiting signoff of a majority of team members in order to enter the final comment period. and removed proposal-in-review labels Mar 22, 2022
erikjohnston and others added 3 commits March 23, 2022 10:10
Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com>
Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com>
should still include it when generating *if* there is a `sender_key` field in
the event we're requesting keys for.

Clients must store and lookup sessions based purely on the session ID.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes it sound like clients are not allowed to store sessions based on room ID or sender user ID, which I don't think is what is intended.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, that is an interesting question actually. I think we don't want clients to ever lookup inbound sessions based on room ID or sender user ID, as they can be faked in exactly the same way as sender_key and device_id. Though I guess clients need to explicitly check that the sender user and room matches the session? I can make that more explicit.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least for room ID, it's hard to fake since the event itself is tied to the room. (Also, your rust SDK PR stores by room ID and session ID.) Maybe it could say something like:

Clients should store and lookup sessions based purely on the session ID or based on the room ID and session ID. Clients must not store sessions based on the sender key or device ID.

?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least for room ID, it's hard to fake since the event itself is tied to the room.

The server could easily lie about the room ID for an event though. If we don't check the room ID matches the session room ID then it allows servers to forward encrypted events into other rooms.


I've updated the language to:

Clients must not store or lookup sessions using the sender key or device ID.

Client must continue to ensure that the event's sender and room ID fields match
those of the looked up session, e.g. by storing and looking up session using the
room ID and sender as well as the session ID.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The server could easily lie about the room ID for an event though. If we don't check the room ID matches the session room ID then it allows servers to forward encrypted events into other rooms.

It can't, the room ID is part of the encrypted payload as well[1].

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, so there are three places where we have a room ID? The event, the session and the encrypted payload? Presumably we still want to check that they all match?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, the event room id is a hint to find the right bucket, the session room id is just a hint into which bucket to put the session. The room id on the session also can't be replaced by the server.

The encrypted payload room id ensures that the server doesn't lie about the room id.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, the event room id is a hint to find the right bucket, the session room id is just a hint into which bucket to put the session. The room id on the session also can't be replaced by the server.

Sure, but that relies on the client actually verifying the room IDs match. This can be done as you say by storing the sessions under the room ID they correspond to and then looking the session up via the room ID.

However, there is a risk here that clients will do the following:

  1. Look up sessions solely via session ID.
  2. Not check the room ID in the encrypted payload
  3. Display the message in the room corresponding to the event's room ID.

At that point the client is vulnerable to the server lying about the event's room ID and making the client display the verified encrypted content into the wrong room.

A similar attack can happen with the sender.

The key issue here is that clients need to handle both encrypted and unecrypted events, so they're very likely to use the event's room ID and sender when rendering things in the UI, rather than any data in the encrypted payload or session. This makes it very important for clients to actually check that the user ID/senders and room IDs match.

Copy link
Member

@dkasak dkasak Apr 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes it very important for clients to actually check that the user ID/senders and room IDs match.

Agreed. But what can we do to make this more prominent? The spec has some vague wording on this for room IDs:

We include the room ID in the payload, because otherwise the homeserver would be able to change the room a message was sent in.

But this is IMO neither clear nor noticeable enough. There is also this bogus paragraph:

As with Olm events, clients must confirm that the sender_key belongs to the user who sent the message. The same reasoning applies, but the sender ed25519 key has to be inferred from the keys.ed25519 property of the event which established the Megolm session.

This seems like a description of exactly what not to do. Clients should not look at sender_key as the server can easily spoof that (and is what's now being removed!). Instead, clients should record the (Curve25519) identity key and (Ed25519) fingerprint keys of the Olm device that initially shared a particular Megolm session.


I guess we need clear, actionable steps of what to do upon an encrypted room message receipt. Something like:

Upon receipt of an encrypted room message, do the following:

  1. Record (Curve25519) identity key and (Ed25519) fingerprint key of the Olm device that initially sends the m.room_key for a particular message.
  2. Check that the m.room.encrypted event's room ID matches the room_id field in the encrypted plaintext of the message. If not, fail.
    ...

If any of the above checks fail, the client must not display the message as a normal message. The client may display the message with a clear indication that it violates a required security property.

Ideally these rules would be in a differently coloured box to make it stand out from the rest of the text.

@turt2live turt2live added spec-pr-missing Proposal has been implemented and is being used in the wild but hasn't yet been added to the spec and removed finished-final-comment-period labels Apr 10, 2022

When clients receive an encrypted event with an unknown session they will need
to send a key request to all clients, rather than the device specified by
`sender_key` and `device_id`. This is the current behaviour used by Element
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current behavior is to send to all your own devices with wildcard, but to the specific device_id of the sender. So this will need to change as it's not the current behavior. Notice that it will produce new to_device traffic

@turt2live turt2live mentioned this pull request May 16, 2022
15 tasks
@turt2live turt2live self-assigned this May 30, 2022
turt2live added a commit to matrix-org/matrix-spec that referenced this pull request May 31, 2022
MSC: matrix-org/matrix-spec-proposals#3700 ([Markdown](https://github.com/matrix-org/matrix-spec-proposals/blob/main/proposals/3700-deprecate-sender-key.md))

The language around `m.room.encrypted` is a bit awkward because *technically* you can use the event to represent non-Megolm events, however that's considered an edge case at this time.
@turt2live
Copy link
Member

Spec PR: matrix-org/matrix-spec#1101

@turt2live turt2live added spec-pr-in-review A proposal which has been PR'd against the spec and is in review and removed spec-pr-missing Proposal has been implemented and is being used in the wild but hasn't yet been added to the spec labels May 31, 2022
turt2live added a commit to matrix-org/matrix-spec that referenced this pull request Jun 9, 2022
* Deprecate the `sender_key` and `device_id` on Megolm events

MSC: matrix-org/matrix-spec-proposals#3700 ([Markdown](https://github.com/matrix-org/matrix-spec-proposals/blob/main/proposals/3700-deprecate-sender-key.md))

The language around `m.room.encrypted` is a bit awkward because *technically* you can use the event to represent non-Megolm events, however that's considered an edge case at this time.

* changelog

* Apply wording changes

* Remove incorrect example

* Add missing sentence
@turt2live turt2live added merged A proposal whose PR has merged into the spec! and removed spec-pr-in-review A proposal which has been PR'd against the spec and is in review labels Jun 9, 2022
@turt2live
Copy link
Member

Merged 🎉

@MRAAGH
Copy link

MRAAGH commented Sep 21, 2022

This sounds like it might make decryption issues harder to debug; if there's no plaintext ID in the event, I won't even be able to check which client messages were sent from

@anoadragon453
Copy link
Member

@MRAAGH Note that reduced ease of debugging is a known downside - it's listed at the bottom of the MSC. It was determined to be worth it for the increased privacy benefits (I can no longer tell you're sending messages from your phone at the moment).

I agree that this makes debugging old, undecryptable messages more difficult. But hopefully for recent conversations the involved parties will know which client sent what message.

@BillCarsonFr
Copy link
Member

A late comment to note that this MSC is impacting MSC2399, as no_olm withheld codes are sent without session_id or room_id. This will make it impossible to relate a no_olm code to an unable to decrypt message.
We might need to modify MSC2399 so that no_olm codes are sent in the same way as other codes, removing the optimisation for sending them only once.

cc @poljar @uhoreg

jcgruenhage pushed a commit to famedly/matrix-dart-sdk that referenced this pull request Jul 7, 2023
This is mostly to see if we can deal with MSC3700 already.

I need to add back sending the sender key in requests and such, but
ignore it when processing events.

See matrix-org/matrix-spec-proposals#3700
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
e2e kind:maintenance MSC which clarifies/updates existing spec merged A proposal whose PR has merged into the spec! proposal A matrix spec change proposal
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.