Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restart broken Olm sessions #780

Merged
merged 18 commits into from
Nov 15, 2018
Merged

Restart broken Olm sessions #780

merged 18 commits into from
Nov 15, 2018

Conversation

dbkr
Copy link
Member

@dbkr dbkr commented Nov 8, 2018

unwedge

  • Start a new Olm sessions with a device when we get an undecryptable
    message on it.
  • Send a dummy message on that sessions such that the other end knows
    about it.
  • Re-send any outstanding keyshare requests for that device.

Also includes a unit test for megolm that isn't very related but came
out as a result anyway.

Includes #776
Fixes element-hq/element-web#3822

 * Start a new Olm sessions with a device when we get an undecryptable
   message on it.
 * Send a dummy message on that sessions such that the other end knows
   about it.
 * Re-send any outstanding keyshare requests for that device.

Also includes a unit test for megolm that isn't very related but came
out as a result anyway.

Includes #776
Fixes element-hq/element-web#3822
@dbkr dbkr requested a review from uhoreg November 8, 2018 19:14
@dbkr
Copy link
Member Author

dbkr commented Nov 8, 2018

Tests failing because of https://github.com/matrix-org/olm-backup/pull/77 requirement

Copy link
Member

@uhoreg uhoreg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll need to look over this again, since it's pretty big, but here are some comments to start with.

spec/unit/crypto.spec.js Show resolved Hide resolved
src/crypto/index.js Outdated Show resolved Hide resolved
src/crypto/algorithms/megolm.js Show resolved Hide resolved
* @private
* @param {module:models/event.MatrixEvent} event undecryptable event
*/
Crypto.prototype._onToDeviceBadEncrypted = async function(event) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like, for example, if Alice sends Bob a message in 5 rooms, creating inbound group sessions in all 5, and then Bob comes online later on, receives all 5 messages, can't decrypt the olm session used to encrypt those group sessions, then Bob will create 5 new olm sessions, which is overkill. It would be better to remember that we already created a new session. This could be done by remembering that an olm session has already been "replaced", and if it encounters an already-replaced olm session, then it just needs to re-send the key request. It probably doesn't need to be persisted, since this situation would only happen in the short term, so it could just be stored as a member of the crypto object.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right, so the problem with this is that only prekey messages contain information on what session ID / device key they're destined for: normal messages sent once the session has been established do not, so if we have multiple sessions, we won't know which has been replaced (when we get an incoming message, we just try decrypting it with every session we have).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't think of anything better to do than a dumb heuristic like only creating one new session every hour.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's unfortunate. It seems like using a dumb heuristic might be the only thing to do, though every hour might be too long. Maybe something more like 10 minutes would be better, as hopefully by then, the sender would have received the m.dummy message.

@uhoreg
Copy link
Member

uhoreg commented Nov 13, 2018

We'll need to make an MSC to document this. So far, it looks like we'll need to document:

  • when an olm session is created and has not yet received a message, use the creation time of the session as the time that it last received a message when deciding what session to use for encrypting outgoing messages
  • for last received time of messages, only consider messages that were successfully decrypted
  • when you receive an undecryptable message using olm, create a new session and send a dummy event using it so that the other side knows about it (except that you don't have to create a new session if you already did)
  • ...

(I can start writing the MSC)

"New session already forced with device " + sender + ":" + deviceKey +
" at " + lastNewSessionForced + ": not forcing another",
);
return;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the keyshare request should still be resent in this situation?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so - we'll have resent all outstanding keyshare requests when the session was re-established, so any responses coming back on the old session will already have had their re-requests sent after establishment of the new session.

@uhoreg
Copy link
Member

uhoreg commented Nov 15, 2018 via email

turt2live added a commit to matrix-org/matrix-spec-proposals that referenced this pull request May 30, 2019
As per [MSC1719](#1719)

No known alterations have been made to the proposal.

Implementation proof: matrix-org/matrix-js-sdk#780
@t3chguy t3chguy deleted the dbkr/olm_session_unwedge branch May 10, 2022 14:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants