You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Whilst working on matrix-org/complement-crypto#50 I took a tcpdump of a failing test and found convincing evidence that SS is not correctly telling clients of the later device list update for the same user. This is problematic because it means clients won't hit /keys/query at the right time (after the client has uploaded device keys) and hence won't encrypt for that device, causing UTDs.
alice rust = authorization: Bearer syt_dXNlci0xLWFsaWNl_pWOuJBbfdUlWnqLLVRgm_1rPA9l\r\n
bob JS = Authorization: Bearer syt_dXNlci0yLWJvYg_BCEPDhFaFLDZGIrkSuVX_1q3cQp\r\n
bob NEW_DEVICE String value: syt_dXNlci0yLWJvYg_RiemTZKJheBtUhAIxhWu_3qdJOR
767, 1.609s JS (bob) sees it.
775, 1.61s SS sees it.
797, 1.619 bob (js) /keys/query for bob
823, 1.753 alice (rust) SS delivers changed entry for bob
831, 1.781 alice (rust) /keys/query for bob
875, 2.16s bob NEW_DEVICE /login response String value: syt_dXNlci0yLWJvYg_sXHbnXvELCuGoDdoQjAz_0GMdjf
908, 2.183s bob /keys/upload response syt_dXNlci0yLWJvYg_sXHbnXvELCuGoDdoQjAz_0GMdjf
917, 2.184s SS sees bob changed
928, 2.184s bob (JS) sees bob changed
953, 2.193s bob NEW_DEVICE syt_dXNlci0yLWJvYg_sXHbnXvELCuGoDdoQjAz_0GMdjf /keys/query for bob
969, 2.195s bob syt_dXNlci0yLWJvYg_BCEPDhFaFLDZGIrkSuVX_1q3cQp /keys/query for bob
Upon close inspection, this seems to be caused because device list updates don't wake up connections. This means it takes 30s for the device list update to make it to Alice, and in the meantime messages sent aren't encrypted for Bob's new device. Adding a sleep of at least 30s between Bob's new device login and Alice sending the message fixes the test.
Separately, the code for func (t *DeviceDataTable) Select(userID, deviceID string, swap bool) is not atomic, because the transaction lacks ... FOR UPDATE in the SELECT, meaning it is possible for the following to happen:
Reader: BEGIN TXN
Reader: SELECT device data
Poller gets a device list change entry
Writer: BEGIN TXN
Writer: SELECT device data
Writer: INSERT .. ON CONFLICT DO UPDATE .. [row level lock acquired]
Reader: UPDATE ... [blocks due to Writer txn]
Writer: COMMIT
Reader: UPDATE unblocks
Reader: COMMIT
Device list change entry was dropped.
With SELECT .. FOR UPDATE, the lock is acquired at SELECT time meaning this cannot happen and instead:
Reader: BEGIN TXN
Reader: SELECT device data [row level lock acquired]
Poller gets a device list change entry
Writer: BEGIN TXN
Writer: SELECT device data [blocks due to Reader txn]
Reader: UPDATE
Reader: COMMIT
Writer: SELECT device data unblocks
Writer: INSERT .. ON CONFLICT DO UPDATE .. [row level lock acquired]
Writer: COMMIT
If this happens, senders may not have up-to-date device lists for users, causing UTDs for the newly logged in device. For this to happen, the device must login at the same time as the client is sending SS requests.
The text was updated successfully, but these errors were encountered:
Whilst working on matrix-org/complement-crypto#50 I took a tcpdump of a failing test and found convincing evidence that SS is not correctly telling clients of the later device list update for the same user. This is problematic because it means clients won't hit
/keys/query
at the right time (after the client has uploaded device keys) and hence won't encrypt for that device, causing UTDs.Upon close inspection, this seems to be caused because device list updates don't wake up connections. This means it takes 30s for the device list update to make it to Alice, and in the meantime messages sent aren't encrypted for Bob's new device. Adding a sleep of at least 30s between Bob's new device login and Alice sending the message fixes the test.
Separately, the code for
func (t *DeviceDataTable) Select(userID, deviceID string, swap bool)
is not atomic, because the transaction lacks... FOR UPDATE
in theSELECT
, meaning it is possible for the following to happen:With SELECT .. FOR UPDATE, the lock is acquired at SELECT time meaning this cannot happen and instead:
If this happens, senders may not have up-to-date device lists for users, causing UTDs for the newly logged in device. For this to happen, the device must login at the same time as the client is sending SS requests.
The text was updated successfully, but these errors were encountered: