Ensure device list updates are robust to race conditions and network failures #432
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fixes #430
This does two things:
.. FOR UPDATE
)This adds a regression test and refactors the unit tests to be testing the correct semantics.
History on this is:
result
previously had theNew
items put intoSent
(the original intent) but now we were copyingresult
towriteBack
and then swapping_New
toSent
. As the caller only looks inSent
, it won't see any device list updates. This would fix itself on the next request because we then returnSent
confusingly. However, the semantics are like this to guard against network failures, and since this refactor specific network failures would drop the update.I found this out when trying to fix this test in complement crypto, which is sensitive to how quickly the device list update will be delivered to the EX client. Hence me finding the bug and fixing it.
This is effectively a UTD cause for the purposes of element-hq/element-meta#245 because it meant EX clients may miss a device list update, causing EX clients to fail to encrypt for newly logged in devices (just like that failing complement crypto test).