-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: ensure pusher events are processed in the same order they were received #31144
Conversation
@aimane-chnaif Please copy/paste the Reviewer Checklist from here into a new comment on this PR and complete it. If you have the K2 extension, you can simply click: [this button] |
console.debug('[OnyxUpdateManager] Done applying Pusher update'); | ||
}); | ||
|
||
return pusherEventsPromise; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
async / await is better reading and easy to understand but unfortunately that's not allowed in our codebase
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. Originally I went this route but then lint yelled at me. I also had a for loop initially that might be more readable, but ended up going with a reduce. I can switch back to it if that makes things easier to understand.
Reviewer Checklist
Screenshots/VideosAndroid: Nativeother.platforms.movAndroid: mWeb Chromeother.platforms.moviOS: Nativeother.platforms.moviOS: mWeb Safariother.platforms.movMacOS: Chrome / Safariweb.movMacOS: Desktopother.platforms.mov |
@barros001 I still have the case when issue still happens. Please try to follow step in my video exactly (around 00:52) bug.mov |
Ok, will take a look tomorrow. |
@aimane-chnaif I tried to reproduce the issue several times but I could not reproduce it. Is this something you can reliably reproduce? I noticed you had a 3rd instance of the app running on a what appeared to be an iOS simulator. Can you give me more details about your test setup? There clearly is something going on as shown on your video, but so far I have not been able to reproduce. |
@aimane-chnaif I believe we still have a race condition here and my solution solved only part of it. See code below: function apply({lastUpdateID, type, request, response, updates}: OnyxUpdatesFromServer): Promise<void | Response> | undefined {
console.debug(`[OnyxUpdateManager] Applying update type: ${type} with lastUpdateID: ${lastUpdateID}`, {request, response, updates});
if (lastUpdateID && lastUpdateIDAppliedToClient && Number(lastUpdateID) < lastUpdateIDAppliedToClient) {
console.debug('[OnyxUpdateManager] Update received was older than current state, returning without applying the updates');
return Promise.resolve();
}
if (lastUpdateID && (lastUpdateIDAppliedToClient === null || Number(lastUpdateID) > lastUpdateIDAppliedToClient)) {
Onyx.merge(ONYXKEYS.ONYX_UPDATES_LAST_UPDATE_ID_APPLIED_TO_CLIENT, Number(lastUpdateID));
}
if (type === CONST.ONYX_UPDATE_TYPES.HTTPS && request && response) {
return applyHTTPSOnyxUpdates(request, response);
}
if (type === CONST.ONYX_UPDATE_TYPES.PUSHER && updates) {
return applyPusherOnyxUpdates(updates);
}
} This gets called every time a push event is received, but it also gets called when http events are received. While debugging I noticed that sometimes a push even start processing and before it ends, http events come in and process first. While I could not reproduce the issue you showed above, there's a good chance that this could lead to inconsistent state. This makes me think that, although we still need to serialize the push events inside Also, The resulting code would look something like this: let applyPromise = Promise.resolve();
function apply({lastUpdateID, type, request, response, updates}: OnyxUpdatesFromServer): Promise<void | Response> | undefined {
// ensure paralell calls to this method will execute sequentially
applyPromise = applyPromise.then(() => {
console.debug(`[OnyxUpdateManager] Applying update type: ${type} with lastUpdateID: ${lastUpdateID}`, {request, response, updates});
if (lastUpdateID && lastUpdateIDAppliedToClient && Number(lastUpdateID) < lastUpdateIDAppliedToClient) {
console.debug('[OnyxUpdateManager] Update received was older than current state, returning without applying the updates');
return Promise.resolve();
}
let onyxMergePromise = Promise.resolve();
if (lastUpdateID && (lastUpdateIDAppliedToClient === null || Number(lastUpdateID) > lastUpdateIDAppliedToClient)) {
onyxMergePromise = Onyx.merge(ONYXKEYS.ONYX_UPDATES_LAST_UPDATE_ID_APPLIED_TO_CLIENT, Number(lastUpdateID));
}
return onyxMergePromise.then(() => {
if (type === CONST.ONYX_UPDATE_TYPES.HTTPS && request && response) {
return applyHTTPSOnyxUpdates(request, response);
}
if (type === CONST.ONYX_UPDATE_TYPES.PUSHER && updates) {
return applyPusherOnyxUpdates(updates);
}
});
});
return applyPromise; Code still needs a little cleanup and more tests but before I go to the work of implementing this I wanted to get some feedback. The code above combined with what's already in the PR will ensure every event received is processed in the correct order. |
Can you explain with async / await approach? Though final code will be promise. |
Let me step back a little first, I'll try to better explain what's going on here. If I refresh the app, or go back online after being offline for a while, the app will do a bunch of things such as pull data from the BE and apply to the FE. Here's some sequence of logs I see sometimes (not always, but usually going from offline to online). Do you see anything funny?
An HTTP update came in while we were applying a pusher update and executed all the way through before the pusher event was fully processed. I don't think this is how this was intended as this processes things out of sequence.
App/src/libs/Middleware/SaveResponseInOnyx.ts Lines 39 to 41 in 183cce0
And also from here: Lines 514 to 517 in ca13914
These two places are unaware of each other causing them to call apply in parallel and events (http/pusher) being mixed up and potentially executed out of order. The PR so far fixes the initial side effect but there seems to be a more fundamental issue going on here and apply needs to be serialized as a whole to avoid potential issues. I'll dig a little deeper to see if I can find a good solution for this. I tried applying the concept I described in the previous comment above but I think there's some sort of deadlock happening and FE gets stuck waiting to finish processing a push event, which never happens. Answering your question about using async/await, I don't think I can really convert it to use async/await due to the nature of the problem: single function, multiple entrypoints where both entry points must wait any pending promise before executing the next one. I'll dig a little further and see if I can figure a clean solution. As far as the issue you pointed out, are you still able to reproduce it? Does it happen all the time? I could not reproduce it yet on my end. |
I can still reproduce. Seems like reliably reproducible to me Screen.Recording.2023-11-14.at.6.58.22.PM.mov |
Interesting.. If I'm on main, I can reliably reproduce it using the exact same steps are you, but if I'm on the branch where the fix is applied, I can no longer replicate, the issue seems to be resolved. I'll try a few more time with different accounts. Silly question, you don't happen to be on |
As this is one file change, I just copied code on this branch into local |
@aimane-chnaif there seems to be something else going on on your video. Slow it down and watch it closely.
You then go back to and make the first window (that sent a black TU) online, once you switch back to the other (that sent the yellow TU), which is still offline, I see the black TU there, shaded as if that window tried to send it while still offline, And it all happens before you go back online. See below: Note the Offline message below composer, top right corner still says offline, and black TU is there. I'm wondering where that came from if that account was still offline. |
Good catch! I think that's because pusher still works sometimes even when chrome network is offline. |
I just pushed a new commit addressing my previous comment. Can you give it a try and see if you still reproduce it? |
Your last change breaks everything. Apis are not called |
…plyPusherOnyxUpdates" This reverts commit 82fb44b.
sorry about that, I did notice the same, there seem to be some sort of deadlock somewhere.. Reverted and we're back to the original solution. I really wish I could reproduce what you're seeing. |
I tried again and now not able to reproduce. Tested both disabling chrome network and disabling device network |
@tgolen can you please generate QR build? Once we confirm again on release builds, I think we're good to go |
Sure, I've triggered one for you. FYI, any internal engineer can trigger those, so no need to ping me directly for them. |
Hah sorry, but I pinged you because you're involved on linked issue. If not you, whom should I tag? 😄 |
🧪🧪 Use the links below to test this adhoc build on Android, iOS, Desktop, and Web. Happy testing! 🧪🧪 |
I tested release builds on real offline mode and worked well. |
Reviewer Checklist
Screenshots/VideosAndroid: Nativeandroid-mweb.movAndroid: mWeb Chromeandroid-mweb.moviOS: Nativedesktop-ios.moviOS: mWeb Safariandroid-mweb.movMacOS: Chrome / Safariweb-real.movMacOS: Desktopdesktop-ios.mov |
✋ This PR was not deployed to staging yet because QA is ongoing. It will be automatically deployed to staging after the next production release. |
🚀 Deployed to production by https://github.com/luacmartins in version: 1.4.1-13 🚀
|
Details
This PR will ensure that all pusher events are processed in the exact same order as they were received.
Fixed Issues
$ #23679
PROPOSAL: #23679 (comment)
Tests
Offline tests
QA Steps
PR Author Checklist
### Fixed Issues
section aboveTests
sectionOffline steps
sectionQA steps
sectiontoggleReport
and notonIconClick
)myBool && <MyComponent />
.src/languages/*
files and using the translation methodWaiting for Copy
label for a copy review on the original GH to get the correct copy.STYLE.md
) were followedAvatar
, I verified the components usingAvatar
are working as expected)/** comment above it */
this
are necessary to be bound (i.e. avoidthis.submit = this.submit.bind(this);
ifthis.submit
is never passed to a component event handler likeonClick
)StyleUtils.getBackgroundAndBorderStyle(themeColors.componentBG)
)Avatar
is modified, I verified thatAvatar
is working as expected in all cases)ScrollView
component to make it scrollable when more elements are added to the page.main
branch was merged into this PR after a review, I tested again and verified the outcome was still expected according to theTest
steps.Screenshots/Videos
Android: Native
Screen.Recording.2023-11-09.at.12.34.09.PM.mov
Android: mWeb Chrome
Screen.Recording.2023-11-09.at.11.52.45.AM.mov
iOS: Native
Screen.Recording.2023-11-09.at.12.02.57.PM.mov
iOS: mWeb Safari
Screen.Recording.2023-11-09.at.11.43.58.AM.mov
MacOS: Chrome / Safari
Screen.Recording.2023-11-09.at.11.39.41.AM.mov
MacOS: Desktop
Screen.Recording.2023-11-09.at.11.47.56.AM.mov