[HOLD for payment 2024-05-06] [$500] [Wave Collect] [Ideal Nav] Chat List and Workspace List sometimes loads slowly & App sometimes hangs #35704

hayata-suenaga · 2024-02-02T19:23:33Z

Action Performed:

Reported by David here, here, and here

This issue has two aspects:

The Chat List and Workspace List loads sometimes loads too slow

Make sure you have an account with a lot of chats and workspaces
Click the Chats button on the bottom tab
Confirm that the list of chats load very slow
Click the 🔧 Settings button in Expensify/All workspace > Workspaces
Confirm that the list of workspaces load very slow

Another performance issue was reported by David
-> internal Slack thread where David posted the issue

Make sure your account has several workspaces with a lot of chats/reports
Open the Chat Switcher (Search Page) by pressing CMD + K
Switch to a different chat/person (possibly one in a different workspace
Observe that the LHN that displays the list of chats loads very slowly

The App sometimes hangs

This issue doesn't have any defined reproduction step. David reported that he sometimes get the browser popup error saying that the page is unresponsive. Please see the screenshot below. This issue was initially reported in this GH issue.

Workaround:

N/A

Platforms:

Which of our officially supported platforms is this issue occurring on?

Screenshots/Videos

Screenshot -> https://expensify.slack.com/archives/C036QM0SLJK/p1706901348693069?thread_ts=1706887969.154719&cid=C036QM0SLJK

View all open jobs on GitHub

Upwork Automation - Do Not Edit

Upwork Job URL: https://www.upwork.com/jobs/~01e122c7e0ae0d76b5
Upwork Job ID: 1753562635025690624
Last Price Increase: 2024-04-23
Automatic offers:

cubuspl42 | Reviewer | 28142256

melvin-bot · 2024-02-02T19:23:45Z

Triggered auto assignment to @lschurr (Bug), see https://stackoverflow.com/c/expensify/questions/14418 for more details.

lschurr · 2024-02-02T20:42:07Z

Hey @hayata-suenaga - should this be External for contributors to work on or are we keeping this Internal?

melvin-bot · 2024-02-02T23:34:48Z

Job added to Upwork: https://www.upwork.com/jobs/~01e122c7e0ae0d76b5

melvin-bot · 2024-02-02T23:34:53Z

Triggered auto assignment to Contributor-plus team member for initial proposal review - @cubuspl42 (External)

hayata-suenaga · 2024-02-02T23:35:25Z

Hey @hayata-suenaga - should this be External for contributors to work on or are we keeping this Internal?

I'll keep this external for now. Thank you for reminding me of it 😄

hurali97 · 2024-02-05T10:06:14Z

@mountiny can you assign this to me ? 👋

melvin-bot · 2024-02-05T10:06:46Z

📣 @cubuspl42 🎉 An offer has been automatically sent to your Upwork account for the Reviewer role 🎉 Thanks for contributing to the Expensify app!

Offer link
Upwork job

hayata-suenaga · 2024-02-05T23:02:32Z

@hurali97 I changed the issue description of the original post of this issue to include another related issue reported about the App becoming unresponsive. Please check the issue description again and let me know if you can handle the added issue. 🙇

hurali97 · 2024-02-06T16:38:43Z

Hey @hayata-suenaga !

I took a look at the issue description and it's clear 👍 I tried to reproduce and locate the root cause but this one's tricky to reproduce. However, I found that switching between chat list and workspace is sometimes really slow. After locating the stack traces for these I found that most of the points are re-usable from the App Startup Audit on-going here. After adding the optimization code from these points, the switching between chat list and workspace is better than before by margins.

However, with the fixes I left the Expensify tab open for a few minutes switched to other tab and when I came back to Expensify and reloaded the page then switched to Workspace list, I got the Unresponsiveness Alert, even with the fixes applied from the above step. It's clear that the fixes doesn't solve the root cause which is apparently related to memory consumption.

I observed the memory usage whilst switching between chat list and workspaces a bunch of time the memory consumption gets to more than 2000 MB❗Upon analysing the memory allocation trace most of the allocation is consumed by getOrderedReportIDs.

I will analyse this further tomorrow and keep y'all posted 👍

hayata-suenaga · 2024-02-06T19:26:10Z

Upon analysing the memory allocation trace most of the allocation is consumed by getOrderedReportIDs.

I don't fully remember what the getOrderedReportIDs function does. But I remember that the app identifies which workspace items should include RBR/GBR when the Workspace Switcher displays the list of workspaces (screenshot below). It does so by reviewing all reports for each workspace to determine if any reports or chats require the user's attention, signified by RBR/GBR. This process might consume a significant amount of memory.

hurali97 · 2024-02-07T07:28:54Z

@hayata-suenaga The getOrderedReportIDs is the one which prepares the data for LHN. And as you explained about GBR there's some checks for GBR as well in the getOrderedReportIDs. Below is a piece of code from that function:

...
  reportsToDisplay.forEach((report) => {
        report.displayName = ReportUtils.getReportName(report);
        report.iouReportAmount = ReportUtils.getMoneyRequestSpendBreakdown(report, allReports).totalDisplaySpend;

        const isPinned = report.isPinned ?? false;
        const reportAction = ReportActionsUtils.getReportAction(report.parentReportID ?? '', report.parentReportActionID ?? '');

        // Checking for Require Attention
        if (isPinned || ReportUtils.requiresAttentionFromCurrentUser(report, reportAction)) {
            pinnedAndGBRReports.push(report);
        } else if (report.hasDraft) {
            draftReports.push(report);
        } else if (ReportUtils.isArchivedRoom(report)) {
            archivedReports.push(report);
        } else {
            nonArchivedReports.push(report);
        }
    });

    pinnedAndGBRReports.sort((a, b) => (a?.displayName && b?.displayName ? a.displayName.toLowerCase().localeCompare(b.displayName.toLowerCase()) : 0));

...

I will further investigate the cause of high memory consumption and then update with my findings here, thanks for the pointers 👍

quinthar · 2024-02-07T16:25:06Z

Happened again; what can I do to help diagnose this?

hurali97 · 2024-02-07T16:31:25Z

@quinthar I have some break through today as I was able to locate the root cause. I also tested the fix and things are looking way better now. Switching between chat list and workspaces is snappy and I didn't encounter any sort of lag OR unresponsiveness. I switched a lot of times between them.

I will post my findings here tomorrow 🤞

hurali97 · 2024-02-08T09:19:34Z

Hey guys 👋

I want to share my findings on this issue with you. This issue was relatively hard to reproduce and even if I reproduced it, it was practically impossible for me to profile the memory leaks and garbage collector not able to free the memory. The reason for that is when we are profiling for memory allocations and the web page is unresponsive, everything stops at the moment, even the profiler. So there's no way forward in that area.

So I tried to profile this in chunks. To explain it, the flow this issue happens is when we are switching between chat list and workspaces. To take a chunk, we can put dummy views in chat list UI and use original view in workspaces. This gives us a chance to look for the memory issues whilst we are switching back and forth.

Following this approach I started the profile process and was able to locate a bunch of very important memory issues. Let's see them in detail:

The reports object:

Analysing this image from memory allocation instrument, we see that the report object still associated with detached HTML views and Garbage Collector is not able to retrieve that allocated memory. This happens in a couple of places in our profile. Exploring this further, we can see that it's bound to WorkSpacesListPage.tsx. The report object has around 15k reports in it. Take a look at the Retained Size section which means the amount of memory that will be freed once the object is removed alongside it's dependent objects. The detached HTML views takes about 2 mb of memory each and the list is relatively long.

The question is the web app became kind of usable when we placed dummy UI views in the chat list area, does it mean that the report object in other areas is also causing the issue?

The report object is supplied via withOnyx to the WorkSpacesListPage.tsx. We have report object supplied via withOnyx in 3 other places, namely ReportScreenIDSetter, SidebarLinksData and LHNOptionsList. Which means that we have more of the same problem in memory whenever we switch back and forth, hence our web app becomes slow and ultimately hangs. To explain it further let's see another issue:

The internals of withOnyx:

tempState variable:

state variable:

These two variables are part of withOnyx implementation and as far as I could understand, they both hold data for the supplied key. So this means for each key we have two variables holding the same data for some implementation details reasons. We have reports object used via withOnyx in 4 prominent places, which means 8 variables holding the reports object. The point to emphasise here is that the reports object is really huge and allowing it to have multiple instances in the React Tree is not ideal. Doing so slows down our web app when we are switching between chat list and workspaces.

The ideal solution in such cases is to share the single source of truth for such huge data. To explain it, we can leverage React.Context and store the reports object there using Onyx.connect. And later where we are using withOnyx for reports we will refactor that to now use a hook for the context, say useReports(). This fixes the bottleneck we have but it also gives us the advantage of immediately mounting the component instead of having to wait to parse it through withOnyx. This is achieved because React.Context already holds the data for us and whenever we ask, it will give us immediately and no extra variables are holding the data like tempState and state.

The FlashList in LHNOptionsList:

The underlying principle of FlashList is Recycling and it doesn't play very well if our renderItems are heavy. When we are switching back and forth with all the UI in place and with above report optimisation. The web app still sometimes hangs and the main reason for that is the usage of FlashList in LHNOptionsList.

I tried to profile the memory allocations by FlashList but since web page keeps getting stuck, I couldn't profile. I also tried with chunk approach by setting dummy UI in workspaces because LHNOptionsList is in chat list area but still the web page was hanging up.

The above image is from the docs of FlashList and as we can see that the list will perform poorly if it's slow to render. If we want to use FlashList, we will need to make the renderItem fast to render but it's hard since it's associated with lots of important data. Otherwise, we can switch to FlatList on web and still keep using FlashList on mobile platforms. Alternatively, we can use FlatList now and in a separate effort we first make renderItems fast to render and then bring back FlashList.

Only case where the web page did not hang up was when I replaced FlashList with FlatList and things were back to normal. I could also profile with original UI and no hang up in that as well.

Since we can't proceed with profiling, what I will deduce from this observation is point 1. Remember the HTML detached elements were not removed by GC and reports was still part of the memory? Since our LHN is based on rendering the reports, it seems to link back to the point 1. Also, since support for FlashList is in beta on web, we can think of replacing it with FlatList.

Just a note, if we are to replace FlashList with FlatList and we know that now it works very well, we still should dedicate some effort in making the renderItems light weight.

The above three step makes our web app a lot smoother, let's see that in action below:

memory.mp4

The above fixes the issue and makes our web app snappy but there are some areas that were observed during the profiling and are worth discussing over:

The report object being used to access data via Onyx.connect

We have Onyx connections in our util files more than 10 times. Most of them are only used for accessing the report data. In the profiling session, we see that more than 10 times the report object is holding the significant amount of memory. Above screenshot is just one example of it.

The ideal solution we can have is either Onyx.get() which helps us retrieve the updated value wherever we need it. If we can't do it for some reasons, we can create a single file with onyx connection and then read the data from it using a method, say Report.getAllReports().

I profiled the memory after closing all the connections and kept alive only a couple and the memory allocation for objects went down by margins.

The JSON.Stringify in getOrderedReportIDs:

We have cache mechanism in place for getOrderedReportIDs and to achieve that we generate a key through the passed arguments to the function. In the screenshot, we can see that the string key takes 25 mb at max and around ~130 mb in total . The reason is that the arguments which are used to generate the stringified key are really huge.

We should also discuss that the cache mechanism is not useful in getOrderedReportIDs. The reason is that due to such a diverse data which is prone to be non-unique because of it having a lots of keys. Which means that even if we place the cache mechanism in place, we have a limit of 5 and once that hits, we pop the oldest cache item to make room for the latest one. However, the cache is not really used because the data is different almost each time and we end up generating the key for it which is not used.

This is not only observed in this flow But in other flows as well. For example, as part of Callstack's audit, we analysed that in the App Startup, Send a message and Report screen load time, cache for getOrderedReportIDs is not used at all. In fact, it adds up to the execution time of the function by ~350 ms.

cc: @hayata-suenaga @quinthar

melvin-bot · 2024-04-23T23:54:34Z

Triggered auto assignment to Contributor Plus for review of internal employee PR - @ntdiary (Internal)

mallenexpensify · 2024-04-23T23:54:51Z

@ntdiary reassigning, please take over as C+. If you don't have bandwidth, unassign yourself. Thanks

melvin-bot · 2024-04-23T23:55:04Z

Upwork job price has been updated to $500

ntdiary · 2024-04-24T05:03:41Z

@ntdiary reassigning, please take over as C+. If you don't have bandwidth, unassign yourself. Thanks

Unassign myself due to lack of bandwidth. :)

melvin-bot · 2024-04-29T23:04:45Z

The solution for this issue has been 🚀 deployed to production 🚀 in version 1.4.67-7 and is now subject to a 7-day regression period 📆. Here is the list of pull requests that resolve this issue:

perf: make switch between chat list and workspaces smoother #36420

If no regressions arise, payment will be issued on 2024-05-06. 🎊

For reference, here are some details about the assignees on this issue:

@hurali97 does not require payment (Contractor)
@situchan requires payment (Needs manual offer from BZ)

melvin-bot · 2024-04-29T23:04:46Z

BugZero Checklist: The PR fixing this issue has been merged! The following checklist (instructions) will need to be completed before the issue can be closed:

[@hayata-suenaga] The PR that introduced the bug has been identified. Link to the PR:
[@hayata-suenaga] The offending PR has been commented on, pointing out the bug it caused and why, so the author and reviewers can learn from the mistake. Link to comment:
[@hayata-suenaga] A discussion in #expensify-bugs has been started about whether any other steps should be taken (e.g. updating the PR review checklist) in order to catch this type of bug sooner. Link to discussion:
[@hurali97 / @situchan] Determine if we should create a regression test for this bug.
[@hurali97 / @situchan] If we decide to create a regression test for the bug, please propose the regression test steps to ensure the same bug will not reach production again.
[@lschurr] Link the GH issue for creating/updating the regression test once above steps have been agreed upon:

hayata-suenaga · 2024-04-30T17:04:57Z

[@hayata-suenaga] The PR that introduced the bug has been identified. Link to the PR:

The performance regression happens with different changes and due to past architectural decisions. One change that affected this was the introduction of a cache for the sorted report IDs that required the string cache key to be compared on each render. However, because the cache key changes almost all the time, this cache wasn't used at all.

[@hayata-suenaga] The offending PR has been commented on, pointing out the bug it caused and why, so the author and reviewers can learn from the mistake. Link to comment:

No specific PR.

[@hayata-suenaga] A discussion in #expensify-bugs has been started about whether any other steps should be taken (e.g. updating the PR review checklist) in order to catch this type of bug sooner. Link to discussion:

https://expensify.slack.com/archives/C049HHMV9SM/p1714496660088649

lschurr · 2024-05-06T16:36:48Z

Payment summary:

Contributor: @hurali97 (no payment needed)
Contributor+: @situchan $500 (paid in Upwork)

lschurr · 2024-05-07T21:11:54Z

Bump on that offer @situchan :)

lschurr · 2024-05-10T21:25:53Z

Waiting on @situchan to accept.

lschurr · 2024-05-16T15:19:24Z

All paid!

hayata-suenaga added Daily KSv2 Bug Something is broken. Auto assigns a BugZero manager. labels Feb 2, 2024

hayata-suenaga self-assigned this Feb 2, 2024

hayata-suenaga added this to [#whatsnext] Wave 08 - Collect Plan Admins Feb 2, 2024

hayata-suenaga moved this to Release 1: Ideal Nav & Collect Simplfied Profile, Members, Categories, Workflows (approvals) in [#whatsnext] Wave 08 - Collect Plan Admins Feb 2, 2024

melvin-bot bot assigned lschurr Feb 2, 2024

hayata-suenaga added the External Added to denote the issue can be worked on by a contributor label Feb 2, 2024

melvin-bot bot changed the title ~~[Wave 8] [Ideal Nav] Loading is too slow for Chat List and Workspace List~~ [$500] [Wave 8] [Ideal Nav] Loading is too slow for Chat List and Workspace List Feb 2, 2024

melvin-bot bot added the Help Wanted Apply this label when an issue is open to proposals by contributors label Feb 2, 2024

melvin-bot bot assigned cubuspl42 Feb 2, 2024

melvin-bot bot added the Overdue label Feb 5, 2024

mountiny assigned hurali97 Feb 5, 2024

melvin-bot bot removed the Help Wanted Apply this label when an issue is open to proposals by contributors label Feb 5, 2024

melvin-bot bot removed the Overdue label Feb 5, 2024

trjExpensify mentioned this issue Feb 5, 2024

NewDot randomly hangs #35741

Closed

6 tasks

hayata-suenaga changed the title ~~[$500] [Wave 8] [Ideal Nav] Loading is too slow for Chat List and Workspace List~~ [$500] [Wave 8] [Ideal Nav] Chat List and Workspace List sometimes loads slowly & App sometimes hangs Feb 5, 2024

mallenexpensify unassigned cubuspl42 Apr 23, 2024

mallenexpensify added Internal Requires API changes or must be handled by Expensify staff and removed Internal Requires API changes or must be handled by Expensify staff labels Apr 23, 2024

melvin-bot bot changed the title ~~[$500] [Wave Collect] [Ideal Nav] Chat List and Workspace List sometimes loads slowly & App sometimes hangs~~ [$250] [Wave Collect] [Ideal Nav] Chat List and Workspace List sometimes loads slowly & App sometimes hangs Apr 23, 2024

melvin-bot bot assigned ntdiary Apr 23, 2024

mallenexpensify changed the title ~~[$250] [Wave Collect] [Ideal Nav] Chat List and Workspace List sometimes loads slowly & App sometimes hangs~~ [$500] [Wave Collect] [Ideal Nav] Chat List and Workspace List sometimes loads slowly & App sometimes hangs Apr 23, 2024

ntdiary removed their assignment Apr 24, 2024

melvin-bot bot added Weekly KSv2 Awaiting Payment Auto-added when associated PR is deployed to production and removed Weekly KSv2 labels Apr 29, 2024

melvin-bot bot changed the title ~~[$500] [Wave Collect] [Ideal Nav] Chat List and Workspace List sometimes loads slowly & App sometimes hangs~~ [HOLD for payment 2024-05-06] [$500] [Wave Collect] [Ideal Nav] Chat List and Workspace List sometimes loads slowly & App sometimes hangs Apr 29, 2024

lschurr added Daily KSv2 and removed Weekly KSv2 labels May 7, 2024

melvin-bot bot added the Overdue label May 10, 2024

melvin-bot bot removed the Overdue label May 10, 2024

lschurr added Weekly KSv2 and removed Daily KSv2 labels May 10, 2024

lschurr closed this as completed May 16, 2024

github-project-automation bot moved this from Release 1: Spring 2024 (May) to Done in [#whatsnext] #wave-collect May 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[HOLD for payment 2024-05-06] [$500] [Wave Collect] [Ideal Nav] Chat List and Workspace List sometimes loads slowly & App sometimes hangs #35704

[HOLD for payment 2024-05-06] [$500] [Wave Collect] [Ideal Nav] Chat List and Workspace List sometimes loads slowly & App sometimes hangs #35704

hayata-suenaga commented Feb 2, 2024 •

edited by melvin-bot bot

Loading

melvin-bot bot commented Feb 2, 2024

lschurr commented Feb 2, 2024

melvin-bot bot commented Feb 2, 2024

melvin-bot bot commented Feb 2, 2024

hayata-suenaga commented Feb 2, 2024

hurali97 commented Feb 5, 2024

melvin-bot bot commented Feb 5, 2024

hayata-suenaga commented Feb 5, 2024 •

edited

Loading

hurali97 commented Feb 6, 2024 •

edited

Loading

hayata-suenaga commented Feb 6, 2024

hurali97 commented Feb 7, 2024

quinthar commented Feb 7, 2024

hurali97 commented Feb 7, 2024

hurali97 commented Feb 8, 2024

melvin-bot bot commented Apr 23, 2024

mallenexpensify commented Apr 23, 2024

melvin-bot bot commented Apr 23, 2024

ntdiary commented Apr 24, 2024

melvin-bot bot commented Apr 29, 2024

melvin-bot bot commented Apr 29, 2024 •

edited by hayata-suenaga

Loading

hayata-suenaga commented Apr 30, 2024 •

edited

Loading

lschurr commented May 6, 2024 •

edited

Loading

lschurr commented May 7, 2024

lschurr commented May 10, 2024

lschurr commented May 16, 2024

[HOLD for payment 2024-05-06] [$500] [Wave Collect] [Ideal Nav] Chat List and Workspace List sometimes loads slowly & App sometimes hangs #35704

[HOLD for payment 2024-05-06] [$500] [Wave Collect] [Ideal Nav] Chat List and Workspace List sometimes loads slowly & App sometimes hangs #35704

Comments

hayata-suenaga commented Feb 2, 2024 • edited by melvin-bot bot Loading

Action Performed:

The Chat List and Workspace List loads sometimes loads too slow

The App sometimes hangs

Workaround:

Platforms:

Screenshots/Videos

melvin-bot bot commented Feb 2, 2024

lschurr commented Feb 2, 2024

melvin-bot bot commented Feb 2, 2024

melvin-bot bot commented Feb 2, 2024

hayata-suenaga commented Feb 2, 2024

hurali97 commented Feb 5, 2024

melvin-bot bot commented Feb 5, 2024

hayata-suenaga commented Feb 5, 2024 • edited Loading

hurali97 commented Feb 6, 2024 • edited Loading

hayata-suenaga commented Feb 6, 2024

hurali97 commented Feb 7, 2024

quinthar commented Feb 7, 2024

hurali97 commented Feb 7, 2024

hurali97 commented Feb 8, 2024

melvin-bot bot commented Apr 23, 2024

mallenexpensify commented Apr 23, 2024

melvin-bot bot commented Apr 23, 2024

ntdiary commented Apr 24, 2024

melvin-bot bot commented Apr 29, 2024

melvin-bot bot commented Apr 29, 2024 • edited by hayata-suenaga Loading

hayata-suenaga commented Apr 30, 2024 • edited Loading

lschurr commented May 6, 2024 • edited Loading

lschurr commented May 7, 2024

lschurr commented May 10, 2024

lschurr commented May 16, 2024

hayata-suenaga commented Feb 2, 2024 •

edited by melvin-bot bot

Loading

hayata-suenaga commented Feb 5, 2024 •

edited

Loading

hurali97 commented Feb 6, 2024 •

edited

Loading

melvin-bot bot commented Apr 29, 2024 •

edited by hayata-suenaga

Loading

hayata-suenaga commented Apr 30, 2024 •

edited

Loading

lschurr commented May 6, 2024 •

edited

Loading