Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The roomlist requires an extra message to reorder & highlight in encrypted rooms since 1.10.x #20859

Closed
benbz opened this issue Feb 2, 2022 · 17 comments · Fixed by matrix-org/matrix-react-sdk#7813
Assignees
Labels
A-Room-List O-Occasional Affects or can be seen by some users regularly or most users rarely S-Critical Prevents work, causes data loss and/or has no workaround T-Defect X-Regression

Comments

@benbz
Copy link
Member

benbz commented Feb 2, 2022

Steps to reproduce

  1. Receive a single message in an encrypted room (DMs appear to be unaffected)
  2. No highlighting of the room in the room list and the room isn't reordered. If the room is in a space, the space does not gain a dot
  3. Even if the message included a mention the room list is not updated despite an alert being fired
    1. This makes it very difficult to find the room where you've been pinged if you miss the alert
  4. Send another message. The room list is reordered and the room highlighted as if the 1st message had just been processed. If the room is in a space, the space now gains a dot
  5. This only started happening since upgrading to 1.10.x.. Clearing of the cache has been attempted

Outcome

What did you expect?

Unread messages in encrypted rooms cause the room to be reordered & highlighted as appropriate in the roomlist immediately

What happened instead?

Message previews for the room update, but the room is not reordered and not highlighted in the roomlist until a further message is sent.

Operating system

Ubuntu Linux

Application version

1.10.1

How did you install the app?

Debian

Homeserver

banzan.uk

Will you send logs?

Yes

@benbz benbz added the T-Defect label Feb 2, 2022
@dbkr dbkr added O-Occasional Affects or can be seen by some users regularly or most users rarely S-Major Severely degrades major functionality or product features, with no satisfactory workaround X-Regression A-Room-List labels Feb 2, 2022
@turt2live
Copy link
Member

I've been seeing this too. I'm unsure what semantics cause a room to behave this way, and visiting the room does not appear to fix it. Below is what I've managed to find so far.

I've had it happen in unencrypted rooms too, though at less frequency. It can affect multiple rooms at a time.

The room list does reorder to accept the message when using importance ordering, but in the idle rooms section.

Read markers and/or receipts are not sent in the background either (thus doesn't appear to be a store losing state).

If the message is a mention, a notification noise is played despite not appearing as such. Unsure on desktop notifications - those are disabled on my machine.

It can happen in web or desktop.

This started happening this week on nightly.

Neither us or synapse have seemingly made changes in the area. I'm the only user on my homeserver.

Things yet to check are:

  • Whether the counts in /sync are affected
  • Whether the notification state classes are stuck
  • Whether push rule evaluation is no longer reliable in web
  • Whether reloading the app helps or not
  • Whether mobile is affected

Unfortunately I haven't seen any logging on web or synapse to indicate a problem.

@turt2live
Copy link
Member

Just for more added context:

  • Sometimes a second message fixes it, often it doesn't (in my case)
  • Notifications do seemingly appear eventually. After switching spaces and playing with notification levels on unrelated rooms one of my rooms from the starting space upgraded to a mention. The mention was otherwise delayed by about 8.5 hours.

@turt2live turt2live added S-Critical Prevents work, causes data loss and/or has no workaround and removed S-Major Severely degrades major functionality or product features, with no satisfactory workaround labels Feb 2, 2022
@turt2live
Copy link
Member

I can't get subsequent messages to be reliable in the rooms, but I have found that space hopping appears to kick things into working for some reason. It can take a few hops before rooms start to appear as unread.

@germain-gg
Copy link
Contributor

I don't think I can reproduce this issue either. However I was chatting about the problem with James earlier today and I believe some of the changes in https://github.com/matrix-org/matrix-js-sdk/pull/2132/files could have cause some disturbance here.
I've created some async operation in the SyncAPI which could have an effect here? @turt2live it would be good to test that theory together and I can come up with a plan if that theory is proven correct

@turt2live
Copy link
Member

@gsouquet that looks quite plausible. Is the way to test it to disable threading in labs and see if it helps?

@germain-gg
Copy link
Contributor

You definitely can do that. But I'm afraid that because I made some of those functions async and have thrown an await in there it will still hit the same issue.
Maybe if you can run the version right before this PR that would help pinpoint whether that's the root of the issue or not.

Regardless I will be playing around with that area and attempt to make it all synchronous as it used to be

@turt2live
Copy link
Member

I don't think it's feasible to downgrade web or nightly like that, unfortunately.

@germain-gg germain-gg self-assigned this Feb 3, 2022
@turt2live turt2live self-assigned this Feb 10, 2022
@turt2live
Copy link
Member

I'm taking Cannot Reproduce off of this because it's being encountered daily by people (many of which don't realize it until it's too late)

@turt2live
Copy link
Member

from my debugging a week ago:

If you have importance ordering (room list ordered by activity and "unread messages first" ticked), then the following can be used to identify problematic rooms:

(function () {
    // Assuming you're using Importance ordering on the tag
    const alg = mxRoomListStore.algorithm.algorithms["im.vector.fake.recent"];
    //const categorized = alg.categorizeRooms(mxMatrixClientPeg.get().getVisibleRooms());
    const categorized = alg.categorizeRooms(mxRoomListStore.orderedLists["im.vector.fake.recent"]);

    const unsent = categorized[4];
    const red = categorized[3];
    const grey = categorized[2];
    const bold = categorized[1];
    const none = categorized[0];

    for (const room of unsent) {
        if (room.getPendingEvents() <= 0) {
            console.error(`@@ Failed: ${room.roomId} is UNSENT with no pending events`);
        }
    }

    for (const room of red) {
        if (room.getUnreadNotificationCount('highlight') <= 0) {
            console.error(`@@ Failed: ${room.roomId} is RED with no highlted events`);
        }
    }

    for (const room of grey) {
        if (room.getUnreadNotificationCount('total') <= 0) {
            console.error(`@@ Failed: ${room.roomId} is GREY with no events`);
        }
    }

    const myUserId = mxMatrixClientPeg.get().getUserId();
    for (const room of bold) {
        const readUpTo = room.getEventReadUpTo(myUserId);
        const lastEvent = room.timeline[room.timeline.length - 1];
        if (lastEvent.getSender() == myUserId) continue;

        // flip through the last events in the room
        let found = false;
        for (let i = room.timeline.length - 1; i >= 0; --i) {
            const ev = room.timeline[i];
            if (ev.getId() == readUpTo) {
                found = true;
                break;
            } else if (ev.getType() === 'm.room.message' && !ev.isRelation('m.replace')) {
                break;
            }
        }
        if (found) {
            console.error(`@@ Failed: ${room.roomId} is BOLD with no events`);
        }
    }

    for (const room of none) {
        const readUpTo = room.getEventReadUpTo(myUserId);
        const lastEvent = room.timeline[room.timeline.length - 1];
        if (lastEvent.getSender() == myUserId) continue;

        // flip through the last events in the room
        let found = false;
        for (let i = room.timeline.length - 1; i >= 0; --i) {
            const ev = room.timeline[i];
            if (ev.getId() == readUpTo) {
                found = true;
                break;
            } else if (ev.getType() === 'm.room.message' && !ev.isRelation('m.replace')) {
                break;
            }
        }
        if (!found) {
            console.error(`@@ Failed: ${room.roomId} is IDLE with unread events`);
        }
    }
})();

It's not perfect and doesn't handle a ton of edge cases, but it's good enough to inspect the room IDs it spat out and see if it was encountering issues. One notable edge case is that membership/state events sent to the room which would not normally cause unread badges can cause the script to complain about that room (these rooms are irrelevant to the issue at hand - look for message events which cause problems).

This doesn't really point to a cause yet, but it at least indicates that the notification state classes are holding the wrong value for the room, for some reason.

@novocaine
Copy link
Contributor

novocaine commented Feb 14, 2022

I'm taking Cannot Reproduce off of this because it's being encountered daily by people (many of which don't realize it until it's too late)

I marked it as X-Cannot-Reproduce because we don't know the reproduction steps, even if it's occurring regularly. This means the issue isn't yet solvable and we need to work to establish those steps.

It is marked as O-Occasional which designates frequency.

@novocaine novocaine added X-Cannot-Reproduce O-Occasional Affects or can be seen by some users regularly or most users rarely and removed O-Occasional Affects or can be seen by some users regularly or most users rarely labels Feb 14, 2022
@jaywink
Copy link
Member

jaywink commented Feb 15, 2022

I've also been seeing similar symptoms, though immediately reading this I can't say if it is only encrypted or also unencrypted rooms - though reading above that doesn't seem to be a defining factor for everyone. To me, it feels like rooms are ordered correctly as I've set them (ie based on activity). But the room just shows as "read" when I haven't actually read the room (which is also visible by going to the room and seeing my read counter above the latest message which should have marked the room unread for me. Whether a second message fixes the room status I can't say from my experience.

I did a rageshake earlier for this: https://github.com/matrix-org/element-web-rageshakes/issues/10774

@dbkr
Copy link
Member

dbkr commented Feb 15, 2022

I can now reproduce this for threaded replies, but not for non-threaded:

  1. Have clients A and B who share a room (room 1) with nobody else. Have another room on A too (room 2)
  2. Send enough messages from B into room 1 to overflow the initial sync amount (30 ought to do it)
  3. Switch to room 2 to client A
  4. Clear cache & reload on A
  5. Send a threaded reply from B to the earliest of the ~30 messages you sent
  6. Observe that A makes the notification sound but no badge appears.

This was trying to exercise this code path: https://github.com/matrix-org/matrix-js-sdk/pull/2132/files#diff-1b0cb46cd1fcdaacf73c771bf7eed6e00a12ee030da1dbcca4ae12842e1ecd81R1310 so would suggest there might a bug (race?) there. No idea how this could cause the issue for non-threaded replies though.

@turt2live
Copy link
Member

@dbkr would that path be exercised on clients with threads disabled too? Something that's been troubling is that there's a lot of red flags in the threads paths, but with threads disabled they shouldn't be called (yet the issue persists)

@dbkr
Copy link
Member

dbkr commented Feb 15, 2022

No, that shouldn't be called at all if threads are disabled

@dbkr
Copy link
Member

dbkr commented Feb 15, 2022

I've just found a much simpler repro case for this bug, or at least some variant of it:

[steps 1-3 as previous, except room 1 must be encrypted]
4. Mention client A from client B
5. Observe grey non-highlight notification on room instead of red
6. Send another message from B
7. Badge goes red on client A

@dbkr
Copy link
Member

dbkr commented Feb 15, 2022

Bingo. I think this was caused by matrix-org/matrix-react-sdk#7635 - specifically https://github.com/matrix-org/matrix-react-sdk/pull/7635/files#diff-9253e0a966fcccd855eb6ad3a2e0d1257b6d8e479f34fefbadaf4778c4b972aeR74 which adds a second 'room' argument to the handler, but the handler is also used for Event.decrypted which doesn't have a room parameter. Since the room is optional-ed, it effectively disables this handler entirely.

@dbkr
Copy link
Member

dbkr commented Feb 15, 2022

I'm removing the cannot-repro label now since we have a concrete repro case. However, this is only for encrypted rooms and there are reports of this happening for unencrypted room too. Since the original report was specific to encrypted rooms, I'm going to consider this to be that bug and we can open other issues as necessary.

dbkr added a commit to matrix-org/matrix-react-sdk that referenced this issue Feb 15, 2022
dbkr added a commit to matrix-org/matrix-react-sdk that referenced this issue Feb 15, 2022
* Fix delayed badge update for mentions in encrypted rooms

Fixes element-hq/element-web#20859

More detail on the issue

* Remove unused import

* Fix listener removal
dbkr added a commit to matrix-org/matrix-react-sdk that referenced this issue Feb 16, 2022
dbkr added a commit to matrix-org/matrix-react-sdk that referenced this issue Feb 17, 2022
dbkr added a commit to matrix-org/matrix-react-sdk that referenced this issue Feb 17, 2022
* Fix delayed badge update for mentions in encrypted rooms

Fixes element-hq/element-web#20859

More detail on the issue

* Remove unused import

* Fix listener removal
dbkr added a commit to matrix-org/matrix-react-sdk that referenced this issue Feb 17, 2022
* Fix delayed badge update for mentions in encrypted rooms

Fixes element-hq/element-web#20859

More detail on the issue

* Remove unused import

* Fix listener removal
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-Room-List O-Occasional Affects or can be seen by some users regularly or most users rarely S-Critical Prevents work, causes data loss and/or has no workaround T-Defect X-Regression
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants