Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rageshake only reports a tiny number of logs these days #26532

Closed
ara4n opened this issue Nov 9, 2023 · 5 comments · Fixed by matrix-org/matrix-react-sdk#12003
Closed

Rageshake only reports a tiny number of logs these days #26532

ara4n opened this issue Nov 9, 2023 · 5 comments · Fixed by matrix-org/matrix-react-sdk#12003
Assignees
Labels
T-Task Tasks for the team like planning

Comments

@ara4n
Copy link
Member

ara4n commented Nov 9, 2023

Steps to reproduce

  1. Submit a rageshake
  2. Discover it's only 5MB large, and contains 5 minutes of logs on a busy account which is having encryption problems
  3. Feel frustrated

Outcome

What did you expect?

24h of logs, capped at 50MB or 100MB or some other high limit.

What happened instead?

5 minutes of logs, capped at 5MB.

Operating system

No response

Application version

Element Nightly version: 2023110701

How did you install the app?

No response

Homeserver

No response

Will you send logs?

No

@ara4n ara4n added the T-Defect label Nov 9, 2023
@t3chguy
Copy link
Member

t3chguy commented Nov 9, 2023

https://github.com/matrix-org/matrix-react-sdk/blob/985bde70c5b442c5fbafaf70d107159398699d4c/src/rageshake/rageshake.ts#L49 should be trivial to change, though wonder if it may cause performance degradation given we already see perf issues around IDB

@t3chguy t3chguy added T-Enhancement T-Task Tasks for the team like planning and removed T-Defect T-Enhancement labels Nov 9, 2023
@richvdh
Copy link
Member

richvdh commented Nov 9, 2023

This is odd. It's been 5MB for years afaict. I'd like to understand what changed.

@ara4n
Copy link
Member Author

ara4n commented Nov 9, 2023

i think the thing that has changed is:

@richvdh
Copy link
Member

richvdh commented Nov 9, 2023

There certainly exist rageshakes in the past which contain significantly more than 5MB (up to 10MB in some cases), but they do seem to be in the tiny minority, so I guess this really is as simple as "we're logging more stuff now".

Looking at a sample rageshake from Matthew (5 492 255 bytes altogether):

  • 1 229 302 bytes of: Error decrypting event (id=<event_id> type=m.room.encrypted sender=<sender> room=<room_id> ts=<ts>): DecryptionError[msg: <cause>, session: <sender key>|<session id>]
  • 760 673 bytes of:
    WARN matrix_sdk_crypto::machine: Failed to decrypt a room event: Can't find the room key to decrypt the event, withheld code: None
        at /home/runner/.cargo/git/checkouts/matrix-rust-sdk-1f4927f82a3d27bb/7440ce0/crates/matrix-sdk-crypto/src/machine.rs:1513
        in matrix_sdk_crypto::machine::decrypt_room_event with room_id="<room_id>" sender="<sender>" event_id="<event_id>" algorithm="m.megolm.v1.aes-sha2" sender_key="<key>" session_id="<session>"
    
  • 642 873 bytes due to the warning in remove unhelpful logging about events without age params. matrix-org/matrix-js-sdk#3861
  • 548 422 bytes of Attempting decryption of event <event_id> in <room_id> from <sender>
  • 486 876 bytes of REST requests (logging for which was recently added in Improve logging of http requests to aid debugging matrix-org/matrix-js-sdk#3485)
  • 453 600 bytes of:
    DEBUG matrix_sdk_crypto::machine: Received a to-device event
        at /home/runner/.cargo/git/checkouts/matrix-rust-sdk-1f4927f82a3d27bb/7440ce0/crates/matrix-sdk-crypto/src/machine.rs:1045
        in matrix_sdk_crypto::machine::receive_to_device_event with sender="<sender>" event_type="<type>" 
        in matrix_sdk_crypto::machine::receive_sync_changes
    
  • 396 952 bytes of:
    WARN matrix_sdk_crypto::identities::device: Trying to encrypt an event for a device, but no Olm session is found.
        at /home/runner/.cargo/git/checkouts/matrix-rust-sdk-1f4927f82a3d27bb/7440ce0/crates/matrix-sdk-crypto/src/identities/device.rs:789
        in matrix_sdk_crypto::identities::device::encrypt with recipient=<recipient> recipient_device=<device> recipient_key=Some("<key>") message_id="<uuid>"
    
  • 163 694 bytes of Not checking key backup for session <session_id>
  • 809 863 bytes of everything else

Basically, I think this is a particular problem because you have a non-functional backup, and Seshat frantically trying to decrypt undecryptable things: as in, this is a symptom of other problems rather than a new problem.

That said, changing the rageshake cap to have an age factor seems like a good plan. I'm hesitant to impose an absolute cap at 24 hours, since we often find useful logs from less frantic accounts from a few days ago. Maybe we could say something like:

  • First we expire logs until either there are less than 24 hours' worth, or the total volume is less than 5MB
  • Then, if there are still more than 100MB of logs, we expire logs until there are less than 100MB.

@ara4n
Copy link
Member Author

ara4n commented Nov 10, 2023

that sounds like a great plan - thanks for digging into it.

@richvdh richvdh self-assigned this Dec 5, 2023
richvdh added a commit to matrix-org/matrix-react-sdk that referenced this issue Dec 6, 2023
Currently, the rageshake store keeps up to 5M of logs. In a busy session for a power-user, that can be quite a short time.

This PR changes the store so that we will keep up to 100M, if they are less than 24H old. (Any logs older than 24H, of more than 5M, will continue to be dropped.)

Fixes element-hq/element-web#26532.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-Task Tasks for the team like planning
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants