Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot store or restore security key occasionally for some users #27524

Closed
telmich opened this issue Jun 5, 2024 · 28 comments
Closed

Cannot store or restore security key occasionally for some users #27524

telmich opened this issue Jun 5, 2024 · 28 comments
Labels
A-E2EE A-OIDC A-Session-Mgmt Session / device names, management UI, etc. S-Major Severely degrades major functionality or product features, with no satisfactory workaround T-Defect

Comments

@telmich
Copy link

telmich commented Jun 5, 2024

Steps to reproduce

This is a follow up from #17886

The following problems / flows exist:

  1. On login you get the infamous Upgrade encryption message
  2. In the desktop it is impossible to logout, because the session key is not stored on the server
  3. Element-web goes into an infinite loop on the sync endpoint after we try to submit the security key for restoring the backup

The common things related are:

  • Reverse proxy in front of nginx in front of synapse
  • SSO enabled using OIDC (using ADFS and keycloak)

The strange things:

  • It does not affect every user
  • In one setup even after a synapse database reset the issue is reappearing instantly

Outcome

What did you expect?

The security key should be able to be used

What happened instead?

Sessions are lost / the security key cannot be stored.

Operating system

Any

Browser information

Any

URL for webapp

At least 2 private ones

Application version

v1.11.39 and 1.11.65 have both been confirmed

Homeserver

Multiple, affected are at least: v1.89.0 and 1.105.1

Will you send logs?

Yes

@telmich
Copy link
Author

telmich commented Jun 5, 2024

Also affected platforms are:

  • Google Pixel (element android) 1.6.14
  • Apple iOS (current / latest version): 1.11.10

@dosubot dosubot bot added A-E2EE A-OIDC A-Session-Mgmt Session / device names, management UI, etc. S-Major Severely degrades major functionality or product features, with no satisfactory workaround labels Jun 5, 2024
@telmich
Copy link
Author

telmich commented Jun 5, 2024

pinging @dbkr and @richvdh from the last bug - I am not entirely sure whether this is a synapse or element bug, but we don't see any errors in the synapse logs when this happens

@richvdh
Copy link
Member

richvdh commented Jun 5, 2024

In the desktop it is impossible to logout, because the session key is not stored on the server

I don't understand what this means, at all. What exactly happens when you try to log out?

Please send a rageshake demonstrating the problem.

@t3chguy
Copy link
Member

t3chguy commented Jun 5, 2024

Will you send logs?
Yes

Not seeing any logs from you

@t3chguy t3chguy added the X-Needs-Info This issue is blocked awaiting information from the reporter label Jun 5, 2024
@telmich
Copy link
Author

telmich commented Jun 6, 2024

good morning @t3chguy - I am currently trying to figure out how to send the logs from the desktop app or from the web app, as this is not phone specific. I'll try to upload logs today.

@t3chguy
Copy link
Member

t3chguy commented Jun 6, 2024

The issue template instructs you

image

@telmich
Copy link
Author

telmich commented Jun 6, 2024

Ok, I just did the following:

  • Tried to login to the home server using element-desktop 1.11.67
  • On login, after the OIDC login, I get redirected back to element-desktop (good!)
  • Then I am prompted for the security key
  • Every time I enter the correct key, the popup comes back - as a user it looks like the input was not accepted
  • Then after twice trying, I skipped entering the security key and I try to enter /rageshake, resulting in an unknown command error

2024-06-06-112421_768x832_scrot

@t3chguy
Copy link
Member

t3chguy commented Jun 6, 2024

Sounds like your desktop build doesn't have rageshaking enabled. Where did you install it from?

@telmich
Copy link
Author

telmich commented Jun 6, 2024

It's from the alpine linux repository, I installed it using apk add element-desktop

@telmich
Copy link
Author

telmich commented Jun 6, 2024

I now tried using element-web 1.11.65 and I get exactly the same error. element-web is loaded from the docker image into firefox.

2024-06-06-113156_1015x934_scrot

@telmich
Copy link
Author

telmich commented Jun 6, 2024

In both cases the behaviour is identical:

  • on login the security key is not accepted even though it is correct and there is no error message, the input field just re-appears

If it is helpful I can use the developer console beforehand in firefox/chromium.

@t3chguy
Copy link
Member

t3chguy commented Jun 6, 2024

It's from the alpine linux repository, I installed it using apk add element-desktop

This package isn't maintained by us, nor do we track issues for it

@t3chguy
Copy link
Member

t3chguy commented Jun 6, 2024

I now tried using element-web 1.11.65 and I get exactly the same error. element-web is loaded from the docker image into firefox.

Your config.json lacks the bug reporting config then, it'd help if you can reproduce on app.element.io which bears it, or add it to your own.

@telmich
Copy link
Author

telmich commented Jun 6, 2024

@t3chguy I'll modify the included config.json on it and come back with the report asap.

So far I have encountered the bug in 2 installations, both using docker based images, one running docker, one running in k8s. Both inside private networks, both behind tls terminating proxies.

In any case, I'll try to get rageshake from element-web in the next hours.

@t3chguy
Copy link
Member

t3chguy commented Jun 6, 2024

The way you host element-web should have no impact given it is an SPA and runs entirely in the browser, the docker image is just an nginx + element-web tarball

@telmich
Copy link
Author

telmich commented Jun 6, 2024

Just changed the element-web config, tried to upload two logs to this bug report now, but don't see them appearing here.

From the browser it seems that a post to https://element.io/bugreports/submit was successful.

@t3chguy
Copy link
Member

t3chguy commented Jun 6, 2024

They are uploaded to a private repo, I can see the logs, thanks.

@t3chguy t3chguy removed the X-Needs-Info This issue is blocked awaiting information from the reporter label Jun 6, 2024
@telmich
Copy link
Author

telmich commented Jun 6, 2024

Thanks for the feedback!

@t3chguy
Copy link
Member

t3chguy commented Jun 6, 2024

You seem to be using a version from a month an a half ago, any chance re-testing on latest? Given you are using the Rust crypto stack and it has frequent updates it'd have the potential to make a difference.

@telmich
Copy link
Author

telmich commented Jun 6, 2024

@t3chguy In those environments I can easily deploy docker images and upgrade, but I cannot easily / deploy from source. If there was an image I can swapout against, that one would be a no brainer.

Do you mean just upgrading to v1.11.68 ?

This can be done within a few hours today.

@telmich
Copy link
Author

telmich commented Jun 6, 2024

I just tried upgrading to v1.11.68 and it fails with:

ERROR: for elementweb  'ContainerConfig'
Traceback (most recent call last):
  File "docker-compose", line 3, in <module>
  File "compose/cli/main.py", line 81, in main
  File "compose/cli/main.py", line 203, in perform_command
  File "compose/metrics/decorator.py", line 18, in wrapper
  File "compose/cli/main.py", line 1189, in up
  File "compose/cli/main.py", line 1185, in up
  File "compose/project.py", line 702, in up
  File "compose/parallel.py", line 108, in parallel_execute
  File "compose/parallel.py", line 206, in producer
  File "compose/project.py", line 688, in do
  File "compose/service.py", line 581, in execute_convergence_plan
  File "compose/service.py", line 503, in _execute_convergence_recreate
  File "compose/parallel.py", line 108, in parallel_execute
  File "compose/parallel.py", line 206, in producer
  File "compose/service.py", line 496, in recreate
  File "compose/service.py", line 615, in recreate_container
  File "compose/service.py", line 334, in create_container
  File "compose/service.py", line 922, in _get_container_create_options
  File "compose/service.py", line 962, in _build_container_volume_options
  File "compose/service.py", line 1549, in merge_volume_bindings
  File "compose/service.py", line 1579, in get_container_data_volumes
KeyError: 'ContainerConfig'
[4092] Failed to execute script docker-compose

relevant docker-compose config:

services:
  elementweb:
    image: vectorim/element-web:${ELEMENT_VERSION}
    volumes:
      - ./config.json:/app/config.json
    ports:
      - "8008:80/tcp"
    restart: unless-stopped

where .env contains:

ELEMENT_VERSION=v1.11.68

@telmich
Copy link
Author

telmich commented Jun 6, 2024

Same issue as above with 1.11.66, 1.11.67. 1.11.65 is the last element-web version that starts.

@t3chguy
Copy link
Member

t3chguy commented Jun 6, 2024

image

It works fine here so seems the issue is in your env.

@telmich
Copy link
Author

telmich commented Jun 7, 2024

Verified on another system, 1.11.68 runs there. Now trying to get the problematic deployment up for testing.

@crjo
Copy link

crjo commented Jun 7, 2024

hi seems we have the same issue, for all our users, have you tried without oidc?

@telmich
Copy link
Author

telmich commented Jun 7, 2024

I have just been able to reproduce the same error on 1.11.68. I have uploaded the logs via rageshake.

The flow is as follows:

  • encryption upgrade available popup appears right after loading

2024-06-07-163826_356x155_scrot

  • Selecting upgrade, getting the security key prompt

2024-06-07-163832_778x381_scrot

  • entering the security key -> UI says looks good, pressing continue

2024-06-07-163839_781x389_scrot

  • Back on the same screen as before

2024-06-07-163847_803x403_scrot

Can be repeated as many times.

@telmich
Copy link
Author

telmich commented Jun 7, 2024

hi seems we have the same issue, for all our users, have you tried without oidc?

No, as the authentication is only available with OIDC.

We do however have 2 test cases:

  • OIDC with ADFS (Microsoft)
  • OIDC with keycloak

In total we actually have 4 systems that are similar, however not all of them show the same problem (just yet):

a)

  • OIDC with ADFS (Microsoft) - staging: not affected
  • OIDC with ADFS (Microsoft) - production: affected

b)

  • OIDC with keycloak - staging: affected
  • OIDC with keycloak - production: not affected

The two (a) systems and (b) systems are configured almost identically, just different authentication / authorization endpoints, but same software, same network, same proxies, etc.

For the (b) case we even tried to reset the database of synapse (it's a staging system, so not a problem) and the issue re-appears instantly.

For the (a) case we did notice sporadically / one user in the beginning with the issue and one way of temporarily fixing it is removing all sessions keys, resetting the secure backup and starting fresh for the user. However it does not fix the problem permanently.

I suspect that this is actually 2 bugs and not one:

  • Some bug in synapse potentially handling the secure key backup incorrectly or a race condition
  • UI/UX issue in handling the race condition falling into an endless loop

@richvdh
Copy link
Member

richvdh commented Jun 11, 2024

The "upgrade your encryption" dialog is completely broken; it needs removing, and that work is tracked at #27455.

There is a question about why it is being shown at all; it's likely because key backup has been set up but the key was not correctly uploaded to 4S during a previous session. #27253 is possibly related. Unfortunately we don't seem to have any logs demonstrating that.

I'm going to go ahead and close this in favour of #27455, because it's quite unclear what the actual repro steps and symptoms are, and I think most of it is covered by #27455.

@richvdh richvdh closed this as completed Jun 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-E2EE A-OIDC A-Session-Mgmt Session / device names, management UI, etc. S-Major Severely degrades major functionality or product features, with no satisfactory workaround T-Defect
Projects
None yet
Development

No branches or pull requests

4 participants