Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

federation sender doesn't work when invoked as synapse.app.generic_worker #8015

Closed
maranda opened this issue Aug 1, 2020 · 12 comments
Closed
Labels
A-Workers Problems related to running Synapse in Worker Mode (or replication)

Comments

@maranda
Copy link

maranda commented Aug 1, 2020

Description

As in the title, I followed the documentation closely to setup the worker, but when a message is propagated through Federation servers will answer with 403s

If I leave federation sending to the master instance there're no issues.

Steps to reproduce

  • Send message in a room
  • Message appears locally but is not replicated over the network
  • Remote servers answer 403s to federation transactions

Messages appear locally but not remotely and in logs the following warnings appear:
2020-08-01 11:29:46,696 - synapse.http.matrixfederationclient - 536 - WARNING - federation_transaction_transmission_loop-4108 - {PUT-O-3160} [***] Request failed: PUT matrix://***/_matrix/federation/v1/send/1596280923582: HttpResponseException("403: b'Forbidden'")

Version information

  • Homeserver: aria-net.org

If not matrix.org:

  • Version: 1.18

  • Install method: Ubuntu packages

  • Platform:
    Hyper-V machine with 16GB of RAM and eight virtual processors
    Synapse running with 1+3 workers (2 general, 1 federation sender) with Postgresql and Redis managed via systemd
    OS: Ubuntu server 20.04

@maranda
Copy link
Author

maranda commented Aug 1, 2020

I found the issue.

Basically https://github.com/matrix-org/synapse/blob/develop/docs/systemd-with-workers/system/matrix-synapse-worker%40.service will only startup the generic worker I didn't really look inside the script so I didn't notice it.

I think that should be properly referenced in the documentation, that the script should be modified to use other worker typed e.g. synapse.app.federation_sender.

@erikjohnston erikjohnston added the A-Docs things relating to the documentation label Aug 3, 2020
@erikjohnston
Copy link
Member

Thanks for the bug report! Part of me feels like we just make everything use the same app to avoid this problem, but until then yeah let's just fix the documentation 👍

@richvdh
Copy link
Member

richvdh commented Aug 3, 2020

I'd so much rather fix the app than the docs...

@richvdh
Copy link
Member

richvdh commented Oct 14, 2020

I'm confused by this. Why does the service file need changing? Why doesn't it work when it starts -m synapse.app.generic_worker ?

@maranda
Copy link
Author

maranda commented Oct 14, 2020

@richvdh it's not that the worker didn't start it's that the generic worker didn't work at all for the federation sender role. And I had to start the synapse.app.federation_sender directly via the systemd service configuration files to rectify that.

@richvdh
Copy link
Member

richvdh commented Oct 14, 2020

yes; I'm asking why that is the case. AFAICT the code that gets run is identical.

@richvdh
Copy link
Member

richvdh commented Oct 14, 2020

(that's mostly a question for the dev team!)

@clokep
Copy link
Member

clokep commented Oct 14, 2020

There's still some special-cased code that needs to get cleaned-up:

if config.worker_app == "synapse.app.federation_sender":
if config.worker.send_federation:
sys.stderr.write(
"\nThe send_federation must be disabled in the main synapse process"
"\nbefore they can be run in a separate worker."
"\nPlease add ``send_federation: false`` to the main config"
"\n"
)
sys.exit(1)
# Force the pushers to start since they will be disabled in the main config
config.worker.send_federation = True
else:
# For other worker types we force this to off.
config.worker.send_federation = False

@erikjohnston
Copy link
Member

Hmm, actually starting it with -m synapse.app.generic_worker should worker? It needs worker_app: synapse.app.federation_sender set in the config though: https://github.com/matrix-org/synapse/blob/develop/synapse/app/generic_worker.py#L952-L963 . Maybe we're doing something cunning somewhere that ensure we're setting worker_app to the same as the started app?

@maranda
Copy link
Author

maranda commented Oct 14, 2020

@erikjohnston worker_app: synapse.app.federation_sender was and is set in the config but something breaks when the generic worker loads the federation sender app, that's the problem, I had to revolve on modifying the service files to load the app directly to counter that.

@richvdh
Copy link
Member

richvdh commented Oct 14, 2020

basically, I don't think this is a documentation bug. If it exists at all, it's an implementation bug.

@richvdh richvdh removed the A-Docs things relating to the documentation label Oct 14, 2020
@richvdh richvdh changed the title Offloading federation sender to a worker doesn't work federation sender doesn't work when invoked as synapse.app.generic_worker Oct 14, 2020
@clokep clokep added the A-Workers Problems related to running Synapse in Worker Mode (or replication) label Oct 15, 2020
@clokep
Copy link
Member

clokep commented Feb 24, 2021

We think that #9466 will fix this, which will be part of Synapse v1.29.0.

@clokep clokep closed this as completed Feb 24, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A-Workers Problems related to running Synapse in Worker Mode (or replication)
Projects
None yet
Development

No branches or pull requests

4 participants