Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Outgoing federation is mostly broken after restoring from backup #16025

Closed
verymilan opened this issue Jul 28, 2023 · 19 comments
Closed

Outgoing federation is mostly broken after restoring from backup #16025

verymilan opened this issue Jul 28, 2023 · 19 comments

Comments

@verymilan
Copy link

verymilan commented Jul 28, 2023

Hi there, I have already tried to ask in the Synapse Matrix room without success and start to wonder if its a priority issue of some sort.
I was forced to restore Synapse after both drives failed upon drive replacement. I admit that I also was forced to recreate configs and to put a new signing key (with a new name of course) in place.

The server is now running without high load for over a day.

Sympoms on the matrix.org homeserver:

  • Synapse admin room is okey
  • #tchncs:tchncs.de (!PQOOuZmmjzGoXWzKuZ:tchncs.de) every few hours one or two messages arrive from tchncs.de
  • !gcAYQIzQPMgGUSaAua:tchncs.de – this was a test dm room with my accounts: initial messages couldn't be decrypted, after that, one way SCR-20230728-ugqn
  • Mostfederated rooms don't work at all (only matrix.org -> tchncs.de) – there are exceptions for some homeservers

Client symptoms:

If I receive an encrypted remote message, Element Desktop throws a decryption error even after clearing the cache and logging out and in again – however – element-x-ios still shows that message just fine.

Logs:

  • testroom from the screenshot (didn't spot any other weirdness with loglevel INFO)
    • synapse.handlers.sync - 741 - INFO - GET-2953895 - Failed to find any events in room !gcAYQIzQPMgGUSaAua:tchncs.de at RoomStreamToken(topological=None, stream=337273433, instance_map=immutabledict({}))
  • Ricardo in the Synapse Admin rooms spotted the following, tho I could not confirm on a different host:
    • matrix-matrix-synapse-1 | 2023-07-28 20:30:04,453 - synapse.federation.federation_client - 752 - WARNING - sync_partial_state_room-1-$rvTcqV4DvQssCHO0ZELo8zvcr2Lyw1NR8u5J5GNti0U-$46nRykeEPJVC2K1roiwCcgYLXpxTceTI9blNIQ6_Y70 - Signature on retrieved event $e4xQAons8TGPgR4iy4RhGRX_0_dfCZmRTrhdL9MoypM was invalid (unable to verify signature for sender domain tchncs.de: 401: Failed to find any key to satisfy: _FetchKeyRequest(server_name='tchncs.de', minimum_valid_until_ts=1690533075296, key_ids=['ed25519:a_hyvD'])). Checking local store/origin server
  • I found a random Unable to get hierarchy of !jXJzclMjZYfMiParuQ:tchncs.de via federation: 404: Unknown room: !jXJzclMjZYfMiParuQ:tchncs.de on a different server i host
  • Warning of unknown meaning WARNING - POST-5204440 - Inserting into a wheel timer that hasn't been read from recently.

Messing with configs

Ricardo in the Synapse Admins room was able to query the old key it seems from his server (I didn't find out how to do it properly/unhashed) – since Synapse was running for a while at this point, instead of trying to put it back in place, I have set it to expired in the Synapse settings.

Possibly related issues (that sadly didn't help me yet)

@JokerGermany
Copy link

JokerGermany commented Jul 29, 2023

Some more infos:
We discovered that the server states the same id for signing key and old key. (We don't know why, the signing key was another one...)

{
  "old_verify_keys": {
    "ed25519:a_hyvD": {
      "expired_ts": 1690533456,
      "key": "a/sCi6MsjnCZlhHY3aHEkSO+/0PGLhAj2jwfKusfRXk"
    }
  },
  "server_name": "tchncs.de",
  "signatures": {
    "tchncs.de": {
      "ed25519:a_hyvD": "x8iyRTGHGHIwA13VfShQvbeP1uq/OFFyVhVRgloAdVfNZqgJwIgMnQaXb+TfXg6lvBPZq1ZLtLOLq2JmTF6rCw"
    }
  },
  "valid_until_ts": 1690703863760,
  "verify_keys": {
    "ed25519:a_hyvD": {
      "key": "a/sCi6MsjnCZlhHY3aHEkSO+/0PGLhAj2jwfKusfRXk"
    }
  }
}

A couple hours ago we deleted the signing key and let synapse generate a new one.

Now we get messages from matrix.org but matrix.org only receives old messages some sort of sometimes.

We got from an Admin of another server this logs:
(same symptons like matrix.org. Messages to tchncs.de go through but can't received.)
(grep tchncs.de)

Jul 28 11:17:27 v2202204174554188327 matrix-synapse[2663894]: 2023-07-28 09:17:27,707 - synapse.handlers.message - 1410 - WARNING - PUT-142788 - Denying new event <FrozenEventV3 event_id=$AJAmPMdmxKhHZU9mhn3UgNDBmbgIqFsA3hqepI2OXRA, type=m.room.member, state_key=@plocki:tchncs.de, outlier=False> because 403: @plocki:tchncs.de is already in the room.
Jul 28 12:52:14 v2202204174554188327 matrix-synapse[2663894]: 2023-07-28 10:52:14,653 - synapse.handlers.federation_event - 1612 - WARNING - GET-145946-$8yWE5mMDMHo40C18Kulu-nX-ZwaQk9zfCrvzil2i3s8-$5f8y7jdl8CcQK7D_v1st3TBts29GxnsfWvYVtCB5lhM-$zZttxvb7EfFQzETx0rY3VxgeMVvWlrWOfg09Y-_NaLs - Server tchncs.de didn't return event $zZttxvb7EfFQzETx0rY3VxgeMVvWlrWOfg09Y-_NaLs
Jul 28 12:52:14 v2202204174554188327 matrix-synapse[2663894]: 2023-07-28 10:52:14,727 - synapse.handlers.federation_event - 1612 - WARNING - GET-145946-$8yWE5mMDMHo40C18Kulu-nX-ZwaQk9zfCrvzil2i3s8-$5f8y7jdl8CcQK7D_v1st3TBts29GxnsfWvYVtCB5lhM-$Fr3pZo0-Jc87L-23QK2KyoZHcG3t1esP6fGJsLb66kQ - Server tchncs.de didn't return event $Fr3pZo0-Jc87L-23QK2KyoZHcG3t1esP6fGJsLb66kQ
Jul 28 12:52:16 v2202204174554188327 matrix-synapse[2663894]: 2023-07-28 10:52:16,401 - synapse.handlers.federation_event - 1612 - WARNING - GET-145946-$zi-gXl_zH5nqKCtDfI-VXXNzjiWXfFqJtzKi6gnfRt4-$L4-xtjilwwgWhJ4Sv2mseu_ZwAJxUzgqtQdjKiG35mQ-$zZttxvb7EfFQzETx0rY3VxgeMVvWlrWOfg09Y-_NaLs - Server tchncs.de didn't return event $zZttxvb7EfFQzETx0rY3VxgeMVvWlrWOfg09Y-_NaLs
Jul 28 12:52:16 v2202204174554188327 matrix-synapse[2663894]: 2023-07-28 10:52:16,458 - synapse.handlers.federation_event - 1612 - WARNING - GET-145946-$zi-gXl_zH5nqKCtDfI-VXXNzjiWXfFqJtzKi6gnfRt4-$L4-xtjilwwgWhJ4Sv2mseu_ZwAJxUzgqtQdjKiG35mQ-$Fr3pZo0-Jc87L-23QK2KyoZHcG3t1esP6fGJsLb66kQ - Server tchncs.de didn't return event $Fr3pZo0-Jc87L-23QK2KyoZHcG3t1esP6fGJsLb66kQ
Jul 28 12:52:18 v2202204174554188327 matrix-synapse[2663894]: 2023-07-28 10:52:18,428 - synapse.handlers.federation_event - 1612 - WARNING - GET-145946-$BtPyzsM0IQ4BWMx6Oku7xRgHbH1UmvYNHjmcPtBrAb4-$OXP-u28bCsOUlkX23vWC6xopDbEUDTJAbqmFewVkyB8-$zZttxvb7EfFQzETx0rY3VxgeMVvWlrWOfg09Y-_NaLs - Server tchncs.de didn't return event $zZttxvb7EfFQzETx0rY3VxgeMVvWlrWOfg09Y-_NaLs
Jul 28 12:52:18 v2202204174554188327 matrix-synapse[2663894]: 2023-07-28 10:52:18,489 - synapse.handlers.federation_event - 1612 - WARNING - GET-145946-$BtPyzsM0IQ4BWMx6Oku7xRgHbH1UmvYNHjmcPtBrAb4-$OXP-u28bCsOUlkX23vWC6xopDbEUDTJAbqmFewVkyB8-$Fr3pZo0-Jc87L-23QK2KyoZHcG3t1esP6fGJsLb66kQ - Server tchncs.de didn't return event $Fr3pZo0-Jc87L-23QK2KyoZHcG3t1esP6fGJsLb66kQ
Jul 28 12:52:20 v2202204174554188327 matrix-synapse[2663894]: 2023-07-28 10:52:20,574 - synapse.handlers.federation_event - 1612 - WARNING - GET-145946-$mgX-S9eoMeU4rPLlR40RVFePkwh1DVm3pQUDLWLPlpw-$OXP-u28bCsOUlkX23vWC6xopDbEUDTJAbqmFewVkyB8-$Fr3pZo0-Jc87L-23QK2KyoZHcG3t1esP6fGJsLb66kQ - Server tchncs.de didn't return event $Fr3pZo0-Jc87L-23QK2KyoZHcG3t1esP6fGJsLb66kQ
Jul 28 12:52:20 v2202204174554188327 matrix-synapse[2663894]: 2023-07-28 10:52:20,641 - synapse.handlers.federation_event - 1612 - WARNING - GET-145946-$mgX-S9eoMeU4rPLlR40RVFePkwh1DVm3pQUDLWLPlpw-$OXP-u28bCsOUlkX23vWC6xopDbEUDTJAbqmFewVkyB8-$zZttxvb7EfFQzETx0rY3VxgeMVvWlrWOfg09Y-_NaLs - Server tchncs.de didn't return event $zZttxvb7EfFQzETx0rY3VxgeMVvWlrWOfg09Y-_NaLs
Jul 28 12:52:21 v2202204174554188327 matrix-synapse[2663894]: 2023-07-28 10:52:21,992 - synapse.http.server - 767 - WARNING - GET-145946 - Not sending response to request <XForwardedForRequest at 0x7f12acb95910 method='GET' uri='/_matrix/client/r0/rooms/!PQOOuZmmjzGoXWzKuZ%3Atchncs.de/messages?limit=20&dir=b&from=s909881_7656227_738_173837_17645_12_43_172956_0_17&filter=%7B%22lazy_load_members%22%3Atrue%7D' clientproto='HTTP/1.0' site='8008'>, already disconnected.
Jul 28 12:52:24 v2202204174554188327 matrix-synapse[2663894]: 2023-07-28 10:52:24,575 - synapse.handlers.federation_event - 1612 - WARNING - maybe_backfill_in_the_background-34-$30NgygxDigDQxDCE9a85iKgTR1bB1M7cfG6ztNAN6iM-$G5MjZKT3nLFP_o0XG4j1VGK4tjUM0STrpxvABP6JpZo-$Fr3pZo0-Jc87L-23QK2KyoZHcG3t1esP6fGJsLb66kQ - Server tchncs.de didn't return event $Fr3pZo0-Jc87L-23QK2KyoZHcG3t1esP6fGJsLb66kQ
Jul 28 12:52:24 v2202204174554188327 matrix-synapse[2663894]: 2023-07-28 10:52:24,630 - synapse.handlers.federation_event - 1612 - WARNING - maybe_backfill_in_the_background-34-$30NgygxDigDQxDCE9a85iKgTR1bB1M7cfG6ztNAN6iM-$G5MjZKT3nLFP_o0XG4j1VGK4tjUM0STrpxvABP6JpZo-$zZttxvb7EfFQzETx0rY3VxgeMVvWlrWOfg09Y-_NaLs - Server tchncs.de didn't return event $zZttxvb7EfFQzETx0rY3VxgeMVvWlrWOfg09Y-_NaLs
Jul 28 12:52:26 v2202204174554188327 matrix-synapse[2663894]: 2023-07-28 10:52:26,243 - synapse.handlers.federation_event - 1612 - WARNING - maybe_backfill_in_the_background-34-$XgXw3VXo8eRMu_KvUZyc5iqJ5lREMBgKbRr8yd9n1G8-$SGeZ-JMgxwmrR2TEtpV0nyRy3sCQUqNqPAiwS_eO1eo-$Fr3pZo0-Jc87L-23QK2KyoZHcG3t1esP6fGJsLb66kQ - Server tchncs.de didn't return event $Fr3pZo0-Jc87L-23QK2KyoZHcG3t1esP6fGJsLb66kQ
Jul 28 12:52:26 v2202204174554188327 matrix-synapse[2663894]: 2023-07-28 10:52:26,314 - synapse.handlers.federation_event - 1612 - WARNING - maybe_backfill_in_the_background-34-$XgXw3VXo8eRMu_KvUZyc5iqJ5lREMBgKbRr8yd9n1G8-$SGeZ-JMgxwmrR2TEtpV0nyRy3sCQUqNqPAiwS_eO1eo-$zZttxvb7EfFQzETx0rY3VxgeMVvWlrWOfg09Y-_NaLs - Server tchncs.de didn't return event $zZttxvb7EfFQzETx0rY3VxgeMVvWlrWOfg09Y-_NaLs
Jul 28 12:52:28 v2202204174554188327 matrix-synapse[2663894]: 2023-07-28 10:52:28,199 - synapse.handlers.federation_event - 1612 - WARNING - maybe_backfill_in_the_background-34-$58CNizqZgyhlp9kz4g83pbOnjXuXW4sANyvrzRgWbjs-$ndA_xngdW0RZUeBaxD2y3VAz6oZ6Uvvu2SMFqGzEs44-$Fr3pZo0-Jc87L-23QK2KyoZHcG3t1esP6fGJsLb66kQ - Server tchncs.de didn't return event $Fr3pZo0-Jc87L-23QK2KyoZHcG3t1esP6fGJsLb66kQ
Jul 28 12:52:28 v2202204174554188327 matrix-synapse[2663894]: 2023-07-28 10:52:28,282 - synapse.handlers.federation_event - 1612 - WARNING - maybe_backfill_in_the_background-34-$58CNizqZgyhlp9kz4g83pbOnjXuXW4sANyvrzRgWbjs-$ndA_xngdW0RZUeBaxD2y3VAz6oZ6Uvvu2SMFqGzEs44-$zZttxvb7EfFQzETx0rY3VxgeMVvWlrWOfg09Y-_NaLs - Server tchncs.de didn't return event $zZttxvb7EfFQzETx0rY3VxgeMVvWlrWOfg09Y-_NaLs
Jul 28 14:09:51 v2202204174554188327 matrix-synapse[2663894]: 2023-07-28 12:09:51,488 - synapse.federation.sender.transaction_manager - 178 - WARNING - federation_transaction_transmission_loop-76279 - TX [zemos.net] {1690292246632} Remote returned error for $uvV6diFBEx_MRA7RJ7Fh5NgGg9c-8B6C79EHI15qruQ: {'errcode': '#BOUNCE', 'error': 'execute $uvV6diFBEx_MRA7RJ7Fh5NgGg9c-8B6C79EHI15qruQ !PQOOuZmmjzGoXWzKuZ:tchncs.de :No users require events of type=m.reaction in !PQOOuZmmjzGoXWzKuZ:tchncs.de on this server.'}
Jul 28 14:51:11 v2202204174554188327 matrix-synapse[2663894]: 2023-07-28 12:51:11,638 - synapse.federation.sender.transaction_manager - 178 - WARNING - federation_transaction_transmission_loop-78173 - TX [zemos.net] {1690292248122} Remote returned error for $gI60kszXAQ6Kk5E3DShpBB7bzsLzQj1v89GGGJLIq7c: {'errcode': '#BOUNCE', 'error': 'execute $gI60kszXAQ6Kk5E3DShpBB7bzsLzQj1v89GGGJLIq7c !PQOOuZmmjzGoXWzKuZ:tchncs.de :No users require events of type=m.room.message in !PQOOuZmmjzGoXWzKuZ:tchncs.de on this server.'}
Jul 28 21:31:59 v2202204174554188327 matrix-synapse[2663894]: 2023-07-28 19:31:59,007 - synapse.federation.sender.transaction_manager - 178 - WARNING - federation_transaction_transmission_loop-92302 - TX [zemos.net] {1690292259186} Remote returned error for $ADLgMcjnkk1-bAe4eV-GTRELRlOYmOs54mlLnsUabvo: {'errcode': '#BOUNCE', 'error': 'execute $ADLgMcjnkk1-bAe4eV-GTRELRlOYmOs54mlLnsUabvo !PQOOuZmmjzGoXWzKuZ:tchncs.de :No users require events of type=m.room.message in !PQOOuZmmjzGoXWzKuZ:tchncs.de on this server.'}
Jul 28 21:32:04 v2202204174554188327 matrix-synapse[2663894]: 2023-07-28 19:32:04,080 - synapse.federation.transport.server._base - 312 - WARNING - GET-167280 - authenticate_request failed: 401: Failed to find any key to satisfy: _FetchKeyRequest(server_name='tchncs.de', minimum_valid_until_ts=1690572723941, key_ids=['ed25519:a_hyvD'])
Jul 28 21:32:42 v2202204174554188327 matrix-synapse[2663894]: 2023-07-28 19:32:42,426 - synapse.federation.sender.transaction_manager - 178 - WARNING - federation_transaction_transmission_loop-92469 - TX [zemos.net] {1690292259293} Remote returned error for $XNwES6ZqdNwhRkCX0ndV3ajMl15lm-sNs-0f-UhTOmg: {'errcode': '#BOUNCE', 'error': 'execute $XNwES6ZqdNwhRkCX0ndV3ajMl15lm-sNs-0f-UhTOmg !PQOOuZmmjzGoXWzKuZ:tchncs.de :No users require events of type=m.room.message in !PQOOuZmmjzGoXWzKuZ:tchncs.de on this server.'}
Jul 28 21:33:14 v2202204174554188327 matrix-synapse[2663894]: 2023-07-28 19:33:14,169 - synapse.federation.sender.transaction_manager - 178 - WARNING - federation_transaction_transmission_loop-92633 - TX [zemos.net] {1690292259397} Remote returned error for $RxEhvKDtDk_lygbYzClwiQsytTfpLlYuRbQMe46aMqQ: {'errcode': '#BOUNCE', 'error': 'execute $RxEhvKDtDk_lygbYzClwiQsytTfpLlYuRbQMe46aMqQ !PQOOuZmmjzGoXWzKuZ:tchncs.de :No users require events of type=m.room.message in !PQOOuZmmjzGoXWzKuZ:tchncs.de on this server.'}
Jul 28 23:35:53 v2202204174554188327 matrix-synapse[2663894]: 2023-07-28 21:35:53,392 - synapse.http.matrixfederationclient - 780 - WARNING - _process_incoming_pdus_in_room_inner-951-$lx5wSrhWSa5kssOftAywddSRw_GxoEOMbyK-Idc3eG4-$k0Ozuk-2vFnQZ8dYYtCGqpJPsm0rLtJMjzRymoDjpVc-$ClszLKJqWHKIhgcGd-ex4cIKE_Gu2Z5jfe-AGmt3pwI - {GET-O-130473} [pussthecat.org] Request failed: GET matrix-federation://pussthecat.org/_matrix/federation/v1/state_ids/%21PQOOuZmmjzGoXWzKuZ%3Atchncs.de?event_id=%24ClszLKJqWHKIhgcGd-ex4cIKE_Gu2Z5jfe-AGmt3pwI: HttpResponseException('404: Not Found')
Jul 28 23:36:06 v2202204174554188327 matrix-synapse[2663894]: 2023-07-28 21:36:06,759 - synapse.http.matrixfederationclient - 780 - WARNING - _process_new_pulled_events_with_failed_pull_attempts-65-$k0Ozuk-2vFnQZ8dYYtCGqpJPsm0rLtJMjzRymoDjpVc-$ClszLKJqWHKIhgcGd-ex4cIKE_Gu2Z5jfe-AGmt3pwI - {GET-O-130476} [pussthecat.org] Request failed: GET matrix-federation://pussthecat.org/_matrix/federation/v1/state_ids/%21PQOOuZmmjzGoXWzKuZ%3Atchncs.de?event_id=%24ClszLKJqWHKIhgcGd-ex4cIKE_Gu2Z5jfe-AGmt3pwI: HttpResponseException('404: Not Found')
Jul 29 12:18:03 v2202204174554188327 matrix-synapse[2663894]: 2023-07-29 10:18:03,610 - synapse.handlers.federation_event - 1612 - WARNING - maybe_backfill_in_the_background-69-$WmsaJekcakXLHG-NDK4THbS3Ta_6WIDaaGM2w_hZwXw-$FNaj0lPVzfiRkStNeDF6gcgWtqOXRNKhHUWE13OUinw-$zZttxvb7EfFQzETx0rY3VxgeMVvWlrWOfg09Y-_NaLs - Server tchncs.de didn't return event $zZttxvb7EfFQzETx0rY3VxgeMVvWlrWOfg09Y-_NaLs
Jul 29 12:18:04 v2202204174554188327 matrix-synapse[2663894]: 2023-07-29 10:18:04,417 - synapse.handlers.federation_event - 1612 - WARNING - maybe_backfill_in_the_background-69-$WmsaJekcakXLHG-NDK4THbS3Ta_6WIDaaGM2w_hZwXw-$FNaj0lPVzfiRkStNeDF6gcgWtqOXRNKhHUWE13OUinw-$Fr3pZo0-Jc87L-23QK2KyoZHcG3t1esP6fGJsLb66kQ - Server tchncs.de didn't return event $Fr3pZo0-Jc87L-23QK2KyoZHcG3t1esP6fGJsLb66kQ
Jul 29 13:09:59 v2202204174554188327 matrix-synapse[2663894]: 2023-07-29 11:09:59,699 - synapse.federation.sender.transaction_manager - 178 - WARNING - federation_transaction_transmission_loop-116572 - TX [zemos.net] {1690292278714} Remote returned error for $aweARhjwbNKkU6BvmEg2RUZBy75N-9It-wGR82BPIOA: {'errcode': '#BOUNCE', 'error': 'execute $aweARhjwbNKkU6BvmEg2RUZBy75N-9It-wGR82BPIOA !PQOOuZmmjzGoXWzKuZ:tchncs.de :No users require events of type=m.room.message in !PQOOuZmmjzGoXWzKuZ:tchncs.de on this server.'}
Jul 29 13:10:46 v2202204174554188327 matrix-synapse[2663894]: 2023-07-29 11:10:46,108 - synapse.federation.sender.transaction_manager - 178 - WARNING - federation_transaction_transmission_loop-116831 - TX [zemos.net] {1690292278894} Remote returned error for $_YOMDkzKC5vY0EJ_1CUbdza5K__1oqjRrleAhI7fxNQ: {'errcode': '#BOUNCE', 'error': 'execute $_YOMDkzKC5vY0EJ_1CUbdza5K__1oqjRrleAhI7fxNQ !PQOOuZmmjzGoXWzKuZ:tchncs.de :No users require events of type=m.room.message in !PQOOuZmmjzGoXWzKuZ:tchncs.de on this server.'}
Jul 29 13:31:15 v2202204174554188327 matrix-synapse[2663894]: 2023-07-29 11:31:15,664 - synapse.media.url_previewer - 342 - WARNING - GET-202049 - Failed to find any OG data in https://matrix.org/_matrix/key/v2/query/tchncs.de
Jul 29 13:31:15 v2202204174554188327 matrix-synapse[2663894]: 2023-07-29 11:31:15,819 - synapse.media.url_previewer - 342 - WARNING - GET-202048 - Failed to find any OG data in https://matrix.tchncs.de/_matrix/key/v2/server
Jul 29 14:28:41 v2202204174554188327 matrix-synapse[2663894]: 2023-07-29 12:28:41,032 - synapse.federation.sender.transaction_manager - 178 - WARNING - federation_transaction_transmission_loop-121561 - TX [zemos.net] {1690292282639} Remote returned error for $66RHzM4swQwxyFDXo0g1DZJ7hz_TdiZLaPIJpI07n4U: {'errcode': '#BOUNCE', 'error': 'execute $66RHzM4swQwxyFDXo0g1DZJ7hz_TdiZLaPIJpI07n4U !PQOOuZmmjzGoXWzKuZ:tchncs.de :No users require events of type=m.room.message in !PQOOuZmmjzGoXWzKuZ:tchncs.de on this server.'}
Jul 29 15:50:24 v2202204174554188327 matrix-synapse[692988]: 2023-07-29 13:50:24,722 - synapse.federation.sender.transaction_manager - 178 - WARNING - federation_transaction_transmission_loop-2406 - TX [zemos.net] {1690634226260} Remote returned error for $qwcei5FNwrAjztrU-3oC6zTqkccfdJj23Y7z7CwAh94: {'errcode': '#BOUNCE', 'error': 'execute $qwcei5FNwrAjztrU-3oC6zTqkccfdJj23Y7z7CwAh94 !PQOOuZmmjzGoXWzKuZ:tchncs.de :No users require events of type=m.room.message in !PQOOuZmmjzGoXWzKuZ:tchncs.de on this server.'}
Jul 29 16:06:47 v2202204174554188327 matrix-synapse[692988]: 2023-07-29 14:06:47,687 - synapse.http.matrixfederationclient - 780 - WARNING - _process_incoming_pdus_in_room_inner-15-$1mAVmTd4ZoI4nCmiT_MPPvJYGjkDKj0EcR4JmIr_u6Y-$F8RiYoWfqB8IAQI1UjKu6TAgAyWoCu55fYngtSCCjSE-$ZVR-w2Sg0-bQ6uNpRg605b_AT1QIm6R56FfE-Lih81Y - {GET-O-3272} [pragma-messenger.ch] Request failed: GET matrix-federation://pragma-messenger.ch/_matrix/federation/v1/state_ids/%21PQOOuZmmjzGoXWzKuZ%3Atchncs.de?event_id=%24ZVR-w2Sg0-bQ6uNpRg605b_AT1QIm6R56FfE-Lih81Y: HttpResponseException('404: Not Found')

Example Room:
https://matrix.to/#/#tchncs:tchncs.de

@verymilan
Copy link
Author

verymilan commented Jul 30, 2023

log from a different server and room where my messages dont arrive (two messages to other server)

grep 'tchncs.de' homeserver.log|grep -v 'presence\|receipt\|notif\|typing\|device'
2023-07-30 10:41:22,591 - synapse.federation.sender.transaction_manager - 121 - INFO - federation_transaction_transmission_loop-299 - TX [tchncs.de] {1690706313381} Sending transaction [1690706313381], (PDUs: 0, EDUs: 1)
2023-07-30 10:41:22,894 - synapse.http.matrixfederationclient - 346 - INFO - federation_transaction_transmission_loop-299 - {PUT-O-231} [tchncs.de] Completed request: 200 OK in 0.30 secs, got 11 bytes - PUT matrix-federation://tchncs.de/_matrix/federation/v1/send/1690706313381
2023-07-30 10:41:22,894 - synapse.federation.sender.transaction_manager - 174 - INFO - federation_transaction_transmission_loop-299 - TX [tchncs.de] {1690706313381} got 200 response
2023-07-30 10:42:38,074 - synapse.federation.sender.transaction_manager - 121 - INFO - federation_transaction_transmission_loop-375 - TX [tchncs.de] {1690706313438} Sending transaction [1690706313438], (PDUs: 0, EDUs: 1)
2023-07-30 10:42:38,451 - synapse.http.matrixfederationclient - 346 - INFO - federation_transaction_transmission_loop-375 - {PUT-O-289} [tchncs.de] Completed request: 200 OK in 0.38 secs, got 11 bytes - PUT matrix-federation://tchncs.de/_matrix/federation/v1/send/1690706313438
2023-07-30 10:42:38,452 - synapse.federation.sender.transaction_manager - 174 - INFO - federation_transaction_transmission_loop-375 - TX [tchncs.de] {1690706313438} got 200 response

also some log from right before on the other server

2023-07-30 10:38:39,010 - synapse.handlers.federation - 2003 - INFO - sync_partial_state_room-4 - Syncing state for room !sNDSYdslSEBKkLBGRf:matrix.org via tchncs.de instead
2023-07-30 10:38:39,018 - synapse.http.federation.well_known_resolver - 253 - INFO - sync_partial_state_room-4-$c-mEWIoZb59OWIgimIz7EJTiVxmHEhlwMx1EX2_Ze2s-$Ey_19Rzx73NeRkrj_1yyWQvCQAp0gc-088L0vnz1Zh4 - Fetching https://tchncs.de/.well-known/matrix/server
2023-07-30 10:38:39,140 - synapse.http.federation.well_known_resolver - 190 - INFO - sync_partial_state_room-4-$c-mEWIoZb59OWIgimIz7EJTiVxmHEhlwMx1EX2_Ze2s-$Ey_19Rzx73NeRkrj_1yyWQvCQAp0gc-088L0vnz1Zh4 - Response from .well-known: {'m.server': 'matrix.tchncs.de:443'}
2023-07-30 10:38:39,469 - synapse.http.matrixfederationclient - 690 - INFO - sync_partial_state_room-4-$c-mEWIoZb59OWIgimIz7EJTiVxmHEhlwMx1EX2_Ze2s-$Ey_19Rzx73NeRkrj_1yyWQvCQAp0gc-088L0vnz1Zh4 - {GET-O-39} [tchncs.de] Got response headers: 403 Forbidden
2023-07-30 10:38:39,470 - synapse.http.matrixfederationclient - 780 - WARNING - sync_partial_state_room-4-$c-mEWIoZb59OWIgimIz7EJTiVxmHEhlwMx1EX2_Ze2s-$Ey_19Rzx73NeRkrj_1yyWQvCQAp0gc-088L0vnz1Zh4 - {GET-O-39} [tchncs.de] Request failed: GET matrix-federation://tchncs.de/_matrix/federation/v1/state_ids/%21sNDSYdslSEBKkLBGRf%3Amatrix.org?event_id=%24Ey_19Rzx73NeRkrj_1yyWQvCQAp0gc-088L0vnz1Zh4: HttpResponseException('403: Forbidden')
2023-07-30 10:38:39,470 - synapse.handlers.federation - 1995 - INFO - sync_partial_state_room-4 - Failed to get state for !sNDSYdslSEBKkLBGRf:matrix.org at <FrozenEventV3 event_id=$c-mEWIoZb59OWIgimIz7EJTiVxmHEhlwMx1EX2_Ze2s, type=m.room.member, state_key=@top47:other-serv.erma-messenger.ch, outlier=False> from tchncs.de because ERROR 403: We can't get valid state history.
2023-07-30 10:38:40,603 - synapse.federation.sender.per_destination_queue - 568 - INFO - federation_transaction_transmission_loop-0 - Catching up rooms to matrix.im: '!PQOOuZmmjzGoXWzKuZ:tchncs.de'
2023-07-30 10:39:10,105 - synapse.federation.sender.transaction_manager - 121 - INFO - federation_transaction_transmission_loop-26 - TX [tchncs.de] {1690706313171} Sending transaction [1690706313171], (PDUs: 0, EDUs: 1)
2023-07-30 10:39:10,106 - synapse.http.federation.well_known_resolver - 253 - INFO - federation_transaction_transmission_loop-26 - Fetching https://tchncs.de/.well-known/matrix/server
2023-07-30 10:39:10,214 - synapse.http.federation.well_known_resolver - 190 - INFO - federation_transaction_transmission_loop-26 - Response from .well-known: {'m.server': 'matrix.tchncs.de:443'}
2023-07-30 10:39:11,098 - synapse.http.matrixfederationclient - 346 - INFO - federation_transaction_transmission_loop-26 - {PUT-O-21} [tchncs.de] Completed request: 200 OK in 0.99 secs, got 11 bytes - PUT matrix-federation://tchncs.de/_matrix/federation/v1/send/1690706313171
2023-07-30 10:39:11,098 - synapse.federation.sender.transaction_manager - 174 - INFO - federation_transaction_transmission_loop-26 - TX [tchncs.de] {1690706313171} got 200 response
2023-07-30 10:39:21,358 - synapse.access.http.8008 - 461 - INFO - POST-306 - 2a01:4f9:3b:2d5f::2 - 8008 - {tchncs.de} Processed request: 0.170sec/0.001sec (0.012sec, 0.003sec) (0.010sec/0.125sec/10) 2570B 200 "POST /_matrix/federation/v1/user/keys/claim HTTP/1.0" "Synapse/1.88.0" [0 dbevts]

meanwhile the tchncs server, grep for the other server

2023-07-30 10:39:17,723 - synapse.access.http.8008 - 461 - INFO - GET-14923 - 145.224.73.29 - 8008 - {@Milan:tchncs.de} Processed request: 0.000sec/0.075sec (0.000sec, 0.000sec) (0.000sec/0.000sec/0) 22B 200 "GET /_matrix/client/r0/thirdparty/user/im.vector.protocol.sip_virtual?native_mxid=%40milan%3Aother-serv.er HTTP/1.0" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Element/1.11.36 Chrome/114.0.5735.134 Electron/25.2.0 Safari/537.36" [0 dbevts]
2023-07-30 10:39:20,576 - synapse.http.federation.well_known_resolver - 253 - INFO - POST-15163 - Fetching https://other-serv.er/.well-known/matrix/server
2023-07-30 10:39:20,985 - synapse.http.federation.well_known_resolver - 197 - INFO - POST-15163 - Error parsing well-known for b'other-serv.er': Non-200 response 404
2023-07-30 10:39:21,381 - synapse.http.matrixfederationclient - 346 - INFO - POST-15163 - {POST-O-11441} [other-serv.er] Completed request: 200 OK in 0.84 secs, got 2570 bytes - POST matrix-federation://other-serv.er/_matrix/federation/v1/user/keys/claim
2023-07-30 10:39:21,381 - synapse.handlers.e2e_keys - 729 - INFO - POST-15163 - Claimed one-time-keys: signed_curve25519:AAAAIQ for @milan:other-serv.er:CIQQRLBPKE,signed_curve25519:AAAACQ for @milan:other-serv.er:FPJKKDDWGE,signed_curve25519:AAAACA for @milan:other-serv.er:IWQKEJHCGQ,signed_curve25519:AAAAJw for @milan:other-serv.er:JWIIZNZRLZ,signed_curve25519:AAAAEA for @milan:other-serv.er:QLTBHQMNMF,signed_curve25519:AAAAAw for @milan:other-serv.er:SWMYIIVVPA,signed_curve25519:AAAAHw for @milan:other-serv.er:THYENOLWAL,signed_curve25519:AAAAIg for @milan:other-serv.er:UMEZSMYYTJ,signed_curve25519:AAAAJw for @milan:other-serv.er:WBQSZVLRYF,signed_curve25519:AAAAFg for @milan:other-serv.er:ZHUEDPPRJX
2023-07-30 10:42:04,394 - synapse.access.http.8008 - 461 - INFO - GET-25668 - 145.224.73.29 - 8008 - {@Milan:tchncs.de} Processed request: 0.000sec/0.001sec (0.001sec, 0.000sec) (0.000sec/0.000sec/0) 22B 200 "GET /_matrix/client/r0/thirdparty/user/im.vector.protocol.sip_virtual?native_mxid=%40milan%3Aother-serv.er HTTP/1.0" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Element/1.11.36 Chrome/114.0.5735.134 Electron/25.2.0 Safari/537.36" [0 dbevts]
2023-07-30 10:51:41,059 - synapse.access.http.8008 - 461 - INFO - GET-70071 - 145.224.73.29 - 8008 - {@Milan:tchncs.de} Processed request: 0.000sec/0.008sec (0.000sec, 0.000sec) (0.000sec/0.000sec/0) 22B 200 "GET /_matrix/client/r0/thirdparty/user/im.vector.protocol.sip_virtual?native_mxid=%40milan%3Aother-serv.er HTTP/1.0" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Element/1.11.36 Chrome/114.0.5735.134 Electron/25.2.0 Safari/537.36" [0 dbevts]

working test message other server -> tchncs – grep on the tchncs side (odd? am i missing logs?)

2023-07-30 10:58:11,636 - synapse.crypto.keyring - 687 - INFO - PerspectivesKeyFetcher-3 - Requesting keys [_FetchKeyRequest(server_name='other-serv.er-messenger.ch', minimum_valid_until_ts=1690707491580, key_ids=['ed25519:a_KOtB'])] from notary server matrix.org
2023-07-30 10:58:11,813 - synapse.access.http.8008 - 461 - INFO - POST-102718 - 2001:1680:101:83a::91 - 8008 - {other-serv.er-messenger.ch} Processed request: 0.227sec/0.006sec (0.003sec, 0.000sec) (0.005sec/0.029sec/2) 20B 200 "POST /_matrix/federation/v1/user/keys/claim HTTP/1.0" "Synapse/1.88.0" [0 dbevts]

@verymilan
Copy link
Author

verymilan commented Jul 30, 2023

Update: at least in the PQO.. tchncs community room, now suddenly federation got interrupted from matrix.org -> tchncs.de as well ... still nothing obvious to spot in the logs, tho i am trying to filter in the dark and don't have clear knowledge on what to search for

Update on update: this was incorrect – those messages only mysteriously vanished in element-desktop

@verymilan verymilan changed the title Remote servers not properly backfillig after restoring from backup Outgoing federation is mostly broken after restoring from backup Jul 30, 2023
@Moini
Copy link

Moini commented Aug 1, 2023

@verymilan Is there someone among the contributors whom we could tag directly for this issue? Our internal Kielux event organization communication is currently partially interrupted, which affects us quite a bit.

@JokerGermany
Copy link

At the moment we experience, that some servers has no problem federating with matrix.org and tchncs.de
When someone of this server sends a message, the messages from tchncs.de appears on matrix.org.
It looks like their server are sending the messages from tchncs.de to matrix.org

@verymilan
Copy link
Author

@Moini hi! Did you mean to mention me? I personally do not knwo :)

@Moini
Copy link

Moini commented Aug 1, 2023

Just generally anyone who reads this :) Looks like there has been some progress, one of us reports that people now read his old messages. Only in our unencrpyted chat room, the tchncs accounts can post, but their posts cannot be read by others with accounts on different servers.

@0xpr03
Copy link

0xpr03 commented Aug 2, 2023

Can confirm that I can't read DMs by tchncs.de people (decryption error), though they can apparently read my messages from matrix.org.

recreate configs and to put a new signing key

Looks like you will have to ask people about somehow exchanging / accepting that new key. I can definitely see this as a required mechanism to prevent taking over servers via DNS hacks. (Even though they can still read my messages..)

@timaeos
Copy link

timaeos commented Aug 2, 2023

@verymilan
Don't mean to impose but could you do me a favor and let me know what the retry interval from tchncs.de to my server is from the
https://matrix.tchncs.de/_synapse/admin/v1/federation/destinations/matrix.nexaeos.io endpoint
or https://matrix.tchncs.de/_synapse/admin/v1/federation/destinations/matrix.org

information on the endpoint is here


Trying to see if there is exponential back-off playing a role since I added the tchncs.de to my trusted key server and killed the keys and as far as I can tell am able to send from my server -> tchncs.de. From my logs, it doesn't even look like tchncs.de is attempting to send stuff to me I believe

@verymilan
Copy link
Author

verymilan commented Aug 2, 2023

@timaeos will try to reset :o
edit The retry timing does not need to be reset for this destination. – oh read doc wrong here

{
  "destination": "matrix.nexaeos.io",
  "last_successful_stream_ordering": 336679047,
  "failure_ts": null,
  "retry_last_ts": 0,
  "retry_interval": 0 🤨 
}
{
  "destination": "matrix.org",
  "last_successful_stream_ordering": 336680426,
  "failure_ts": null,
  "retry_last_ts": 0,
  "retry_interval": 0
}

@timaeos
Copy link

timaeos commented Aug 2, 2023

Drat, was hoping it would show a failure. I'll keep looking through my logs to see if there is an A-ha moment

@tcpipuk
Copy link

tcpipuk commented Aug 2, 2023

The situation seems to have changed over the last 24 hours, but it's unclear whether this is because a number of homeservers (like mine) are in the room for tchncs.de to bounce messages off to reach the homeservers that it can't talk to directly.

The main issue users are currently experiencing is many "unable to decrypt" errors, which would also be occurring if the messages are bouncing via another homeserver, so federation definitely looks like the issue.

The keys look absolutely fine on matrix.org, so we're running out of ideas why it's failing!
https://matrix.org/_matrix/key/v2/query/tchncs.de

@JokerGermany
Copy link

@tcpipuk Only the array number 1 is correct, 0 and 2 is false

@timaeos
Copy link

timaeos commented Aug 2, 2023

@tcpipuk Only the array number 1 is correct, 0 and 2 is false

I don't think that endpoint array order is static. Which keys are you saying are incorrect?

Based on https://matrix.tchncs.de/_matrix/key/v2/server I'm guessing that you're saying that only ed25519:a_liSe is correct?

@0xpr03
Copy link

0xpr03 commented Aug 2, 2023

Something changed, was able to receive some messages on matrix.org which - according to their timestamps - were sent at 11:00 and 15:00 yesterday.

@tcpipuk
Copy link

tcpipuk commented Aug 2, 2023

We believe this is now resolved, the issue appears to have been a faulty federation_sender on the homeserver, so hopefully everything's working fine again now! 🙂

@timaeos
Copy link

timaeos commented Aug 2, 2023

Something changed, was able to receive some messages on matrix.org which - according to their timestamps - were send at 15:00 and 15:00 yesterday.

@verymilan was able to find an error in one of his federation senders with the help of the people in the synapse admin room. That was resolved and now the server is federating correctly. It's going to take a while for it to churn through the backlog.

They'll probably close this issue tomorrow after they get some much needed sleep

@clokep clokep closed this as not planned Won't fix, can't repro, duplicate, stale Aug 2, 2023
@Moini
Copy link

Moini commented Aug 2, 2023

Yay, thanks a lot everyone!!!

@verymilan
Copy link
Author

Thank you everybody who has patiently tried to investigate this problem here and in various Matrix rooms. Means alot 🤗

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants