Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

e2e: return newly fetched cross-signing keys #10912

Closed
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions changelog.d/10912.bugfix
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
return E2E cross signing keys fetched over federation when to a client.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't really make sense? Please can you expand it into something that will be meaningful to other synapse admins.

Also, do we know when this bug was introduced? Please include that info.

51 changes: 36 additions & 15 deletions synapse/handlers/e2e_keys.py
Original file line number Diff line number Diff line change
Expand Up @@ -168,32 +168,36 @@ async def query_devices(
) = await self.store.get_user_devices_from_cache(query_list)
for user_id, devices in remote_results.items():
user_devices = results.setdefault(user_id, {})
for device_id, device in devices.items():
keys = device.get("keys", None)
device_display_name = device.get("device_display_name", None)
if keys:
result = dict(keys)
unsigned = result.setdefault("unsigned", {})
if device_display_name:
unsigned["device_display_name"] = device_display_name
user_devices[device_id] = result
Comment on lines -171 to -179
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this code is moving further down. Can you explain why we need to do that?


# check for missing cross-signing keys.
for user_id in remote_queries.keys():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so we're now doing the following code just on the results of the query, rather than the devices we asked for. That may well be valid, but can you explain why we need to make this change?

cached_cross_master = user_id in cross_signing_keys["master_keys"]
cached_cross_selfsigning = (
user_id in cross_signing_keys["self_signing_keys"]
self_sig_key = cross_signing_keys["self_signing_keys"].get(
user_id, {}
)
cached_cross_selfsigning = bool(self_sig_key)
Comment on lines +173 to +176
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if I'm missing something, but it looks like this logic is identical to the previous version?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1. I found the original easier to understand (and has the advantage of not introducing yet another local variable to this massive function).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic is not quite identical: the proposed code would behave differently if user_id was present in the self-signing keys dict but set to a falsey value, like an empty dict.

I wonder if this might more clearly express intent:

Suggested change
self_sig_key = cross_signing_keys["self_signing_keys"].get(
user_id, {}
)
cached_cross_selfsigning = bool(self_sig_key)
cached_cross_selfsigning = bool(cross_signing_keys["self_signing_keys"].get(user_id)

The question then becomes: are there scenarios where this could happen, and would we be better off preventing those instead? (E.g., stripping falsey values from the dicts before testing for membership)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From looking at get_cross_signing_keys_from_cache(), it may be possible for a falsey value to end up in cross_signing_keys["self_signing_keys"].

There's a call tree that's basically:

  • get_cross_signing_keys_from_cache()
    • store.get_e2e_cross_signing_keys_bulk() (Docstring: "If a user's cross-signing keys were not found, either their user ID will not be in the dict, or their user ID will map to None.")
      • store._get_bare_e2e_cross_signing_keys_bulk_txn() (Docstring: "If a user's cross-signing keys were not found, their user ID will not be in the dict.")

So worth figuring out what's actually true. Or being defensive here.

(Also worth seeing if this code survived Erik's refactor in the related PR)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same code is present in the current version of e2e_keys.py, so still worth digging into.


# check if we are missing only one of cross-signing master or
# self-signing key, but the other one is cached.
# as we need both, this will issue a federation request.
# if we don't have any of the keys, either the user doesn't have
# cross-signing set up, or the cached device list
# is not (yet) updated.
# cross-signing set up, or we did not fetch the
# cross-signing keys yet since the device list is not (yet) updated.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not really following this. Why would the device list not be updated, and why does it cause us to not fetch the cross-signing keys?

Also, is it relevant to the bug being fixed in this PR, or is it a separate correction you're doing at the same time? (It doesn't matter either way - I'm just trying to understand what this PR is fixing!)

if cached_cross_master ^ cached_cross_selfsigning:
user_ids_not_in_cache.add(user_id)

for device_id, device in devices.items():
keys = device.get("keys", None)
device_display_name = device.get("device_display_name", None)
if keys:
result = dict(keys)
unsigned = result.setdefault("unsigned", {})
if device_display_name:
unsigned["device_display_name"] = device_display_name
user_devices[device_id] = result

# TODO: invalidate device cache if we have cached
# device keys but not cross-signing keys,
# although the user should have cross-signing keys.

# add those users to the list to fetch over federation.
for user_id in user_ids_not_in_cache:
domain = get_domain_from_id(user_id)
Expand Down Expand Up @@ -234,6 +238,9 @@ async def do_remote_query(destination: str) -> None:
# probably be tracking their device lists. However, we haven't
# done an initial sync on the device list so we do it now.
try:
# here, we fetch the user's device list
# and cross signing keys,
# and update our cache with them.
Comment on lines +241 to +243
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please
use
longer
lines

if self._is_master:
user_devices = await self.device_handler.device_list_updater.user_device_resync(
user_id
Expand All @@ -247,6 +254,15 @@ async def do_remote_query(destination: str) -> None:
user_results = results.setdefault(user_id, {})
for device in user_devices:
user_results[device["device_id"]] = device["keys"]

# update the result with the devicelist's cross-signing keys
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you expand this comment a bit to say why this is necessary?

cross_signing_keys["master_keys"][user_id] = user_devices.get(
"master_key"
)
cross_signing_keys["self_signing_keys"][
user_id
] = user_devices.get("self_signing_key")
Comment on lines +259 to +264
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this will overwrite any existing values in cross_signing_keys. Is that correct behaviour?


user_ids_updated.append(user_id)
except Exception as e:
failures[destination] = _exception_to_failure(e)
Expand All @@ -260,6 +276,11 @@ async def do_remote_query(destination: str) -> None:
for user_id in user_ids_updated:
destination_query.pop(user_id)

# There's still some leftover destination queries:
# either we don't share a room with the user,
# or explicit devices were requested.
# Now fetch the device keys via /user/keys/query,
# and don't cache these results.
try:
remote_result = await self.federation.query_client_keys(
destination, {"device_keys": destination_query}, timeout=timeout
Expand Down