Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Remote dynamic thumbnails are stored at the wrong location #5913

Closed
rkfg opened this issue Aug 27, 2019 · 17 comments
Closed

Remote dynamic thumbnails are stored at the wrong location #5913

rkfg opened this issue Aug 27, 2019 · 17 comments
Labels
z-bug (Deprecated Label)

Comments

@rkfg
Copy link
Contributor

rkfg commented Aug 27, 2019

Description

The remote thumbnails path and filename are derived from media_id when they're generated and stored with dynamic_thumbnails set to true. The said thumbnails are looked up by filesystem_id which fails every time. Synapse uses a lot of CPU regenerating the thumbnails all the time. When dynamic_thumbnails option is then turned off Synapse can't find the thumbnails generated when the said option was turned on and just returns 404.

Steps to reproduce

  • set dynamic_thumbnails to true
  • try to load a thumbnail of a remote media
  • now set dynamic_thumbnails to false
  • try to load the same thumbnail of the remote media
  • it returns 404

Local media thumbnails are generated and retrieved just fine. For example, I'm trying to retrieve a media with id zOUYnsGQKLQypwBoQkRpnkXh and corresponding filesystem id WFzEWtlvXlCaZShcaIthgzrG. The media itself is stored at /var/lib/matrix-synapse/media/bc/remote_content/matrix.org/WF/zE/WtlvXlCaZShcaIthgzrG and I can retrieve it from my server. The thumbnail should be located at /var/lib/matrix-synapse/media/bc/remote_thumbnail/matrix.org/WF/zE/WtlvXlCaZShcaIthgzrG/30-30-image-png but it's actually stored at /var/lib/matrix-synapse/media/bc/remote_thumbnail/matrix.org/zO/UY/nsGQKLQypwBoQkRpnkXh/30-30-image-png. Why do we even need two identifiers? It confuses things pretty much (it's hard to find an offending file in the storage if you know its id without looking it up in the database) and doesn't add any security.

Version information

  • Homeserver: own server
  • Version: 1.3.1+buster1

  • Install method: apt repository

  • Platform: Debian stable amd64
@rkfg
Copy link
Contributor Author

rkfg commented Aug 27, 2019

This seems to fix it:

diff --git a/synapse/rest/media/v1/media_repository.py b/synapse/rest/media/v1/media_repository.py
index cf5759e9a6..28073dec4c 100644
--- a/synapse/rest/media/v1/media_repository.py
+++ b/synapse/rest/media/v1/media_repository.py
@@ -526,7 +526,7 @@ def generate_remote_exact_thumbnail(
             try:
                 file_info = FileInfo(
                     server_name=server_name,
-                    file_id=media_id,
+                    file_id=file_id,
                     thumbnail=True,
                     thumbnail_width=t_width,
                     thumbnail_height=t_height,

@richvdh
Copy link
Member

richvdh commented Aug 27, 2019

thanks @rkfg. Do you know if this is a recent regression, or if it has always always been a problem?

Would you be able to make a PR for your patch? (A regression test would be very helpful too, though I appreciate that may need more time than you have.)

@rkfg
Copy link
Contributor Author

rkfg commented Aug 27, 2019

Eh, no idea if it's recent but I noticed that thumbnails didn't load after I cleaned up the cache and database. Didn't have time to dig deeper back then and enabling dynamic thumbnails "fixed" it. But recently I noticed that avatars take too much time to load like 1-1.5 minutes (if they were not cached) so I decided to take a look. I suspected S3 being slow at first but then noticed a big CPU usage and in the end solved it.

I'd like to make a PR but it needs too much hassle for a one-liner so I'm not really excited to do that. Feel free to make that patch yourself, I don't even need to be mentioned. The fix is too trivial. Hope it will help you reduce the load on matrix.org.

@richvdh
Copy link
Member

richvdh commented Aug 27, 2019

I'd like to make a PR but it needs too much hassle for a one-liner so I'm not really excited to do that.

understood, but that means it has to wait for someone else to have time to do it :(

@richvdh richvdh added z-bug (Deprecated Label) good first issue labels Aug 27, 2019
@richvdh
Copy link
Member

richvdh commented Aug 27, 2019

Why do we even need two identifiers? It confuses things pretty much (it's hard to find an offending file in the storage if you know its id without looking it up in the database) and doesn't add any security.

I think basically because the media_id is made up by the remote server, and it's not necessarily a given that it contains only characters which are valid on the local filesystem. @erikjohnston: this seems to have come from #2767: do you remember the logic?

@richvdh
Copy link
Member

richvdh commented Aug 27, 2019

I think basically because the media_id is made up by the remote server, and it's not necessarily a given that it contains only characters which are valid on the local filesystem. @erikjohnston: this seems to have come from #2767: do you remember the logic?

Oh, no. It's because each piece of media can have several thumbnails, and each one needs its own filesystem_id.

(edit: this is bogus)

@erikjohnston
Copy link
Member

Why do we even need two identifiers? It confuses things pretty much (it's hard to find an offending file in the storage if you know its id without looking it up in the database) and doesn't add any security.

I think basically because the media_id is made up by the remote server, and it's not necessarily a given that it contains only characters which are valid on the local filesystem. @erikjohnston: this seems to have come from #2767: do you remember the logic?

I think it was mostly that, possibly also so that we could decide where we wanted it on disk (e.g. dedupe the media, or something)

@richvdh
Copy link
Member

richvdh commented Aug 27, 2019

@rkfg I'm very confused here. I agree that something looks bogus, but I cannot reproduce the issue at all. Any hints?

@richvdh
Copy link
Member

richvdh commented Aug 27, 2019

(I have dynamic_thumbnails set to false, but my server seems quite happy to serve thumbnails of remote media)

@richvdh
Copy link
Member

richvdh commented Aug 27, 2019

As far as I can tell, the file that you suggest patching (generate_remote_exact_thumbnail) is only called when dynamic_thumbnails is turned on, so the fix doesn't seem to match the described symptoms?

@rkfg
Copy link
Contributor Author

rkfg commented Aug 27, 2019

@rkfg I'm very confused here. I agree that something looks bogus, but I cannot reproduce the issue at all. Any hints?

Do the thumbnail files appear at the correct location? Maybe there were changes between my version and current develop, I can't really test. Is this happening on 1.3.1?

@richvdh
Copy link
Member

richvdh commented Aug 27, 2019

yes, they appear at the location given by the filesystem_id.

I'm testing on current develop, but I don't think this code has been touched in months.

@rkfg
Copy link
Contributor Author

rkfg commented Aug 27, 2019

As far as I can tell, the file that you suggest patching (generate_remote_exact_thumbnail) is only called when dynamic_thumbnails is turned on, so the fix doesn't seem to match the described symptoms?

It's called when that option is off for me, I cleared the remote media tables and removed the files after applying the fix and it worked. The thumbnails were stored under the wrong names before, I checked the timestamps to make sure they're fresh. Does this fix break them for you then? Something has to change after applying it!

@richvdh
Copy link
Member

richvdh commented Aug 27, 2019

I'm just reluctant to change code without understanding what we're changing.

generate_remote_exact_thumbnail is only called by ThumbnailResource._select_or_generate_remote_thumbnail (https://github.com/matrix-org/synapse/blob/develop/synapse/rest/media/v1/thumbnail_resource.py#L189), which is only called by _async_render_GET (https://github.com/matrix-org/synapse/blob/develop/synapse/rest/media/v1/thumbnail_resource.py#L73), and only when dynamic_thumbnails is on.

Is it possible that the thumbnails were generated in the wrong place when dynamic_thumbnails was enabled?

@rkfg
Copy link
Contributor Author

rkfg commented Aug 27, 2019

It's possible, yes. Now that I think about it I really didn't test that case. When there's a database record already for a certain file it won't be regenerated even if the actual file is missing, correct? Then I already had those incorrectly named thumbnails and nuking the tables and data hid the issue.

@rkfg
Copy link
Contributor Author

rkfg commented Aug 28, 2019

I fixed the issue description and steps to reproduce.

@rkfg rkfg changed the title Remote thumbnails are stored to the wrong location Remote dynamic thumbnails are stored to the wrong location Aug 28, 2019
@rkfg rkfg changed the title Remote dynamic thumbnails are stored to the wrong location Remote dynamic thumbnails are stored at the wrong location Aug 28, 2019
@richvdh
Copy link
Member

richvdh commented Sep 2, 2019

hopefully fixed by #5915 then. Thanks all.

@richvdh richvdh closed this as completed Sep 2, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
z-bug (Deprecated Label)
Projects
None yet
Development

No branches or pull requests

3 participants