Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent hitting the directory limit when creating image previews (#13091) #16747

Closed
wants to merge 4 commits into from
Closed

Conversation

gbmaster
Copy link

@gbmaster gbmaster commented Aug 15, 2019

Fix #13091 #14169

When creating image previews, the previous code was creating a folder in the "preview" folder using the file ID as a name. When the number of folders exceeds 32767, it is impossible to create further preview on some filesystems because of the "Too many links" error.
I changed the code in order to split the file ID into digits and using them to create a tree of folders, preventing the hit of the directory limit. This fixes the issue experienced by @giraudeo on FreeBSD and by me on OpenBSD.

Please consider this pull request.

)

Signed-off-by: Aldo Mazzeo <aldo.mazzeo@gmail.com>
@kesselb kesselb added this to the Nextcloud 18 milestone Aug 15, 2019
@kesselb kesselb added 3. to review Waiting for reviews enhancement labels Aug 15, 2019
@kesselb kesselb requested a review from rullzer August 15, 2019 10:36
Signed-off-by: Aldo Mazzeo <aldo.mazzeo@gmail.com>
Signed-off-by: Aldo Mazzeo <aldo.mazzeo@gmail.com>
@gbmaster
Copy link
Author

@kesselb looks like the CI failed, but I don't think that it's related to my changes: what do you think about it?

@kesselb
Copy link
Contributor

kesselb commented Sep 30, 2019

Index: tests/lib/Preview/GeneratorTest.php
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
--- tests/lib/Preview/GeneratorTest.php	(revision cf37455b0c67f099182414e146343f0e323dd7f4)
+++ tests/lib/Preview/GeneratorTest.php	(date 1569879289787)
@@ -90,7 +90,7 @@
 
 		$previewFolder = $this->createMock(ISimpleFolder::class);
 		$this->appData->method('getFolder')
-			->with($this->equalTo(42))
+			->with($this->equalTo('a/1/d/0/c/6/e'))
 			->willReturn($previewFolder);
 
 		$maxPreview = $this->createMock(ISimpleFile::class);
@@ -138,11 +138,14 @@
 
 		$previewFolder = $this->createMock(ISimpleFolder::class);
 		$this->appData->method('getFolder')
-			->with($this->equalTo(42))
+			->withConsecutive(
+				[$this->equalTo('a/1/d/0/c/6/e')],
+				[$this->equalTo(42)]
+			)
 			->willThrowException(new NotFoundException());
 
 		$this->appData->method('newFolder')
-			->with($this->equalTo(42))
+			->with($this->equalTo('a/1/d/0/c/6/e'))
 			->willReturn($previewFolder);
 
 		$this->config->method('getSystemValue')
@@ -301,7 +304,7 @@
 
 		$previewFolder = $this->createMock(ISimpleFolder::class);
 		$this->appData->method('getFolder')
-			->with($this->equalTo(42))
+			->with($this->equalTo('a/1/d/0/c/6/e'))
 			->willReturn($previewFolder);
 
 		$previewFolder->method('getDirectoryListing')
@@ -388,7 +391,7 @@
 
 		$previewFolder = $this->createMock(ISimpleFolder::class);
 		$this->appData->method('getFolder')
-			->with($this->equalTo(42))
+			->with($this->equalTo('a/1/d/0/c/6/e'))
 			->willReturn($previewFolder);
 
 		$maxPreview = $this->createMock(ISimpleFile::class);

@gbmaster the tests need to be adjusted for the new folder ids.

@gbmaster
Copy link
Author

gbmaster commented Oct 1, 2019

@kesselb the remaining tests are failing mostly because the amount of files in the preview folder is different from before. It is failing at

/drone/src/tests/lib/Preview/BackgroundCleanupJobTest.php:121
/drone/src/tests/lib/Preview/BackgroundCleanupJobTest.php:143

is there any elegant way on how to fix these? I am failing to find one, at the moment.

@kesselb
Copy link
Contributor

kesselb commented Oct 3, 2019

I am failing to find one, at the moment.

Me too 🙈

Signed-off-by: Aldo Mazzeo <aldo.mazzeo@gmail.com>
@gbmaster
Copy link
Author

@kesselb I managed to write a function that returns the amount of files a preview has been generated for. However, BackgroundCleanupJob is still failing in deleting the previews when its file is removed. I tried to fix the getFolder statement, but it seems to me that the query should be adjusted as well.
I am not confident with the NC database, so I'm kinda flying blind here: do you have any suggestion on how this code should be adjusted?

@pousterlus
Copy link

Hello,

I'm facing the same issue there : Too many links (FreeBSD with "freebsd-ufs" filesystem).
I don't understand if you need someone to test your fix out, if so I would be happy to help you help us :)

Regards,

--
Leo.

@gbmaster
Copy link
Author

gbmaster commented Nov 4, 2019

Hello,

I'm facing the same issue there : Too many links (FreeBSD with "freebsd-ufs" filesystem).
I don't understand if you need someone to test your fix out, if so I would be happy to help you help us :)

Regards,

--
Leo.

Hi @pousterlus
I'm trying to enhance my fix, as it's still failing to correctly handle the case when files are deleted.

@pousterlus
Copy link

Understood !
I'm surprised there's not so much people complaining about this "bug" : It seems to me that for a cloud solution, 32k previews isn't so big.

--
Léo.

@kesselb
Copy link
Contributor

kesselb commented Nov 4, 2019

I'm surprised there's not so much people complaining about this "bug" : It seems to me that for a cloud solution, 32k previews isn't so big.

https://stackoverflow.com/a/466596

@pousterlus
Copy link

pousterlus commented Nov 4, 2019

I'm surprised there's not so much people complaining about this "bug" : It seems to me that for a cloud solution, 32k previews isn't so big.

https://stackoverflow.com/a/466596

Yeah, obviously, I wasn't talking about the filesystems limitations, but I meant Nextcloud's implementation preview system, as a cloud solution is likely to store years of unnecessary pictures and backups.

--
Léo.

@kesselb
Copy link
Contributor

kesselb commented Nov 4, 2019

Yeah, obviously, I wasn't talking about the filesystems limitations, but I meant Nextcloud's implementation preview system, as a cloud solution is likely to store years of unnecessary pictures and backups.

How is that related to this pull request? This pull requests puts previews into subfolders to workaround the 32k preview limit on ufs. Does not happen on ext3/ext4/etc...

@pousterlus
Copy link

How is that related to this pull request? This pull requests puts previews into subfolders to workaround the 32k preview limit on ufs. Does not happen on ext3/ext4/etc...

Now I understand it's not a Nextcloud bug but a filesystem limitation I wasn't aware of.
That saif, I'm not the only one having ufs as a filesystem, and if the this portion of code is precisely designed to "Prevent hitting the directory limit when creating image previews" on ufs filesystems, I think it's worthwhile to take a look at it.

--
Léo.

@kesselb
Copy link
Contributor

kesselb commented Nov 7, 2019

as it's still failing to correctly handle the case when files are deleted.

Unfortunately 😞 It's somehow related to the way how we detect previews of deleted files.


sqlite> SELECT "a"."name", "a".* FROM "oc_filecache" "a" LEFT JOIN "oc_filecache" "b" ON "a"."name" = "b"."fileid" WHERE ("b"."fileid" IS NULL) AND ("a"."parent" = 1680);
0|1681|36|appdata_ocpkufxtc5ts/preview/0|dde18ba4e6977ea7a38467e4eb9c4ab0|1680|0|2|1|-1|1573153930|1573153930|0|0|5dc46c8ad465e|31|
2|1733|36|appdata_ocpkufxtc5ts/preview/2|83deedd67bf687c12e3af53db566d9fd|1680|2|2|1|-1|1573153932|1573153932|0|0|5dc46c8c4a17e|31|
6|1751|36|appdata_ocpkufxtc5ts/preview/6|2a3607f616e3e9bef4865395e4fa8a29|1680|6|2|1|-1|1573153932|1573153932|0|0|5dc46c8cbf9b3|31|
7|1690|36|appdata_ocpkufxtc5ts/preview/7|1c5cbb7057a0c2fabf04726b9f6468ef|1680|7|2|1|-1|1573153931|1573153931|0|0|5dc46c8b1fbf6|31|
8|1724|36|appdata_ocpkufxtc5ts/preview/8|e46321a7abf6c7f4b428ee9534917f88|1680|8|2|1|-1|1573153932|1573153932|0|0|5dc46c8c10da6|31|
9|1699|36|appdata_ocpkufxtc5ts/preview/9|7790a9dc8f30a61b0e9f26d66f61ac3a|1680|9|2|1|-1|1573153931|1573153931|0|0|5dc46c8b581a2|31|
b|1742|36|appdata_ocpkufxtc5ts/preview/b|b2b754d42a5ea7d9a7ef17a6457d183b|1680|b|2|1|-1|1573153932|1573153932|0|0|5dc46c8c851ee|31|
e|1768|36|appdata_ocpkufxtc5ts/preview/e|9b13c7364895895c755082102d6a5504|1680|e|2|1|-1|1573153933|1573153933|0|0|5dc46c8d40071|31|

Somehow it tries to delete the first folder instead of the matching preview. I have no idea right now how to traverse from the first folder to the matching preview file. It's a "convention" that the folder name is the id of the source file.

@gbmaster
Copy link
Author

gbmaster commented Nov 7, 2019

Yeah, @kesselb . I think we're very close... the query needs to be readjusted somehow... I'm still struggling to find a decent solution :(

@kesselb
Copy link
Contributor

kesselb commented Nov 7, 2019

I think we need to include the original fileid again:

implode('/', str_split(substr(md5($file->getId()), 0, 6))) . '/' . $file->getId();

Then change the above query to select every row that matches appdata_(\w*)/preview(?>/.){6}/(?<fileid>.*)/ (e.g appdata_ocpkufxtc5ts/preview/5/4/f/f/9/e/9534/4096-4096-max.png). Take the fileid to join oc_filecache as b again and return the rows where b.fileid is null.

I don't know if this is possible with sql (and works for sqlite, oracle, postgres and mysql) and is somehow performant.

This was referenced Dec 11, 2019
@rullzer rullzer mentioned this pull request Dec 19, 2019
18 tasks
@rullzer rullzer modified the milestones: Nextcloud 18, Nextcloud 19 Dec 19, 2019
@rullzer
Copy link
Member

rullzer commented Dec 19, 2019

I have had time to think about this. But we should abstract more. I will tackel this once we branch off. If it is not to intrusive we can backport.

@pousterlus

This comment has been minimized.

@SpamReceiver
Copy link
Contributor

SpamReceiver commented Jan 8, 2020

How is that related to this pull request? This pull requests puts previews into subfolders to workaround the 32k preview limit on ufs. Does not happen on ext3/ext4/etc...

@kesselb
In that thread, there's someone who's facing this issue with ext'4 filesystem...
rullzer/previewgenerator#121

--
Léo.

This comment is not off-topic because ext3/ext4 may have a limit on the number of sub-directories, too.
In effect, the same problem is caused as reported in the original post.

@kesselb
Copy link
Contributor

kesselb commented Jan 8, 2020

This comment is not off-topic because ext3/ext4 may have a limit on the number of sub-directories

Oh dear. We are already beyond this point. All previews in on folder is bad. That's probably also happening for FAT16, FAT32, NTFS, ReiserFS, ZFS, add your file system here. The solution you (and others) proposed is accepted. We are discussing a potential patch already (this pull request). Turns out putting previews in sub folders will break the current way how previews are removed if a file is deleted.

I will tackel this once we branch off. If it is not to intrusive we can backport.

He will look into if the development window for Nextcloud 19 starts. That's sometime after Nextcloud 18 has been released.

If you have the same issue please use 👍 (on the first post). The solution will put previews in sub folders. That should work for most file systems.

Contributing to Nextcloud is appreciated. You can contribute in many ways like triaging issues, submitting patches, helping to specify features, suggest new features or develop apps. Check https://nextcloud.com/contribute/ for more details 😃

@SpamReceiver
Copy link
Contributor

Somehow it tries to delete the first folder instead of the matching preview. I have no idea right now how to traverse from the first folder to the matching preview file. It's a "convention" that the folder name is the id of the source file.

Just as an idea:
What about a new database table with two columns: fileid | preview-folder-id
This table can be checked in the backgound cleanup job when searching for unlinked previews.

Other idea:
React on the delete event immediately, don't wait for a cleanup job.

This was referenced Apr 4, 2020
@gbmaster
Copy link
Author

gbmaster commented Apr 9, 2020

Abandoning in favor of #19214

@gbmaster gbmaster closed this Apr 9, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3. to review Waiting for reviews enhancement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

FreeBSD UFS2 file per directory limit
6 participants