More efficient file storage #84

btrask · 2015-08-09T21:33:40Z

Surprisingly, our submission bottleneck is the file system, not the database.

Process for adding a file:

Write it to a temporary location
fsync the file
Atomic rename (actually link(2)) to final location
Open the parent directory
fsync the directory
Close the directory

We don't do any batching and can't really because the directories accessed are random (first byte of the hash).

Surprisingly again, there's no reason for our on-disk file representation to actually use content addresses. Once we look up the file info in the database, which we always have to do anyway, we might as well use the file ID or some other sequential ID to access the file.

I'm thinking we could write files to a spread of ~100 directories in batches of 1000. So files 1-1000 go in directory 1, 1001-2000 in directory 2, etc. Then it wraps back around so 10,001-11,000 are back in directory 1 again.

For batch submissions (hopefully most submissions, depending on #1), this should cut the syscall overhead from like 5 down to 2, plus 3 for the whole batch. And the number of fsyncs would be cut from 2 per file to 1.

The text was updated successfully, but these errors were encountered:

btrask · 2015-08-23T22:37:19Z

A problem with the above idea of using sequential IDs instead of hashes for internal storage is that we can end up with junk in the file system when a transaction rolls back. Coordinating transactions with the file system so they both get rolled back atomically is hard.

btrask · 2015-08-23T23:22:56Z

BTW there is also the idea of storing small files directly in the database. I don't know the exact tipping point where it becomes worth it, but most file systems use 4K blocks and LSM trees suffer with large blobs during compaction. A reasonable threshold might be anywhere between 128B and 2KB.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

More efficient file storage #84

More efficient file storage #84

btrask commented Aug 9, 2015

btrask commented Aug 23, 2015

btrask commented Aug 23, 2015

More efficient file storage #84

More efficient file storage #84

Comments

btrask commented Aug 9, 2015

btrask commented Aug 23, 2015

btrask commented Aug 23, 2015