Fix lfs_file_rawseek performance issue #632

robekras · 2022-01-16T19:52:31Z

This should fix the performance issue if a new seek position belongs to currently cached data.
This avoids unnecessary rereads of file data.

See issue #631

geky · 2022-04-09T05:49:30Z

Hi @robekras, thanks for putting this optimization together, will bring this in next minor release.

lfs.c

geky · 2022-04-09T07:09:07Z

Back to performance. As I wrote in issue #631: This change will only fix the case where the new position is after the old/previous position.

Ah sorry, I missed this and taking a second look it looks like there's some issues.

I had assumed CI had passed but it looks like GitHub Actions broke again :/

I'm a bit nervous that this solution make not work in the general case, since we are skipping that file_flush, but CI should quickly determine if that's a problem. I've gone ahead and touched your commit to retrigger CI, sorry if that causes any issues.

I think the first bytes of the cached block can contain some other date (block link data?).
This can only be fixed, if we would have an additional variable within the administrative structure which hold the start offset of the real file data?

Hmm, this would be a better way to enable this. In theory you can compute the current block's offset from the position in the file with lfs_ctz_index:

lfs_off_t off = npos;
int index = lfs_ctz_index(lfs, &off);
// off is now the offset in the current block

// index is the block number in the list, but that's not important here

This should fix the performance issue if a new seek position belongs to currently cached data. This avoids unnecessary rereads of file data.

geky · 2022-04-09T07:13:26Z

Ok apparently GitHub Actions don't run if the PR is not up to date with master. That's... new and annoying...

robekras · 2022-04-09T09:43:58Z

I think I'm a little been lost now with all the messages here ...

I'm not an expert in working with github, and it would be no problem for me, if I'm not mentioned as contributor.

BTW, the mentioned performance issue is here: #667. I hope it is understandable

geky · 2022-04-10T06:42:54Z

Hi @robekras, I went ahead and added a commit that implements the full idea outlined in #631 (comment) in 4484165, that is, lfs_file_seek should now avoid flushes when reading and the offset is in the cache.

This mostly involved lfs_ctz_index and some undocumented subtleties of the file structure/cache in order to pass the tests (though I haven't ran the full CI on it yet).

It's worth noting that this optimization only works when reading. When writing a file we always need to flush the cache since we need to make sure we writing out any pending data to disk. Any "holes" created by not-flushing the cache would corrupt the underlying data that we should be merging with our writes.

Would you be able to check that this provides the same performance benefit in your application with lvgl?

I know this is functionally correct, thanks to testing, but don't have good performance measurements yet.

The basic idea is simple, if we seek to a position in the currently loaded cache, don't flush the cache. Notably this ensures that seek is always as fast or faster than just reading the data. This is a bit tricky since we need to check that our new block and offset match the cache, fortunately we can skip the block check by reevaluating the block index for both the current and new positions. Note this only works whene reading, for writing we need to always flush the cache, or else we will lose the pending write data.

geky · 2022-04-11T02:52:34Z

Thanks for the PR!

robekras changed the title ~~Update lfs.c~~ Fix lfs_file_rawseek performance issue Jan 17, 2022

geky added performance next minor labels Mar 21, 2022

geky added this to the v2.5 milestone Apr 9, 2022

geky reviewed Apr 9, 2022

View reviewed changes

lfs.c Outdated Show resolved Hide resolved

geky force-pushed the patch-1 branch from 54d2893 to 48f111a Compare April 9, 2022 07:04

Update lfs.c

a6f01b7

This should fix the performance issue if a new seek position belongs to currently cached data. This avoids unnecessary rereads of file data.

geky force-pushed the patch-1 branch from 48f111a to a6f01b7 Compare April 9, 2022 07:12

geky changed the base branch from master to devel April 9, 2022 17:41

geky force-pushed the patch-1 branch from 4484165 to 425dc81 Compare April 10, 2022 17:47

geky added the v2.5 label Apr 10, 2022

geky merged commit a94fbda into littlefs-project:devel Apr 11, 2022

geky mentioned this pull request Apr 11, 2022

Minor release: v2.5 #669

Merged

robekras mentioned this pull request May 13, 2023

lfs_file_seek() poor performance when seeking backward by one byte with no apparent reason #810

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix lfs_file_rawseek performance issue #632

Fix lfs_file_rawseek performance issue #632

robekras commented Jan 16, 2022

geky commented Apr 9, 2022

geky commented Apr 9, 2022

geky commented Apr 9, 2022

robekras commented Apr 9, 2022

geky commented Apr 10, 2022

geky commented Apr 11, 2022

Fix lfs_file_rawseek performance issue #632

Fix lfs_file_rawseek performance issue #632

Conversation

robekras commented Jan 16, 2022

geky commented Apr 9, 2022

geky commented Apr 9, 2022

geky commented Apr 9, 2022

robekras commented Apr 9, 2022

geky commented Apr 10, 2022

geky commented Apr 11, 2022