Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix lfs_file_rawseek performance issue #632

Merged
merged 2 commits into from
Apr 11, 2022
Merged

Conversation

robekras
Copy link
Contributor

This should fix the performance issue if a new seek position belongs to currently cached data.
This avoids unnecessary rereads of file data.

See issue #631

@robekras robekras changed the title Update lfs.c Fix lfs_file_rawseek performance issue Jan 17, 2022
@geky geky added this to the v2.5 milestone Apr 9, 2022
@geky
Copy link
Member

geky commented Apr 9, 2022

Hi @robekras, thanks for putting this optimization together, will bring this in next minor release.

lfs.c Outdated Show resolved Hide resolved
@geky
Copy link
Member

geky commented Apr 9, 2022

Back to performance. As I wrote in issue #631: This change will only fix the case where the new position is after the old/previous position.

Ah sorry, I missed this and taking a second look it looks like there's some issues.

I had assumed CI had passed but it looks like GitHub Actions broke again :/

I'm a bit nervous that this solution make not work in the general case, since we are skipping that file_flush, but CI should quickly determine if that's a problem. I've gone ahead and touched your commit to retrigger CI, sorry if that causes any issues.


I think the first bytes of the cached block can contain some other date (block link data?).
This can only be fixed, if we would have an additional variable within the administrative structure which hold the start offset of the real file data?

Hmm, this would be a better way to enable this. In theory you can compute the current block's offset from the position in the file with lfs_ctz_index:

lfs_off_t off = npos;
int index = lfs_ctz_index(lfs, &off);
// off is now the offset in the current block

// index is the block number in the list, but that's not important here

This should fix the performance issue if a new seek position belongs to currently cached data.
This avoids unnecessary rereads of file data.
@geky
Copy link
Member

geky commented Apr 9, 2022

Ok apparently GitHub Actions don't run if the PR is not up to date with master. That's... new and annoying...

@robekras
Copy link
Contributor Author

robekras commented Apr 9, 2022

I think I'm a little been lost now with all the messages here ...

I'm not an expert in working with github, and it would be no problem for me, if I'm not mentioned as contributor.

BTW, the mentioned performance issue is here: #667. I hope it is understandable

@geky geky changed the base branch from master to devel April 9, 2022 17:41
@geky
Copy link
Member

geky commented Apr 10, 2022

Hi @robekras, I went ahead and added a commit that implements the full idea outlined in #631 (comment) in 4484165, that is, lfs_file_seek should now avoid flushes when reading and the offset is in the cache.

This mostly involved lfs_ctz_index and some undocumented subtleties of the file structure/cache in order to pass the tests (though I haven't ran the full CI on it yet).

It's worth noting that this optimization only works when reading. When writing a file we always need to flush the cache since we need to make sure we writing out any pending data to disk. Any "holes" created by not-flushing the cache would corrupt the underlying data that we should be merging with our writes.


Would you be able to check that this provides the same performance benefit in your application with lvgl?

I know this is functionally correct, thanks to testing, but don't have good performance measurements yet.

The basic idea is simple, if we seek to a position in the currently
loaded cache, don't flush the cache. Notably this ensures that seek is
always as fast or faster than just reading the data.

This is a bit tricky since we need to check that our new block and
offset match the cache, fortunately we can skip the block check by
reevaluating the block index for both the current and new positions.

Note this only works whene reading, for writing we need to always flush
the cache, or else we will lose the pending write data.
@geky geky added the v2.5 label Apr 10, 2022
@geky geky merged commit a94fbda into littlefs-project:devel Apr 11, 2022
@geky
Copy link
Member

geky commented Apr 11, 2022

Thanks for the PR!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants