-
Notifications
You must be signed in to change notification settings - Fork 794
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix lfs_file_rawseek performance issue #632
Conversation
Hi @robekras, thanks for putting this optimization together, will bring this in next minor release. |
Ah sorry, I missed this and taking a second look it looks like there's some issues. I had assumed CI had passed but it looks like GitHub Actions broke again :/ I'm a bit nervous that this solution make not work in the general case, since we are skipping that file_flush, but CI should quickly determine if that's a problem. I've gone ahead and touched your commit to retrigger CI, sorry if that causes any issues.
Hmm, this would be a better way to enable this. In theory you can compute the current block's offset from the position in the file with lfs_off_t off = npos;
int index = lfs_ctz_index(lfs, &off);
// off is now the offset in the current block
// index is the block number in the list, but that's not important here |
This should fix the performance issue if a new seek position belongs to currently cached data. This avoids unnecessary rereads of file data.
Ok apparently GitHub Actions don't run if the PR is not up to date with master. That's... new and annoying... |
I think I'm a little been lost now with all the messages here ... I'm not an expert in working with github, and it would be no problem for me, if I'm not mentioned as contributor. BTW, the mentioned performance issue is here: #667. I hope it is understandable |
Hi @robekras, I went ahead and added a commit that implements the full idea outlined in #631 (comment) in 4484165, that is, This mostly involved It's worth noting that this optimization only works when reading. When writing a file we always need to flush the cache since we need to make sure we writing out any pending data to disk. Any "holes" created by not-flushing the cache would corrupt the underlying data that we should be merging with our writes. Would you be able to check that this provides the same performance benefit in your application with lvgl? I know this is functionally correct, thanks to testing, but don't have good performance measurements yet. |
The basic idea is simple, if we seek to a position in the currently loaded cache, don't flush the cache. Notably this ensures that seek is always as fast or faster than just reading the data. This is a bit tricky since we need to check that our new block and offset match the cache, fortunately we can skip the block check by reevaluating the block index for both the current and new positions. Note this only works whene reading, for writing we need to always flush the cache, or else we will lose the pending write data.
Thanks for the PR! |
This should fix the performance issue if a new seek position belongs to currently cached data.
This avoids unnecessary rereads of file data.
See issue #631