Skip to content

Commit

Permalink
Optimization of the rename case.
Browse files Browse the repository at this point in the history
Rename can be VERY time consuming. One of the reasons is the 4 recursion
level depth of lfs_dir_traverse() seen if a compaction happened during the
rename.

lfs_dir_compact()
  size computation
    [1] lfs_dir_traverse(cb=lfs_dir_commit_size)
         - do 'duplicates and tag update'
       [2] lfs_dir_traverse(cb=lfs_dir_traverse_filter, data=tag[1])
           - Reaching a LFS_FROM_MOVE tag (here)
         [3] lfs_dir_traverse(cb=lfs_dir_traverse_filter, data=tag[1]) <= on 'from' dir
             - do 'duplicates and tag update'
           [4] lfs_dir_traverse(cb=lfs_dir_traverse_filter, data=tag[3])
  followed by the compaction itself:
    [1] lfs_dir_traverse(cb=lfs_dir_commit_commit)
         - do 'duplicates and tag update'
       [2] lfs_dir_traverse(cb=lfs_dir_traverse_filter, data=tag[1])
           - Reaching a LFS_FROM_MOVE tag (here)
         [3] lfs_dir_traverse(cb=lfs_dir_traverse_filter, data=tag[1]) <= on 'from' dir
             - do 'duplicates and tag update'
           [4] lfs_dir_traverse(cb=lfs_dir_traverse_filter, data=tag[3])

Yet, analyse shows that levels [3] and [4] don't perform anything
if the callback is lfs_dir_traverse_filter...

A practical example:

- format and mount a 4KB block FS
- create 100 files of 256 Bytes named "/dummy_%d"
- create a 1024 Byte file "/test"
- rename "/test" "/test_rename"
- create a 1024 Byte file "/test"
- rename "/test" "/test_rename"
This triggers a compaction where lfs_dir_traverse was called 148393 times,
generating 25e6+ lfs_bd_read calls (~100 MB+ of data)

With the optimization, lfs_dir_traverse is now called 3248 times
(589e3 lfs_bds_calls (~2.3MB of data)

=> x 43 improvement...
  • Loading branch information
invoxiaamo authored and geky committed Apr 10, 2022
1 parent 9c7e232 commit c2fa1bb
Showing 1 changed file with 33 additions and 0 deletions.
33 changes: 33 additions & 0 deletions lfs.c
Original file line number Diff line number Diff line change
Expand Up @@ -818,6 +818,39 @@ static int lfs_dir_traverse(lfs_t *lfs,
} else if (lfs_tag_type3(tag) == LFS_FROM_MOVE) {
uint16_t fromid = lfs_tag_size(tag);
uint16_t toid = lfs_tag_id(tag);
// There is a huge room for simple optimization for the rename case
// where we can see up to 4 levels of lfs_dir_traverse recursions
// when compaction happened (for example):
//
// >lfs_dir_compact
// [1] lfs_dir_traverse(cb=lfs_dir_commit_size)
// - do 'duplicates and tag update'
// [2] lfs_dir_traverse(cb=lfs_dir_traverse_filter, data=tag[1])
// - Reaching a LFS_FROM_MOVE tag (here)
// [3] lfs_dir_traverse(cb=lfs_dir_traverse_filter,
// data=tag[1]) <= on 'from' dir
// - do 'duplicates and tag update'
// [4] lfs_dir_traverse(cb=lfs_dir_traverse_filter,
// data=tag[3])
//
// Yet, for LFS_FROM_MOVE when cb == lfs_dir_traverse_filter
// traverse [3] and [4] don't do anything:
// - if [2] is supposed to match 'toid' for duplication, a preceding
// ERASE or CREATE with the same tag id will already have stopped
// the search.
// - if [2] is here to update tag value of CREATE/DELETE attr found
// during the scan, since [3] is looking for LFS_TYPE_STRUCT only
// and call lfs_dir_traverse_filter with LFS_TYPE_STRUCT attr
// wheras lfs_dir_traverse_filter only modify tag on CREATE or
// DELETE. Consequently, cb called from [4] will never stop the
// search from [2].
// - [4] may call lfs_dir_traverse_filter, but with action on a
// tag[3] pointer completely different from tag[1]
if (cb == lfs_dir_traverse_filter) {
continue;
}

// note: buffer = oldcwd dir
int err = lfs_dir_traverse(lfs,
buffer, 0, 0xffffffff, NULL, 0,
LFS_MKTAG(0x600, 0x3ff, 0),
Expand Down

0 comments on commit c2fa1bb

Please sign in to comment.