Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PR to run functional + perf tests on unpack_trees() optimization #16

Closed
wants to merge 4 commits into from
Closed

PR to run functional + perf tests on unpack_trees() optimization #16

wants to merge 4 commits into from

Commits on Aug 6, 2018

  1. unpack-trees: add performance tracing

    We're going to optimize unpack_trees() a bit in the following
    patches. Let's add some tracing to measure how long it takes before
    and after. This is the baseline ("git checkout -" on gcc.git, 80k
    files on worktree)
    
        0.018239226 s: read cache .git/index
        0.052541655 s: preload index
        0.001537598 s: refresh index
        0.168167768 s: unpack trees
        0.002897186 s: update worktree after a merge
        0.131661745 s: repair cache-tree
        0.075389117 s: write index, changed mask = 2a
        0.111702023 s: unpack trees
        0.000023245 s: update worktree after a merge
        0.111793866 s: diff-index
        0.587933288 s: git command: /home/pclouds/w/git/git checkout -
    
    Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
    pclouds authored and benpeart committed Aug 6, 2018
    Configuration menu
    Copy the full SHA
    f64cebe View commit details
    Browse the repository at this point in the history
  2. unpack-trees: optimize walking same trees with cache-tree

    In order to merge one or many trees with the index, unpack-trees code
    walks multiple trees in parallel with the index and performs n-way
    merge. If we find out at start of a directory that all trees are the
    same (by comparing OID) and cache-tree happens to be available for
    that directory as well, we could avoid walking the trees because we
    already know what these trees contain: it's flattened in what's called
    "the index".
    
    The upside is of course a lot less I/O since we can potentially skip
    lots of trees (think subtrees). We also save CPU because we don't have
    to inflate and the apply deltas. The downside is of course more
    fragile code since the logic in some functions are now duplicated
    elsewhere.
    
    "checkout -" with this patch on gcc.git:
    
        baseline      new
      --------------------------------------------------------------------
        0.018239226   0.019365414 s: read cache .git/index
        0.052541655   0.049605548 s: preload index
        0.001537598   0.001571695 s: refresh index
        0.168167768   0.049677212 s: unpack trees
        0.002897186   0.002845256 s: update worktree after a merge
        0.131661745   0.136597522 s: repair cache-tree
        0.075389117   0.075422517 s: write index, changed mask = 2a
        0.111702023   0.032813253 s: unpack trees
        0.000023245   0.000022002 s: update worktree after a merge
        0.111793866   0.032933140 s: diff-index
        0.587933288   0.398924370 s: git command: /home/pclouds/w/git/git
    
    Another measurement from Ben's running "git checkout" with over 500k
    trees (on the whole series):
    
        baseline        new
      ----------------------------------------------------------------------
        0.535510167     0.556558733     s: read cache .git/index
        0.3057373       0.3147105       s: initialize name hash
        0.0184082       0.023558433     s: preload index
        0.086910967     0.089085967     s: refresh index
        7.889590767     2.191554433     s: unpack trees
        0.120760833     0.131941267     s: update worktree after a merge
        2.2583504       2.572663167     s: repair cache-tree
        0.8916137       0.959495233     s: write index, changed mask = 28
        3.405199233     0.2710663       s: unpack trees
        0.000999667     0.0021554       s: update worktree after a merge
        3.4063306       0.273318333     s: diff-index
        16.9524923      9.462943133     s: git command: git.exe checkout
    
    This command calls unpack_trees() twice, the first time on 2way merge
    and the second 1way merge. In both times, "unpack trees" time is
    reduced to one third. Overall time reduction is not that impressive of
    course because index operations take a big chunk. And there's that
    repair cache-tree line.
    
    Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
    pclouds authored and benpeart committed Aug 6, 2018
    Configuration menu
    Copy the full SHA
    f9c2ff9 View commit details
    Browse the repository at this point in the history
  3. unpack-trees: reduce malloc in cache-tree walk

    This is a micro optimization that probably only shines on repos with
    deep directory structure. Instead of allocating and freeing a new
    cache_entry in every iteration, we reuse the last one and only update
    the parts that are new each iteration.
    
    Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
    pclouds authored and benpeart committed Aug 6, 2018
    Configuration menu
    Copy the full SHA
    e601365 View commit details
    Browse the repository at this point in the history
  4. unpack-trees: cheaper index update when walking by cache-tree

    With the new cache-tree, we could mostly avoid I/O (due to odb access)
    the code mostly becomes a loop of "check this, check that, add the
    entry to the index". We could skip a couple checks in this giant loop
    to go faster:
    
    - We know here that we're copying entries from the source index to the
      result one. All paths in the source index must have been validated
      at load time already (and we're not taking strange paths from tree
      objects) which means we can skip verify_path() without compromise.
    
    - We also know that D/F conflicts can't happen for all these entries
      (since cache-tree and all the trees are the same) so we can skip
      that as well.
    
    This gives rather nice speedups for "unpack trees" rows where "unpack
    trees" time is now cut in half compared to when
    traverse_by_cache_tree() is added, or 1/7 of the original "unpack
    trees" time.
    
       baseline      cache-tree    this patch
     --------------------------------------------------------------------
       0.018239226   0.019365414   0.020519621 s: read cache .git/index
       0.052541655   0.049605548   0.048814384 s: preload index
       0.001537598   0.001571695   0.001575382 s: refresh index
       0.168167768   0.049677212   0.024719308 s: unpack trees
       0.002897186   0.002845256   0.002805555 s: update worktree after a merge
       0.131661745   0.136597522   0.134891617 s: repair cache-tree
       0.075389117   0.075422517   0.074832291 s: write index, changed mask = 2a
       0.111702023   0.032813253   0.008616479 s: unpack trees
       0.000023245   0.000022002   0.000026630 s: update worktree after a merge
       0.111793866   0.032933140   0.008714071 s: diff-index
       0.587933288   0.398924370   0.380452871 s: git command: /home/pclouds/w/git/git
    
    Total saving of this new patch looks even less impressive, now that
    time spent in unpacking trees is so small. Which is why the next
    attempt should be on that "repair cache-tree" line.
    
    Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
    pclouds authored and benpeart committed Aug 6, 2018
    Configuration menu
    Copy the full SHA
    5141476 View commit details
    Browse the repository at this point in the history