Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pipelined Implementation of ZSTD_fast (~+5% Speed) #2749

Merged
merged 16 commits into from
Sep 9, 2021

Conversation

felixhandte
Copy link
Contributor

@felixhandte felixhandte commented Aug 17, 2021

This PR introduces a new implementation of the ZSTD_fast parser for single-segment compressions. This new match-finder achieves up to 5% speed improvements, and improves compression slightly on average.

Description

If you squint hard enough (and ignore repcodes), the search operation at any given position is broken into 4 stages:

  1. Hash (map position to hash value via input read)
  2. Lookup (map hash val to index via hashtable read)
  3. Load (map index to value at that position via input read)
  4. Compare (determine whether a match exists)

Each of these steps involves a memory read at an address which is computed from the previous step. This means that for each position, these steps must be sequenced and their latencies are cumulative.

Originally, ZSTD_fast simply did each step sequentially:

Pos | Time -->
----|-----------------
N   | 1234
N+1 |     1234
N+2 |         1234
N+3 |             1234

In #1562, @terrelln changed the implementation to work on two positions at a time:

Pos | Time -->
----|-----------------
N   | 1 2 3 4
N+1 |  1 2 3 4
N+2 |         1 2 3 4
N+3 |          1 2 3 4

This PR changes to a different strategy of parallelizing work, and does approximately the following:

R = Repcode Read & Compare
H = Hash Position
T = HashTable Lookup
M = Match Read & Compare

Pos | Time -->
----+-------------------
N   | ... M
N+1 | ...   TM
N+2 |    R H   T M
N+3 |         H    TM
N+4 |           R H   T M
N+5 |                H   ...
N+6 |                  R ...

This is very much analogous to the pipelining of execution in a CPU, and has the same benefits and drawbacks. This approach appears to more successfully parallelize read latencies.

However, just like a CPU, we have to dump the pipeline when we find a match (take a branch). When this happens, we throw away our current state, record the match, and then do the following prep to re-enter the loop:

Pos | Time -->
----+-------------------
N   | H T
N+1 |  H

This is also the work we do at the beginning to enter the loop initially.

In addition to this broad rearchitecture, various implementation details are tweaked to coax the best performance possible. This includes:

  • Rather than recalculate the next step size every time we need it, we keep track of it and simply increment it when we fail to find a match within a certain number of searches. This is equivalent to the previous approach, but slightly less work.
  • This also gives us a good infrequent checkpoint outside the core loop which ends up being a good opportunity to do some prefetching of the input.
  • The loop is manually unrolled so we do two positions per iteration. This lets us check our loop condition less frequently and lets us do a repcode search at every other position, like Nick's loop.

Parsing Differences

This PR parses slightly differently than the current strategy.

A big change is that the sensitivity is greatly increased to the acceleration factor derived from negative compression levels. In Nick's implementation, the step was applied only every other advance (each pair of searches in a loop iteration remained 1 byte apart). Whereas here, we return to the pre-#1562 behavior of applying the step between every search.

Benchmarks

As expected, given the above discussion, this is comparatively faster on less compressible inputs, but roughly on more compressible inputs, especially those with short matches. Here are some benchmarks:

As of 69b8ee9 ("Initial Pipelined Implementation for ZSTD_fast"):
Corpus      | gcc-10                    
            | ratio         | speed     
            |  dev  |  exp  | dev | exp 
------------+-------+-------+-----+-----
        -P5 | 1.250 | 1.246 | 789 | 903
       -P25 | 1.863 | 1.841 | 411 | 454
       -P50 | 3.170 | 3.152 | 452 | 464
       -P75 | 5.595 | 5.579 | 550 | 542
       -P95 | 9.662 | 9.660 | 649 | 633
silesia.tar | 2.884 | 2.888 | 355 | 352
------------+-------+-------+-----+-----
Corpus      | clang-11                  
            | ratio         | speed     
            |  dev  |  exp  | dev | exp 
------------+-------+-------+-----+-----
        -P5 | 1.250 | 1.246 | 786 | 882
       -P25 | 1.863 | 1.841 | 425 | 428
       -P50 | 3.170 | 3.152 | 465 | 436
       -P75 | 5.595 | 5.579 | 561 | 526
       -P95 | 9.662 | 9.660 | 650 | 627
silesia.tar | 2.884 | 2.888 | 363 | 336

(Benchmarked on an Intel Xeon E5-1650 v3 @ 3.50GHz.)

As of e2afc28 ("Nit: Only Store 2 Hash Variables"):
Corpus      | gcc-10                    |
            | ratio         | speed     |
            |  dev  |  exp  | dev | exp |
------------+-------+-------+-----+-----+
        -P5 | 1.250 | 1.246 | 789 | 905 |
       -P25 | 1.863 | 1.842 | 411 | 488 |
       -P50 | 3.170 | 3.152 | 452 | 487 |
       -P75 | 5.595 | 5.579 | 550 | 583 |
       -P95 | 9.662 | 9.660 | 649 | 667 |
silesia.tar | 2.884 | 2.887 | 355 | 369 |
     enwik9 | 2.798 | 2.804 | 301 | 301 |
------------+-------+-------+-----+-----+
Corpus      | clang-11                  |
            | ratio         | speed     |
            |  dev  |  exp  | dev | exp |
------------+-------+-------+-----+-----+
        -P5 | 1.250 | 1.246 | 786 | 888 |
       -P25 | 1.863 | 1.842 | 425 | 467 |
       -P50 | 3.170 | 3.152 | 465 | 481 |
       -P75 | 5.595 | 5.579 | 561 | 567 |
       -P95 | 9.662 | 9.660 | 650 | 650 |
silesia.tar | 2.884 | 2.887 | 363 | 356 |
     enwik9 | 2.798 | 2.804 | 299 | 289 |

(Benchmarked on an Intel Xeon E5-1650 v3 @ 3.50GHz.)

Corpus      | gcc-10                    |
            | ratio         | speed     |
            |  dev  |  exp  | dev | exp |
------------+-------+-------+-----+-----+
        -P5 | 1.250 | 1.246 | 222 | 233 |
       -P25 | 1.863 | 1.842 | 117 | 131 |
       -P50 | 3.170 | 3.152 | 130 | 136 |
       -P75 | 5.595 | 5.579 | 163 | 163 |
       -P95 | 9.662 | 9.660 | 201 | 196 |
silesia.tar | 2.884 | 2.887 | 113 | 109 |
     enwik8 | 2.455 | 2.459 |  83 |  80 |
------------+-------+-------+-----+-----+
Corpus      | clang-11                  |
            | ratio         | speed     |
            |  dev  |  exp  | dev | exp |
------------+-------+-------+-----+-----+
        -P5 | 1.250 | 1.246 | 213 | 224 |
       -P25 | 1.863 | 1.842 | 109 | 117 |
       -P50 | 3.170 | 3.152 | 121 | 123 |
       -P75 | 5.595 | 5.579 | 154 | 152 |
       -P95 | 9.662 | 9.660 | 197 | 189 |
silesia.tar | 2.884 | 2.887 | 107 | 102 |
     enwik8 | 2.455 | 2.459 |  78 |  74 |

(Benchmarked on a Raspberry Pi 4 aka "Broadcom BCM2711 SoC with a 1.5 GHz 64-bit quad-core ARM Cortex-A72 processor".)

As of 687c591 ("Tweak Step"):
Corpus      | gcc-10                           |
            | ratio         | speed            |
            |  dev  |  exp  | dev | exp | diff |
------------+-------+-------+-----+-----+------+
        -P5 | 1.250 | 1.246 | 789 | 916 | +16% |
       -P25 | 1.863 | 1.841 | 411 | 503 | +22% |
       -P50 | 3.170 | 3.152 | 452 | 516 | +14% |
       -P75 | 5.595 | 5.579 | 550 | 607 | +10% |
       -P95 | 9.662 | 9.662 | 649 | 685 | + 6% |
silesia.tar | 2.884 | 2.888 | 355 | 377 | + 6% |
     enwik9 | 2.798 | 2.802 | 301 | 315 | + 5% |
------------+-------+-------+-----+-----+------+
Corpus      | clang-11                         |
            | ratio         | speed            |
            |  dev  |  exp  | dev | exp | diff |
------------+-------+-------+-----+-----+------+
        -P5 | 1.250 | 1.246 | 786 | 900 | +15% |
       -P25 | 1.863 | 1.841 | 425 | 495 | +16% |
       -P50 | 3.170 | 3.152 | 465 | 510 | +10% |
       -P75 | 5.595 | 5.579 | 561 | 595 | + 6% |
       -P95 | 9.662 | 9.662 | 650 | 678 | + 4% |
silesia.tar | 2.884 | 2.888 | 363 | 372 | + 2% |
     enwik9 | 2.798 | 2.802 | 299 | 306 | + 2% |
------------+-------+-------+-----+-----+------+

(Benchmarked on an Intel Xeon E5-1650 v3 @ 3.50GHz.)

Corpus      | gcc-10                           |
            | ratio         | speed            |
            |  dev  |  exp  | dev | exp | diff |
------------+-------+-------+-----+-----+------+
        -P5 | 1.250 | 1.246 | 222 | 233 | + 5% |
       -P25 | 1.863 | 1.841 | 117 | 135 | +15% |
       -P50 | 3.170 | 3.152 | 130 | 139 | + 7% |
       -P75 | 5.595 | 5.579 | 163 | 168 | + 3% |
       -P95 | 9.662 | 9.662 | 201 | 199 | - 1% |
silesia.tar | 2.884 | 2.888 | 112 | 109 | - 3% |
     enwik8 | 2.455 | 2.459 |  83 |  80 | - 4% |
------------+-------+-------+-----+-----+------+
Corpus      | clang-11                         |
            | ratio         | speed            |
            |  dev  |  exp  | dev | exp | diff |
------------+-------+-------+-----+-----+------+
        -P5 | 1.250 | 1.246 | 213 | 224 | + 5% |
       -P25 | 1.863 | 1.841 | 109 | 129 | +18% |
       -P50 | 3.170 | 3.152 | 121 | 133 | +10% |
       -P75 | 5.595 | 5.579 | 154 | 161 | + 5% |
       -P95 | 9.662 | 9.662 | 197 | 193 | - 2% |
silesia.tar | 2.884 | 2.888 | 107 | 105 | - 2% |
     enwik8 | 2.455 | 2.459 |  78 |  77 | - 1% |
------------+-------+-------+-----+-----+------+

(Benchmarked on a Raspberry Pi 4 aka "Broadcom BCM2711 SoC with a 1.5 GHz 64-bit quad-core ARM Cortex-A72 processor".)

Extended Benchmark on Multiple Compilers, Levels, and Corpuses:
                         | Compression Speed        | Compression Ratio
Corpus      Compiler Lvl |   dev    exp     diff    |   dev    exp     diff
-------------------------+--------------------------+-------------------------
dickens     gcc-4.8   -1 |  207.8  209.3 ( +0.722%) |  1.763  1.765 ( +0.113%)
dickens     gcc-5     -1 |  201.2  210.3 ( +4.523%) |  1.763  1.765 ( +0.113%)
dickens     gcc-6     -1 |  199.9  213.1 ( +6.603%) |  1.763  1.765 ( +0.113%)
dickens     gcc-7     -1 |  207.6  204.4 ( -1.541%) |  1.763  1.765 ( +0.113%)
dickens     gcc-8     -1 |  206.9  211.2 ( +2.078%) |  1.763  1.765 ( +0.113%)
dickens     gcc-10    -1 |  201.1  208.1 ( +3.481%) |  1.763  1.765 ( +0.113%)
dickens     clang-6.0 -1 |  203.1  210.4 ( +3.594%) |  1.763  1.765 ( +0.113%)
dickens     clang-7   -1 |  206.2  207.0 ( +0.388%) |  1.763  1.765 ( +0.113%)
dickens     clang-8   -1 |  204.7  209.7 ( +2.443%) |  1.763  1.765 ( +0.113%)
dickens     clang-9   -1 |  208.9  214.8 ( +2.824%) |  1.763  1.765 ( +0.113%)
dickens     clang-11  -1 |  204.8  198.3 ( -3.174%) |  1.763  1.765 ( +0.113%)
dickens     clang-12  -1 |  204.6  210.7 ( +2.981%) |  1.763  1.765 ( +0.113%)
dickens     gcc-4.8    1 |  179.9  198.4 (+10.283%) |  2.388  2.392 ( +0.168%)
dickens     gcc-5      1 |  186.1  196.6 ( +5.642%) |  2.388  2.392 ( +0.168%)
dickens     gcc-6      1 |  194.3  182.7 ( -5.970%) |  2.388  2.392 ( +0.168%)
dickens     gcc-7      1 |  189.0  193.5 ( +2.381%) |  2.388  2.392 ( +0.168%)
dickens     gcc-8      1 |  193.4  189.9 ( -1.810%) |  2.388  2.392 ( +0.168%)
dickens     gcc-10     1 |  189.8  195.4 ( +2.950%) |  2.388  2.392 ( +0.168%)
dickens     clang-6.0  1 |  192.1  193.3 ( +0.625%) |  2.388  2.392 ( +0.168%)
dickens     clang-7    1 |  198.7  196.9 ( -0.906%) |  2.388  2.392 ( +0.168%)
dickens     clang-8    1 |  189.9  194.3 ( +2.317%) |  2.388  2.392 ( +0.168%)
dickens     clang-9    1 |  197.4  201.3 ( +1.976%) |  2.388  2.392 ( +0.168%)
dickens     clang-11   1 |  190.6  192.5 ( +0.997%) |  2.388  2.392 ( +0.168%)
dickens     clang-12   1 |  189.7  196.8 ( +3.743%) |  2.388  2.392 ( +0.168%)
dickens     gcc-4.8    2 |  118.7  124.1 ( +4.549%) |  2.630  2.637 ( +0.266%)
dickens     gcc-5      2 |  122.1  123.1 ( +0.819%) |  2.630  2.637 ( +0.266%)
dickens     gcc-6      2 |  122.4  117.0 ( -4.412%) |  2.630  2.637 ( +0.266%)
dickens     gcc-7      2 |  120.2  124.6 ( +3.661%) |  2.630  2.637 ( +0.266%)
dickens     gcc-8      2 |  122.5  118.5 ( -3.265%) |  2.630  2.637 ( +0.266%)
dickens     gcc-10     2 |  121.1  123.0 ( +1.569%) |  2.630  2.637 ( +0.266%)
dickens     clang-6.0  2 |  121.4  119.4 ( -1.647%) |  2.630  2.637 ( +0.266%)
dickens     clang-7    2 |  124.4  121.3 ( -2.492%) |  2.630  2.637 ( +0.266%)
dickens     clang-8    2 |  125.9  122.8 ( -2.462%) |  2.630  2.637 ( +0.266%)
dickens     clang-9    2 |  120.8  122.6 ( +1.490%) |  2.630  2.637 ( +0.266%)
dickens     clang-11   2 |  124.6  125.0 ( +0.321%) |  2.630  2.637 ( +0.266%)
dickens     clang-12   2 |  127.3  123.4 ( -3.064%) |  2.630  2.637 ( +0.266%)
mozilla     gcc-4.8   -1 |  328.9  329.5 ( +0.182%) |  2.325  2.324 ( -0.043%)
mozilla     gcc-5     -1 |  300.4  311.3 ( +3.628%) |  2.325  2.324 ( -0.043%)
mozilla     gcc-6     -1 |  317.3  348.6 ( +9.864%) |  2.325  2.324 ( -0.043%)
mozilla     gcc-7     -1 |  324.9  330.1 ( +1.600%) |  2.325  2.324 ( -0.043%)
mozilla     gcc-8     -1 |  334.4  329.6 ( -1.435%) |  2.325  2.324 ( -0.043%)
mozilla     gcc-10    -1 |  320.9  329.4 ( +2.649%) |  2.325  2.324 ( -0.043%)
mozilla     clang-6.0 -1 |  336.6  336.8 ( +0.059%) |  2.325  2.324 ( -0.043%)
mozilla     clang-7   -1 |  320.9  333.5 ( +3.926%) |  2.325  2.324 ( -0.043%)
mozilla     clang-8   -1 |  328.3  324.0 ( -1.310%) |  2.325  2.324 ( -0.043%)
mozilla     clang-9   -1 |  339.5  333.2 ( -1.856%) |  2.325  2.324 ( -0.043%)
mozilla     clang-11  -1 |  333.1  328.3 ( -1.441%) |  2.325  2.324 ( -0.043%)
mozilla     clang-12  -1 |  335.7  327.3 ( -2.502%) |  2.325  2.324 ( -0.043%)
mozilla     gcc-4.8    1 |  276.9  289.9 ( +4.695%) |  2.546  2.546 ( +0.000%)
mozilla     gcc-5      1 |  272.1  276.6 ( +1.654%) |  2.546  2.546 ( +0.000%)
mozilla     gcc-6      1 |  271.2  273.8 ( +0.959%) |  2.546  2.546 ( +0.000%)
mozilla     gcc-7      1 |  274.5  277.1 ( +0.947%) |  2.546  2.546 ( +0.000%)
mozilla     gcc-8      1 |  281.1  288.3 ( +2.561%) |  2.546  2.546 ( +0.000%)
mozilla     gcc-10     1 |  283.1  283.0 ( -0.035%) |  2.546  2.546 ( +0.000%)
mozilla     clang-6.0  1 |  291.9  286.1 ( -1.987%) |  2.546  2.546 ( +0.000%)
mozilla     clang-7    1 |  271.4  285.8 ( +5.306%) |  2.546  2.546 ( +0.000%)
mozilla     clang-8    1 |  283.2  278.7 ( -1.589%) |  2.546  2.546 ( +0.000%)
mozilla     clang-9    1 |  286.9  272.2 ( -5.124%) |  2.546  2.546 ( +0.000%)
mozilla     clang-11   1 |  286.3  292.3 ( +2.096%) |  2.546  2.546 ( +0.000%)
mozilla     clang-12   1 |  287.1  286.5 ( -0.209%) |  2.546  2.546 ( +0.000%)
mozilla     gcc-4.8    2 |  222.7  225.6 ( +1.302%) |  2.686  2.688 ( +0.074%)
mozilla     gcc-5      2 |  209.6  216.5 ( +3.292%) |  2.686  2.688 ( +0.074%)
mozilla     gcc-6      2 |  214.9  214.9 ( +0.000%) |  2.686  2.688 ( +0.074%)
mozilla     gcc-7      2 |  214.3  221.0 ( +3.126%) |  2.686  2.688 ( +0.074%)
mozilla     gcc-8      2 |  226.4  225.7 ( -0.309%) |  2.686  2.688 ( +0.074%)
mozilla     gcc-10     2 |  221.5  224.2 ( +1.219%) |  2.686  2.688 ( +0.074%)
mozilla     clang-6.0  2 |  230.5  230.2 ( -0.130%) |  2.686  2.688 ( +0.074%)
mozilla     clang-7    2 |  216.4  225.8 ( +4.344%) |  2.686  2.688 ( +0.074%)
mozilla     clang-8    2 |  213.8  218.3 ( +2.105%) |  2.686  2.688 ( +0.074%)
mozilla     clang-9    2 |  227.0  211.8 ( -6.696%) |  2.686  2.688 ( +0.074%)
mozilla     clang-11   2 |  224.4  224.7 ( +0.134%) |  2.686  2.688 ( +0.074%)
mozilla     clang-12   2 |  223.6  223.5 ( -0.045%) |  2.686  2.688 ( +0.074%)
mr          gcc-4.8   -1 |  299.1  320.0 ( +6.988%) |  1.958  1.961 ( +0.153%)
mr          gcc-5     -1 |  292.0  301.4 ( +3.219%) |  1.958  1.961 ( +0.153%)
mr          gcc-6     -1 |  293.4  327.6 (+11.656%) |  1.958  1.961 ( +0.153%)
mr          gcc-7     -1 |  299.1  321.3 ( +7.422%) |  1.958  1.961 ( +0.153%)
mr          gcc-8     -1 |  300.6  315.5 ( +4.957%) |  1.958  1.961 ( +0.153%)
mr          gcc-10    -1 |  285.4  319.0 (+11.773%) |  1.958  1.961 ( +0.153%)
mr          clang-6.0 -1 |  309.5  315.5 ( +1.939%) |  1.958  1.961 ( +0.153%)
mr          clang-7   -1 |  321.6  314.8 ( -2.114%) |  1.958  1.961 ( +0.153%)
mr          clang-8   -1 |  326.5  327.1 ( +0.184%) |  1.958  1.961 ( +0.153%)
mr          clang-9   -1 |  327.4  320.8 ( -2.016%) |  1.958  1.961 ( +0.153%)
mr          clang-11  -1 |  327.9  302.2 ( -7.838%) |  1.958  1.961 ( +0.153%)
mr          clang-12  -1 |  296.6  328.0 (+10.587%) |  1.958  1.961 ( +0.153%)
mr          gcc-4.8    1 |  246.8  264.3 ( +7.091%) |  2.596  2.611 ( +0.578%)
mr          gcc-5      1 |  241.1  252.8 ( +4.853%) |  2.596  2.611 ( +0.578%)
mr          gcc-6      1 |  250.9  266.7 ( +6.297%) |  2.596  2.611 ( +0.578%)
mr          gcc-7      1 |  254.7  258.3 ( +1.413%) |  2.596  2.611 ( +0.578%)
mr          gcc-8      1 |  237.1  259.6 ( +9.490%) |  2.596  2.611 ( +0.578%)
mr          gcc-10     1 |  239.1  265.2 (+10.916%) |  2.596  2.611 ( +0.578%)
mr          clang-6.0  1 |  255.9  255.2 ( -0.274%) |  2.596  2.611 ( +0.578%)
mr          clang-7    1 |  256.1  263.8 ( +3.007%) |  2.596  2.611 ( +0.578%)
mr          clang-8    1 |  257.4  255.1 ( -0.894%) |  2.596  2.611 ( +0.578%)
mr          clang-9    1 |  264.9  259.9 ( -1.888%) |  2.596  2.611 ( +0.578%)
mr          clang-11   1 |  261.8  261.4 ( -0.153%) |  2.596  2.611 ( +0.578%)
mr          clang-12   1 |  251.4  276.8 (+10.103%) |  2.596  2.611 ( +0.578%)
mr          gcc-4.8    2 |  174.2  182.7 ( +4.879%) |  2.769  2.771 ( +0.072%)
mr          gcc-5      2 |  170.2  165.7 ( -2.644%) |  2.769  2.771 ( +0.072%)
mr          gcc-6      2 |  171.5  173.4 ( +1.108%) |  2.769  2.771 ( +0.072%)
mr          gcc-7      2 |  173.8  168.3 ( -3.165%) |  2.769  2.771 ( +0.072%)
mr          gcc-8      2 |  166.8  166.8 ( +0.000%) |  2.769  2.771 ( +0.072%)
mr          gcc-10     2 |  171.7  177.7 ( +3.494%) |  2.769  2.771 ( +0.072%)
mr          clang-6.0  2 |  178.0  177.9 ( -0.056%) |  2.769  2.771 ( +0.072%)
mr          clang-7    2 |  174.3  178.1 ( +2.180%) |  2.769  2.771 ( +0.072%)
mr          clang-8    2 |  180.6  169.5 ( -6.146%) |  2.769  2.771 ( +0.072%)
mr          clang-9    2 |  184.8  177.3 ( -4.058%) |  2.769  2.771 ( +0.072%)
mr          clang-11   2 |  176.2  176.1 ( -0.057%) |  2.769  2.771 ( +0.072%)
mr          clang-12   2 |  176.5  187.2 ( +6.062%) |  2.769  2.771 ( +0.072%)
nci         gcc-4.8   -1 |  564.8  587.0 ( +3.931%) |  9.427  9.435 ( +0.085%)
nci         gcc-5     -1 |  581.3  600.4 ( +3.286%) |  9.427  9.435 ( +0.085%)
nci         gcc-6     -1 |  557.2  599.7 ( +7.627%) |  9.427  9.435 ( +0.085%)
nci         gcc-7     -1 |  583.8  593.5 ( +1.662%) |  9.427  9.435 ( +0.085%)
nci         gcc-8     -1 |  590.5  584.2 ( -1.067%) |  9.427  9.435 ( +0.085%)
nci         gcc-10    -1 |  578.8  584.3 ( +0.950%) |  9.427  9.435 ( +0.085%)
nci         clang-6.0 -1 |  603.4  582.4 ( -3.480%) |  9.427  9.435 ( +0.085%)
nci         clang-7   -1 |  603.1  598.3 ( -0.796%) |  9.427  9.435 ( +0.085%)
nci         clang-8   -1 |  580.0  591.2 ( +1.931%) |  9.427  9.435 ( +0.085%)
nci         clang-9   -1 |  597.8  586.3 ( -1.924%) |  9.427  9.435 ( +0.085%)
nci         clang-11  -1 |  598.7  588.6 ( -1.687%) |  9.427  9.435 ( +0.085%)
nci         clang-12  -1 |  602.8  581.9 ( -3.467%) |  9.427  9.435 ( +0.085%)
nci         gcc-4.8    1 |  513.6  545.2 ( +6.153%) | 11.780 11.780 ( +0.000%)
nci         gcc-5      1 |  508.2  556.9 ( +9.583%) | 11.780 11.780 ( +0.000%)
nci         gcc-6      1 |  549.3  552.2 ( +0.528%) | 11.780 11.780 ( +0.000%)
nci         gcc-7      1 |  551.5  539.6 ( -2.158%) | 11.780 11.780 ( +0.000%)
nci         gcc-8      1 |  544.4  551.1 ( +1.231%) | 11.780 11.780 ( +0.000%)
nci         gcc-10     1 |  551.3  549.0 ( -0.417%) | 11.780 11.780 ( +0.000%)
nci         clang-6.0  1 |  579.4  537.1 ( -7.301%) | 11.780 11.780 ( +0.000%)
nci         clang-7    1 |  538.7  550.1 ( +2.116%) | 11.780 11.780 ( +0.000%)
nci         clang-8    1 |  564.3  538.5 ( -4.572%) | 11.780 11.780 ( +0.000%)
nci         clang-9    1 |  562.0  545.3 ( -2.972%) | 11.780 11.780 ( +0.000%)
nci         clang-11   1 |  568.8  547.6 ( -3.727%) | 11.780 11.780 ( +0.000%)
nci         clang-12   1 |  550.0  536.0 ( -2.545%) | 11.780 11.780 ( +0.000%)
nci         gcc-4.8    2 |  474.5  460.0 ( -3.056%) | 11.610 11.620 ( +0.086%)
nci         gcc-5      2 |  474.8  480.1 ( +1.116%) | 11.610 11.620 ( +0.086%)
nci         gcc-6      2 |  486.0  495.7 ( +1.996%) | 11.610 11.620 ( +0.086%)
nci         gcc-7      2 |  494.4  470.3 ( -4.875%) | 11.610 11.620 ( +0.086%)
nci         gcc-8      2 |  495.7  482.4 ( -2.683%) | 11.610 11.620 ( +0.086%)
nci         gcc-10     2 |  487.6  492.0 ( +0.902%) | 11.610 11.620 ( +0.086%)
nci         clang-6.0  2 |  492.4  470.1 ( -4.529%) | 11.610 11.620 ( +0.086%)
nci         clang-7    2 |  486.6  481.6 ( -1.028%) | 11.610 11.620 ( +0.086%)
nci         clang-8    2 |  481.6  451.9 ( -6.167%) | 11.610 11.620 ( +0.086%)
nci         clang-9    2 |  508.1  486.7 ( -4.212%) | 11.610 11.620 ( +0.086%)
nci         clang-11   2 |  490.6  480.3 ( -2.099%) | 11.610 11.620 ( +0.086%)
nci         clang-12   2 |  493.0  480.9 ( -2.454%) | 11.610 11.620 ( +0.086%)
ooffice     gcc-4.8   -1 |  274.1  275.1 ( +0.365%) |  1.515  1.513 ( -0.132%)
ooffice     gcc-5     -1 |  264.1  285.7 ( +8.179%) |  1.515  1.513 ( -0.132%)
ooffice     gcc-6     -1 |  254.8  284.8 (+11.774%) |  1.515  1.513 ( -0.132%)
ooffice     gcc-7     -1 |  268.1  287.6 ( +7.273%) |  1.515  1.513 ( -0.132%)
ooffice     gcc-8     -1 |  261.9  283.3 ( +8.171%) |  1.515  1.513 ( -0.132%)
ooffice     gcc-10    -1 |  272.7  285.0 ( +4.510%) |  1.515  1.513 ( -0.132%)
ooffice     clang-6.0 -1 |  280.2  283.4 ( +1.142%) |  1.515  1.513 ( -0.132%)
ooffice     clang-7   -1 |  274.2  284.0 ( +3.574%) |  1.515  1.513 ( -0.132%)
ooffice     clang-8   -1 |  276.1  285.7 ( +3.477%) |  1.515  1.513 ( -0.132%)
ooffice     clang-9   -1 |  285.9  283.8 ( -0.735%) |  1.515  1.513 ( -0.132%)
ooffice     clang-11  -1 |  273.9  296.6 ( +8.288%) |  1.515  1.513 ( -0.132%)
ooffice     clang-12  -1 |  280.5  288.7 ( +2.923%) |  1.515  1.513 ( -0.132%)
ooffice     gcc-4.8    1 |  217.9  228.7 ( +4.956%) |  1.713  1.711 ( -0.117%)
ooffice     gcc-5      1 |  211.4  233.6 (+10.501%) |  1.713  1.711 ( -0.117%)
ooffice     gcc-6      1 |  220.8  240.2 ( +8.786%) |  1.713  1.711 ( -0.117%)
ooffice     gcc-7      1 |  220.7  239.6 ( +8.564%) |  1.713  1.711 ( -0.117%)
ooffice     gcc-8      1 |  221.8  230.1 ( +3.742%) |  1.713  1.711 ( -0.117%)
ooffice     gcc-10     1 |  222.8  228.5 ( +2.558%) |  1.713  1.711 ( -0.117%)
ooffice     clang-6.0  1 |  233.2  233.2 ( +0.000%) |  1.713  1.711 ( -0.117%)
ooffice     clang-7    1 |  226.0  230.7 ( +2.080%) |  1.713  1.711 ( -0.117%)
ooffice     clang-8    1 |  218.4  233.6 ( +6.960%) |  1.713  1.711 ( -0.117%)
ooffice     clang-9    1 |  235.6  232.4 ( -1.358%) |  1.713  1.711 ( -0.117%)
ooffice     clang-11   1 |  231.4  221.6 ( -4.235%) |  1.713  1.711 ( -0.117%)
ooffice     clang-12   1 |  224.4  238.7 ( +6.373%) |  1.713  1.711 ( -0.117%)
ooffice     gcc-4.8    2 |  160.0  159.3 ( -0.437%) |  1.851  1.850 ( -0.054%)
ooffice     gcc-5      2 |  159.7  163.2 ( +2.192%) |  1.851  1.850 ( -0.054%)
ooffice     gcc-6      2 |  152.5  166.3 ( +9.049%) |  1.851  1.850 ( -0.054%)
ooffice     gcc-7      2 |  152.7  167.3 ( +9.561%) |  1.851  1.850 ( -0.054%)
ooffice     gcc-8      2 |  163.5  162.8 ( -0.428%) |  1.851  1.850 ( -0.054%)
ooffice     gcc-10     2 |  158.9  155.5 ( -2.140%) |  1.851  1.850 ( -0.054%)
ooffice     clang-6.0  2 |  167.0  155.3 ( -7.006%) |  1.851  1.850 ( -0.054%)
ooffice     clang-7    2 |  158.5  160.2 ( +1.073%) |  1.851  1.850 ( -0.054%)
ooffice     clang-8    2 |  156.6  165.6 ( +5.747%) |  1.851  1.850 ( -0.054%)
ooffice     clang-9    2 |  164.4  165.8 ( +0.852%) |  1.851  1.850 ( -0.054%)
ooffice     clang-11   2 |  157.8  160.9 ( +1.965%) |  1.851  1.850 ( -0.054%)
ooffice     clang-12   2 |  151.6  165.2 ( +8.971%) |  1.851  1.850 ( -0.054%)
osdb        gcc-4.8   -1 |  301.4  298.1 ( -1.095%) |  2.383  2.389 ( +0.252%)
osdb        gcc-5     -1 |  289.1  295.3 ( +2.145%) |  2.383  2.389 ( +0.252%)
osdb        gcc-6     -1 |  282.3  311.3 (+10.273%) |  2.383  2.389 ( +0.252%)
osdb        gcc-7     -1 |  298.4  293.9 ( -1.508%) |  2.383  2.389 ( +0.252%)
osdb        gcc-8     -1 |  304.6  295.6 ( -2.955%) |  2.383  2.389 ( +0.252%)
osdb        gcc-10    -1 |  293.6  295.9 ( +0.783%) |  2.383  2.389 ( +0.252%)
osdb        clang-6.0 -1 |  295.6  291.4 ( -1.421%) |  2.383  2.389 ( +0.252%)
osdb        clang-7   -1 |  300.6  292.6 ( -2.661%) |  2.383  2.389 ( +0.252%)
osdb        clang-8   -1 |  298.3  300.8 ( +0.838%) |  2.383  2.389 ( +0.252%)
osdb        clang-9   -1 |  312.5  303.9 ( -2.752%) |  2.383  2.389 ( +0.252%)
osdb        clang-11  -1 |  303.2  295.5 ( -2.540%) |  2.383  2.389 ( +0.252%)
osdb        clang-12  -1 |  296.1  301.5 ( +1.824%) |  2.383  2.389 ( +0.252%)
osdb        gcc-4.8    1 |  259.6  266.8 ( +2.773%) |  2.697  2.704 ( +0.260%)
osdb        gcc-5      1 |  255.7  264.8 ( +3.559%) |  2.697  2.704 ( +0.260%)
osdb        gcc-6      1 |  256.4  277.4 ( +8.190%) |  2.697  2.704 ( +0.260%)
osdb        gcc-7      1 |  262.4  264.8 ( +0.915%) |  2.697  2.704 ( +0.260%)
osdb        gcc-8      1 |  269.8  268.3 ( -0.556%) |  2.697  2.704 ( +0.260%)
osdb        gcc-10     1 |  268.9  267.9 ( -0.372%) |  2.697  2.704 ( +0.260%)
osdb        clang-6.0  1 |  262.2  262.9 ( +0.267%) |  2.697  2.704 ( +0.260%)
osdb        clang-7    1 |  253.5  259.9 ( +2.525%) |  2.697  2.704 ( +0.260%)
osdb        clang-8    1 |  272.8  252.4 ( -7.478%) |  2.697  2.704 ( +0.260%)
osdb        clang-9    1 |  248.4  266.3 ( +7.206%) |  2.697  2.704 ( +0.260%)
osdb        clang-11   1 |  267.0  268.9 ( +0.712%) |  2.697  2.704 ( +0.260%)
osdb        clang-12   1 |  268.5  270.8 ( +0.857%) |  2.697  2.704 ( +0.260%)
osdb        gcc-4.8    2 |  208.0  205.0 ( -1.442%) |  2.887  2.888 ( +0.035%)
osdb        gcc-5      2 |  201.5  207.9 ( +3.176%) |  2.887  2.888 ( +0.035%)
osdb        gcc-6      2 |  202.4  210.2 ( +3.854%) |  2.887  2.888 ( +0.035%)
osdb        gcc-7      2 |  206.4  207.3 ( +0.436%) |  2.887  2.888 ( +0.035%)
osdb        gcc-8      2 |  208.1  205.2 ( -1.394%) |  2.887  2.888 ( +0.035%)
osdb        gcc-10     2 |  202.7  208.6 ( +2.911%) |  2.887  2.888 ( +0.035%)
osdb        clang-6.0  2 |  205.6  205.7 ( +0.049%) |  2.887  2.888 ( +0.035%)
osdb        clang-7    2 |  194.6  204.9 ( +5.293%) |  2.887  2.888 ( +0.035%)
osdb        clang-8    2 |  205.7  197.2 ( -4.132%) |  2.887  2.888 ( +0.035%)
osdb        clang-9    2 |  193.5  196.1 ( +1.344%) |  2.887  2.888 ( +0.035%)
osdb        clang-11   2 |  210.8  209.5 ( -0.617%) |  2.887  2.888 ( +0.035%)
osdb        clang-12   2 |  203.6  212.5 ( +4.371%) |  2.887  2.888 ( +0.035%)
reymont     gcc-4.8   -1 |  203.4  206.1 ( +1.327%) |  2.702  2.709 ( +0.259%)
reymont     gcc-5     -1 |  195.9  204.1 ( +4.186%) |  2.702  2.709 ( +0.259%)
reymont     gcc-6     -1 |  194.0  205.0 ( +5.670%) |  2.702  2.709 ( +0.259%)
reymont     gcc-7     -1 |  199.1  202.7 ( +1.808%) |  2.702  2.709 ( +0.259%)
reymont     gcc-8     -1 |  196.4  191.9 ( -2.291%) |  2.702  2.709 ( +0.259%)
reymont     gcc-10    -1 |  197.9  198.8 ( +0.455%) |  2.702  2.709 ( +0.259%)
reymont     clang-6.0 -1 |  199.4  201.9 ( +1.254%) |  2.702  2.709 ( +0.259%)
reymont     clang-7   -1 |  203.6  200.7 ( -1.424%) |  2.702  2.709 ( +0.259%)
reymont     clang-8   -1 |  206.6  215.0 ( +4.066%) |  2.702  2.709 ( +0.259%)
reymont     clang-9   -1 |  207.7  201.5 ( -2.985%) |  2.702  2.709 ( +0.259%)
reymont     clang-11  -1 |  214.3  205.5 ( -4.106%) |  2.702  2.709 ( +0.259%)
reymont     clang-12  -1 |  201.6  212.3 ( +5.308%) |  2.702  2.709 ( +0.259%)
reymont     gcc-4.8    1 |  198.8  205.0 ( +3.119%) |  3.078  3.085 ( +0.227%)
reymont     gcc-5      1 |  194.1  201.1 ( +3.606%) |  3.078  3.085 ( +0.227%)
reymont     gcc-6      1 |  202.9  203.9 ( +0.493%) |  3.078  3.085 ( +0.227%)
reymont     gcc-7      1 |  196.6  187.2 ( -4.781%) |  3.078  3.085 ( +0.227%)
reymont     gcc-8      1 |  199.6  193.9 ( -2.856%) |  3.078  3.085 ( +0.227%)
reymont     gcc-10     1 |  200.6  200.1 ( -0.249%) |  3.078  3.085 ( +0.227%)
reymont     clang-6.0  1 |  197.0  202.7 ( +2.893%) |  3.078  3.085 ( +0.227%)
reymont     clang-7    1 |  203.1  193.7 ( -4.628%) |  3.078  3.085 ( +0.227%)
reymont     clang-8    1 |  209.4  211.3 ( +0.907%) |  3.078  3.085 ( +0.227%)
reymont     clang-9    1 |  190.7  203.9 ( +6.922%) |  3.078  3.085 ( +0.227%)
reymont     clang-11   1 |  205.8  206.3 ( +0.243%) |  3.078  3.085 ( +0.227%)
reymont     clang-12   1 |  190.1  207.5 ( +9.153%) |  3.078  3.085 ( +0.227%)
reymont     gcc-4.8    2 |  161.0  158.8 ( -1.366%) |  3.200  3.208 ( +0.250%)
reymont     gcc-5      2 |  158.4  161.0 ( +1.641%) |  3.200  3.208 ( +0.250%)
reymont     gcc-6      2 |  162.2  160.8 ( -0.863%) |  3.200  3.208 ( +0.250%)
reymont     gcc-7      2 |  166.5  154.5 ( -7.207%) |  3.200  3.208 ( +0.250%)
reymont     gcc-8      2 |  158.6  151.3 ( -4.603%) |  3.200  3.208 ( +0.250%)
reymont     gcc-10     2 |  161.1  159.6 ( -0.931%) |  3.200  3.208 ( +0.250%)
reymont     clang-6.0  2 |  164.7  161.4 ( -2.004%) |  3.200  3.208 ( +0.250%)
reymont     clang-7    2 |  163.8  150.9 ( -7.875%) |  3.200  3.208 ( +0.250%)
reymont     clang-8    2 |  165.2  162.3 ( -1.755%) |  3.200  3.208 ( +0.250%)
reymont     clang-9    2 |  153.6  163.0 ( +6.120%) |  3.200  3.208 ( +0.250%)
reymont     clang-11   2 |  166.3  161.8 ( -2.706%) |  3.200  3.208 ( +0.250%)
reymont     clang-12   2 |  156.4  162.5 ( +3.900%) |  3.200  3.208 ( +0.250%)
samba       gcc-4.8   -1 |  370.4  365.1 ( -1.431%) |  3.433  3.438 ( +0.146%)
samba       gcc-5     -1 |  354.2  350.4 ( -1.073%) |  3.433  3.438 ( +0.146%)
samba       gcc-6     -1 |  361.3  366.8 ( +1.522%) |  3.433  3.438 ( +0.146%)
samba       gcc-7     -1 |  353.4  367.9 ( +4.103%) |  3.433  3.438 ( +0.146%)
samba       gcc-8     -1 |  368.7  365.7 ( -0.814%) |  3.433  3.438 ( +0.146%)
samba       gcc-10    -1 |  360.1  360.9 ( +0.222%) |  3.433  3.438 ( +0.146%)
samba       clang-6.0 -1 |  361.5  363.7 ( +0.609%) |  3.433  3.438 ( +0.146%)
samba       clang-7   -1 |  364.2  360.3 ( -1.071%) |  3.433  3.438 ( +0.146%)
samba       clang-8   -1 |  364.0  368.6 ( +1.264%) |  3.433  3.438 ( +0.146%)
samba       clang-9   -1 |  361.7  369.7 ( +2.212%) |  3.433  3.438 ( +0.146%)
samba       clang-11  -1 |  369.4  370.4 ( +0.271%) |  3.433  3.438 ( +0.146%)
samba       clang-12  -1 |  371.5  361.3 ( -2.746%) |  3.433  3.438 ( +0.146%)
samba       gcc-4.8    1 |  329.2  337.8 ( +2.612%) |  3.921  3.927 ( +0.153%)
samba       gcc-5      1 |  317.9  323.0 ( +1.604%) |  3.921  3.927 ( +0.153%)
samba       gcc-6      1 |  322.0  338.0 ( +4.969%) |  3.921  3.927 ( +0.153%)
samba       gcc-7      1 |  319.7  331.5 ( +3.691%) |  3.921  3.927 ( +0.153%)
samba       gcc-8      1 |  330.9  326.9 ( -1.209%) |  3.921  3.927 ( +0.153%)
samba       gcc-10     1 |  352.4  328.2 ( -6.867%) |  3.921  3.927 ( +0.153%)
samba       clang-6.0  1 |  330.3  337.1 ( +2.059%) |  3.921  3.927 ( +0.153%)
samba       clang-7    1 |  351.0  326.3 ( -7.037%) |  3.921  3.927 ( +0.153%)
samba       clang-8    1 |  323.5  335.5 ( +3.709%) |  3.921  3.927 ( +0.153%)
samba       clang-9    1 |  344.5  341.2 ( -0.958%) |  3.921  3.927 ( +0.153%)
samba       clang-11   1 |  326.5  337.1 ( +3.247%) |  3.921  3.927 ( +0.153%)
samba       clang-12   1 |  331.9  342.9 ( +3.314%) |  3.921  3.927 ( +0.153%)
samba       gcc-4.8    2 |  264.6  256.6 ( -3.023%) |  4.123  4.131 ( +0.194%)
samba       gcc-5      2 |  247.4  255.5 ( +3.274%) |  4.123  4.131 ( +0.194%)
samba       gcc-6      2 |  253.8  260.0 ( +2.443%) |  4.123  4.131 ( +0.194%)
samba       gcc-7      2 |  261.2  261.2 ( +0.000%) |  4.123  4.131 ( +0.194%)
samba       gcc-8      2 |  258.3  257.0 ( -0.503%) |  4.123  4.131 ( +0.194%)
samba       gcc-10     2 |  257.8  259.5 ( +0.659%) |  4.123  4.131 ( +0.194%)
samba       clang-6.0  2 |  254.2  253.5 ( -0.275%) |  4.123  4.131 ( +0.194%)
samba       clang-7    2 |  261.1  263.0 ( +0.728%) |  4.123  4.131 ( +0.194%)
samba       clang-8    2 |  258.1  249.0 ( -3.526%) |  4.123  4.131 ( +0.194%)
samba       clang-9    2 |  265.4  262.7 ( -1.017%) |  4.123  4.131 ( +0.194%)
samba       clang-11   2 |  264.6  267.3 ( +1.020%) |  4.123  4.131 ( +0.194%)
samba       clang-12   2 |  250.4  258.5 ( +3.235%) |  4.123  4.131 ( +0.194%)
sao         gcc-4.8   -1 |  261.1  299.5 (+14.707%) |  1.107  1.107 ( +0.000%)
sao         gcc-5     -1 |  250.8  289.7 (+15.510%) |  1.107  1.107 ( +0.000%)
sao         gcc-6     -1 |  249.3  292.1 (+17.168%) |  1.107  1.107 ( +0.000%)
sao         gcc-7     -1 |  259.6  293.3 (+12.982%) |  1.107  1.107 ( +0.000%)
sao         gcc-8     -1 |  268.7  269.8 ( +0.409%) |  1.107  1.107 ( +0.000%)
sao         gcc-10    -1 |  267.8  281.6 ( +5.153%) |  1.107  1.107 ( +0.000%)
sao         clang-6.0 -1 |  283.7  284.9 ( +0.423%) |  1.107  1.107 ( +0.000%)
sao         clang-7   -1 |  264.9  281.5 ( +6.267%) |  1.107  1.107 ( +0.000%)
sao         clang-8   -1 |  268.3  283.8 ( +5.777%) |  1.107  1.107 ( +0.000%)
sao         clang-9   -1 |  275.0  284.1 ( +3.309%) |  1.107  1.107 ( +0.000%)
sao         clang-11  -1 |  279.8  280.0 ( +0.071%) |  1.107  1.107 ( +0.000%)
sao         clang-12  -1 |  265.9  282.0 ( +6.055%) |  1.107  1.107 ( +0.000%)
sao         gcc-4.8    1 |  192.4  213.0 (+10.707%) |  1.159  1.159 ( +0.000%)
sao         gcc-5      1 |  185.1  219.9 (+18.801%) |  1.159  1.159 ( +0.000%)
sao         gcc-6      1 |  189.9  205.7 ( +8.320%) |  1.159  1.159 ( +0.000%)
sao         gcc-7      1 |  191.3  218.6 (+14.271%) |  1.159  1.159 ( +0.000%)
sao         gcc-8      1 |  197.1  212.1 ( +7.610%) |  1.159  1.159 ( +0.000%)
sao         gcc-10     1 |  197.5  206.4 ( +4.506%) |  1.159  1.159 ( +0.000%)
sao         clang-6.0  1 |  219.3  211.7 ( -3.466%) |  1.159  1.159 ( +0.000%)
sao         clang-7    1 |  197.8  212.3 ( +7.331%) |  1.159  1.159 ( +0.000%)
sao         clang-8    1 |  209.3  210.5 ( +0.573%) |  1.159  1.159 ( +0.000%)
sao         clang-9    1 |  208.4  214.5 ( +2.927%) |  1.159  1.159 ( +0.000%)
sao         clang-11   1 |  208.6  219.5 ( +5.225%) |  1.159  1.159 ( +0.000%)
sao         clang-12   1 |  203.1  207.0 ( +1.920%) |  1.159  1.159 ( +0.000%)
sao         gcc-4.8    2 |  123.3  129.5 ( +5.028%) |  1.248  1.249 ( +0.080%)
sao         gcc-5      2 |  122.6  136.0 (+10.930%) |  1.248  1.249 ( +0.080%)
sao         gcc-6      2 |  123.5  127.2 ( +2.996%) |  1.248  1.249 ( +0.080%)
sao         gcc-7      2 |  127.1  135.9 ( +6.924%) |  1.248  1.249 ( +0.080%)
sao         gcc-8      2 |  124.5  135.2 ( +8.594%) |  1.248  1.249 ( +0.080%)
sao         gcc-10     2 |  128.5  134.3 ( +4.514%) |  1.248  1.249 ( +0.080%)
sao         clang-6.0  2 |  140.6  131.7 ( -6.330%) |  1.248  1.249 ( +0.080%)
sao         clang-7    2 |  130.4  125.1 ( -4.064%) |  1.248  1.249 ( +0.080%)
sao         clang-8    2 |  132.3  136.6 ( +3.250%) |  1.248  1.249 ( +0.080%)
sao         clang-9    2 |  134.2  133.9 ( -0.224%) |  1.248  1.249 ( +0.080%)
sao         clang-11   2 |  135.9  133.8 ( -1.545%) |  1.248  1.249 ( +0.080%)
sao         clang-12   2 |  130.0  131.7 ( +1.308%) |  1.248  1.249 ( +0.080%)
webster     gcc-4.8   -1 |  239.2  244.3 ( +2.132%) |  2.346  2.351 ( +0.213%)
webster     gcc-5     -1 |  223.2  235.9 ( +5.690%) |  2.346  2.351 ( +0.213%)
webster     gcc-6     -1 |  225.6  249.5 (+10.594%) |  2.346  2.351 ( +0.213%)
webster     gcc-7     -1 |  234.4  242.7 ( +3.541%) |  2.346  2.351 ( +0.213%)
webster     gcc-8     -1 |  224.5  243.3 ( +8.374%) |  2.346  2.351 ( +0.213%)
webster     gcc-10    -1 |  232.3  236.0 ( +1.593%) |  2.346  2.351 ( +0.213%)
webster     clang-6.0 -1 |  236.8  240.2 ( +1.436%) |  2.346  2.351 ( +0.213%)
webster     clang-7   -1 |  237.9  245.3 ( +3.111%) |  2.346  2.351 ( +0.213%)
webster     clang-8   -1 |  232.8  245.7 ( +5.541%) |  2.346  2.351 ( +0.213%)
webster     clang-9   -1 |  243.9  244.5 ( +0.246%) |  2.346  2.351 ( +0.213%)
webster     clang-11  -1 |  241.8  243.0 ( +0.496%) |  2.346  2.351 ( +0.213%)
webster     clang-12  -1 |  230.7  242.5 ( +5.115%) |  2.346  2.351 ( +0.213%)
webster     gcc-4.8    1 |  220.1  224.8 ( +2.135%) |  3.028  3.035 ( +0.231%)
webster     gcc-5      1 |  211.5  228.3 ( +7.943%) |  3.028  3.035 ( +0.231%)
webster     gcc-6      1 |  216.7  224.8 ( +3.738%) |  3.028  3.035 ( +0.231%)
webster     gcc-7      1 |  219.5  225.7 ( +2.825%) |  3.028  3.035 ( +0.231%)
webster     gcc-8      1 |  218.5  220.5 ( +0.915%) |  3.028  3.035 ( +0.231%)
webster     gcc-10     1 |  221.0  217.1 ( -1.765%) |  3.028  3.035 ( +0.231%)
webster     clang-6.0  1 |  225.8  224.4 ( -0.620%) |  3.028  3.035 ( +0.231%)
webster     clang-7    1 |  227.8  225.1 ( -1.185%) |  3.028  3.035 ( +0.231%)
webster     clang-8    1 |  221.7  223.1 ( +0.631%) |  3.028  3.035 ( +0.231%)
webster     clang-9    1 |  226.3  229.8 ( +1.547%) |  3.028  3.035 ( +0.231%)
webster     clang-11   1 |  228.8  225.8 ( -1.311%) |  3.028  3.035 ( +0.231%)
webster     clang-12   1 |  218.4  226.3 ( +3.617%) |  3.028  3.035 ( +0.231%)
webster     gcc-4.8    2 |  152.1  153.3 ( +0.789%) |  3.228  3.237 ( +0.279%)
webster     gcc-5      2 |  147.9  156.5 ( +5.815%) |  3.228  3.237 ( +0.279%)
webster     gcc-6      2 |  149.3  155.6 ( +4.220%) |  3.228  3.237 ( +0.279%)
webster     gcc-7      2 |  149.8  154.2 ( +2.937%) |  3.228  3.237 ( +0.279%)
webster     gcc-8      2 |  151.4  154.6 ( +2.114%) |  3.228  3.237 ( +0.279%)
webster     gcc-10     2 |  157.0  151.5 ( -3.503%) |  3.228  3.237 ( +0.279%)
webster     clang-6.0  2 |  149.0  150.9 ( +1.275%) |  3.228  3.237 ( +0.279%)
webster     clang-7    2 |  152.7  158.1 ( +3.536%) |  3.228  3.237 ( +0.279%)
webster     clang-8    2 |  149.7  155.6 ( +3.941%) |  3.228  3.237 ( +0.279%)
webster     clang-9    2 |  159.0  160.5 ( +0.943%) |  3.228  3.237 ( +0.279%)
webster     clang-11   2 |  155.4  159.6 ( +2.703%) |  3.228  3.237 ( +0.279%)
webster     clang-12   2 |  154.7  154.1 ( -0.388%) |  3.228  3.237 ( +0.279%)
xml         gcc-4.8   -1 |  472.6  483.4 ( +2.285%) |  6.055  6.048 ( -0.116%)
xml         gcc-5     -1 |  462.5  478.6 ( +3.481%) |  6.055  6.048 ( -0.116%)
xml         gcc-6     -1 |  443.3  481.0 ( +8.504%) |  6.055  6.048 ( -0.116%)
xml         gcc-7     -1 |  482.6  505.4 ( +4.724%) |  6.055  6.048 ( -0.116%)
xml         gcc-8     -1 |  488.7  485.5 ( -0.655%) |  6.055  6.048 ( -0.116%)
xml         gcc-10    -1 |  480.4  484.2 ( +0.791%) |  6.055  6.048 ( -0.116%)
xml         clang-6.0 -1 |  488.2  477.5 ( -2.192%) |  6.055  6.048 ( -0.116%)
xml         clang-7   -1 |  466.0  484.7 ( +4.013%) |  6.055  6.048 ( -0.116%)
xml         clang-8   -1 |  503.4  480.0 ( -4.648%) |  6.055  6.048 ( -0.116%)
xml         clang-9   -1 |  457.3  476.1 ( +4.111%) |  6.055  6.048 ( -0.116%)
xml         clang-11  -1 |  479.6  489.3 ( +2.023%) |  6.055  6.048 ( -0.116%)
xml         clang-12  -1 |  484.4  476.1 ( -1.713%) |  6.055  6.048 ( -0.116%)
xml         gcc-4.8    1 |  463.3  466.2 ( +0.626%) |  7.673  7.668 ( -0.065%)
xml         gcc-5      1 |  450.2  467.9 ( +3.932%) |  7.673  7.668 ( -0.065%)
xml         gcc-6      1 |  452.8  439.0 ( -3.048%) |  7.673  7.668 ( -0.065%)
xml         gcc-7      1 |  470.0  477.6 ( +1.617%) |  7.673  7.668 ( -0.065%)
xml         gcc-8      1 |  484.8  470.0 ( -3.053%) |  7.673  7.668 ( -0.065%)
xml         gcc-10     1 |  467.3  473.4 ( +1.305%) |  7.673  7.668 ( -0.065%)
xml         clang-6.0  1 |  467.9  481.0 ( +2.800%) |  7.673  7.668 ( -0.065%)
xml         clang-7    1 |  465.3  472.8 ( +1.612%) |  7.673  7.668 ( -0.065%)
xml         clang-8    1 |  470.6  470.8 ( +0.042%) |  7.673  7.668 ( -0.065%)
xml         clang-9    1 |  464.8  457.5 ( -1.571%) |  7.673  7.668 ( -0.065%)
xml         clang-11   1 |  468.0  473.0 ( +1.068%) |  7.673  7.668 ( -0.065%)
xml         clang-12   1 |  474.8  463.9 ( -2.296%) |  7.673  7.668 ( -0.065%)
xml         gcc-4.8    2 |  379.5  375.9 ( -0.949%) |  7.843  7.861 ( +0.230%)
xml         gcc-5      2 |  357.5  383.1 ( +7.161%) |  7.843  7.861 ( +0.230%)
xml         gcc-6      2 |  369.2  362.4 ( -1.842%) |  7.843  7.861 ( +0.230%)
xml         gcc-7      2 |  387.1  388.7 ( +0.413%) |  7.843  7.861 ( +0.230%)
xml         gcc-8      2 |  387.7  389.8 ( +0.542%) |  7.843  7.861 ( +0.230%)
xml         gcc-10     2 |  382.4  385.7 ( +0.863%) |  7.843  7.861 ( +0.230%)
xml         clang-6.0  2 |  384.1  389.6 ( +1.432%) |  7.843  7.861 ( +0.230%)
xml         clang-7    2 |  381.7  384.3 ( +0.681%) |  7.843  7.861 ( +0.230%)
xml         clang-8    2 |  397.6  393.6 ( -1.006%) |  7.843  7.861 ( +0.230%)
xml         clang-9    2 |  392.9  389.4 ( -0.891%) |  7.843  7.861 ( +0.230%)
xml         clang-11   2 |  387.1  384.0 ( -0.801%) |  7.843  7.861 ( +0.230%)
xml         clang-12   2 |  385.0  380.6 ( -1.143%) |  7.843  7.861 ( +0.230%)
x-ray       gcc-4.8   -1 | 1882.3 2375.9 (+26.223%) |  1.001  1.000 ( -0.100%)
x-ray       gcc-5     -1 | 1904.6 2396.6 (+25.832%) |  1.001  1.000 ( -0.100%)
x-ray       gcc-6     -1 | 1952.3 2457.6 (+25.882%) |  1.001  1.000 ( -0.100%)
x-ray       gcc-7     -1 | 1937.5 2398.3 (+23.783%) |  1.001  1.000 ( -0.100%)
x-ray       gcc-8     -1 | 2043.5 2402.6 (+17.573%) |  1.001  1.000 ( -0.100%)
x-ray       gcc-10    -1 | 1977.7 2405.1 (+21.611%) |  1.001  1.000 ( -0.100%)
x-ray       clang-6.0 -1 | 2021.1 2281.0 (+12.859%) |  1.001  1.000 ( -0.100%)
x-ray       clang-7   -1 | 1871.8 2459.9 (+31.419%) |  1.001  1.000 ( -0.100%)
x-ray       clang-8   -1 | 2068.2 2424.7 (+17.237%) |  1.001  1.000 ( -0.100%)
x-ray       clang-9   -1 | 2038.7 2481.4 (+21.715%) |  1.001  1.000 ( -0.100%)
x-ray       clang-11  -1 | 2076.8 2417.0 (+16.381%) |  1.001  1.000 ( -0.100%)
x-ray       clang-12  -1 | 1944.6 2382.8 (+22.534%) |  1.001  1.000 ( -0.100%)
x-ray       gcc-4.8    1 |  526.3  537.8 ( +2.185%) |  1.251  1.252 ( +0.080%)
x-ray       gcc-5      1 |  550.0  561.3 ( +2.055%) |  1.251  1.252 ( +0.080%)
x-ray       gcc-6      1 |  563.5  576.4 ( +2.289%) |  1.251  1.252 ( +0.080%)
x-ray       gcc-7      1 |  565.4  562.0 ( -0.601%) |  1.251  1.252 ( +0.080%)
x-ray       gcc-8      1 |  562.1  563.6 ( +0.267%) |  1.251  1.252 ( +0.080%)
x-ray       gcc-10     1 |  528.1  548.2 ( +3.806%) |  1.251  1.252 ( +0.080%)
x-ray       clang-6.0  1 |  550.6  564.6 ( +2.543%) |  1.251  1.252 ( +0.080%)
x-ray       clang-7    1 |  574.3  575.3 ( +0.174%) |  1.251  1.252 ( +0.080%)
x-ray       clang-8    1 |  556.0  593.5 ( +6.745%) |  1.251  1.252 ( +0.080%)
x-ray       clang-9    1 |  538.5  585.4 ( +8.709%) |  1.251  1.252 ( +0.080%)
x-ray       clang-11   1 |  544.1  568.7 ( +4.521%) |  1.251  1.252 ( +0.080%)
x-ray       clang-12   1 |  529.7  529.2 ( -0.094%) |  1.251  1.252 ( +0.080%)
x-ray       gcc-4.8    2 |  229.5  286.4 (+24.793%) |  1.268  1.264 ( -0.315%)
x-ray       gcc-5      2 |  228.0  297.9 (+30.658%) |  1.268  1.264 ( -0.315%)
x-ray       gcc-6      2 |  230.2  301.2 (+30.843%) |  1.268  1.264 ( -0.315%)
x-ray       gcc-7      2 |  224.7  286.9 (+27.681%) |  1.268  1.264 ( -0.315%)
x-ray       gcc-8      2 |  234.7  300.6 (+28.078%) |  1.268  1.264 ( -0.315%)
x-ray       gcc-10     2 |  225.7  302.2 (+33.895%) |  1.268  1.264 ( -0.315%)
x-ray       clang-6.0  2 |  242.7  300.7 (+23.898%) |  1.268  1.264 ( -0.315%)
x-ray       clang-7    2 |  218.0  304.1 (+39.495%) |  1.268  1.264 ( -0.315%)
x-ray       clang-8    2 |  238.3  301.9 (+26.689%) |  1.268  1.264 ( -0.315%)
x-ray       clang-9    2 |  238.3  320.3 (+34.410%) |  1.268  1.264 ( -0.315%)
x-ray       clang-11   2 |  238.2  300.1 (+25.987%) |  1.268  1.264 ( -0.315%)
x-ray       clang-12   2 |  231.3  286.7 (+23.952%) |  1.268  1.264 ( -0.315%)
silesia.tar gcc-4.8   -1 |  307.2  325.6 ( +5.990%) |  2.434  2.436 ( +0.082%)
silesia.tar gcc-5     -1 |  307.0  325.8 ( +6.124%) |  2.434  2.436 ( +0.082%)
silesia.tar gcc-6     -1 |  308.5  331.0 ( +7.293%) |  2.434  2.436 ( +0.082%)
silesia.tar gcc-7     -1 |  318.4  320.9 ( +0.785%) |  2.434  2.436 ( +0.082%)
silesia.tar gcc-8     -1 |  300.2  325.1 ( +8.294%) |  2.434  2.436 ( +0.082%)
silesia.tar gcc-10    -1 |  316.8  321.8 ( +1.578%) |  2.434  2.436 ( +0.082%)
silesia.tar clang-6.0 -1 |  324.9  326.6 ( +0.523%) |  2.434  2.436 ( +0.082%)
silesia.tar clang-7   -1 |  327.8  315.9 ( -3.630%) |  2.434  2.436 ( +0.082%)
silesia.tar clang-8   -1 |  332.7  321.7 ( -3.306%) |  2.434  2.436 ( +0.082%)
silesia.tar clang-9   -1 |  326.1  323.5 ( -0.797%) |  2.434  2.436 ( +0.082%)
silesia.tar clang-11  -1 |  318.7  323.8 ( +1.600%) |  2.434  2.436 ( +0.082%)
silesia.tar clang-12  -1 |  326.5  325.6 ( -0.276%) |  2.434  2.436 ( +0.082%)
silesia.tar gcc-4.8    1 |  273.5  286.8 ( +4.863%) |  2.884  2.888 ( +0.139%)
silesia.tar gcc-5      1 |  277.6  290.2 ( +4.539%) |  2.884  2.888 ( +0.139%)
silesia.tar gcc-6      1 |  281.8  285.5 ( +1.313%) |  2.884  2.888 ( +0.139%)
silesia.tar gcc-7      1 |  278.9  290.6 ( +4.195%) |  2.884  2.888 ( +0.139%)
silesia.tar gcc-8      1 |  273.6  287.5 ( +5.080%) |  2.884  2.888 ( +0.139%)
silesia.tar gcc-10     1 |  275.8  289.6 ( +5.004%) |  2.884  2.888 ( +0.139%)
silesia.tar clang-6.0  1 |  283.6  284.8 ( +0.423%) |  2.884  2.888 ( +0.139%)
silesia.tar clang-7    1 |  285.8  284.7 ( -0.385%) |  2.884  2.888 ( +0.139%)
silesia.tar clang-8    1 |  278.0  280.6 ( +0.935%) |  2.884  2.888 ( +0.139%)
silesia.tar clang-9    1 |  296.6  289.7 ( -2.326%) |  2.884  2.888 ( +0.139%)
silesia.tar clang-11   1 |  270.9  302.6 (+11.702%) |  2.884  2.888 ( +0.139%)
silesia.tar clang-12   1 |  288.6  291.6 ( +1.040%) |  2.884  2.888 ( +0.139%)
silesia.tar gcc-4.8    2 |  200.0  211.4 ( +5.700%) |  3.046  3.048 ( +0.066%)
silesia.tar gcc-5      2 |  197.8  211.1 ( +6.724%) |  3.046  3.048 ( +0.066%)
silesia.tar gcc-6      2 |  205.6  205.5 ( -0.049%) |  3.046  3.048 ( +0.066%)
silesia.tar gcc-7      2 |  197.3  209.4 ( +6.133%) |  3.046  3.048 ( +0.066%)
silesia.tar gcc-8      2 |  194.1  217.4 (+12.004%) |  3.046  3.048 ( +0.066%)
silesia.tar gcc-10     2 |  204.9  211.9 ( +3.416%) |  3.046  3.048 ( +0.066%)
silesia.tar clang-6.0  2 |  211.2  209.5 ( -0.805%) |  3.046  3.048 ( +0.066%)
silesia.tar clang-7    2 |  212.4  203.5 ( -4.190%) |  3.046  3.048 ( +0.066%)
silesia.tar clang-8    2 |  200.6  208.0 ( +3.689%) |  3.046  3.048 ( +0.066%)
silesia.tar clang-9    2 |  216.2  206.7 ( -4.394%) |  3.046  3.048 ( +0.066%)
silesia.tar clang-11   2 |  202.8  215.1 ( +6.065%) |  3.046  3.048 ( +0.066%)
silesia.tar clang-12   2 |  207.3  210.6 ( +1.592%) |  3.046  3.048 ( +0.066%)
enwik8      gcc-4.8   -1 |  225.6  234.6 ( +3.989%) |  1.934  1.937 ( +0.155%)
enwik8      gcc-5     -1 |  222.8  231.8 ( +4.039%) |  1.934  1.937 ( +0.155%)
enwik8      gcc-6     -1 |  225.7  235.7 ( +4.431%) |  1.934  1.937 ( +0.155%)
enwik8      gcc-7     -1 |  225.5  235.2 ( +4.302%) |  1.934  1.937 ( +0.155%)
enwik8      gcc-8     -1 |  245.0  235.4 ( -3.918%) |  1.934  1.937 ( +0.155%)
enwik8      gcc-10    -1 |  231.4  237.4 ( +2.593%) |  1.934  1.937 ( +0.155%)
enwik8      clang-6.0 -1 |  232.0  234.5 ( +1.078%) |  1.934  1.937 ( +0.155%)
enwik8      clang-7   -1 |  235.4  233.4 ( -0.850%) |  1.934  1.937 ( +0.155%)
enwik8      clang-8   -1 |  229.4  225.3 ( -1.787%) |  1.934  1.937 ( +0.155%)
enwik8      clang-9   -1 |  229.1  238.6 ( +4.147%) |  1.934  1.937 ( +0.155%)
enwik8      clang-11  -1 |  221.0  231.4 ( +4.706%) |  1.934  1.937 ( +0.155%)
enwik8      clang-12  -1 |  227.1  232.6 ( +2.422%) |  1.934  1.937 ( +0.155%)
enwik8      gcc-4.8    1 |  200.0  216.6 ( +8.300%) |  2.455  2.459 ( +0.163%)
enwik8      gcc-5      1 |  201.6  211.1 ( +4.712%) |  2.455  2.459 ( +0.163%)
enwik8      gcc-6      1 |  210.1  212.8 ( +1.285%) |  2.455  2.459 ( +0.163%)
enwik8      gcc-7      1 |  206.2  215.5 ( +4.510%) |  2.455  2.459 ( +0.163%)
enwik8      gcc-8      1 |  210.5  211.8 ( +0.618%) |  2.455  2.459 ( +0.163%)
enwik8      gcc-10     1 |  211.3  207.9 ( -1.609%) |  2.455  2.459 ( +0.163%)
enwik8      clang-6.0  1 |  206.1  218.3 ( +5.919%) |  2.455  2.459 ( +0.163%)
enwik8      clang-7    1 |  207.6  210.5 ( +1.397%) |  2.455  2.459 ( +0.163%)
enwik8      clang-8    1 |  197.9  214.9 ( +8.590%) |  2.455  2.459 ( +0.163%)
enwik8      clang-9    1 |  215.4  211.6 ( -1.764%) |  2.455  2.459 ( +0.163%)
enwik8      clang-11   1 |  208.0  214.5 ( +3.125%) |  2.455  2.459 ( +0.163%)
enwik8      clang-12   1 |  204.3  213.6 ( +4.552%) |  2.455  2.459 ( +0.163%)
enwik8      gcc-4.8    2 |  135.1  148.1 ( +9.623%) |  2.671  2.679 ( +0.300%)
enwik8      gcc-5      2 |  140.7  144.8 ( +2.914%) |  2.671  2.679 ( +0.300%)
enwik8      gcc-6      2 |  143.5  147.0 ( +2.439%) |  2.671  2.679 ( +0.300%)
enwik8      gcc-7      2 |  138.8  148.4 ( +6.916%) |  2.671  2.679 ( +0.300%)
enwik8      gcc-8      2 |  142.3  143.3 ( +0.703%) |  2.671  2.679 ( +0.300%)
enwik8      gcc-10     2 |  135.9  141.4 ( +4.047%) |  2.671  2.679 ( +0.300%)
enwik8      clang-6.0  2 |  142.7  147.2 ( +3.153%) |  2.671  2.679 ( +0.300%)
enwik8      clang-7    2 |  139.5  140.9 ( +1.004%) |  2.671  2.679 ( +0.300%)
enwik8      clang-8    2 |  133.3  145.2 ( +8.927%) |  2.671  2.679 ( +0.300%)
enwik8      clang-9    2 |  148.1  144.9 ( -2.161%) |  2.671  2.679 ( +0.300%)
enwik8      clang-11   2 |  144.0  148.5 ( +3.125%) |  2.671  2.679 ( +0.300%)
enwik8      clang-12   2 |  140.8  145.5 ( +3.338%) |  2.671  2.679 ( +0.300%)
enwik9      gcc-4.8   -1 |  261.1  253.2 ( -3.026%) |  2.204  2.207 ( +0.136%)
enwik9      gcc-5     -1 |  247.3  251.1 ( +1.537%) |  2.204  2.207 ( +0.136%)
enwik9      gcc-6     -1 |  239.2  261.2 ( +9.197%) |  2.204  2.207 ( +0.136%)
enwik9      gcc-7     -1 |  249.9  265.7 ( +6.323%) |  2.204  2.207 ( +0.136%)
enwik9      gcc-8     -1 |  257.5  267.5 ( +3.883%) |  2.204  2.207 ( +0.136%)
enwik9      gcc-10    -1 |  259.1  265.0 ( +2.277%) |  2.204  2.207 ( +0.136%)
enwik9      clang-6.0 -1 |  252.5  253.5 ( +0.396%) |  2.204  2.207 ( +0.136%)
enwik9      clang-7   -1 |  253.6  262.3 ( +3.431%) |  2.204  2.207 ( +0.136%)
enwik9      clang-8   -1 |  269.3  265.3 ( -1.485%) |  2.204  2.207 ( +0.136%)
enwik9      clang-9   -1 |  247.1  265.2 ( +7.325%) |  2.204  2.207 ( +0.136%)
enwik9      clang-11  -1 |  252.0  264.7 ( +5.040%) |  2.204  2.207 ( +0.136%)
enwik9      clang-12  -1 |  252.8  260.2 ( +2.927%) |  2.204  2.207 ( +0.136%)
enwik9      gcc-4.8    1 |  233.1  235.9 ( +1.201%) |  2.798  2.802 ( +0.143%)
enwik9      gcc-5      1 |  226.2  233.2 ( +3.095%) |  2.798  2.802 ( +0.143%)
enwik9      gcc-6      1 |  222.0  236.1 ( +6.351%) |  2.798  2.802 ( +0.143%)
enwik9      gcc-7      1 |  228.4  233.6 ( +2.277%) |  2.798  2.802 ( +0.143%)
enwik9      gcc-8      1 |  235.2  226.8 ( -3.571%) |  2.798  2.802 ( +0.143%)
enwik9      gcc-10     1 |  218.8  241.2 (+10.238%) |  2.798  2.802 ( +0.143%)
enwik9      clang-6.0  1 |  227.1  234.2 ( +3.126%) |  2.798  2.802 ( +0.143%)
enwik9      clang-7    1 |  223.4  243.0 ( +8.774%) |  2.798  2.802 ( +0.143%)
enwik9      clang-8    1 |  238.7  227.7 ( -4.608%) |  2.798  2.802 ( +0.143%)
enwik9      clang-9    1 |  231.4  238.8 ( +3.198%) |  2.798  2.802 ( +0.143%)
enwik9      clang-11   1 |  235.5  234.8 ( -0.297%) |  2.798  2.802 ( +0.143%)
enwik9      clang-12   1 |  234.8  237.9 ( +1.320%) |  2.798  2.802 ( +0.143%)
enwik9      gcc-4.8    2 |  161.0  161.2 ( +0.124%) |  3.039  3.047 ( +0.263%)
enwik9      gcc-5      2 |  159.0  161.6 ( +1.635%) |  3.039  3.047 ( +0.263%)
enwik9      gcc-6      2 |  160.1  163.5 ( +2.124%) |  3.039  3.047 ( +0.263%)
enwik9      gcc-7      2 |  160.8  157.0 ( -2.363%) |  3.039  3.047 ( +0.263%)
enwik9      gcc-8      2 |  162.8  161.0 ( -1.106%) |  3.039  3.047 ( +0.263%)
enwik9      gcc-10     2 |  156.2  165.6 ( +6.018%) |  3.039  3.047 ( +0.263%)
enwik9      clang-6.0  2 |  162.9  165.7 ( +1.719%) |  3.039  3.047 ( +0.263%)
enwik9      clang-7    2 |  163.0  165.2 ( +1.350%) |  3.039  3.047 ( +0.263%)
enwik9      clang-8    2 |  158.3  158.0 ( -0.190%) |  3.039  3.047 ( +0.263%)
enwik9      clang-9    2 |  163.6  162.9 ( -0.428%) |  3.039  3.047 ( +0.263%)
enwik9      clang-11   2 |  164.7  160.6 ( -2.489%) |  3.039  3.047 ( +0.263%)
enwik9      clang-12   2 |  161.3  163.7 ( +1.488%) |  3.039  3.047 ( +0.263%)

(Benchmarked on an Intel Xeon E5-2680 v4 @ 2.40GHz.)

Status

I am satisfied with this PR and feel that it is ready to merge.

To-Do:

  • Achieve correctness.
  • Achieve ratio parity (approximately).
  • Achieve speed parity (approximately).
  • Search at the end of the block. (Decided not to.)
  • Consistently improve speed over existing implementation.
  • Benchmark on other CPUs and architectures.

lib/compress/zstd_fast.c Outdated Show resolved Hide resolved
@felixhandte felixhandte changed the title [WIP] Pipelined Implementation of ZSTD_fast Pipelined Implementation of ZSTD_fast (~+5% Speed) Aug 20, 2021
Amusingly, it seems to be a non-trivial performance hit to add in final
searches or even hash table insertions during cleanup. So let's not. It seems
to not make any meaningful difference in compression ratio.
… Speed)

Unrolling the loop to handle 2 positions in each iteration allows us to reduce
the frequency of some operations that don't need to happen at every position.
One such operation is the step calculation, which is a very rough heuristic
anyways. It's fine if we do this a position later. The other operation is the
repcode check. But since the repcode check already tries expanding back one
position, we're really not missing much of importance by only trying it every
other position.

This commit also slightly reorders some operations.
This removes the old `ZSTD_compressBlock_fast_generic()` and renames the new
`ZSTD_compressBlock_fast_generic_pipelined()` to replace it. This is
functionally a no-op.
It's a bit strange, because this is hitting the dictionary special case where
the dictionary is contiguous with the input and still runs in the single-
segment path.

We should probably change that to hit the `extDict` path instead?
@Cyan4973
Copy link
Contributor

Cyan4973 commented Sep 7, 2021

The maintenance complexity of the PR is pretty good, and a tractable level,
which is a good sign.

For additional control, I've been benchmarking this PR on a stable desktop system, using a variety of compilers.

With gcc-9.3 (default compiler), I see a generally consistent speed win at level 1,
by variable amounts, sometimes as high as +8%, sometimes at just +1%.
In one specific instance, calgary/geo, the new variant features an impressive speed gain (+30%) albeit for a slightly worse compression ratio (1.450 -> 1.435).
There is only a single instance of speed regression (silesia/nci) and even then it's tiny.

However, clang-10.0 was much less friendlier, with performances of this new PR being mostly slower,
by a small yet consistent and measurable amount, in the range -2% (with variations depending on exact file), calgary/geo being the only remaining case with a clear speed win.

clang-10.0 is the default version of my linux distro, but I can test other versions, to compensate from a biais specific to a version (or a specific build). So I tested with clang-11.0. And the good news is : it's better with this version. Though it's not better enough to make it clearly positive. It's more in the -0% territory, meaning both dev and this PR get almost the same performance, and differences are insignificant (except calgary/geo, as usual).
clang-12.0 shows a slightly better picture, with changes generally positive, although mostly by very little. But overall rather positive.

So that makes clang relatively "neutral" to this change across 3 versions.
What about gcc ? v9.3.0 is clearly positive. Let's control with a different version, like gcc-10.3.
Yep, it's still clearly positive. Maybe a bit less positive, but still clearly, with everything being faster by some amount (except silesia/nci, as previously).

So, these results make this PR a clear gain for gcc, and somewhat neutral for clang.
Not sure if something can be done to improve clang, but in the meantime, we can probably live by accepting some amount of variation (sometimes negative, sometimes positive) for this compiler.

@felixhandte felixhandte merged commit d68aa19 into facebook:dev Sep 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants