Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Calculate relative, not absolute, scores in SabreSwap #9012

Merged
merged 4 commits into from
Nov 9, 2022

Commits on Oct 27, 2022

  1. Calculate relative, not absolute, scores in SabreSwap

    This is a significant performance improvement for very wide (100q+)
    circuits.
    
    The previous `SabreSwap` algorithm would score each candidate swap by
    applying the swap to the current layout, iterating through every element
    in the front layer and extended sets summing the total distances of the
    2q gates, and then undoing the swap.  However, in the front layer, a
    given swap can affect at most two 2q operations.  This new form instead
    scores each swap by calculating its total distance relative to if the
    swap had not been made.  This means that the basic score is formed from
    only one or two gates, no matter how many are in the front layer.
    
    This is an algorithmic complexity improvement in the scoring for
    volumetric circuits with respect to the number of qubits.  Such a
    circuit with `n` qubits has `n / 2` gates in its front layer at all
    times, and so (assuming a coupling map that expands by a constant
    connectivity factor per qubit, like heavy hex) `k * n` swaps to score.
    This means that choosing the best swap has quadratic complexity with the
    original Sabre scoring algorithm.  With this new algorithm, the score
    for a given swap is calculated in constant time, so choosing the best is
    instead linear.  In practice, I did not see all these improvements at
    the scales I tested at, but I did see significant improvements - a 1081q
    heavy-hex quantum-volume circuit at depth 5 was swap-mapped on my
    machine in 25s with this commit compared to 100s before it.
    
    The principal change is the structs `FrontLayer` and `ExtendedSet`,
    which combine constant-time hash-set insertions, lookups and removals
    with vectors to enable constant-time lookup of the affected qubits.
    `FrontLayer` now only ever holds currently unroutable 2q operations;
    routable operations are immediately placed into the output structures
    as soon as a new 2q gate is routed.  This routing is also done by
    exploiting that a swap can only affect two previously unroutable gates
    in the front layer; we just walk forwards along the outbound edges from
    those two nodes, adding any operations that become routable.  This
    avoids another linear scan through the whole front layer after each
    swap, although in practice this has less of a speed-up effect, because
    it already wasn't quadratic.
    
    In theory, this change does not affect how any swap is scored relative
    to any other with the same front layer and extended set (though the
    scores used in the comparison do change).  In order to precisely match
    the current implementation (and ensure reproducibility from a given
    seed), this tracks the insertion order of nodes into the front layer,
    including after removals.
    
    This commit completely modifies the internals of the Rust components of
    Sabre, although the actual algorithm is largely unchanged, aside from
    the scoring difference.  Various parts of the implementation do change
    for efficiency, though.
    
    This commit maintains RNG compatibility with the previous Rust
    implementation in most cases.  It is possible in some circuits for
    floating-point differences to cause different output, when several swaps
    are at the minimum score, but plus/minus 1ULP.  This happens in both the
    old and new forms of the implementation, but _which_ of the minimal
    swaps get the minus-1ULP score varies between them, and consequently
    affects the swap choice.  In fairly extensive testing, this appears to
    be the only mechanism for differences; I've verified that the
    release-valve mechanism and predecessor-requirement tracking function
    identically to before.  The resultant scores - relative for "basic" and
    "lookahead", absolute for "decay" - are in practice within 2ULP of the
    old algorithm's.
    
    In maintaining RNG compatibility, this commit leaves several further
    speed-ups on the table.  There is additional memory usage and tracking
    to maintain some required iteration orders, and some reordering checks
    that are not strictly necessary any more.  Further, the sorting stages
    at the ends of the swap-choosing functions (to maintain repeatability)
    can be made redundant now, since some hash-set iteration (which is
    effectively an uncontrolled randomisation per run) is no longer
    required.  These will be addressed in follow-ups.
    jakelishman committed Oct 27, 2022
    Configuration menu
    Copy the full SHA
    fcbc9e6 View commit details
    Browse the repository at this point in the history
  2. Fix lint

    jakelishman committed Oct 27, 2022
    Configuration menu
    Copy the full SHA
    9016e7c View commit details
    Browse the repository at this point in the history

Commits on Nov 9, 2022

  1. Configuration menu
    Copy the full SHA
    28722d2 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    99e497a View commit details
    Browse the repository at this point in the history