-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
make SingleQubitCliffordGate immutable singletons and use it in qubit_characterizations for a 37% speedup #6392
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with a few minor suggestions.
@@ -794,10 +787,14 @@ def _single_qubit_gates( | |||
|
|||
|
|||
def _single_qubit_cliffords() -> Cliffords: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT: Should we cache _single_qubit_cliffords
as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
num_x = len([gate for gate in gates if gate == cirq.ops.SingleQubitCliffordGate.X]) | ||
num_z = len([gate for gate in gates if gate == cirq.ops.SingleQubitCliffordGate.Z]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should restore the assertion on len(gates)
. How about
num_x = len([gate for gate in gates if gate == cirq.ops.SingleQubitCliffordGate.X]) | |
num_z = len([gate for gate in gates if gate == cirq.ops.SingleQubitCliffordGate.Z]) | |
num_i = sum(1 for gate in gates if gate == cirq.ops.SingleQubitCliffordGate.I) | |
num_x = sum(1 for gate in gates if gate == cirq.ops.SingleQubitCliffordGate.X) | |
num_z = sum(1 for gate in gates if gate == cirq.ops.SingleQubitCliffordGate.Z) | |
assert num_i + num_x + num_z == len(gates) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the group has X^{0.5, -0.5, 0} and Z^{1, 0.5, -0.5} so the check has to have sqrtX, and adjointXsqrt. added
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. Let us then leave it as it was with a fancier sum-counting of the X and Z gates.
@@ -356,6 +358,8 @@ def _get_sqrt_map( | |||
class CliffordGate(raw_types.Gate, CommonCliffordGates): | |||
"""Clifford rotation for N-qubit.""" | |||
|
|||
_clifford_tableau: qis.CliffordTableau |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: We can move this to line 382 as
self._clifford_tableau: qis.CliffordTableau
cirq-core/cirq/ops/clifford_gate.py
Outdated
@@ -761,6 +777,7 @@ def commutes_with_pauli(self, pauli: Pauli) -> bool: | |||
to, flip = self.pauli_tuple(pauli) | |||
return to == pauli and not flip | |||
|
|||
@functools.cache |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We usually avoid using functools.cache
on instance methods because it creates a single cache at the class level; instead can use _compat.cached_method
which creates separate caches on each instance. Should take about the same space since the since the class-level cache would store all (self, second)
tuples, whereas the separate instance caches would just store second
, but there would be one cache for each self
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the suggestion, changed to cached_method
@@ -457,6 +467,7 @@ def _act_on_( | |||
return NotImplemented | |||
|
|||
|
|||
@dataclass(frozen=True, init=False, eq=False, repr=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does making this a dataclass actually do here with all these options set to False?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
makes the class frozen while not overriding any method. the default parameters will create an __init__
, __eq__
and __repr__
but we already implement these.
@@ -652,3 +652,6 @@ def measure( | |||
self, axes: Sequence[int], seed: 'cirq.RANDOM_STATE_OR_SEED_LIKE' = None | |||
) -> List[int]: | |||
return [self._measure(axis, random_state.parse_random_state(seed)) for axis in axes] | |||
|
|||
def __hash__(self) -> int: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
consider making this a cached_method as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #6392 +/- ##
=======================================
Coverage 97.81% 97.81%
=======================================
Files 1111 1111
Lines 97054 97084 +30
=======================================
+ Hits 94931 94961 +30
Misses 2123 2123 ☔ View full report in Codecov by Sentry. |
#6401) This is a followup to #6392 and reduces circuit creation time (for 1000 cliffords and 50 qubits) to 0.213s from the original 3.886s (94.5% speedup) and the 2.148s (90% speedup) following #6392 ``` In [3]: import cirq.experiments.qubit_characterizations as ceq ...: import cirq ...: import time ...: import numpy as np ...: ...: num_cliffords = 1000 ...: qubits = cirq.LineQubit.range(50) ...: ...: cliffords = ceq._single_qubit_cliffords() ...: c1 = cliffords.c1_in_xy ...: t1 = time.time() ...: seq = ceq._create_parallel_rb_circuit(qubits, num_cliffords, c1) ...: t2 = time.time() ...: print(f'{t2-t1} s') 0.2055680751800537 s ```
…_characterizations for a 37% speedup (quantumlib#6392) SingleQubitCliffordGates represents the 24 gates in the single qubit clifford group. Some of the operations this class implements are expensive computation but at the same time are properties of each of the 24 operators. turning those expensive computations into cached properties is benificial for performance but for that the objects need to be immutable. one of the places that heavily relies on single qubit cliffords is `qubit_characterizations.py` which implemented the single qubit clifford algebra from scratch by falling on to the matrix representation and doing matrix multiplication and inversion (for computing the adjoint) this led to a bottleneck while creating circuits (e.g. for example _create_parallel_rb_circuit for 50 qubits and a 1000 gates takes $3.886s$). using SingleQubitCliffordGates instead and using `merged_with` operation which maps two cliffords onto the clifford equaivalent to their composition leads to $2.148s$, with most of those 2s spent in `Moment.__init__` which will be the target of the next PR. I also made the 24 cliffords singleton, since there is no point in creating new object which won't have any of the cached properties. part of quantumlib#6389
quantumlib#6401) This is a followup to quantumlib#6392 and reduces circuit creation time (for 1000 cliffords and 50 qubits) to 0.213s from the original 3.886s (94.5% speedup) and the 2.148s (90% speedup) following quantumlib#6392 ``` In [3]: import cirq.experiments.qubit_characterizations as ceq ...: import cirq ...: import time ...: import numpy as np ...: ...: num_cliffords = 1000 ...: qubits = cirq.LineQubit.range(50) ...: ...: cliffords = ceq._single_qubit_cliffords() ...: c1 = cliffords.c1_in_xy ...: t1 = time.time() ...: seq = ceq._create_parallel_rb_circuit(qubits, num_cliffords, c1) ...: t2 = time.time() ...: print(f'{t2-t1} s') 0.2055680751800537 s ```
SingleQubitCliffordGates represents the 24 gates in the single qubit clifford group. Some of the operations this class implements are expensive computation but at the same time are properties of each of the 24 operators.
turning those expensive computations into cached properties is benificial for performance but for that the objects need to be immutable.
one of the places that heavily relies on single qubit cliffords is$3.886s$ ). using SingleQubitCliffordGates instead and using $2.148s$ , with most of those 2s spent in
qubit_characterizations.py
which implemented the single qubit clifford algebra from scratch by falling on to the matrix representation and doing matrix multiplication and inversion (for computing the adjoint) this led to a bottleneck while creating circuits (e.g. for example _create_parallel_rb_circuit for 50 qubits and a 1000 gates takesmerged_with
operation which maps two cliffords onto the clifford equaivalent to their composition leads toMoment.__init__
which will be the target of the next PR.I also made the 24 cliffords singleton, since there is no point in creating new object which won't have any of the cached properties.
part of #6389