make SingleQubitCliffordGate immutable singletons and use it in qubit_characterizations for a 37% speedup #6392

NoureldinYosri · 2024-01-03T00:26:02Z

SingleQubitCliffordGates represents the 24 gates in the single qubit clifford group. Some of the operations this class implements are expensive computation but at the same time are properties of each of the 24 operators.

turning those expensive computations into cached properties is benificial for performance but for that the objects need to be immutable.

one of the places that heavily relies on single qubit cliffords is qubit_characterizations.py which implemented the single qubit clifford algebra from scratch by falling on to the matrix representation and doing matrix multiplication and inversion (for computing the adjoint) this led to a bottleneck while creating circuits (e.g. for example _create_parallel_rb_circuit for 50 qubits and a 1000 gates takes $3.886s$). using SingleQubitCliffordGates instead and using merged_with operation which maps two cliffords onto the clifford equaivalent to their composition leads to $2.148s$, with most of those 2s spent in Moment.__init__ which will be the target of the next PR.

I also made the 24 cliffords singleton, since there is no point in creating new object which won't have any of the cached properties.

part of #6389

…marking

pavoljuhas

LGTM with a few minor suggestions.

pavoljuhas · 2024-01-06T01:52:32Z

cirq-core/cirq/experiments/qubit_characterizations.py

@@ -794,10 +787,14 @@ def _single_qubit_gates(


 def _single_qubit_cliffords() -> Cliffords:


NIT: Should we cache _single_qubit_cliffords as well?

pavoljuhas · 2024-01-06T01:54:43Z

cirq-core/cirq/experiments/qubit_characterizations_test.py

+        num_x = len([gate for gate in gates if gate == cirq.ops.SingleQubitCliffordGate.X])
+        num_z = len([gate for gate in gates if gate == cirq.ops.SingleQubitCliffordGate.Z])


I think we should restore the assertion on len(gates). How about

Suggested change

num_x = len([gate for gate in gates if gate == cirq.ops.SingleQubitCliffordGate.X])

num_z = len([gate for gate in gates if gate == cirq.ops.SingleQubitCliffordGate.Z])

num_i = sum(1 for gate in gates if gate == cirq.ops.SingleQubitCliffordGate.I)

num_x = sum(1 for gate in gates if gate == cirq.ops.SingleQubitCliffordGate.X)

num_z = sum(1 for gate in gates if gate == cirq.ops.SingleQubitCliffordGate.Z)

assert num_i + num_x + num_z == len(gates)

the group has X^{0.5, -0.5, 0} and Z^{1, 0.5, -0.5} so the check has to have sqrtX, and adjointXsqrt. added

I see. Let us then leave it as it was with a fancier sum-counting of the X and Z gates.

pavoljuhas · 2024-01-06T02:05:36Z

cirq-core/cirq/ops/clifford_gate.py

@@ -356,6 +358,8 @@ def _get_sqrt_map(
 class CliffordGate(raw_types.Gate, CommonCliffordGates):
    """Clifford rotation for N-qubit."""

+    _clifford_tableau: qis.CliffordTableau


Nit: We can move this to line 382 as

self._clifford_tableau: qis.CliffordTableau

maffoo · 2024-01-08T04:56:46Z

cirq-core/cirq/ops/clifford_gate.py

@@ -761,6 +777,7 @@ def commutes_with_pauli(self, pauli: Pauli) -> bool:
        to, flip = self.pauli_tuple(pauli)
        return to == pauli and not flip

+    @functools.cache


We usually avoid using functools.cache on instance methods because it creates a single cache at the class level; instead can use _compat.cached_method which creates separate caches on each instance. Should take about the same space since the since the class-level cache would store all (self, second) tuples, whereas the separate instance caches would just store second, but there would be one cache for each self.

thanks for the suggestion, changed to cached_method

maffoo · 2024-01-08T05:00:39Z

cirq-core/cirq/ops/clifford_gate.py

@@ -457,6 +467,7 @@ def _act_on_(
        return NotImplemented


+@dataclass(frozen=True, init=False, eq=False, repr=False)


What does making this a dataclass actually do here with all these options set to False?

makes the class frozen while not overriding any method. the default parameters will create an __init__, __eq__ and __repr__ but we already implement these.

maffoo · 2024-01-08T05:02:20Z

cirq-core/cirq/qis/clifford_tableau.py

@@ -652,3 +652,6 @@ def measure(
        self, axes: Sequence[int], seed: 'cirq.RANDOM_STATE_OR_SEED_LIKE' = None
    ) -> List[int]:
        return [self._measure(axis, random_state.parse_random_state(seed)) for axis in axes]
+
+    def __hash__(self) -> int:


consider making this a cached_method as well.

codecov · 2024-01-08T19:24:43Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (d33b1a7) 97.81% compared to head (8fc489f) 97.81%.
Report is 1 commits behind head on main.

Additional details and impacted files

@@           Coverage Diff           @@
##             main    #6392   +/-   ##
=======================================
  Coverage   97.81%   97.81%           
=======================================
  Files        1111     1111           
  Lines       97054    97084   +30     
=======================================
+ Hits        94931    94961   +30     
  Misses       2123     2123

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

#6401) This is a followup to #6392 and reduces circuit creation time (for 1000 cliffords and 50 qubits) to 0.213s from the original 3.886s (94.5% speedup) and the 2.148s (90% speedup) following #6392 ``` In [3]: import cirq.experiments.qubit_characterizations as ceq ...: import cirq ...: import time ...: import numpy as np ...: ...: num_cliffords = 1000 ...: qubits = cirq.LineQubit.range(50) ...: ...: cliffords = ceq._single_qubit_cliffords() ...: c1 = cliffords.c1_in_xy ...: t1 = time.time() ...: seq = ceq._create_parallel_rb_circuit(qubits, num_cliffords, c1) ...: t2 = time.time() ...: print(f'{t2-t1} s') 0.2055680751800537 s ```

…_characterizations for a 37% speedup (quantumlib#6392) SingleQubitCliffordGates represents the 24 gates in the single qubit clifford group. Some of the operations this class implements are expensive computation but at the same time are properties of each of the 24 operators. turning those expensive computations into cached properties is benificial for performance but for that the objects need to be immutable. one of the places that heavily relies on single qubit cliffords is `qubit_characterizations.py` which implemented the single qubit clifford algebra from scratch by falling on to the matrix representation and doing matrix multiplication and inversion (for computing the adjoint) this led to a bottleneck while creating circuits (e.g. for example _create_parallel_rb_circuit for 50 qubits and a 1000 gates takes $3.886s$). using SingleQubitCliffordGates instead and using `merged_with` operation which maps two cliffords onto the clifford equaivalent to their composition leads to $2.148s$, with most of those 2s spent in `Moment.__init__` which will be the target of the next PR. I also made the 24 cliffords singleton, since there is no point in creating new object which won't have any of the cached properties. part of quantumlib#6389

quantumlib#6401) This is a followup to quantumlib#6392 and reduces circuit creation time (for 1000 cliffords and 50 qubits) to 0.213s from the original 3.886s (94.5% speedup) and the 2.148s (90% speedup) following quantumlib#6392 ``` In [3]: import cirq.experiments.qubit_characterizations as ceq ...: import cirq ...: import time ...: import numpy as np ...: ...: num_cliffords = 1000 ...: qubits = cirq.LineQubit.range(50) ...: ...: cliffords = ceq._single_qubit_cliffords() ...: c1 = cliffords.c1_in_xy ...: t1 = time.time() ...: seq = ceq._create_parallel_rb_circuit(qubits, num_cliffords, c1) ...: t2 = time.time() ...: print(f'{t2-t1} s') 0.2055680751800537 s ```

make SingleQubitCliffordGate immutable singletons and use it in bench…

a19846a

…marking

NoureldinYosri requested review from mrwojtek, vtomole, cduck and a team as code owners January 3, 2024 00:26

NoureldinYosri requested a review from verult January 3, 2024 00:26

NoureldinYosri changed the title ~~make SingleQubitCliffordGate immutable singletons and use it in benchmarking for a 37% speedup~~ make SingleQubitCliffordGate immutable singletons and use it in qubit_characterizations for a 37% speedup Jan 3, 2024

pavoljuhas self-assigned this Jan 3, 2024

pavoljuhas self-requested a review January 5, 2024 02:03

pavoljuhas removed their assignment Jan 5, 2024

pavoljuhas approved these changes Jan 6, 2024

View reviewed changes

maffoo reviewed Jan 8, 2024

View reviewed changes

NoureldinYosri added 2 commits January 8, 2024 11:02

address comments

7495f3a

rerun doctest

3d95892

Merge branch 'main' into cliffords

8fc489f

NoureldinYosri enabled auto-merge (squash) January 8, 2024 19:35

NoureldinYosri merged commit b3e0eeb into main Jan 8, 2024
38 checks passed

NoureldinYosri mentioned this pull request Jan 8, 2024

optimize clifford benchmark circuit creation for a total 94.5% speedup #6401

Merged

pavoljuhas mentioned this pull request Jan 10, 2024

Remove cached_property backport #6398

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

make SingleQubitCliffordGate immutable singletons and use it in qubit_characterizations for a 37% speedup #6392

make SingleQubitCliffordGate immutable singletons and use it in qubit_characterizations for a 37% speedup #6392

NoureldinYosri commented Jan 3, 2024 •

edited

Loading

pavoljuhas left a comment

pavoljuhas Jan 6, 2024

NoureldinYosri Jan 8, 2024

pavoljuhas Jan 6, 2024

NoureldinYosri Jan 8, 2024

pavoljuhas Jan 8, 2024

pavoljuhas Jan 6, 2024

maffoo Jan 8, 2024

NoureldinYosri Jan 8, 2024

maffoo Jan 8, 2024

NoureldinYosri Jan 8, 2024

maffoo Jan 8, 2024

NoureldinYosri Jan 8, 2024

codecov bot commented Jan 8, 2024 •

edited

Loading

		@@ -794,10 +787,14 @@ def _single_qubit_gates(


		def _single_qubit_cliffords() -> Cliffords:

		num_x = len([gate for gate in gates if gate == cirq.ops.SingleQubitCliffordGate.X])
		num_z = len([gate for gate in gates if gate == cirq.ops.SingleQubitCliffordGate.Z])

-        num_x = len([gate for gate in gates if gate == cirq.ops.SingleQubitCliffordGate.X])
-        num_z = len([gate for gate in gates if gate == cirq.ops.SingleQubitCliffordGate.Z])
+        num_i = sum(1 for gate in gates if gate == cirq.ops.SingleQubitCliffordGate.I)
+        num_x = sum(1 for gate in gates if gate == cirq.ops.SingleQubitCliffordGate.X)
+        num_z = sum(1 for gate in gates if gate == cirq.ops.SingleQubitCliffordGate.Z)
+        assert num_i + num_x + num_z == len(gates)

		@@ -457,6 +467,7 @@ def _act_on_(
		return NotImplemented


		@dataclass(frozen=True, init=False, eq=False, repr=False)

make SingleQubitCliffordGate immutable singletons and use it in qubit_characterizations for a 37% speedup #6392

make SingleQubitCliffordGate immutable singletons and use it in qubit_characterizations for a 37% speedup #6392

Conversation

NoureldinYosri commented Jan 3, 2024 • edited Loading

pavoljuhas left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov bot commented Jan 8, 2024 • edited Loading

Codecov Report

NoureldinYosri commented Jan 3, 2024 •

edited

Loading

codecov bot commented Jan 8, 2024 •

edited

Loading