Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make SingleQubitCliffordGate immutable singletons and use it in qubit_characterizations for a 37% speedup #6392

Merged
merged 4 commits into from
Jan 8, 2024

Conversation

NoureldinYosri
Copy link
Collaborator

@NoureldinYosri NoureldinYosri commented Jan 3, 2024

SingleQubitCliffordGates represents the 24 gates in the single qubit clifford group. Some of the operations this class implements are expensive computation but at the same time are properties of each of the 24 operators.

turning those expensive computations into cached properties is benificial for performance but for that the objects need to be immutable.

one of the places that heavily relies on single qubit cliffords is qubit_characterizations.py which implemented the single qubit clifford algebra from scratch by falling on to the matrix representation and doing matrix multiplication and inversion (for computing the adjoint) this led to a bottleneck while creating circuits (e.g. for example _create_parallel_rb_circuit for 50 qubits and a 1000 gates takes $3.886s$). using SingleQubitCliffordGates instead and using merged_with operation which maps two cliffords onto the clifford equaivalent to their composition leads to $2.148s$, with most of those 2s spent in Moment.__init__ which will be the target of the next PR.

I also made the 24 cliffords singleton, since there is no point in creating new object which won't have any of the cached properties.

part of #6389

@NoureldinYosri NoureldinYosri requested review from mrwojtek, vtomole, cduck and a team as code owners January 3, 2024 00:26
@NoureldinYosri NoureldinYosri requested a review from verult January 3, 2024 00:26
@NoureldinYosri NoureldinYosri changed the title make SingleQubitCliffordGate immutable singletons and use it in benchmarking for a 37% speedup make SingleQubitCliffordGate immutable singletons and use it in qubit_characterizations for a 37% speedup Jan 3, 2024
@pavoljuhas pavoljuhas self-assigned this Jan 3, 2024
@pavoljuhas pavoljuhas self-requested a review January 5, 2024 02:03
@pavoljuhas pavoljuhas removed their assignment Jan 5, 2024
Copy link
Collaborator

@pavoljuhas pavoljuhas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with a few minor suggestions.

@@ -794,10 +787,14 @@ def _single_qubit_gates(


def _single_qubit_cliffords() -> Cliffords:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT: Should we cache _single_qubit_cliffords as well?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment on lines 78 to 79
num_x = len([gate for gate in gates if gate == cirq.ops.SingleQubitCliffordGate.X])
num_z = len([gate for gate in gates if gate == cirq.ops.SingleQubitCliffordGate.Z])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should restore the assertion on len(gates). How about

Suggested change
num_x = len([gate for gate in gates if gate == cirq.ops.SingleQubitCliffordGate.X])
num_z = len([gate for gate in gates if gate == cirq.ops.SingleQubitCliffordGate.Z])
num_i = sum(1 for gate in gates if gate == cirq.ops.SingleQubitCliffordGate.I)
num_x = sum(1 for gate in gates if gate == cirq.ops.SingleQubitCliffordGate.X)
num_z = sum(1 for gate in gates if gate == cirq.ops.SingleQubitCliffordGate.Z)
assert num_i + num_x + num_z == len(gates)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the group has X^{0.5, -0.5, 0} and Z^{1, 0.5, -0.5} so the check has to have sqrtX, and adjointXsqrt. added

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Let us then leave it as it was with a fancier sum-counting of the X and Z gates.

@@ -356,6 +358,8 @@ def _get_sqrt_map(
class CliffordGate(raw_types.Gate, CommonCliffordGates):
"""Clifford rotation for N-qubit."""

_clifford_tableau: qis.CliffordTableau
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: We can move this to line 382 as

        self._clifford_tableau: qis.CliffordTableau

@@ -761,6 +777,7 @@ def commutes_with_pauli(self, pauli: Pauli) -> bool:
to, flip = self.pauli_tuple(pauli)
return to == pauli and not flip

@functools.cache
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We usually avoid using functools.cache on instance methods because it creates a single cache at the class level; instead can use _compat.cached_method which creates separate caches on each instance. Should take about the same space since the since the class-level cache would store all (self, second) tuples, whereas the separate instance caches would just store second, but there would be one cache for each self.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the suggestion, changed to cached_method

@@ -457,6 +467,7 @@ def _act_on_(
return NotImplemented


@dataclass(frozen=True, init=False, eq=False, repr=False)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does making this a dataclass actually do here with all these options set to False?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes the class frozen while not overriding any method. the default parameters will create an __init__, __eq__ and __repr__ but we already implement these.

@@ -652,3 +652,6 @@ def measure(
self, axes: Sequence[int], seed: 'cirq.RANDOM_STATE_OR_SEED_LIKE' = None
) -> List[int]:
return [self._measure(axis, random_state.parse_random_state(seed)) for axis in axes]

def __hash__(self) -> int:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider making this a cached_method as well.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link

codecov bot commented Jan 8, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (d33b1a7) 97.81% compared to head (8fc489f) 97.81%.
Report is 1 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #6392   +/-   ##
=======================================
  Coverage   97.81%   97.81%           
=======================================
  Files        1111     1111           
  Lines       97054    97084   +30     
=======================================
+ Hits        94931    94961   +30     
  Misses       2123     2123           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@NoureldinYosri NoureldinYosri enabled auto-merge (squash) January 8, 2024 19:35
@NoureldinYosri NoureldinYosri merged commit b3e0eeb into main Jan 8, 2024
38 checks passed
NoureldinYosri added a commit that referenced this pull request Jan 8, 2024
#6401)

This is a followup to #6392 and reduces circuit creation time (for 1000 cliffords and 50 qubits) to 0.213s from the original 3.886s (94.5% speedup) and the 2.148s (90% speedup) following #6392

```
In [3]: import cirq.experiments.qubit_characterizations as ceq
   ...: import cirq
   ...: import time
   ...: import numpy as np
   ...: 
   ...: num_cliffords = 1000
   ...: qubits = cirq.LineQubit.range(50)
   ...: 
   ...: cliffords = ceq._single_qubit_cliffords()
   ...: c1 = cliffords.c1_in_xy
   ...: t1 = time.time()
   ...: seq = ceq._create_parallel_rb_circuit(qubits, num_cliffords, c1)
   ...: t2 = time.time()
   ...: print(f'{t2-t1} s')
0.2055680751800537 s
```
harry-phasecraft pushed a commit to PhaseCraft/Cirq that referenced this pull request Oct 31, 2024
…_characterizations for a 37% speedup (quantumlib#6392)

SingleQubitCliffordGates represents the 24 gates in the single qubit clifford group. Some of the operations this class implements are expensive computation but at the same time are properties of each of the 24 operators.

turning those expensive computations into cached properties is benificial for performance but for that the objects need to be immutable.

one of the places that heavily relies on single qubit cliffords is `qubit_characterizations.py` which implemented the single qubit clifford algebra from scratch by falling on to the matrix representation and doing matrix multiplication and inversion (for computing the adjoint) this led to a bottleneck while creating circuits (e.g. for example _create_parallel_rb_circuit for 50 qubits and a 1000 gates takes $3.886s$). using SingleQubitCliffordGates instead and using `merged_with` operation which maps two cliffords onto the clifford equaivalent to their composition leads to $2.148s$, with most of those 2s spent in `Moment.__init__` which will be the target of the next PR.

I also made the 24 cliffords singleton, since there is no point in creating new object which won't have any of the cached properties.
 
part of quantumlib#6389
harry-phasecraft pushed a commit to PhaseCraft/Cirq that referenced this pull request Oct 31, 2024
quantumlib#6401)

This is a followup to quantumlib#6392 and reduces circuit creation time (for 1000 cliffords and 50 qubits) to 0.213s from the original 3.886s (94.5% speedup) and the 2.148s (90% speedup) following quantumlib#6392

```
In [3]: import cirq.experiments.qubit_characterizations as ceq
   ...: import cirq
   ...: import time
   ...: import numpy as np
   ...: 
   ...: num_cliffords = 1000
   ...: qubits = cirq.LineQubit.range(50)
   ...: 
   ...: cliffords = ceq._single_qubit_cliffords()
   ...: c1 = cliffords.c1_in_xy
   ...: t1 = time.time()
   ...: seq = ceq._create_parallel_rb_circuit(qubits, num_cliffords, c1)
   ...: t2 = time.time()
   ...: print(f'{t2-t1} s')
0.2055680751800537 s
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants