Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Unitary Synthesis #1475

Open
10 of 12 tasks
khalatepradnya opened this issue Apr 4, 2024 · 7 comments · Fixed by #1708, #1794, #2217 or #2154
Open
10 of 12 tasks

[RFC] Unitary Synthesis #1475

khalatepradnya opened this issue Apr 4, 2024 · 7 comments · Fixed by #1708, #1794, #2217 or #2154
Labels
language Anything related to the CUDA Quantum language specification RFC-approved

Comments

@khalatepradnya
Copy link
Collaborator

khalatepradnya commented Apr 4, 2024

Describe the feature

Problem

Given a user provided arbitrary quantum unitary, synthesize it into a sequence of quantum gates.

Expectations

  • User provides an arbitrary unitary matrix as a custom quantum operation.
  • The custom operation can be used as a regular CUDA-Q supported quantum operation.
    • Q: Broadcast (same operation on multiple qubits): Out of scope
  • The allowed set of quantum gates for synthesis depends on the backend target.
    • Q: Allow user to specify set of allowed gates: Out of scope
  • CUDA-Q throws error if a unitary cannot be synthesized (reasonably).
    • 'reasonably' to account for time limit (timeout), gate count limit (upper threshold), and how close the synthesized "circuit" is to the input unitary (tolerance)
  • Parameterized custom operations will be covered in a follow-up RFC.

User API

  • Python
import cudaq 

cudaq.register_operation("custom_h", 1. / np.sqrt(2.) *  np.array([[1, 1], [1, -1]])) 
cudaq.register_operation("custom_x", np.array([[0, 1], [1, 0]])) 

@cudaq.kernel 
def bell(): 
  qubits = cudaq.qvector(2) 
  custom_h(qubits[0]) 
  custom_x.ctrl(qubits[0], qubits[1]) 

counts = cudaq.sample(bell) 
counts.dump()

  • C++
// Macro to specify the custom unitary operation
cudaq_register_operation(custom_h, 1, 0,
                         (std::vector<std::vector<std::complex<double>>>{
                             {M_SQRT1_2, M_SQRT1_2}, {M_SQRT1_2, -M_SQRT1_2}}));
cudaq_register_operation(
    custom_x, 1, 0, (std::vector<std::vector<std::complex<double>>>{{0, 1}, {1, 0}}));

void custom_operation() __qpu__ {
  cudaq::qvector qubits(2);
  custom_h(qubits[0]);
  custom_x.ctrl(qubits[0], qubits[1]);
}

int main() {
  auto result = cudaq::sample(custom_operation);
  std::cout << result.most_probable() << '\n';
  return 0;
}
  • The user must provide valid unitary matrix (CUDA-Q will not check / enforce this requirement)
  • Ordering: The user provided matrix must be in row-major format
  • Endianness: The user provided matrix is interpreted as Big-endian (often followed by Physics textbooks).

Constraints

  • Size of unitary matrix: limit to 8 qubits, (2^8 = 256), 256 x 256
  • The custom operation must be defined outside of a quantum kernel. (for e.g. call to register_operation cannot be inside a function decorated with @cudaq.kernel)
  • The tolerance for the synthesized circuit and the gate count limit will be default values determined by CUDA-Q
  • The custom operation definition is restricted to qubit (cudaq::qudit<2>).

Workflow

image
  • In simulation, no synthesis will happen.
  • Compiler will automatically synthesize the matrix when targeting hardware.
  • Explicit synthesis mechanism (API or command-line argument) - Out of scope for the first iteration
  • NVQC target behaves same as when running locally

Work items / TO-DOs

  • Support in simulation for Python -
    • Kernel mode
    • Builder mode
    • State vector simulators
    • Tensornet simulators
  • Support in simulation for C++
    • Library mode
    • MLIR mode
  • Add generic synthesis for emulation
  • Error handling: Gracefully handle user errors, feature constraints and runtime errors
  • Support synthesis per hardware backend
  • Comprehensive documentation and useful example(s): Covered in issue [docs] Add user facing documentation for custom operations and unitary synthesis #2002
@khalatepradnya khalatepradnya added the enhancement New feature or request label Apr 4, 2024
@bettinaheim bettinaheim added RFC Request for Comments language Anything related to the CUDA Quantum language specification and removed enhancement New feature or request labels Apr 5, 2024
@bettinaheim bettinaheim added RFC-approved and removed RFC Request for Comments labels Apr 30, 2024
@schweitzpgi
Copy link
Collaborator

Specifically, I'm not entirely sure what the following code's intended semantics is.

cudaq_register_op("custom_h",
                  {{M_SQRT1_2, M_SQRT1_2}, {M_SQRT1_2, -M_SQRT1_2}});
cudaq_register_op("custom_x", {{0, 1}, {1, 0}});

These are calls? Macros? What exactly is being registered? And with what?

These aren't marked as __qpu__ code so will be entirely opaque to the compiler at first blush. Hence, the compiler cannot generate quake code for them.

@schweitzpgi
Copy link
Collaborator

Second order question: it may be possible for the compiler to take a constant matrix here and generate a gate list (approximation) from those values. Or perhaps this should be generated entirely in the control hardware at QIR time? And what about the synthesis case? If the compiler is going to generate the gate list, it stands to reason that it will need to do so at synthesis-time. And that affects the IR, which would need to support dynamic matrix specifications that can be instantiated by the synthesizer.

@khalatepradnya
Copy link
Collaborator Author

These are calls? Macros? What exactly is being registered? And with what?

Macros. Updated the code snippet in description.

@ACE07-Sev
Copy link

Will this PR provide unitary decomposition like what qiskit's transpile does?
#1781

@khalatepradnya
Copy link
Collaborator Author

Will this PR provide unitary decomposition like what qiskit's transpile does? #1781

Conceptually, yes.
However, the synthesis mechanism and target gateset will be implicit in this iteration.

@ACE07-Sev
Copy link

Question. I am trying to compare the depth of cuda-quantum's implementation of QSD (I assume it's QSD) vs Qiskit's implementation. May I ask how I can see the depth of the circuit in terms of U3 and CX gates?

@khalatepradnya
Copy link
Collaborator Author

Question. I am trying to compare the depth of cuda-quantum's implementation of QSD (I assume it's QSD) vs Qiskit's implementation. May I ask how I can see the depth of the circuit in terms of U3 and CX gates?

Thank you for the question. This feature is not yet available in CUDA-Q. I will update this issue when it becomes available.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment