Update docstrings and regenerate ipynb files

quantumlib · May 28, 2024 · 71fe8e8 · 71fe8e8
1 parent 41de200
commit 71fe8e8
Show file tree

Hide file tree

Showing 6 changed files with 425 additions and 334 deletions.
diff --git a/dev_tools/autogenerate-bloqs-notebooks-v2.py b/dev_tools/autogenerate-bloqs-notebooks-v2.py
@@ -75,6 +75,7 @@
 import qualtran.bloqs.chemistry.trotter.ising.unitaries
 import qualtran.bloqs.chemistry.trotter.trotterized_unitary
 import qualtran.bloqs.data_loading.qrom
+import qualtran.bloqs.data_loading.qrom_base
 import qualtran.bloqs.data_loading.select_swap_qrom
 import qualtran.bloqs.factoring.ecc
 import qualtran.bloqs.factoring.mod_exp
@@ -497,12 +498,18 @@
     NotebookSpecV2(
         title='QROM',
         module=qualtran.bloqs.data_loading.qrom,
-        bloq_specs=[qualtran.bloqs.data_loading.qrom._QROM_DOC],
+        bloq_specs=[
+            qualtran.bloqs.data_loading.qrom_base._QROM_BASE_DOC,
+            qualtran.bloqs.data_loading.qrom._QROM_DOC,
+        ],
     ),
     NotebookSpecV2(
         title='SelectSwapQROM',
         module=qualtran.bloqs.data_loading.select_swap_qrom,
-        bloq_specs=[qualtran.bloqs.data_loading.select_swap_qrom._SELECT_SWAP_QROM_DOC],
+        bloq_specs=[
+            qualtran.bloqs.data_loading.qrom_base._QROM_BASE_DOC,
+            qualtran.bloqs.data_loading.select_swap_qrom._SELECT_SWAP_QROM_DOC,
+        ],
     ),
     NotebookSpecV2(
         title='Block Encoding',

diff --git a/qualtran/bloqs/data_loading/qrom.ipynb b/qualtran/bloqs/data_loading/qrom.ipynb
@@ -2,7 +2,7 @@
  "cells": [
   {
    "cell_type": "markdown",
-   "id": "ffe91e97",
+   "id": "6a30c548",
    "metadata": {
     "cq.autogen": "title_cell"
    },
@@ -15,7 +15,7 @@
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "2737c79f",
+   "id": "20c4cf8d",
    "metadata": {
     "cq.autogen": "top_imports"
    },
@@ -32,13 +32,13 @@
   },
   {
    "cell_type": "markdown",
-   "id": "4ea4344b",
+   "id": "ee1b29d3",
    "metadata": {
-    "cq.autogen": "QROM.bloq_doc.md"
+    "cq.autogen": "QROMBase.bloq_doc.md"
    },
    "source": [
-    "## `QROM`\n",
-    "Bloq to load `data[l]` in the target register when the selection stores an index `l`.\n",
+    "## `QROMBase`\n",
+    "Interface for Bloqs to load `data[l]` when the selection register stores index `l`.\n",
     "\n",
     "## Overview\n",
     "The action of a QROM can be described as\n",
@@ -53,65 +53,127 @@
     "        |d_L[s_1, s_2, \\dots, s_k]\\rangle\n",
     "$$\n",
     "\n",
-    "There two high level parameters that control the behavior of a QROM are -\n",
+    "A behavior of a QROM can be understood in terms of its classical analogue, where a for-loop\n",
+    "over one or more (selection) indices can be used to load one or more classical datasets, where\n",
+    "each of the classical dataset can be multidimensional.\n",
+    "\n",
+    "```\n",
+    ">>> # N, M, P, Q, R, S, T are pre-initialized integer parameters.\n",
+    ">>> output = [np.zeros((P, Q)), np.zeros((R, S, T))]\n",
+    ">>> # Load two different classical datasets; each of different shape.\n",
+    ">>> data = [np.random.rand(N, M, P, Q), np.random.rand(N, M, R, S, T)]\n",
+    ">>> for i in range(N): # For loop over two selection indices i and j.\n",
+    ">>>     for j in range(M):\n",
+    ">>>        # Load two multidimensional classical datasets data[0] and data[1] s.t.\n",
+    ">>>        # |i, j⟩|0⟩  -> |i, j⟩|data[0][i, j, :]⟩|data[1][i, j, :]⟩\n",
+    ">>>        output[0] = data[0][i, j, :]\n",
+    ">>>        output[1] = data[1][i, j, :]\n",
+    "```\n",
+    "\n",
+    "The parameters that control the behavior and costs of a QROM are -\n",
+    "\n",
+    "1. Number of selection registers (eg: $i$, $j$) and their iteration lengths (eg: $N$, $M$).\n",
+    "2. Number of target registers, their quantum datatype and shape.\n",
+    "    - Number of target registers: One for each classical dataset to load (eg: $\\text{data}[0]$\n",
+    "        and $\\text{data}[1]$)\n",
+    "    - QDType of target registers: Depends on `dtype` of the $i$'th classical dataset\n",
+    "    - Shape of target registers: Depends on shape of classical data (eg: $(P, Q)$ and\n",
+    "        $(R, S, T)$ above)\n",
+    "\n",
+    "### Specification of classical data via `data_or_shape`\n",
+    "Users can specify the classical data to load via QROM by passing in an appropriate value\n",
+    "for `data_or_shape` attribute. This is a list of numpy arrays or `Shaped` objects, where\n",
+    "each item of the list corresponds to a classical dataset to load.\n",
+    "\n",
+    "Each classical dataset to load can be specified as a numpy array (or a `Shaped` object for\n",
+    "symbolic bloqs). The shape of the dataset is a union of the selection shape and target shape,\n",
+    "s.t.\n",
+    "$$\n",
+    "    \\text{data[i].shape} = \\text{selection\\_shape} + \\text{target\\_shape[i]}\n",
+    "$$\n",
+    "\n",
+    "Note that the $\\text{selection\\_shape}$ should be same across all classical datasets to be\n",
+    "loaded and correspond to a tuple of iteration lengths of selection indices (i.e. $(N, M)$\n",
+    "in the example above).\n",
+    "\n",
+    "The target shape of each classical dataset can be different and parameterizes the size of\n",
+    "the desired output that should be loaded in a target register.\n",
+    "\n",
+    "### Number of selection registers and their iteration lengths\n",
+    "As describe in the previous section, the number of selection registers and their iteration\n",
+    "lengths can be inferred from the shape of the classical dataset. All classical datasets\n",
+    "to be loaded must have the same $\\text{selection\\_shape}$, which is a tuple of iteration\n",
+    "lengths over each dimension of the dataset (i.e. the range for each nested for-loop).\n",
     "\n",
-    "1. Shape of the classical dataset to be loaded ($\\text{data.shape} = (S_1, S_2, ..., S_K)$).\n",
-    "2. Number of distinct datasets to be loaded ($\\text{data.bitsizes} = (b_1, b_2, ..., b_L)$).\n",
+    "In order to load a data set with $\\text{selection\\_shape} == (P, Q, R, S)$ the QROM bloq\n",
+    "needs four selection registers with bitsizes $(p, q, r, s)$ where each of\n",
+    "$p,q,r,s \\geq \\log_2{P}, \\log_2{Q}, \\log_2{R}, \\log_2{S}$.\n",
     "\n",
-    "Each of these have an effect on the cost of the QROM. The `data_or_shape` parameter stores\n",
-    "either\n",
-    "1. A numpy array of shape $(L, S_1, S_2, ..., S_K)$ when $L$ classical datasets, each of\n",
-    "   shape $(S_1, S_2, ..., S_K)$ and bitsizes $(b_1, b_2, ..., b_L)$ are to be loaded and\n",
-    "   the classical data is available to instantiate the QROM bloq. In this case, the helper\n",
-    "   builder `QROM.build_from_data(data_1, data_2, ..., data_L)` can be used to build the QROM.\n",
+    "In general, to load $K$ dimensional data, we use $K$ named selection registers\n",
+    "$(\\text{selection}_0, \\text{selection}_1, ..., \\text{selection}_k)$ to index and\n",
+    "load the data. For the $i$'th selection register, its size is configured using\n",
+    "attribute $\\text{selection\\_bitsizes[i]}$ and the iteration range is configued\n",
+    "using $\\text{data\\_or\\_shape[0].shape[i]}$.\n",
     "\n",
-    "2. A `Shaped` object that stores a (potentially symbolic) tuple $(L, S_1, S_2, ..., S_K)$\n",
-    "   that represents the number of classical datasets `L=data_or_shape.shape[0]` and\n",
-    "   their shape `data_shape=data_or_shape.shape[1:]` to be loaded by this QROM. This is used\n",
-    "   to instantiate QROM bloqs for symbolic cost analysis where the exact data to be loaded\n",
-    "   is not known. In this case, the helper builder `QROM.build_from_bitsize` can be used\n",
-    "   to build the QROM.\n",
+    "### Number of target registers, their quantum datatype and shape\n",
+    "QROM bloq uses one target register for each entry corresponding to classical dataset in the\n",
+    "tuple `data_or_shape`. Thus, to load $L$ classical datsets, we use $L$ names target registers\n",
+    "$(\\text{target}_0, \\text{target}_1, ..., \\text{target}_L)$\n",
     "\n",
-    "### Shape of the classical dataset to be loaded.\n",
-    "QROM bloq supports loading multidimensional classical datasets. In order to load a data\n",
-    "set of shape $\\mathrm{data.shape} == (P, Q, R, S)$ the QROM bloq needs four selection\n",
-    "registers with bitsizes $(p, q, r, s)$ where\n",
-    "$p,q,r,s=\\log_2{P}, \\log_2{Q}, \\log_2{R}, \\log_2{S}$.\n",
+    "Each named target register has a bitsize $b_{i}=\\text{target\\_bitsizes[i]}$ that represents\n",
+    "the size of the register and depends upon the maximum value of individual elements in the\n",
+    "$i$'th classical dataset.\n",
+    "\n",
+    "Each named target register has a shape that can be configured using attribute\n",
+    "$\\text{target\\_shape[i]}$ that represents the number of target registers if the output to load\n",
+    "is multidimensional.\n",
+    "\n",
+    "#### Parameters\n",
+    " - `data_or_shape`: List of numpy ndarrays specifying the data to load. If the length of this list ($L$) is greater than one then we use the same selection indices to load each dataset. The shape of a classical dataset is a concatenation of selection_shape and target_shape[i]; i.e. `data_or_shape[i].shape = selection_shape + target_shape[i]`. Thus, each data set is required to have the same selection shape $(S_1, S_2, ..., S_K)$ and can have a different target shape given by `target_shapes[i]`. For symbolic QROMs, pass a list of `Shaped` objects instead with shape $(S_1, S_2, ..., S_K) + target_shape[i]$.\n",
+    " - `selection_bitsizes`: The number of bits used to represent each selection register corresponding to the size of each dimension of the selection_shape $(S_1, S_2, ..., S_K)$. Should be the same length as the selection shape of each of the datasets and $2**\\text{selection\\_bitsizes[i]} >= S_i$\n",
+    " - `target_shapes`: Shape of target registers for each classical dataset to be loaded. Must be consistent with `data_or_shape` s.t. `len(data_or_shape) == len(target_shapes)` and `data_or_shape[-len(target_shapes[i]):] == target_shapes[i]`.\n",
+    " - `target_bitsizes`: Bitsize (or qdtype) of the target registers for each classical dataset to be loaded. This can be deduced from the maximum element of each of the datasets. Must be consistent with `data_or_shape` s.t. `len(data_or_shape) == len(target_bitsizes)` and `target_bitsizes[i] >= max(data[i]).bitsize`.\n",
+    " - `num_controls`: The number of controls to instanstiate a controlled version of this bloq.\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "9d6450ad",
+   "metadata": {
+    "cq.autogen": "QROM.bloq_doc.md"
+   },
+   "source": [
+    "## `QROM`\n",
+    "Bloq to load `data[l]` in the target register when the selection stores an index `l`.\n",
     "\n",
-    "In general, to load K dimensional data, we use K named selection registers `(selection0,\n",
-    "selection1, ..., selection{k})` to index and load the data.\n",
+    "See docstrings of `QROMBase` for an overview of the QROM primitive and the various attributes.\n",
     "\n",
-    "The T/Toffoli cost of the QROM scales linearly with the number of elements in the dataset\n",
-    "(i.e. $\\mathcal{O}(\\mathrm{np.prod(data.shape)}$).\n",
+    "This bloq is an implementation of the `QROMBase` interface that uses the unary iteration based\n",
+    "approach described in Ref [1].\n",
     "\n",
-    "### Number of distinct datasets to be loaded, and their corresponding target bitsize.\n",
-    "To load a classical dataset into a target register of bitsize $b$, the clifford cost of a QROM\n",
-    "scales as $\\mathcal{O}(b \\mathrm{np.prod}(\\mathrm{data.shape}))$. This is because we need\n",
-    "$\\mathcal{O}(b)$ CNOT gates to load the ith data element in the target register when the\n",
-    "selection register stores index $i$.\n",
+    "## Cost of this (unary iteration based) QROM\n",
     "\n",
-    "If you have multiple classical datasets `(data_1, data_2, data_3, ..., data_L)` to be loaded\n",
-    "and each of them has the same shape `(data_1.shape == data_2.shape == ... == data_L.shape)`\n",
-    "and different target bitsizes `(b_1, b_2, ..., b_L)`, then one construct a single classical\n",
-    "dataset `data = merge(data_1, data_2, ..., data_L)` where\n",
+    "### T / Toffoli cost\n",
+    "The T/Toffoli cost of this QROM scales linearly with the product of iteration lengths over\n",
+    "all dimensions (i.e. $\\mathcal{O}(\\mathrm{np.prod(\\text{selection\\_shape})}$).\n",
     "\n",
-    "- `data.shape == data_1.shape == data_2.shape == ... == data_L` and\n",
-    "- `data[idx] = f'{data_1[idx]!0{b_1}b}' + f'{data_2[idx]!0{b_2}b}' + ... + f'{data_L[idx]!0{b_L}b}'`\n",
+    "### Clifford Cost\n",
+    "To load a classical dataset into a target register of bitsize $b$ and shape\n",
+    "$\\text{target\\_shape}$, the clifford cost of this QROM scales as\n",
+    "$\\mathcal{O}(b \\cdot \\text{np.prod(selection\\_shape+target\\_shape)})\n",
+    "=\\mathcal{O}(b \\cdot \\text{np.prod(data.shape)})$. This is because we need $\\mathcal{O}(b)$\n",
+    "CNOT gates to load 1 classical data element in the target register and for each of the\n",
+    "$\\text{np.prod(selection\\_shape)}$ selection indices, we have $\\text{np.prod(target\\_shape)}$\n",
+    "such data elements to load.\n",
     "\n",
-    "Thus, the target bitsize of the merged dataset is $b = b_1 + b_2 + \\dots + b_L$ and clifford\n",
-    "cost of loading merged dataset scales as\n",
-    "$\\mathcal{O}((b_1 + b_2 + \\dots + b_L) \\mathrm{np.prod}(\\mathrm{data.shape}))$.\n",
+    "### Ancilla cost\n",
+    "The number of clean ancilla required by this QROM scales linearly with the size of the\n",
+    "selection registers + number of controls.\n",
     "\n",
     "## Variable spaced QROM\n",
     "When the input classical data contains consecutive entries of identical data elements to\n",
     "load, the QROM also implements the \"variable-spaced\" QROM optimization described in Ref [2].\n",
     "\n",
-    "#### Parameters\n",
-    " - `data_or_shape`: List of numpy ndarrays specifying the data to load. If the length of this list ($L$) is greater than one then we use the same selection indices to load each dataset. Each data set is required to have the same shape $(S_1, S_2, ..., S_K)$ and to be of integer type. For symbolic QROMs, pass a `Shaped` object instead with shape $(L, S_1, S_2, ..., S_K)$.\n",
-    " - `selection_bitsizes`: The number of bits used to represent each selection register corresponding to the size of each dimension of the array $(S_1, S_2, ..., S_K)$. Should be the same length as the shape of each of the datasets.\n",
-    " - `target_bitsizes`: The number of bits used to represent the data signature. This can be deduced from the maximum element of each of the datasets. Should be a tuple $(b_1, b_2, ..., b_L)$ of length `L = len(data)`, i.e. the number of datasets to be loaded.\n",
-    " - `num_controls`: The number of controls. \n",
-    "\n",
     "#### References\n",
     " - [Encoding Electronic Spectra in Quantum Circuits with Linear T Complexity](https://arxiv.org/abs/1805.03662).     Babbush et. al. (2018). Figure 1.\n",
     " - [Compilation of Fault-Tolerant Quantum Heuristics for Combinatorial Optimization](https://arxiv.org/abs/2007.07391).     Babbush et. al. (2020). Figure 3.\n"
@@ -120,7 +182,7 @@
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "af317b30",
+   "id": "2c3d471e",
    "metadata": {
     "cq.autogen": "QROM.bloq_doc.py"
    },
@@ -131,7 +193,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "d5d3a19d",
+   "id": "0b256e78",
    "metadata": {
     "cq.autogen": "QROM.example_instances.md"
    },
@@ -142,7 +204,7 @@
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "5f02d641",
+   "id": "c457f9f9",
    "metadata": {
     "cq.autogen": "QROM.qrom_small"
    },
@@ -155,7 +217,7 @@
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "c2f8b350",
+   "id": "2b97c379",
    "metadata": {
     "cq.autogen": "QROM.qrom_multi_data"
    },
@@ -169,7 +231,7 @@
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "036cf220",
+   "id": "fc65e4e0",
    "metadata": {
     "cq.autogen": "QROM.qrom_multi_dim"
    },
@@ -183,18 +245,19 @@
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "a084ca8c-0c89-4439-86d9-51cf91e972c4",
-   "metadata": {},
+   "id": "01c89177",
+   "metadata": {
+    "cq.autogen": "QROM.qrom_symb"
+   },
    "outputs": [],
    "source": [
     "N, M, b1, b2, c = sympy.symbols('N M b1 b2 c')\n",
-    "qrom_symb = QROM.build_from_bitsize((N, M), (b1, b2), num_controls=c)\n",
-    "qrom_symb"
+    "qrom_symb = QROM.build_from_bitsize((N, M), (b1, b2), num_controls=c)"
    ]
   },
   {
    "cell_type": "markdown",
-   "id": "b92d1c8e",
+   "id": "d063ce85",
    "metadata": {
     "cq.autogen": "QROM.graphical_signature.md"
    },
@@ -205,7 +268,7 @@
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "9681cfed",
+   "id": "8a3ea28a",
    "metadata": {
     "cq.autogen": "QROM.graphical_signature.py"
    },
@@ -218,7 +281,7 @@
   },
   {
    "cell_type": "markdown",
-   "id": "8eb02cb9",
+   "id": "b63db87f",
    "metadata": {
     "cq.autogen": "QROM.call_graph.md"
    },
@@ -229,7 +292,7 @@
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "7f29f104",
+   "id": "65639e51",
    "metadata": {
     "cq.autogen": "QROM.call_graph.py"
    },
@@ -240,31 +303,6 @@
     "show_call_graph(qrom_small_g)\n",
     "show_counts_sigma(qrom_small_sigma)"
    ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "5392752e-276e-434c-9d60-fadcf6478077",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "qrom_symb_g, qrom_symb_sigma = qrom_symb.call_graph(generalizer=ignore_split_join)\n",
-    "show_call_graph(qrom_symb_g)\n",
-    "show_counts_sigma(qrom_symb_sigma)"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "52a98d72",
-   "metadata": {
-    "cq.autogen": "QROM.qrom_symb"
-   },
-   "outputs": [],
-   "source": [
-    "N, M, b1, b2, c = sympy.symbols('N M b1 b2 c')\n",
-    "qrom_symb = QROM.build_from_bitsize((N, M), (b1, b2), num_controls=c)"
-   ]
   }
  ],
  "metadata": {