Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add initial representation of classical expressions #10332

Merged
merged 3 commits into from
Jul 14, 2023

Conversation

jakelishman
Copy link
Member

Summary

This adds a method of representing a classical type system and expression tree in Qiskit. This initial version contains only simple bitwise, logical and relational operations on Booleans and sized unsigned integers. This is largely in line with

https://github.com/Qiskit/RFCs/blob/0dd13344/0010-simple-classical-representations.md

Some of the details surrounding construction and representation have been changed slightly from that design, but the type system described is the same.

In particular, this commit slightly modifies the representation of the tree; instead of individual tree items for each unary and binary operation, they are part of the Unary and Binary nodes with a respective enumeration discriminator in the node. This was done to make visitors to the expression tree require less boilerplate and be easier to maintain by exploiting the natural similarities between all these operations of the same type. One can still discriminate on the particular operations by using a jump table based on the enumeration values, if desired.

Secondly, a series of constructor functions was added that is separate to the representation form. This is for both efficiency and user convenience. The tree form is purely a data format; it does no type-checking on creation so that manipulations of the tree do not need to pay runtime costs when the manipulators are the ones who will be ensuring that the type system is fulfilled anyway. The more user friendly constructors do resolve all this type information as part of their construction. They also infer typing information for Qiskit scalar values, and insert casts as required.

Last, the tree has a Cast node added that is additional to the description in the RFC. This was done to make life easier for consumers of the tree: they do not need to be aware of all the possible implicit promotions that are required to make various unary and binary operations valid. This will reduce complexity of consumer code and reduce the chance of bugs caused by different paths handling the casts differently.

Details and comments

Close #10223.

Release note is left til the epic closes, but I've tagged "new feature" for the short-form changelog.

@jakelishman jakelishman added the Changelog: New Feature Include in the "Added" section of the changelog label Jun 23, 2023
@jakelishman jakelishman added this to the 0.25.0 milestone Jun 23, 2023
@jakelishman jakelishman requested a review from a team as a code owner June 23, 2023 23:05
@qiskit-bot
Copy link
Collaborator

One or more of the the following people are requested to review this:

@coveralls
Copy link

coveralls commented Jun 23, 2023

Pull Request Test Coverage Report for Build 5543952498

  • 351 of 381 (92.13%) changed or added relevant lines in 8 files are covered.
  • 333 unchanged lines in 19 files lost coverage.
  • Overall coverage increased (+0.1%) to 86.045%

Changes Missing Coverage Covered Lines Changed/Added Lines %
qiskit/circuit/classical/types/ordering.py 36 37 97.3%
qiskit/circuit/classical/types/types.py 50 52 96.15%
qiskit/circuit/classical/expr/constructors.py 142 155 91.61%
qiskit/circuit/classical/expr/expr.py 97 111 87.39%
Files with Coverage Reduction New Missed Lines %
crates/qasm2/src/lex.rs 1 91.65%
qiskit/pulse/library/symbolic_pulses.py 1 95.57%
qiskit/qasm/init.py 1 80.0%
qiskit/transpiler/passes/synthesis/unitary_synthesis.py 1 90.39%
qiskit/circuit/quantumcircuitdata.py 2 86.36%
qiskit/qasm/pygments/init.py 3 0%
crates/accelerate/src/sabre_swap/layer.rs 4 97.32%
qiskit/circuit/bit.py 5 90.0%
qiskit/algorithms/gradients/utils.py 6 95.11%
qiskit/execute_function.py 6 87.5%
Totals Coverage Status
Change from base Build 5446526815: 0.1%
Covered Lines: 72252
Relevant Lines: 83970

💛 - Coveralls

return self.visit_generic(expr)


class Expr(abc.ABC):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At some point we likely want to explore uniting the full Qiskit program representation within an object model (ie., unite Expr, Circuit, Operations, Pulses, etc) within a common object model that has common properties. Each component being different makes it difficult to operate/extend in a unified fashion as the complexity grows.

I understand this is outside the scope of this PR, but it has to be tackled at some point.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a limited initial form of Expr so it probably looks pretty ad-hoc right now, but the intention is for it to be more unified. In the future, we're looking to have programs represented more as a sequence of abstract operations, where we have a proper classical type system attached and everywhere where current classical-like values appear (the parameters of gates, IfElseOp.condition, WhileLoopOp.condition, SwitchCaseOp.target, etc) will be typed operand slots for classical r-values. We'll also add something like the classical operation Store that writes the result of an Expr to a memory location. I'm hoping to starting getting those bits and pieces in place for the Terra after this one, and while it won't go all the way to a full unification in the next release, I think it'll look a bit more like what you've got in mind.

As far as I'm aware, we've got no plans for Terra to reason about classical program representations at the level of registers on the target hardware, so I think that's as far into the model as we'll get - we'll not be lowering to a three-address code or anything like that.

Pulses can be linked into the mathematical circuit-model machinery by providing the calibrations of gates, so to some degree, they're already there. A full pulse scheduler at the Qiskit level is lower than transpile typically deals with though (there's little-to-no concept of the device Hamiltonian baked into the compiler, so no matrix representation and no optimisation), so it feels like unifying ScheduleBlock/whatever as a "statement" type is not a priori a complexity we want to take on; it's (to me) somewhat akin to C programs with inline assembly. I can see the argument in favour, but it's something we'd have to discuss in our prioritisation meetings. My understanding is that the current thinking is that the pure-pulse compiler will deliberately be a separate entity.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know that say Clang has its Expr class inherit from ValueStatement, so they unify those more strongly than our initial plans. With how Terra is likely to reason (or not) about classical operations, I'm not certain that that's going to be the best strategy for us.

I think our representation of classical processing is likely to (certainly at the start) be more along the lines of just "store this value" and "use this value", and leave the details of lowering/optimising the classical components to lower parts of the stack. We can define the evaluation order of arguments, but then if Terra's not actually going to be doing any reasoning about how classical computations are lowered to hardware, then I'm not certain how much benefit there is in having every Expr be required to be linked one-to-one with a "statement".

This adds a method of representing a classical type system and
expression tree in Qiskit.  This initial version contains only simple
bitwise, logical and relational operations on Booleans and sized
unsigned integers.  This is largely in line with

    https://github.com/Qiskit/RFCs/blob/0dd13344/0010-simple-classical-representations.md

Some of the details surrounding construction and representation have
been changed slightly from that design, but the type system described is
the same.

Perhaps the largest forwards-facing change is that there is a split
between `Var` and `Value`.  The intention here is that `Var` will become
the type for a general variable with defined storage, while `Value` is
for (currently) literal-like values.

This commit slightly modifies the representation of the
tree; instead of individual tree items for each unary and binary
operation, they are part of the `Unary` and `Binary` nodes with a
respective enumeration discriminator in the node.  This was done to make
visitors to the expression tree require less boilerplate and be easier
to maintain by exploiting the natural similarities between all these
operations of the same type.  One can still discriminate on the
particular operations by using a jump table based on the enumeration
values, if desired.

Secondly, a series of constructor functions was added that is separate
to the representation form.  This is for both efficiency and user
convenience.  The tree form is _purely_ a data format; it does no
type-checking on creation so that manipulations of the tree do not need
to pay runtime costs when the manipulators are the ones who will be
ensuring that the type system is fulfilled anyway.  The more user
friendly constructors _do_ resolve all this type information as part of
their construction.  They also infer typing information for Qiskit
scalar values, and insert casts as required.

Last, the tree has a `Cast` node added that is additional to the
description in the RFC.  This was done to make life easier for consumers
of the tree: they do not need to be aware of all the possible implicit
promotions that are required to make various unary and binary operations
valid.  This will reduce complexity of consumer code and reduce the
chance of bugs caused by different paths handling the casts differently.
Copy link
Contributor

@kevinhartman kevinhartman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking great. I've left one comment about the need for separate operations for high-level boolean logic versus bitwise boolean logic. Perhaps this is something that will be more obvious once we define a concrete visitor.

qiskit/circuit/classical/expr/__init__.py Outdated Show resolved Hide resolved
return Binary(op, left, right, bool_)


def logic_and(left: typing.Any, right: typing.Any, /) -> Expr:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea of having separate operators for high level boolean logic versus bitwise operations that are also logical seems a bit odd to me. Isn't the only difference between these the type of the operands? What makes Bool operands need different handling from Uint(1) for bitwise logic?

Separately, the term "logical" might get overloaded in the future if we're not careful. For bit-shift operations, it's typical to have a "logical" and an "arithmetic" shift, the latter preserving the sign of negative integers and being sensitive to overflow.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was most thinking about the difference between Bool and Uint(1) from a slightly more abstract typing perspective. In this system here I consider Uint as a type family, so it feels weird to me to give Uint(1) special meaning and potentially different semantics, though I suppose that's how it is in LLVM right? For the difference between pure bitwise and Boolean operation: I was thinking in the future that the short-circuit evaluation nature of the "logical" - there's no observable impact right now, but in the future with extern calls there may well be.

When it comes to bitshifts, I think we'll only ever want to support logical shifts, so we'll just call them "left_shift" and "right_shift" most likely. If we did want to support it, I think we can just overload that particular binary opcode - left_shift(Uint(n), ...) would be logical while left_shift(Int(n), ...) would be arithmetic; there's no distinction for unsigneds, and "logical shift" never makes sense for signeds. That's consistent with how C, Rust and Python all handle the operator already.

I don't mind altering the bit_ vs logic_ names. Those ones seemed natural to me - it's what I've always known their C equivalents to be called.

Copy link
Contributor

@kevinhartman kevinhartman Jul 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking in the future that the short-circuit evaluation nature of the "logical" - there's no observable impact right now, but in the future with extern calls there may well be.

Hmm, yeah I'm curious how this would materialize. Is it typical to have a separate operator node for every combination of operand types? At least for the bit_ ops, there's a sort of guarantee (assuming you use the constructor interface) that the visitor won't get called with Uints of different sizes, right? But technically, the semantics of an operation on a UInt of some size are different from the semantics of a UInt of a different size. So visitor implementors would be checking the size anyway, which feels similar to requiring them to check the type family too (and possibly choosing different semantics). Which makes it seem reasonable to just have and, or, xor etc. expressions instead of bit_* and logic_*. I guess maybe some of this is due to the fact that we're lifting the types relatively early because we don't have precedence between the left and right sides of binary ops.

EDIT: I like the current approach as is, and you've already pretty much answered my question 😄.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can understand that point. I do still think there's worth, even in this stage in the distinction between a Uint type and a Bool type at this level of abstraction. Bool is a two-valued non-numeric type, whereas Uint(n) represents the infinite series of non-negative integers truncated on the additive ring to [0, 2 ** n). It so happens that Uint(1) has the same number of values as Bool, but equally so does Option<()> or Result<(), ()> in Rust, yet they're all semantically different.

I think perhaps a better way to illustrate my unwillingness to merge the types is not on the allowed operations, but on the disallowed operations for Booleans. Imo, Bool should not be allowed to be used in a non-equality relation (<, <=, etc), but Uint(n) all should be. Looking to the future, Bool + Bool should be an error (or do we go to the extreme of removing duplication and disallowing one of bitwise xor and addition for Uint(1)?), but Uint(1) + Uint(1) is well defined addition on the ring.

Copy link
Member Author

@jakelishman jakelishman Jul 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should say as well: I'm super on board with this thinking about the ease of consuming the expression tree.

I personally thought that the logic that handles the Uint(n) * Uint(n) -> Uint(n) family of binary operations would almost certainly be parametrised in any real visitor over the generic n, so I wasn't really considering "look at the width of the Uint types" to be much by the way of operation disambiguation. I wasn't planning on defining any operations for Uint types that can't be described with the n parameter constant-generic, so in that sense, treating Uint(1) specially in some places (e.g. having IfElseOp.condition accept a Uint(1) expression but not a Uint(2)) would be a violation of that principle to me.

I also don't like the implication too much that the future subscripting operation Uint(n) * <some int type> would need to return Uint(1). The memory-model semantics of QC hardware ends up with single bits being the smallest addressable type for stores, so I like having a specific type to represent that. It also matches with Terra right now - we have a distinction between ClassicalRegister(1) and Clbit.

@jakelishman
Copy link
Member Author

The other thing I could consider here is to improve the integer-literal type inference for all binary operations, rather than just the bitwise ones. Technically it's not needed in the Terra-side data model, but I think it'll be easier for interop with other places.

If so, I'll make a follow-up PR rather than dragging this bottom-of-the-stack one on longer.

Copy link
Contributor

@kevinhartman kevinhartman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I'm fine with the current implementation, but I'll let you hit the merge button in case my inline reply influences any additional changes. Nice work! 😄

@jakelishman
Copy link
Member Author

I'm happy for this to merge at this point, but I'll throw it back to you again since I replied to your comments.

@kevinhartman kevinhartman added this pull request to the merge queue Jul 13, 2023
Merged via the queue into Qiskit:main with commit 9350014 Jul 14, 2023
@jakelishman jakelishman deleted the expr/repr branch July 14, 2023 08:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Changelog: New Feature Include in the "Added" section of the changelog
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add representation of types and expressions for initial classical expressions
5 participants