-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add initial representation of classical expressions #10332
Conversation
One or more of the the following people are requested to review this:
|
Pull Request Test Coverage Report for Build 5543952498
💛 - Coveralls |
return self.visit_generic(expr) | ||
|
||
|
||
class Expr(abc.ABC): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At some point we likely want to explore uniting the full Qiskit program representation within an object model (ie., unite Expr, Circuit, Operations, Pulses, etc) within a common object model that has common properties. Each component being different makes it difficult to operate/extend in a unified fashion as the complexity grows.
I understand this is outside the scope of this PR, but it has to be tackled at some point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a limited initial form of Expr
so it probably looks pretty ad-hoc right now, but the intention is for it to be more unified. In the future, we're looking to have programs represented more as a sequence of abstract operations, where we have a proper classical type system attached and everywhere where current classical-like values appear (the parameters of gates, IfElseOp.condition
, WhileLoopOp.condition
, SwitchCaseOp.target
, etc) will be typed operand slots for classical r-values. We'll also add something like the classical operation Store
that writes the result of an Expr
to a memory location. I'm hoping to starting getting those bits and pieces in place for the Terra after this one, and while it won't go all the way to a full unification in the next release, I think it'll look a bit more like what you've got in mind.
As far as I'm aware, we've got no plans for Terra to reason about classical program representations at the level of registers on the target hardware, so I think that's as far into the model as we'll get - we'll not be lowering to a three-address code or anything like that.
Pulses can be linked into the mathematical circuit-model machinery by providing the calibrations of gates, so to some degree, they're already there. A full pulse scheduler at the Qiskit level is lower than transpile
typically deals with though (there's little-to-no concept of the device Hamiltonian baked into the compiler, so no matrix representation and no optimisation), so it feels like unifying ScheduleBlock
/whatever as a "statement" type is not a priori a complexity we want to take on; it's (to me) somewhat akin to C programs with inline assembly. I can see the argument in favour, but it's something we'd have to discuss in our prioritisation meetings. My understanding is that the current thinking is that the pure-pulse compiler will deliberately be a separate entity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know that say Clang has its Expr
class inherit from ValueStatement
, so they unify those more strongly than our initial plans. With how Terra is likely to reason (or not) about classical operations, I'm not certain that that's going to be the best strategy for us.
I think our representation of classical processing is likely to (certainly at the start) be more along the lines of just "store this value" and "use this value", and leave the details of lowering/optimising the classical components to lower parts of the stack. We can define the evaluation order of arguments, but then if Terra's not actually going to be doing any reasoning about how classical computations are lowered to hardware, then I'm not certain how much benefit there is in having every Expr
be required to be linked one-to-one with a "statement".
This adds a method of representing a classical type system and expression tree in Qiskit. This initial version contains only simple bitwise, logical and relational operations on Booleans and sized unsigned integers. This is largely in line with https://github.com/Qiskit/RFCs/blob/0dd13344/0010-simple-classical-representations.md Some of the details surrounding construction and representation have been changed slightly from that design, but the type system described is the same. Perhaps the largest forwards-facing change is that there is a split between `Var` and `Value`. The intention here is that `Var` will become the type for a general variable with defined storage, while `Value` is for (currently) literal-like values. This commit slightly modifies the representation of the tree; instead of individual tree items for each unary and binary operation, they are part of the `Unary` and `Binary` nodes with a respective enumeration discriminator in the node. This was done to make visitors to the expression tree require less boilerplate and be easier to maintain by exploiting the natural similarities between all these operations of the same type. One can still discriminate on the particular operations by using a jump table based on the enumeration values, if desired. Secondly, a series of constructor functions was added that is separate to the representation form. This is for both efficiency and user convenience. The tree form is _purely_ a data format; it does no type-checking on creation so that manipulations of the tree do not need to pay runtime costs when the manipulators are the ones who will be ensuring that the type system is fulfilled anyway. The more user friendly constructors _do_ resolve all this type information as part of their construction. They also infer typing information for Qiskit scalar values, and insert casts as required. Last, the tree has a `Cast` node added that is additional to the description in the RFC. This was done to make life easier for consumers of the tree: they do not need to be aware of all the possible implicit promotions that are required to make various unary and binary operations valid. This will reduce complexity of consumer code and reduce the chance of bugs caused by different paths handling the casts differently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking great. I've left one comment about the need for separate operations for high-level boolean logic versus bitwise boolean logic. Perhaps this is something that will be more obvious once we define a concrete visitor.
return Binary(op, left, right, bool_) | ||
|
||
|
||
def logic_and(left: typing.Any, right: typing.Any, /) -> Expr: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea of having separate operators for high level boolean logic versus bitwise operations that are also logical seems a bit odd to me. Isn't the only difference between these the type of the operands? What makes Bool
operands need different handling from Uint(1)
for bitwise logic?
Separately, the term "logical" might get overloaded in the future if we're not careful. For bit-shift operations, it's typical to have a "logical" and an "arithmetic" shift, the latter preserving the sign of negative integers and being sensitive to overflow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was most thinking about the difference between Bool
and Uint(1)
from a slightly more abstract typing perspective. In this system here I consider Uint
as a type family, so it feels weird to me to give Uint(1)
special meaning and potentially different semantics, though I suppose that's how it is in LLVM right? For the difference between pure bitwise and Boolean operation: I was thinking in the future that the short-circuit evaluation nature of the "logical" - there's no observable impact right now, but in the future with extern
calls there may well be.
When it comes to bitshifts, I think we'll only ever want to support logical shifts, so we'll just call them "left_shift" and "right_shift" most likely. If we did want to support it, I think we can just overload that particular binary opcode - left_shift(Uint(n), ...)
would be logical while left_shift(Int(n), ...)
would be arithmetic; there's no distinction for unsigneds, and "logical shift" never makes sense for signeds. That's consistent with how C, Rust and Python all handle the operator already.
I don't mind altering the bit_
vs logic_
names. Those ones seemed natural to me - it's what I've always known their C equivalents to be called.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking in the future that the short-circuit evaluation nature of the "logical" - there's no observable impact right now, but in the future with extern calls there may well be.
Hmm, yeah I'm curious how this would materialize. Is it typical to have a separate operator node for every combination of operand types? At least for the bit_
ops, there's a sort of guarantee (assuming you use the constructor interface) that the visitor won't get called with Uint
s of different sizes, right? But technically, the semantics of an operation on a UInt
of some size are different from the semantics of a UInt
of a different size. So visitor implementors would be checking the size anyway, which feels similar to requiring them to check the type family too (and possibly choosing different semantics). Which makes it seem reasonable to just have and
, or
, xor
etc. expressions instead of bit_*
and logic_*
. I guess maybe some of this is due to the fact that we're lifting the types relatively early because we don't have precedence between the left and right sides of binary ops.
EDIT: I like the current approach as is, and you've already pretty much answered my question 😄.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can understand that point. I do still think there's worth, even in this stage in the distinction between a Uint type and a Bool type at this level of abstraction. Bool
is a two-valued non-numeric type, whereas Uint(n)
represents the infinite series of non-negative integers truncated on the additive ring to [0, 2 ** n)
. It so happens that Uint(1)
has the same number of values as Bool
, but equally so does Option<()>
or Result<(), ()>
in Rust, yet they're all semantically different.
I think perhaps a better way to illustrate my unwillingness to merge the types is not on the allowed operations, but on the disallowed operations for Booleans. Imo, Bool
should not be allowed to be used in a non-equality relation (<
, <=
, etc), but Uint(n)
all should be. Looking to the future, Bool + Bool
should be an error (or do we go to the extreme of removing duplication and disallowing one of bitwise xor and addition for Uint(1)
?), but Uint(1) + Uint(1)
is well defined addition on the ring.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should say as well: I'm super on board with this thinking about the ease of consuming the expression tree.
I personally thought that the logic that handles the Uint(n) * Uint(n) -> Uint(n)
family of binary operations would almost certainly be parametrised in any real visitor over the generic n
, so I wasn't really considering "look at the width of the Uint
types" to be much by the way of operation disambiguation. I wasn't planning on defining any operations for Uint
types that can't be described with the n
parameter constant-generic, so in that sense, treating Uint(1)
specially in some places (e.g. having IfElseOp.condition
accept a Uint(1)
expression but not a Uint(2)
) would be a violation of that principle to me.
I also don't like the implication too much that the future subscripting operation Uint(n) * <some int type>
would need to return Uint(1)
. The memory-model semantics of QC hardware ends up with single bits being the smallest addressable type for stores, so I like having a specific type to represent that. It also matches with Terra right now - we have a distinction between ClassicalRegister(1)
and Clbit
.
The other thing I could consider here is to improve the integer-literal type inference for all binary operations, rather than just the bitwise ones. Technically it's not needed in the Terra-side data model, but I think it'll be easier for interop with other places. If so, I'll make a follow-up PR rather than dragging this bottom-of-the-stack one on longer. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I'm fine with the current implementation, but I'll let you hit the merge button in case my inline reply influences any additional changes. Nice work! 😄
I'm happy for this to merge at this point, but I'll throw it back to you again since I replied to your comments. |
Summary
This adds a method of representing a classical type system and expression tree in Qiskit. This initial version contains only simple bitwise, logical and relational operations on Booleans and sized unsigned integers. This is largely in line with
Some of the details surrounding construction and representation have been changed slightly from that design, but the type system described is the same.
In particular, this commit slightly modifies the representation of the tree; instead of individual tree items for each unary and binary operation, they are part of the
Unary
andBinary
nodes with a respective enumeration discriminator in the node. This was done to make visitors to the expression tree require less boilerplate and be easier to maintain by exploiting the natural similarities between all these operations of the same type. One can still discriminate on the particular operations by using a jump table based on the enumeration values, if desired.Secondly, a series of constructor functions was added that is separate to the representation form. This is for both efficiency and user convenience. The tree form is purely a data format; it does no type-checking on creation so that manipulations of the tree do not need to pay runtime costs when the manipulators are the ones who will be ensuring that the type system is fulfilled anyway. The more user friendly constructors do resolve all this type information as part of their construction. They also infer typing information for Qiskit scalar values, and insert casts as required.
Last, the tree has a
Cast
node added that is additional to the description in the RFC. This was done to make life easier for consumers of the tree: they do not need to be aware of all the possible implicit promotions that are required to make various unary and binary operations valid. This will reduce complexity of consumer code and reduce the chance of bugs caused by different paths handling the casts differently.Details and comments
Close #10223.
Release note is left til the epic closes, but I've tagged "new feature" for the short-form changelog.