-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cranelift: Fix ireduce rules #8005
Conversation
We had two optimization rules which started off like this: (rule (simplify (ireduce smallty val@(binary_op _ op x y))) (if-let _ (reducible_modular_op val)) ...) This was intended to check that `x` and `y` came from an instruction which not only was a binary op but also matched `reducible_modular_op`. Unfortunately, both `binary_op` and `reducible_modular_op` were multi-terms. - So `binary_op` would search the eclass rooted at `val` to find each instruction that uses a binary operator. - Then `reducible_modular_op` would search the entire eclass again to find an instruction which matched its criteria. Nothing ensured that both searches would find the same instruction. The reason these rules were written this way was because they had additional guards (`will_simplify_with_ireduce`) which made them fairly complex, and it seemed desirable to not have to copy those guards for every operator where we wanted to apply this optimization. However, we've decided that checking whether the rule is actually an improvement is not desirable. In general, that should be the job of the cost function. Blindly adding equivalent expressions gives us more opportunities for other rules to fire, and we have global recursion and growth limits to keep the process from going too wild. As a result, we can just delete those guards. That allows us to write the rules in a more straightforward way. Fixes bytecodealliance#7999. Co-authored-by: Trevor Elliott <telliott@fastly.com> Co-authored-by: L Pereira <l.pereira@fastly.com> Co-authored-by: Chris Fallin <chris@cfallin.org>
990ddd0
to
4672280
Compare
We had two optimization rules which started off like this: (rule (simplify (ireduce smallty val@(binary_op _ op x y))) (if-let _ (reducible_modular_op val)) ...) This was intended to check that `x` and `y` came from an instruction which not only was a binary op but also matched `reducible_modular_op`. Unfortunately, both `binary_op` and `reducible_modular_op` were multi-terms. - So `binary_op` would search the eclass rooted at `val` to find each instruction that uses a binary operator. - Then `reducible_modular_op` would search the entire eclass again to find an instruction which matched its criteria. Nothing ensured that both searches would find the same instruction. The reason these rules were written this way was because they had additional guards (`will_simplify_with_ireduce`) which made them fairly complex, and it seemed desirable to not have to copy those guards for every operator where we wanted to apply this optimization. However, we've decided that checking whether the rule is actually an improvement is not desirable. In general, that should be the job of the cost function. Blindly adding equivalent expressions gives us more opportunities for other rules to fire, and we have global recursion and growth limits to keep the process from going too wild. As a result, we can just delete those guards. That allows us to write the rules in a more straightforward way. Fixes bytecodealliance#7999. Co-authored-by: Trevor Elliott <telliott@fastly.com> Co-authored-by: L Pereira <l.pereira@fastly.com> Co-authored-by: Chris Fallin <chris@cfallin.org>
We had two optimization rules which started off like this: (rule (simplify (ireduce smallty val@(binary_op _ op x y))) (if-let _ (reducible_modular_op val)) ...) This was intended to check that `x` and `y` came from an instruction which not only was a binary op but also matched `reducible_modular_op`. Unfortunately, both `binary_op` and `reducible_modular_op` were multi-terms. - So `binary_op` would search the eclass rooted at `val` to find each instruction that uses a binary operator. - Then `reducible_modular_op` would search the entire eclass again to find an instruction which matched its criteria. Nothing ensured that both searches would find the same instruction. The reason these rules were written this way was because they had additional guards (`will_simplify_with_ireduce`) which made them fairly complex, and it seemed desirable to not have to copy those guards for every operator where we wanted to apply this optimization. However, we've decided that checking whether the rule is actually an improvement is not desirable. In general, that should be the job of the cost function. Blindly adding equivalent expressions gives us more opportunities for other rules to fire, and we have global recursion and growth limits to keep the process from going too wild. As a result, we can just delete those guards. That allows us to write the rules in a more straightforward way. Fixes bytecodealliance#7999. Co-authored-by: Trevor Elliott <telliott@fastly.com> Co-authored-by: L Pereira <l.pereira@fastly.com> Co-authored-by: Chris Fallin <chris@cfallin.org>
* cranelift: Fix ireduce rules (#8005) We had two optimization rules which started off like this: (rule (simplify (ireduce smallty val@(binary_op _ op x y))) (if-let _ (reducible_modular_op val)) ...) This was intended to check that `x` and `y` came from an instruction which not only was a binary op but also matched `reducible_modular_op`. Unfortunately, both `binary_op` and `reducible_modular_op` were multi-terms. - So `binary_op` would search the eclass rooted at `val` to find each instruction that uses a binary operator. - Then `reducible_modular_op` would search the entire eclass again to find an instruction which matched its criteria. Nothing ensured that both searches would find the same instruction. The reason these rules were written this way was because they had additional guards (`will_simplify_with_ireduce`) which made them fairly complex, and it seemed desirable to not have to copy those guards for every operator where we wanted to apply this optimization. However, we've decided that checking whether the rule is actually an improvement is not desirable. In general, that should be the job of the cost function. Blindly adding equivalent expressions gives us more opportunities for other rules to fire, and we have global recursion and growth limits to keep the process from going too wild. As a result, we can just delete those guards. That allows us to write the rules in a more straightforward way. Fixes #7999. Co-authored-by: Trevor Elliott <telliott@fastly.com> Co-authored-by: L Pereira <l.pereira@fastly.com> Co-authored-by: Chris Fallin <chris@cfallin.org> * Update RELEASES.md --------- Co-authored-by: Jamey Sharp <jsharp@fastly.com> Co-authored-by: L Pereira <l.pereira@fastly.com> Co-authored-by: Chris Fallin <chris@cfallin.org>
* cranelift: Fix ireduce rules (#8005) We had two optimization rules which started off like this: (rule (simplify (ireduce smallty val@(binary_op _ op x y))) (if-let _ (reducible_modular_op val)) ...) This was intended to check that `x` and `y` came from an instruction which not only was a binary op but also matched `reducible_modular_op`. Unfortunately, both `binary_op` and `reducible_modular_op` were multi-terms. - So `binary_op` would search the eclass rooted at `val` to find each instruction that uses a binary operator. - Then `reducible_modular_op` would search the entire eclass again to find an instruction which matched its criteria. Nothing ensured that both searches would find the same instruction. The reason these rules were written this way was because they had additional guards (`will_simplify_with_ireduce`) which made them fairly complex, and it seemed desirable to not have to copy those guards for every operator where we wanted to apply this optimization. However, we've decided that checking whether the rule is actually an improvement is not desirable. In general, that should be the job of the cost function. Blindly adding equivalent expressions gives us more opportunities for other rules to fire, and we have global recursion and growth limits to keep the process from going too wild. As a result, we can just delete those guards. That allows us to write the rules in a more straightforward way. Fixes #7999. Co-authored-by: Trevor Elliott <telliott@fastly.com> Co-authored-by: L Pereira <l.pereira@fastly.com> Co-authored-by: Chris Fallin <chris@cfallin.org> * Update RELEASES.md * Update RELEASES.md --------- Co-authored-by: Jamey Sharp <jsharp@fastly.com> Co-authored-by: L Pereira <l.pereira@fastly.com> Co-authored-by: Chris Fallin <chris@cfallin.org>
…dance after bytecodealliance#7999. While debugging bytecodealliance#7999, we learned of a possible, extremely subtle, interaction between the design of our ISLE mid-end prelude and matching rules with a certain structure. Because a `Value` represents an entire eclass, separate left-hand sides (as in helper rules or if-let clauses) that match on the value may match against different enodes returned by the multi-match extractor's iterator. We then may infer some property of one enode, another property of another enode, and perform a rewrite that is only valid if both of those matches are on the same enode. The precise distinction is whether it is a property of the *value* -- e.g., nonzero, or even, or within a range -- or a property of the *operation* and matched subpieces. The former is fine, the latter runs into trouble. We found that bytecodealliance#7719 added a helper that determined whether a value "was a certain operator" -- actually, had any enode of a certain operator -- and separately, matched "the operator" and extracted its opcode and parameters (actually, *any* binary operator). The first half can match an opcode we support simplifying, and the second half can get the arguments and `op` and blindly use them in the rewrite. This PR adds new guidance to avoid complex helpers and be aware of multi-matching behavior, preferring to write patterns directly (as the fix in bytecodealliance#8005 does) instead. Longer-term, we also have other ideas, e.g. @jameysharp's suggestion to disallow at-patterns on multi-extractors in left hand sides to reduce the chance of hitting this footgun.
…dance after #7999. (#8015) While debugging #7999, we learned of a possible, extremely subtle, interaction between the design of our ISLE mid-end prelude and matching rules with a certain structure. Because a `Value` represents an entire eclass, separate left-hand sides (as in helper rules or if-let clauses) that match on the value may match against different enodes returned by the multi-match extractor's iterator. We then may infer some property of one enode, another property of another enode, and perform a rewrite that is only valid if both of those matches are on the same enode. The precise distinction is whether it is a property of the *value* -- e.g., nonzero, or even, or within a range -- or a property of the *operation* and matched subpieces. The former is fine, the latter runs into trouble. We found that #7719 added a helper that determined whether a value "was a certain operator" -- actually, had any enode of a certain operator -- and separately, matched "the operator" and extracted its opcode and parameters (actually, *any* binary operator). The first half can match an opcode we support simplifying, and the second half can get the arguments and `op` and blindly use them in the rewrite. This PR adds new guidance to avoid complex helpers and be aware of multi-matching behavior, preferring to write patterns directly (as the fix in #8005 does) instead. Longer-term, we also have other ideas, e.g. @jameysharp's suggestion to disallow at-patterns on multi-extractors in left hand sides to reduce the chance of hitting this footgun.
Oops! Sorry for the mess here.
Good to see the new guidance for this, since it had been uncertain back when I first started doing these rules in #7693 (comment) |
Yeah, we're all still figuring out how best to use this egraph optimizer. The best we can do is keep trying things out, and I appreciate your efforts to do that! |
We had two optimization rules which started off like this:
(rule (simplify (ireduce smallty val@(binary_op _ op x y)))
(if-let _ (reducible_modular_op val))
...)
This was intended to check that
x
andy
came from an instruction which not only was a binary op but also matchedreducible_modular_op
.Unfortunately, both
binary_op
andreducible_modular_op
were multi-terms.binary_op
would search the eclass rooted atval
to find each instruction that uses a binary operator.reducible_modular_op
would search the entire eclass again to find an instruction which matched its criteria.Nothing ensured that both searches would find the same instruction.
The reason these rules were written this way was because they had additional guards (
will_simplify_with_ireduce
) which made them fairly complex, and it seemed desirable to not have to copy those guards for every operator where we wanted to apply this optimization.However, we've decided that checking whether the rule is actually an improvement is not desirable. In general, that should be the job of the cost function. Blindly adding equivalent expressions gives us more opportunities for other rules to fire, and we have global recursion and growth limits to keep the process from going too wild.
As a result, we can just delete those guards. That allows us to write the rules in a more straightforward way.
Fixes #7999.
cc: @elliottt @cfallin @lpereira