cranelift: Fix ireduce rules #8005

jameysharp · 2024-02-28T01:25:11Z

We had two optimization rules which started off like this:

(rule (simplify (ireduce smallty val@(binary_op _ op x y)))
(if-let _ (reducible_modular_op val))
...)

This was intended to check that x and y came from an instruction which not only was a binary op but also matched reducible_modular_op.

Unfortunately, both binary_op and reducible_modular_op were multi-terms.

So binary_op would search the eclass rooted at val to find each instruction that uses a binary operator.
Then reducible_modular_op would search the entire eclass again to find an instruction which matched its criteria.

Nothing ensured that both searches would find the same instruction.

The reason these rules were written this way was because they had additional guards (will_simplify_with_ireduce) which made them fairly complex, and it seemed desirable to not have to copy those guards for every operator where we wanted to apply this optimization.

However, we've decided that checking whether the rule is actually an improvement is not desirable. In general, that should be the job of the cost function. Blindly adding equivalent expressions gives us more opportunities for other rules to fire, and we have global recursion and growth limits to keep the process from going too wild.

As a result, we can just delete those guards. That allows us to write the rules in a more straightforward way.

Fixes #7999.

cc: @elliottt @cfallin @lpereira

We had two optimization rules which started off like this: (rule (simplify (ireduce smallty val@(binary_op _ op x y))) (if-let _ (reducible_modular_op val)) ...) This was intended to check that `x` and `y` came from an instruction which not only was a binary op but also matched `reducible_modular_op`. Unfortunately, both `binary_op` and `reducible_modular_op` were multi-terms. - So `binary_op` would search the eclass rooted at `val` to find each instruction that uses a binary operator. - Then `reducible_modular_op` would search the entire eclass again to find an instruction which matched its criteria. Nothing ensured that both searches would find the same instruction. The reason these rules were written this way was because they had additional guards (`will_simplify_with_ireduce`) which made them fairly complex, and it seemed desirable to not have to copy those guards for every operator where we wanted to apply this optimization. However, we've decided that checking whether the rule is actually an improvement is not desirable. In general, that should be the job of the cost function. Blindly adding equivalent expressions gives us more opportunities for other rules to fire, and we have global recursion and growth limits to keep the process from going too wild. As a result, we can just delete those guards. That allows us to write the rules in a more straightforward way. Fixes bytecodealliance#7999. Co-authored-by: Trevor Elliott <telliott@fastly.com> Co-authored-by: L Pereira <l.pereira@fastly.com> Co-authored-by: Chris Fallin <chris@cfallin.org>

* cranelift: Fix ireduce rules (#8005) We had two optimization rules which started off like this: (rule (simplify (ireduce smallty val@(binary_op _ op x y))) (if-let _ (reducible_modular_op val)) ...) This was intended to check that `x` and `y` came from an instruction which not only was a binary op but also matched `reducible_modular_op`. Unfortunately, both `binary_op` and `reducible_modular_op` were multi-terms. - So `binary_op` would search the eclass rooted at `val` to find each instruction that uses a binary operator. - Then `reducible_modular_op` would search the entire eclass again to find an instruction which matched its criteria. Nothing ensured that both searches would find the same instruction. The reason these rules were written this way was because they had additional guards (`will_simplify_with_ireduce`) which made them fairly complex, and it seemed desirable to not have to copy those guards for every operator where we wanted to apply this optimization. However, we've decided that checking whether the rule is actually an improvement is not desirable. In general, that should be the job of the cost function. Blindly adding equivalent expressions gives us more opportunities for other rules to fire, and we have global recursion and growth limits to keep the process from going too wild. As a result, we can just delete those guards. That allows us to write the rules in a more straightforward way. Fixes #7999. Co-authored-by: Trevor Elliott <telliott@fastly.com> Co-authored-by: L Pereira <l.pereira@fastly.com> Co-authored-by: Chris Fallin <chris@cfallin.org> * Update RELEASES.md --------- Co-authored-by: Jamey Sharp <jsharp@fastly.com> Co-authored-by: L Pereira <l.pereira@fastly.com> Co-authored-by: Chris Fallin <chris@cfallin.org>

* cranelift: Fix ireduce rules (#8005) We had two optimization rules which started off like this: (rule (simplify (ireduce smallty val@(binary_op _ op x y))) (if-let _ (reducible_modular_op val)) ...) This was intended to check that `x` and `y` came from an instruction which not only was a binary op but also matched `reducible_modular_op`. Unfortunately, both `binary_op` and `reducible_modular_op` were multi-terms. - So `binary_op` would search the eclass rooted at `val` to find each instruction that uses a binary operator. - Then `reducible_modular_op` would search the entire eclass again to find an instruction which matched its criteria. Nothing ensured that both searches would find the same instruction. The reason these rules were written this way was because they had additional guards (`will_simplify_with_ireduce`) which made them fairly complex, and it seemed desirable to not have to copy those guards for every operator where we wanted to apply this optimization. However, we've decided that checking whether the rule is actually an improvement is not desirable. In general, that should be the job of the cost function. Blindly adding equivalent expressions gives us more opportunities for other rules to fire, and we have global recursion and growth limits to keep the process from going too wild. As a result, we can just delete those guards. That allows us to write the rules in a more straightforward way. Fixes #7999. Co-authored-by: Trevor Elliott <telliott@fastly.com> Co-authored-by: L Pereira <l.pereira@fastly.com> Co-authored-by: Chris Fallin <chris@cfallin.org> * Update RELEASES.md * Update RELEASES.md --------- Co-authored-by: Jamey Sharp <jsharp@fastly.com> Co-authored-by: L Pereira <l.pereira@fastly.com> Co-authored-by: Chris Fallin <chris@cfallin.org>

@jameysharp

…dance after bytecodealliance#7999. While debugging bytecodealliance#7999, we learned of a possible, extremely subtle, interaction between the design of our ISLE mid-end prelude and matching rules with a certain structure. Because a `Value` represents an entire eclass, separate left-hand sides (as in helper rules or if-let clauses) that match on the value may match against different enodes returned by the multi-match extractor's iterator. We then may infer some property of one enode, another property of another enode, and perform a rewrite that is only valid if both of those matches are on the same enode. The precise distinction is whether it is a property of the *value* -- e.g., nonzero, or even, or within a range -- or a property of the *operation* and matched subpieces. The former is fine, the latter runs into trouble. We found that bytecodealliance#7719 added a helper that determined whether a value "was a certain operator" -- actually, had any enode of a certain operator -- and separately, matched "the operator" and extracted its opcode and parameters (actually, *any* binary operator). The first half can match an opcode we support simplifying, and the second half can get the arguments and `op` and blindly use them in the rewrite. This PR adds new guidance to avoid complex helpers and be aware of multi-matching behavior, preferring to write patterns directly (as the fix in bytecodealliance#8005 does) instead. Longer-term, we also have other ideas, e.g. @jameysharp's suggestion to disallow at-patterns on multi-extractors in left hand sides to reduce the chance of hitting this footgun.

@jameysharp

…dance after #7999. (#8015) While debugging #7999, we learned of a possible, extremely subtle, interaction between the design of our ISLE mid-end prelude and matching rules with a certain structure. Because a `Value` represents an entire eclass, separate left-hand sides (as in helper rules or if-let clauses) that match on the value may match against different enodes returned by the multi-match extractor's iterator. We then may infer some property of one enode, another property of another enode, and perform a rewrite that is only valid if both of those matches are on the same enode. The precise distinction is whether it is a property of the *value* -- e.g., nonzero, or even, or within a range -- or a property of the *operation* and matched subpieces. The former is fine, the latter runs into trouble. We found that #7719 added a helper that determined whether a value "was a certain operator" -- actually, had any enode of a certain operator -- and separately, matched "the operator" and extracted its opcode and parameters (actually, *any* binary operator). The first half can match an opcode we support simplifying, and the second half can get the arguments and `op` and blindly use them in the rewrite. This PR adds new guidance to avoid complex helpers and be aware of multi-matching behavior, preferring to write patterns directly (as the fix in #8005 does) instead. Longer-term, we also have other ideas, e.g. @jameysharp's suggestion to disallow at-patterns on multi-extractors in left hand sides to reduce the chance of hitting this footgun.

scottmcm · 2024-03-28T06:42:53Z

Oops! Sorry for the mess here.

However, we've decided that checking whether the rule is actually an improvement is not desirable. In general, that should be the job of the cost function. Blindly adding equivalent expressions gives us more opportunities for other rules to fire, and we have global recursion and growth limits to keep the process from going too wild.

Good to see the new guidance for this, since it had been uncertain back when I first started doing these rules in #7693 (comment)

jameysharp · 2024-03-28T15:30:55Z

Yeah, we're all still figuring out how best to use this egraph optimizer. The best we can do is keep trying things out, and I appreciate your efforts to do that!

jameysharp requested a review from a team as a code owner February 28, 2024 01:25

jameysharp requested review from cfallin and removed request for a team February 28, 2024 01:25

jameysharp force-pushed the fix-ireduce-opts branch from 990ddd0 to 4672280 Compare February 28, 2024 01:26

jameysharp changed the title ~~cranelift: Fix ireduce rules (fixes #7999)~~ cranelift: Fix ireduce rules Feb 28, 2024

elliottt approved these changes Feb 28, 2024

View reviewed changes

jameysharp added this pull request to the merge queue Feb 28, 2024

Merged via the queue into bytecodealliance:main with commit ead6c7c Feb 28, 2024
19 checks passed

jameysharp deleted the fix-ireduce-opts branch February 28, 2024 02:28

elliottt mentioned this pull request Feb 28, 2024

Backport #8005 to release-17.0.0 #8007

Merged

elliottt mentioned this pull request Feb 28, 2024

Backport #8005 to release-18.0.0 #8008

Merged

fitzgen mentioned this pull request Feb 28, 2024

Add regression test for issue #7999 #8012

Merged

cfallin mentioned this pull request Feb 28, 2024

Cranelift: add note to opts/README.md describing new mid-end rule guidance after #7999. #8015

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cranelift: Fix ireduce rules #8005

cranelift: Fix ireduce rules #8005

jameysharp commented Feb 28, 2024 •

edited

Loading

scottmcm commented Mar 28, 2024

jameysharp commented Mar 28, 2024

cranelift: Fix ireduce rules #8005

cranelift: Fix ireduce rules #8005

Conversation

jameysharp commented Feb 28, 2024 • edited Loading

scottmcm commented Mar 28, 2024

jameysharp commented Mar 28, 2024

jameysharp commented Feb 28, 2024 •

edited

Loading