Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cranelift: Fix ireduce rules #8005

Merged
merged 1 commit into from
Feb 28, 2024

Conversation

jameysharp
Copy link
Contributor

@jameysharp jameysharp commented Feb 28, 2024

We had two optimization rules which started off like this:

(rule (simplify (ireduce smallty val@(binary_op _ op x y)))
(if-let _ (reducible_modular_op val))
...)

This was intended to check that x and y came from an instruction which not only was a binary op but also matched reducible_modular_op.

Unfortunately, both binary_op and reducible_modular_op were multi-terms.

  • So binary_op would search the eclass rooted at val to find each instruction that uses a binary operator.
  • Then reducible_modular_op would search the entire eclass again to find an instruction which matched its criteria.

Nothing ensured that both searches would find the same instruction.

The reason these rules were written this way was because they had additional guards (will_simplify_with_ireduce) which made them fairly complex, and it seemed desirable to not have to copy those guards for every operator where we wanted to apply this optimization.

However, we've decided that checking whether the rule is actually an improvement is not desirable. In general, that should be the job of the cost function. Blindly adding equivalent expressions gives us more opportunities for other rules to fire, and we have global recursion and growth limits to keep the process from going too wild.

As a result, we can just delete those guards. That allows us to write the rules in a more straightforward way.

Fixes #7999.

cc: @elliottt @cfallin @lpereira

@jameysharp jameysharp requested a review from a team as a code owner February 28, 2024 01:25
@jameysharp jameysharp requested review from cfallin and removed request for a team February 28, 2024 01:25
We had two optimization rules which started off like this:

(rule (simplify (ireduce smallty val@(binary_op _ op x y)))
      (if-let _ (reducible_modular_op val))
      ...)

This was intended to check that `x` and `y` came from an instruction
which not only was a binary op but also matched `reducible_modular_op`.

Unfortunately, both `binary_op` and `reducible_modular_op` were
multi-terms.
- So `binary_op` would search the eclass rooted at `val` to find each
  instruction that uses a binary operator.
- Then `reducible_modular_op` would search the entire eclass again to
  find an instruction which matched its criteria.

Nothing ensured that both searches would find the same instruction.

The reason these rules were written this way was because they had
additional guards (`will_simplify_with_ireduce`) which made them fairly
complex, and it seemed desirable to not have to copy those guards for
every operator where we wanted to apply this optimization.

However, we've decided that checking whether the rule is actually an
improvement is not desirable. In general, that should be the job of the
cost function. Blindly adding equivalent expressions gives us more
opportunities for other rules to fire, and we have global recursion and
growth limits to keep the process from going too wild.

As a result, we can just delete those guards. That allows us to write
the rules in a more straightforward way.

Fixes bytecodealliance#7999.

Co-authored-by: Trevor Elliott <telliott@fastly.com>
Co-authored-by: L Pereira <l.pereira@fastly.com>
Co-authored-by: Chris Fallin <chris@cfallin.org>
@jameysharp jameysharp changed the title cranelift: Fix ireduce rules (fixes #7999) cranelift: Fix ireduce rules Feb 28, 2024
@jameysharp jameysharp added this pull request to the merge queue Feb 28, 2024
Merged via the queue into bytecodealliance:main with commit ead6c7c Feb 28, 2024
19 checks passed
@jameysharp jameysharp deleted the fix-ireduce-opts branch February 28, 2024 02:28
elliottt added a commit to elliottt/wasmtime that referenced this pull request Feb 28, 2024
We had two optimization rules which started off like this:

(rule (simplify (ireduce smallty val@(binary_op _ op x y)))
      (if-let _ (reducible_modular_op val))
      ...)

This was intended to check that `x` and `y` came from an instruction
which not only was a binary op but also matched `reducible_modular_op`.

Unfortunately, both `binary_op` and `reducible_modular_op` were
multi-terms.
- So `binary_op` would search the eclass rooted at `val` to find each
  instruction that uses a binary operator.
- Then `reducible_modular_op` would search the entire eclass again to
  find an instruction which matched its criteria.

Nothing ensured that both searches would find the same instruction.

The reason these rules were written this way was because they had
additional guards (`will_simplify_with_ireduce`) which made them fairly
complex, and it seemed desirable to not have to copy those guards for
every operator where we wanted to apply this optimization.

However, we've decided that checking whether the rule is actually an
improvement is not desirable. In general, that should be the job of the
cost function. Blindly adding equivalent expressions gives us more
opportunities for other rules to fire, and we have global recursion and
growth limits to keep the process from going too wild.

As a result, we can just delete those guards. That allows us to write
the rules in a more straightforward way.

Fixes bytecodealliance#7999.

Co-authored-by: Trevor Elliott <telliott@fastly.com>
Co-authored-by: L Pereira <l.pereira@fastly.com>
Co-authored-by: Chris Fallin <chris@cfallin.org>
elliottt added a commit to elliottt/wasmtime that referenced this pull request Feb 28, 2024
We had two optimization rules which started off like this:

(rule (simplify (ireduce smallty val@(binary_op _ op x y)))
      (if-let _ (reducible_modular_op val))
      ...)

This was intended to check that `x` and `y` came from an instruction
which not only was a binary op but also matched `reducible_modular_op`.

Unfortunately, both `binary_op` and `reducible_modular_op` were
multi-terms.
- So `binary_op` would search the eclass rooted at `val` to find each
  instruction that uses a binary operator.
- Then `reducible_modular_op` would search the entire eclass again to
  find an instruction which matched its criteria.

Nothing ensured that both searches would find the same instruction.

The reason these rules were written this way was because they had
additional guards (`will_simplify_with_ireduce`) which made them fairly
complex, and it seemed desirable to not have to copy those guards for
every operator where we wanted to apply this optimization.

However, we've decided that checking whether the rule is actually an
improvement is not desirable. In general, that should be the job of the
cost function. Blindly adding equivalent expressions gives us more
opportunities for other rules to fire, and we have global recursion and
growth limits to keep the process from going too wild.

As a result, we can just delete those guards. That allows us to write
the rules in a more straightforward way.

Fixes bytecodealliance#7999.

Co-authored-by: Trevor Elliott <telliott@fastly.com>
Co-authored-by: L Pereira <l.pereira@fastly.com>
Co-authored-by: Chris Fallin <chris@cfallin.org>
alexcrichton pushed a commit that referenced this pull request Feb 28, 2024
* cranelift: Fix ireduce rules (#8005)

We had two optimization rules which started off like this:

(rule (simplify (ireduce smallty val@(binary_op _ op x y)))
      (if-let _ (reducible_modular_op val))
      ...)

This was intended to check that `x` and `y` came from an instruction
which not only was a binary op but also matched `reducible_modular_op`.

Unfortunately, both `binary_op` and `reducible_modular_op` were
multi-terms.
- So `binary_op` would search the eclass rooted at `val` to find each
  instruction that uses a binary operator.
- Then `reducible_modular_op` would search the entire eclass again to
  find an instruction which matched its criteria.

Nothing ensured that both searches would find the same instruction.

The reason these rules were written this way was because they had
additional guards (`will_simplify_with_ireduce`) which made them fairly
complex, and it seemed desirable to not have to copy those guards for
every operator where we wanted to apply this optimization.

However, we've decided that checking whether the rule is actually an
improvement is not desirable. In general, that should be the job of the
cost function. Blindly adding equivalent expressions gives us more
opportunities for other rules to fire, and we have global recursion and
growth limits to keep the process from going too wild.

As a result, we can just delete those guards. That allows us to write
the rules in a more straightforward way.

Fixes #7999.

Co-authored-by: Trevor Elliott <telliott@fastly.com>
Co-authored-by: L Pereira <l.pereira@fastly.com>
Co-authored-by: Chris Fallin <chris@cfallin.org>

* Update RELEASES.md

---------

Co-authored-by: Jamey Sharp <jsharp@fastly.com>
Co-authored-by: L Pereira <l.pereira@fastly.com>
Co-authored-by: Chris Fallin <chris@cfallin.org>
alexcrichton pushed a commit that referenced this pull request Feb 28, 2024
* cranelift: Fix ireduce rules (#8005)

We had two optimization rules which started off like this:

(rule (simplify (ireduce smallty val@(binary_op _ op x y)))
      (if-let _ (reducible_modular_op val))
      ...)

This was intended to check that `x` and `y` came from an instruction
which not only was a binary op but also matched `reducible_modular_op`.

Unfortunately, both `binary_op` and `reducible_modular_op` were
multi-terms.
- So `binary_op` would search the eclass rooted at `val` to find each
  instruction that uses a binary operator.
- Then `reducible_modular_op` would search the entire eclass again to
  find an instruction which matched its criteria.

Nothing ensured that both searches would find the same instruction.

The reason these rules were written this way was because they had
additional guards (`will_simplify_with_ireduce`) which made them fairly
complex, and it seemed desirable to not have to copy those guards for
every operator where we wanted to apply this optimization.

However, we've decided that checking whether the rule is actually an
improvement is not desirable. In general, that should be the job of the
cost function. Blindly adding equivalent expressions gives us more
opportunities for other rules to fire, and we have global recursion and
growth limits to keep the process from going too wild.

As a result, we can just delete those guards. That allows us to write
the rules in a more straightforward way.

Fixes #7999.

Co-authored-by: Trevor Elliott <telliott@fastly.com>
Co-authored-by: L Pereira <l.pereira@fastly.com>
Co-authored-by: Chris Fallin <chris@cfallin.org>

* Update RELEASES.md

* Update RELEASES.md

---------

Co-authored-by: Jamey Sharp <jsharp@fastly.com>
Co-authored-by: L Pereira <l.pereira@fastly.com>
Co-authored-by: Chris Fallin <chris@cfallin.org>
cfallin added a commit to cfallin/wasmtime that referenced this pull request Feb 28, 2024
…dance after bytecodealliance#7999.

While debugging bytecodealliance#7999, we learned of a possible, extremely subtle,
interaction between the design of our ISLE mid-end prelude and matching
rules with a certain structure. Because a `Value` represents an entire
eclass, separate left-hand sides (as in helper rules or if-let clauses)
that match on the value may match against different enodes returned by
the multi-match extractor's iterator. We then may infer some property of
one enode, another property of another enode, and perform a rewrite that
is only valid if both of those matches are on the same enode.

The precise distinction is whether it is a property of the *value* -- e.g.,
nonzero, or even, or within a range -- or a property of the *operation*
and matched subpieces. The former is fine, the latter runs into
trouble. We found that bytecodealliance#7719 added a helper that determined whether a
value "was a certain operator" -- actually, had any enode of a certain
operator -- and separately, matched "the operator" and extracted its
opcode and parameters (actually, *any* binary operator). The first half
can match an opcode we support simplifying, and the second half can get
the arguments and `op` and blindly use them in the rewrite.

This PR adds new guidance to avoid complex helpers and be aware of
multi-matching behavior, preferring to write patterns directly (as the
fix in bytecodealliance#8005 does) instead. Longer-term, we also have other ideas, e.g.
@jameysharp's suggestion to disallow at-patterns on multi-extractors in
left hand sides to reduce the chance of hitting this footgun.
github-merge-queue bot pushed a commit that referenced this pull request Feb 28, 2024
…dance after #7999. (#8015)

While debugging #7999, we learned of a possible, extremely subtle,
interaction between the design of our ISLE mid-end prelude and matching
rules with a certain structure. Because a `Value` represents an entire
eclass, separate left-hand sides (as in helper rules or if-let clauses)
that match on the value may match against different enodes returned by
the multi-match extractor's iterator. We then may infer some property of
one enode, another property of another enode, and perform a rewrite that
is only valid if both of those matches are on the same enode.

The precise distinction is whether it is a property of the *value* -- e.g.,
nonzero, or even, or within a range -- or a property of the *operation*
and matched subpieces. The former is fine, the latter runs into
trouble. We found that #7719 added a helper that determined whether a
value "was a certain operator" -- actually, had any enode of a certain
operator -- and separately, matched "the operator" and extracted its
opcode and parameters (actually, *any* binary operator). The first half
can match an opcode we support simplifying, and the second half can get
the arguments and `op` and blindly use them in the rewrite.

This PR adds new guidance to avoid complex helpers and be aware of
multi-matching behavior, preferring to write patterns directly (as the
fix in #8005 does) instead. Longer-term, we also have other ideas, e.g.
@jameysharp's suggestion to disallow at-patterns on multi-extractors in
left hand sides to reduce the chance of hitting this footgun.
@scottmcm
Copy link
Contributor

Oops! Sorry for the mess here.

However, we've decided that checking whether the rule is actually an improvement is not desirable. In general, that should be the job of the cost function. Blindly adding equivalent expressions gives us more opportunities for other rules to fire, and we have global recursion and growth limits to keep the process from going too wild.

Good to see the new guidance for this, since it had been uncertain back when I first started doing these rules in #7693 (comment)

@jameysharp
Copy link
Contributor Author

Yeah, we're all still figuring out how best to use this egraph optimizer. The best we can do is keep trying things out, and I appreciate your efforts to do that!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Cranelift: Misoptimization of imul + ireduce
3 participants