Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clobbers: use a more efficient bitmask representation in API. #58

Merged
merged 2 commits into from
Jun 27, 2022

Conversation

cfallin
Copy link
Member

@cfallin cfallin commented Jun 26, 2022

Currently, the Function trait requires a &[PReg] for the
clobber-list for a given instruction. In most cases where clobbers are
used, the list may be long: e.g., ABIs specify a fixed set of registers
that are clobbered and there may be ~half of all registers in this list.
What's more, the list can't be shared for e.g. all calls of a given ABI,
because actual return-values (defs) can't be clobbers. So we need to
allocate space for long, sometimes-slightly-different lists; this is
inefficient for the embedder and for us.

It's much more efficient to use a bitmask to represent a set of physical
registers. With current data structure bitpacking limitations, we can
support at most 128 physical registers; this means we can use a u128
bitmask. This also allows e.g. an embedder to start with a constant for
a given ABI, and mask out bits for actual return-value registers on call
instructions.

This PR makes that change, for minor but positive performance impact.

Currently, the `Function` trait requires a `&[PReg]` for the
clobber-list for a given instruction. In most cases where clobbers are
used, the list may be long: e.g., ABIs specify a fixed set of registers
that are clobbered and there may be ~half of all registers in this list.
What's more, the list can't be shared for e.g. all calls of a given ABI,
because actual return-values (defs) can't be clobbers. So we need to
allocate space for long, sometimes-slightly-different lists; this is
inefficient for the embedder and for us.

It's much more efficient to use a bitmask to represent a set of physical
registers. With current data structure bitpacking limitations, we can
support at most 128 physical registers; this means we can use a `u128`
bitmask. This also allows e.g. an embedder to start with a constant for
a given ABI, and mask out bits for actual return-value registers on call
instructions.

This PR makes that change, for minor but positive performance impact.
@cfallin cfallin requested a review from fitzgen June 26, 2022 17:57
cfallin added a commit to cfallin/wasmtime that referenced this pull request Jun 26, 2022
- Handle call instructions' clobbers with the clobbers API, using RA2's
  clobbers bitmask (bytecodealliance/regalloc2#58) rather than clobbers
  list;

- Pull in changes from bytecodealliance/regalloc2#59 for much more sane
  edge-case behavior w.r.t. liverange splitting.

Currently refers to local path for RA2 so will fail CI; I will update
this once RA2 PRs 58 and 59 are merged, version-bumped and released.

Fixes bytecodealliance#4291.
Copy link
Member

@fitzgen fitzgen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Small suggestions below.

src/lib.rs Outdated Show resolved Hide resolved
src/lib.rs Outdated Show resolved Hide resolved
@cfallin cfallin merged commit 9733cb2 into bytecodealliance:main Jun 27, 2022
@cfallin cfallin deleted the bitmask-clobbers branch June 27, 2022 19:27
cfallin added a commit to cfallin/regalloc2 that referenced this pull request Jun 27, 2022
Includes improvements in splitting performance (bytecodealliance#59) and more efficient
handling of clobber lists (bytecodealliance#58).

Semver break because of API change in bytecodealliance#58.
@cfallin cfallin mentioned this pull request Jun 27, 2022
cfallin added a commit that referenced this pull request Jun 27, 2022
Includes improvements in splitting performance (#59) and more efficient
handling of clobber lists (#58).

Semver break because of API change in #58.
cfallin added a commit to cfallin/wasmtime that referenced this pull request Jun 27, 2022
- Handle call instructions' clobbers with the clobbers API, using RA2's
  clobbers bitmask (bytecodealliance/regalloc2#58) rather than clobbers
  list;

- Pull in changes from bytecodealliance/regalloc2#59 for much more sane
  edge-case behavior w.r.t. liverange splitting.
cfallin added a commit to cfallin/wasmtime that referenced this pull request Jun 27, 2022
- Handle call instructions' clobbers with the clobbers API, using RA2's
  clobbers bitmask (bytecodealliance/regalloc2#58) rather than clobbers
  list;

- Pull in changes from bytecodealliance/regalloc2#59 for much more sane
  edge-case behavior w.r.t. liverange splitting.
cfallin added a commit to bytecodealliance/wasmtime that referenced this pull request Jun 28, 2022
- Handle call instructions' clobbers with the clobbers API, using RA2's
  clobbers bitmask (bytecodealliance/regalloc2#58) rather than clobbers
  list;

- Pull in changes from bytecodealliance/regalloc2#59 for much more sane
  edge-case behavior w.r.t. liverange splitting.
afonso360 pushed a commit to afonso360/wasmtime that referenced this pull request Jun 30, 2022
- Handle call instructions' clobbers with the clobbers API, using RA2's
  clobbers bitmask (bytecodealliance/regalloc2#58) rather than clobbers
  list;

- Pull in changes from bytecodealliance/regalloc2#59 for much more sane
  edge-case behavior w.r.t. liverange splitting.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants