Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge with upstream #136

Merged
merged 21 commits into from
Sep 25, 2024
Merged

Merge with upstream #136

merged 21 commits into from
Sep 25, 2024

Conversation

dhil
Copy link
Member

@dhil dhil commented Sep 25, 2024

No description provided.

keithw and others added 21 commits September 16, 2024 19:14
…e#1788)

* wasmprinter: print function bodies with code section

* Re-bless more tests
This commit updates the `roundtrip_wit` fuzzer to test out merging of
worlds, not just merging of `Resolve`. This then fixes various bugs that
cropped up where `ensure_can_add_world_exports` wasn't ensuring enough.
…dealliance#1792)

This commit fixes two issues, both preexisting, which represented
violated invariants of a `Resolve` after worlds were merged. World
merging is used during component generation and might be used for
bindings generation so it's important to maintain the various dynamic
invariants of `Resolve` to ensure that surrounding tooling works. An
example of this is that printing a merged world doesn't produce
parseable WIT today because some of the internal invariants aren't
respected.

This commit beefs up the `Resolve::assert_valid` function and then
applies necessary fixes to resolve these assertion failures on
preexisting test cases. The main difference is that anonymous
interfaces/types are now "cloned" when they move from one world to
another. This is required to ensure that all the various links between
these items are consistent and point to the right place. This cloning
infrastructure may end up being more generalized in the future if the
need arises, but for now it's quite limited.
* Update some fuzzing defaults and infrastructure

* Update `wasm_smith::Config` to default-enable some stage4+ proposals:
  `exceptions`, `gc`, `reference_types`, `relaxed_simd`, `simd`,
  `tail_call`, `threads`. These can still all be disabled via
  configuration and CLI flags.
* All stage4+ proposals are now swarm-enabled through
  `Arbitrary for Config`
* Default generation of modules in wasm-tools's own fuzzing no longer
  special-cases these proposals since they're all already handled.
* The `WasmFeatures` used for validating fuzz-generated modules now
  starts with a minimal baseline set of features to ensure that all
  proposals are disabled in the validator if the corresponding
  wasm-smith configuration flag is disabled.
* The `wasm-mutate` crate was updated to return errors instead of
  panicking for unsupported wasm proposals. All wasm proposals are now
  enabled when passing to `wasm-mutate`.

The primary motivation for this commit was this last point where I'm
seeing panics on OSS-Fuzz for Wasmtime using `wasm-mutate` as a mutation
hook because `wasm-mutate` is panicking on some GC types. When fixing
that I noticed other fuzz-related things I wanted to clean up while I
was here.

* Fix some feature handling in more places
Now that `wasm_encoder::reencode` exists that's the better option to use
for reencoding modules. This commit removes one of the major users of
the equivalent `Translator` trait in `wasm_mutate` to start down the
path of removing that trait entirely and instead relying on the support
in `wasm_encoder`.
…e#1795)

Same as bytecodealliance#1794 except for another use of the `Translator` trait.
…e#1796)

* Treat inputs uniformly in `mutate` fuzzer

Don't use cargo features to reinterpret inputs to ensure that test cases
can reproduce more easily when toggling features.

* Remove a panic in wasm-mutate

Let this fall through to the "unimplemented opcodes" section below
instead of panicking.

* Remove usage of `Translator` from peephole pass

Only used in a minor capacity, easy to transfer over.

* Share the `ReencodeResult` type in wasm-mutate

* Refactor `Elements` in `wasm-encoder`

* Add another hook in `Reencode` to process just elements
* Change the internal slices to `Cow` instead of `&[T]` to be a bit more
  flexible in terms of ownership

* Migrate const expression mutator to wasm-mutate

Mostly just updating various bits and pieces here and there.

* Remove translation framework of wasm-mutate

Now no-longer-needed.

* Reject `TryTable` in `wasm-mutate`

Returns a first-class error instead of creating an invalid module.

* Fix another panic in wasm-mutate

Fall back on empty type info instead of panicking to get the error to
crop up elsewhere for unsupported instructions.

* Fix test on historical rust

* More test fixes for older rust
Return a first-class error instead of generating an invalid module
during loop unrolling in wasm-mutate when unsupported opcodes are
encountered.
…ce#1799)

Remove a few `Encode` impls and start laying some groundwork for other
sections to get migrated as well with a new trait/method to use.
* threads: add component model canonical functions

The shared-everything-threads [proposal] adds two new component model
canonical functions, `thread.spawn` and `thread.hw_concurrency`. This
change adds initial support for these new functions, which should match
what is specified in the [canonical ABI]. All of this support should be
gated behind checks that ensure the shared-everything-threads proposal
has been enabled.

[proposal]: https://github.com/WebAssembly/shared-everything-threads
[canonical ABI]: https://github.com/WebAssembly/component-model/blob/main/design/mvp/CanonicalABI.md#-canon-threadspawn

* Rework using core types

* refactor: add `SubType::func` for brevity

* Create concrete ref from core type ID

* review: use core type index in reencoder

* review: add `intern_sub_type` helper

* review: move proposal check, add missing-features tests

* refactor: rename CM builtins test

* refactor: pass down `WasmFeatures`

* fix: re-bless

* review: make `thread.*` canonicals `shared`
…#1800)

* Extract "world elaboration" to a separate function

This commit extracts the process of elaborating a worlds imports/exports
to a dedicated function. This function ensures the transitive closure of
all imports/exports are listed in the world in proper topograhical
order. This additionally validates that the world has a coherent
definition given WIT today, notably that imports don't accidentally try
to use exported types.

This functionality was all present previously during the process of
taking a WIT document and creating `Resolve`. This moves the logic
around and refactors it slightly given its new surroundings. This
reorders a few imports/exports in worlds from what was previously
present but the underlying meaning of worlds should be the same.

The goal of this commit is to enable `Resolve`-mutating transformations
to not need to preserve all these invariants each time a world is
modified. Instead a world can be modified and then this function can be
called to "clean up" the world afterwards.

* Simplify the `Resolve::importize` operation

This commit updates the `importize` operation to be a much simpler "just
move the exports to the imports" operation where care is only taken to
preserve imported types and to additionally error on overlap of imported
types and exported functions/interfaces. The previous `elaborate_world`
method helps make this method much simpler than before.
…#1806)

Signed-off-by: karthik2804 <karthik.ganeshram@fermyon.com>
This commit fixes a small regression in reencoding from bytecodealliance#1794 where
element segments subtly changed encoding by accident. This commit
additionally ensures that there's a text format for all element
encodings and updates printing to respect the same defaults. It should
now be possible to represent all formats for element segments in the
text format and have them round-trippable.
…nce#1803)

* wast: Use wasm-encoder for import sections

* wast: Use wasm-encoder for the table section

* wast: Use wasm-encoder for the memory section

* wast: Use wasm-encoder for the tag section

* wast: Use wasm-encoder for the global section

* wast: Use wasm-encoder for the element section

* wast: Use wasm-encoder for the data section

* wast: Use wasm-encoder for the func section

* wast: Use wasm-encoder for the name section

* wast: Use wasm-encoder for more custom sections

* wast: Use wasm-encoder for encoding branch hints

This additionally entailed adding support to wasm-encoder for emitting
branch hints.

* Add example for `wasm_encoder::BranchHints`

* Fix rebase conflict
* Update to Wasmtime 25 for testing/fuzzing

* Downgrade serde again
…nce#1813)

This commit builds on the work of bytecodealliance#1764 to support the text format for
GC types in components more than before. This notably bring support for:

* `(sub ...)` types in components and module types
* `(rec ...)` groups in components and module types
* types can refer to themselves like with core wasm

The main consequence of this work is that unlike most other `$foo`
identifiers in the component model the identifiers found in types will
not automatically inject outer aliases to refer to outer types. For
example this will not parse:

    (component $C
      (type $t (struct))
      (component
        (type (array (ref $t)))
      )
    )

The reason for this is that automatic injection of an outer alias
requires that types are resolved and then their names are registered.
The resolution process queues up aliases to inject which are accounted
for during registration when indices are assigned. Here though because
types can refer to themselves (or future types in `rec` groups) the
registration process has to happen first before resolution. This means
that if resolution were to inject more type indices then that would mess
up the indexes already assigned.

This is hopefully relatively minor in terms of how often this'll bite
someone. For now various changes have been made to the name resolution
pass of components to handle this and some tests have been added too for
both positive and negative situations.
…nce#1810)

* Overhaul and refactor some internals of `wit-component`

This commit is a lead-up to the changes proposed in bytecodealliance#1774. The
`wit-component` crate is quite old and has gone through many iterations
of the component model and it's very much showing its age in a few
places. Namely the correlation between component model names and core
wasm names is open-coded in many places throughout validation and
encoding of a component. This makes the changes in bytecodealliance#1774 where the names
may be different (e.g. core wasm 0.2.0 might import component 0.2.1).

Making this change was inevitably going to require quite a lot of
refactoring of `wit-component`, so that's what this commit does. It's a
pretty large rewrite of the internals of validation of a core wasm
module and adapter prior to creating a component. The metadata produced
by this pass is now represented in a completely different format. The
metadata is extensively used throughout the encoding process so encoding
has been heavily refactored as well.

The overall idea is that the previous "grab bag" of various items here
and there produced from validation are now all unified into a single
`ImportMap` and `ExportMap` for a module. These maps track all the
various kinds of imports and exports and how they map back to core wasm
names. Notably this means that the logic to correlate core wasm names
with component model names is now happening in just one location (in
theory) as opposed to implicitly all throughout encoding. I've
additionally taken this opportunity to subjectively simplify much of the
encoding process around managing instantiations of core wasm modules and
adapters.

One of the main changes in this commit is that it does away with code
structure such as "do the thing for WIT" then "do the thing for
resources" then "do the thing for other resources" and finally "do the
thing for adapters". This was difficult to understand every time I came
back to it and I can't imagine was easy for anyone else to understand
either. All imports are now handled in a single location and it's
intended to be much better separated who's responsible for what. For
example the code satisfying an import is decoupled from what the import
is going to be named and how it's provided to the main core wasm module.

Overall the intention is that this does not either enhance the
functionality of wit-component nor regress it. Lots of tests have
changed but I've tried to verify it's just things moving around as
opposed to anything that has a different semantic meaning. A future PR
for bytecodealliance#1774 will enhance the logic of connecting core wasm imports to WIT
imports but that's deferred for a future PR.

* Update link-related tests

* Bless some more tests
…liance#1815)

* Implement merging semver-compatible interfaces in imports

This commit is the first half and the meat of the implementation
of bytecodealliance#1774 where, by default, WIT processing tools will now merge world
imports when possible based on semver versions. The precise shape of how
this is done has gone through a few iterations and this is the end
result I've ended up settling on. This should handle all the various
cases we're interested in and continue to produce valid worlds within a
`Resolve` (with the help of `elaborate_world` added in bytecodealliance#1800). CLI
tooling has been updated with flags to configure this behavior but the
behavior is now enabled by default.

* Add broken test case for wit-component

* Match wasm/component import names based on versions

This commit implements the final bit of bytecodealliance#1774 where the `wit-component`
crate will now match imports based on semver version in addition to the
previous exact-name matching that was performed previously.

* Don't panic on duplicate shims in wit-component

This is now possible with import version matching so refactor the
internal data structures to better support insertion of possibly
duplicate shims. Most of wit-component was already ready for this
refactoring, it was just the initial generation of shims that needed to
be reorganized slightly.

* wip

* Elaborate all worlds after semver merging

Any interface could be modified, so elaborate all of them to fixup anything
that needs new imports.

* Only merge once at the end, not continuously

This commit updates how the semver-merging bits of `Resolve` work by
moving it towards the end of the encoding process rather than
once-per-world-merged. That helps both deduplicate work and fix some
asserts I'm seeing that are tripping if a `Resolve` is import-merged and
then merged again elsewhere.

This additionally simplifies some APIs because the boolean of what to do
isn't threaded to quite so many locations.

* Fix test

* Update interface deps of exports too

Dependencies might depend on replaced interfaces so the dependency edges
there need to be updated in the same manner as imports.

* Fix fuzzer build

Also ensure to at least try to test the new merging function.
@dhil dhil merged commit 98efaab into wasmfx:main Sep 25, 2024
30 checks passed
@dhil dhil deleted the wasmfx-merge branch September 25, 2024 13:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants