Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new_file #3

Open
wants to merge 245 commits into
base: main-test-workflow
Choose a base branch
from

Conversation

jlb6740
Copy link
Owner

@jlb6740 jlb6740 commented Jan 30, 2022

No description provided.

uweigand and others added 3 commits January 25, 2022 18:15
In order to migrate branches to ISLE, we define a second entry
point `lower_branch` which gets the list of branch targets as
additional argument.

This requires a small change to `lower_common`: the `isle_lower`
callback argument is changed from a function pointer to a closure.
This allows passing the extra argument via a closure.

Traps make use of the recently added facility to emit safepoints
from ISLE, but are otherwise straightforward.
…alliance#3739)

This commit updates the allocation of a `VMExternRefActivationsTable`
structure to perform zero malloc memory allocations. Previously it would
allocate a page-size of `chunk` plus some space in hash sets for future
insertions. The main trick here implemented is that after the first gc
during the slow path the fast chunk allocation is allocated and
configured.

The motivation for this PR is that given our recent work to further
refine and optimize the instantiation process this allocation started to
show up in a nontrivial fashion. Most modules today never touch this
table anyway as almost none of them use reference types, so the time
spent allocation and deallocating the table per-store was largely wasted
time.

Concretely on a microbenchmark this PR speeds up instantiation of a
module with one function by 30%, decreasing the instantiation cost from
1.8us to 1.2us. Overall a pretty minor win but when the instantiation
times we're measuring start being in the single-digit microseconds this
win ends up getting magnified!
…lliance#3741)

* Don't copy `VMBuiltinFunctionsArray` into each `VMContext`

This is another PR along the lines of "let's squeeze all possible
performance we can out of instantiation". Before this PR we would copy,
by value, the contents of `VMBuiltinFunctionsArray` into each
`VMContext` allocated. This array of function pointers is modestly-sized
but growing over time as we add various intrinsics. Additionally it's
the exact same for all `VMContext` allocations.

This PR attempts to speed up instantiation slightly by instead storing
an indirection to the function array. This means that calling a builtin
intrinsic is a tad bit slower since it requires two loads instead of one
(one to get the base pointer, another to get the actual address).
Otherwise though `VMContext` initialization is now simply setting one
pointer instead of doing a `memcpy` from one location to another.

With some macro-magic this commit also replaces the previous
implementation with one that's more `const`-friendly which also gets us
compile-time type-checks of libcalls as well as compile-time
verification that all libcalls are defined.

Overall, as with bytecodealliance#3739, the win is very modest here. Locally I measured
a speedup from 1.9us to 1.7us taken to instantiate an empty module with
one function. While small at these scales it's still a 10% improvement!

* Review comments
@jlb6740
Copy link
Owner Author

jlb6740 commented Jan 30, 2022

/bench_x64

@jlb6740 jlb6740 force-pushed the main-test-workflow branch from b91d0dc to 8ff5de0 Compare January 30, 2022 22:05
@jlb6740
Copy link
Owner Author

jlb6740 commented Jan 30, 2022

bench

@jlb6740
Copy link
Owner Author

jlb6740 commented Jan 30, 2022

/binch_x64

@jlb6740
Copy link
Owner Author

jlb6740 commented Jan 30, 2022

/bench_x64

@jlb6740 jlb6740 force-pushed the main-test-workflow branch from 8ff5de0 to 2cc4b39 Compare January 30, 2022 22:23
@jlb6740
Copy link
Owner Author

jlb6740 commented Jan 30, 2022

X86_blah

@jlb6740
Copy link
Owner Author

jlb6740 commented Jan 30, 2022

/x86_64

@jlb6740
Copy link
Owner Author

jlb6740 commented Jan 30, 2022

/Bench_x64

@jlb6740 jlb6740 force-pushed the main-test-workflow branch from 2cc4b39 to 3090ec0 Compare January 30, 2022 22:37
@jlb6740
Copy link
Owner Author

jlb6740 commented Jan 30, 2022

/bench_x64
Performance results based on clockticks comparison with main HEAD (higher %change shows improvement):

arch engine phase %Change_(1-Head/Patch)
x86_64 patch Compilation 0.042198
x86_64 patch Instantiation -0.003262
x86_64 patch Execution 0.213896

@github-actions
Copy link

Performance results based on clockticks comparison with main HEAD (higher %change shows improvement):

arch engine phase %Change_(1-Head/Patch)
x86_64 patch Compilation 0.047175
x86_64 patch Instantiation -0.041985
x86_64 patch Execution -0.004026

@jlb6740 jlb6740 force-pushed the main-test-workflow branch from 3090ec0 to 9ea1089 Compare January 30, 2022 22:44
@jlb6740
Copy link
Owner Author

jlb6740 commented Jan 30, 2022

/Bench_x64

@jlb6740
Copy link
Owner Author

jlb6740 commented Jan 30, 2022

/bench_x64
Performance results based on clockticks comparison with main HEAD (higher %change shows improvement):

arch engine phase %Change_(1-Head/Patch)
x86_64 patch Compilation 0.059375
x86_64 patch Instantiation -0.024623
x86_64 patch Execution -0.002329

@jlb6740
Copy link
Owner Author

jlb6740 commented Jan 31, 2022

/bench_x64

Performance results based on clockticks comparison with main HEAD (higher %change shows improvement):

arch engine phase %Change_(1-Head/Patch)
x86_64 patch Compilation 0.041162
x86_64 patch Instantiation -0.043220
x86_64 patch Execution 0.012652
Performance results based on clockticks comparison with main HEAD (higher %change shows improvement):
arch engine phase %Change_(1-Head/Patch)
x86_64 patch Compilation 0.029994
x86_64 patch Instantiation -0.043939
x86_64 patch Execution -0.012034
Performance results based on clockticks comparison with main HEAD (higher %change shows improvement):
arch engine phase %Change_(1-Head/Patch)
x86_64 patch Compilation 0.052418
x86_64 patch Instantiation 0.009085
x86_64 patch Execution 0.002242

cfallin and others added 11 commits January 31, 2022 12:53
As first suggested by Jan on the Zulip here [1], a cheap and effective
way to obtain copy-on-write semantics of a "backing image" for a Wasm
memory is to mmap a file with `MAP_PRIVATE`. The `memfd` mechanism
provided by the Linux kernel allows us to create anonymous,
in-memory-only files that we can use for this mapping, so we can
construct the image contents on-the-fly then effectively create a CoW
overlay. Furthermore, and importantly, `madvise(MADV_DONTNEED, ...)`
will discard the CoW overlay, returning the mapping to its original
state.

By itself this is almost enough for a very fast
instantiation-termination loop of the same image over and over,
without changing the address space mapping at all (which is
expensive). The only missing bit is how to implement
heap *growth*. But here memfds can help us again: if we create another
anonymous file and map it where the extended parts of the heap would
go, we can take advantage of the fact that a `mmap()` mapping can
be *larger than the file itself*, with accesses beyond the end
generating a `SIGBUS`, and the fact that we can cheaply resize the
file with `ftruncate`, even after a mapping exists. So we can map the
"heap extension" file once with the maximum memory-slot size and grow
the memfd itself as `memory.grow` operations occur.

The above CoW technique and heap-growth technique together allow us a
fastpath of `madvise()` and `ftruncate()` only when we re-instantiate
the same module over and over, as long as we can reuse the same
slot. This fastpath avoids all whole-process address-space locks in
the Linux kernel, which should mean it is highly scalable. It also
avoids the cost of copying data on read, as the `uffd` heap backend
does when servicing pagefaults; the kernel's own optimized CoW
logic (same as used by all file mmaps) is used instead.

[1] https://bytecodealliance.zulipchat.com/#narrow/stream/206238-general/topic/Copy.20on.20write.20based.20instance.20reuse/near/266657772
Testing so far with recent Wasmtime has not been able to show the need
for avoiding the process-wide mmap lock in real-world use-cases. As
such, the technique of using an anonymous file and ftruncate() to extend
it seems unnecessary; instead, memfd can always use anonymous zeroed
memory for heap backing where the CoW image is not present, and
mprotect() to extend the heap limit by changing page protections.
…nchtrap

s390x: Migrate branches and traps to ISLE
Even though the implementation of emit and emit_safepoint may
be platform-specific, the interface ought to be common so that
other code in prelude.isle may safely call these constructors.

This patch moves the definition of emit (from all platforms)
and emit_safepoint (s390x only) to prelude.isle.  This required
adding an emit_safepoint implementation to aarch64 and x64 as
well - the latter is still a stub as special move mitosis
handling will be required.
Move emit and emit_safepoint to prelude.isle
With the addition of `sock_accept()` in `wasi-0.11.0`, wasmtime can now
implement basic networking for pre-opened sockets.

For Windows `AsHandle` was replaced with `AsRawHandleOrSocket` to cope
with the duality of Handles and Sockets.

For Unix a `wasi_cap_std_sync::net::Socket` enum was created to handle
the {Tcp,Unix}{Listener,Stream} more efficiently in
`WasiCtxBuilder::preopened_socket()`.

The addition of that many `WasiFile` implementors was mainly necessary,
because of the difference in the `num_ready_bytes()` function.

A known issue is Windows now busy polling on sockets, because except
for `stdin`, nothing is querying the status of windows handles/sockets.

Another know issue on Windows, is that there is no crate providing
support for `fcntl(fd, F_GETFL, 0)` on a socket.

Signed-off-by: Harald Hoyer <harald@profian.com>
(This was not a correctness bug, but is an obvious performance bug...)
@jlb6740 jlb6740 force-pushed the main-test-workflow branch 2 times, most recently from afb0d6e to 4e5c126 Compare April 15, 2022 20:18
@jlb6740
Copy link
Owner Author

jlb6740 commented Jul 7, 2022

/bench_x64

16 similar comments
@jlb6740
Copy link
Owner Author

jlb6740 commented Jul 7, 2022

/bench_x64

@jlb6740
Copy link
Owner Author

jlb6740 commented Jul 7, 2022

/bench_x64

@jlb6740
Copy link
Owner Author

jlb6740 commented Jul 8, 2022

/bench_x64

@jlb6740
Copy link
Owner Author

jlb6740 commented Jul 8, 2022

/bench_x64

@jlb6740
Copy link
Owner Author

jlb6740 commented Jul 8, 2022

/bench_x64

@jlb6740
Copy link
Owner Author

jlb6740 commented Jul 8, 2022

/bench_x64

@jlb6740
Copy link
Owner Author

jlb6740 commented Jul 8, 2022

/bench_x64

@jlb6740
Copy link
Owner Author

jlb6740 commented Jul 8, 2022

/bench_x64

@jlb6740
Copy link
Owner Author

jlb6740 commented Jul 8, 2022

/bench_x64

@jlb6740
Copy link
Owner Author

jlb6740 commented Jul 8, 2022

/bench_x64

@jlb6740
Copy link
Owner Author

jlb6740 commented Jul 8, 2022

/bench_x64

@jlb6740
Copy link
Owner Author

jlb6740 commented Jul 8, 2022

/bench_x64

@jlb6740
Copy link
Owner Author

jlb6740 commented Jul 8, 2022

/bench_x64

@jlb6740
Copy link
Owner Author

jlb6740 commented Jul 8, 2022

/bench_x64

@jlb6740
Copy link
Owner Author

jlb6740 commented Jul 8, 2022

/bench_x64

@jlb6740
Copy link
Owner Author

jlb6740 commented Jul 8, 2022

/bench_x64

@jlb6740
Copy link
Owner Author

jlb6740 commented Jul 8, 2022

Requested from pull request comment.

Shows clockticks reduced. 1-Patch/Main (positive pct is better)

wasm arch phase pct_change
blake3-scalar x86_64 Compilation 0.252038
blake3-scalar x86_64 Execution 0.514687
blake3-scalar x86_64 Instantiation -0.188046
blake3-simd x86_64 Compilation 0.061630
blake3-simd x86_64 Execution 0.488527
blake3-simd x86_64 Instantiation -0.225931
bz2 x86_64 Compilation 0.205812
bz2 x86_64 Execution 0.003150
bz2 x86_64 Instantiation -0.178587
meshoptimizer x86_64 Compilation 0.354439
meshoptimizer x86_64 Execution 0.126972
meshoptimizer x86_64 Instantiation 0.104328
noop x86_64 Compilation 0.345874
noop x86_64 Execution 0.260000
noop x86_64 Instantiation -0.436084
pulldown-cmark x86_64 Compilation 0.283439
pulldown-cmark x86_64 Execution 0.027934
pulldown-cmark x86_64 Instantiation -0.073733
shootout-ackermann x86_64 Compilation 0.180137
shootout-ackermann x86_64 Execution 0.169526
shootout-ackermann x86_64 Instantiation -0.176180
shootout-base64 x86_64 Compilation 0.190269
shootout-base64 x86_64 Execution 0.031772
shootout-base64 x86_64 Instantiation -0.244714
shootout-ctype x86_64 Compilation 0.167044
shootout-ctype x86_64 Execution 0.116045
shootout-ctype x86_64 Instantiation -0.225372
shootout-ed25519 x86_64 Compilation -0.257056
shootout-ed25519 x86_64 Execution 0.162103
shootout-ed25519 x86_64 Instantiation -0.251569
shootout-fib2 x86_64 Compilation 0.225430
shootout-fib2 x86_64 Execution 0.232701
shootout-fib2 x86_64 Instantiation -0.218610
shootout-gimli x86_64 Compilation -0.419485
shootout-gimli x86_64 Execution -0.325183
shootout-gimli x86_64 Instantiation -0.577899
shootout-heapsort x86_64 Compilation 0.137374
shootout-heapsort x86_64 Execution 0.096731
shootout-heapsort x86_64 Instantiation -0.090041
shootout-keccak x86_64 Compilation -0.529278
shootout-keccak x86_64 Execution 0.296737
shootout-keccak x86_64 Instantiation -0.262778
shootout-matrix x86_64 Compilation 0.240290
shootout-matrix x86_64 Execution -0.146350
shootout-matrix x86_64 Instantiation -0.168759
shootout-memmove x86_64 Compilation 0.021640
shootout-memmove x86_64 Execution 0.028090
shootout-memmove x86_64 Instantiation -0.267921
shootout-minicsv x86_64 Compilation 0.083099
shootout-minicsv x86_64 Execution -0.008419
shootout-minicsv x86_64 Instantiation -0.279105
shootout-nestedloop x86_64 Compilation 0.234397
shootout-nestedloop x86_64 Execution -0.394263
shootout-nestedloop x86_64 Instantiation -0.232816
shootout-random x86_64 Compilation 0.214803
shootout-random x86_64 Execution 0.013657
shootout-random x86_64 Instantiation -0.253377
shootout-ratelimit x86_64 Compilation 0.261537
shootout-ratelimit x86_64 Execution 0.037091
shootout-ratelimit x86_64 Instantiation -0.298557
shootout-seqhash x86_64 Compilation 0.151300
shootout-seqhash x86_64 Execution 0.329786
shootout-seqhash x86_64 Instantiation -0.257654
shootout-sieve x86_64 Compilation 0.221652
shootout-sieve x86_64 Execution -0.034621
shootout-sieve x86_64 Instantiation -0.224541
shootout-switch x86_64 Compilation -0.541247
shootout-switch x86_64 Execution -0.046208
shootout-switch x86_64 Instantiation -0.278171
shootout-xblabla20 x86_64 Compilation 0.015567
shootout-xblabla20 x86_64 Execution 0.364604
shootout-xblabla20 x86_64 Instantiation -0.258469
shootout-xchacha20 x86_64 Compilation 0.137297
shootout-xchacha20 x86_64 Execution 0.223475
shootout-xchacha20 x86_64 Instantiation -0.298439
spidermonkey x86_64 Compilation 0.319904
spidermonkey x86_64 Execution 0.101459
spidermonkey x86_64 Instantiation 0.839427

@jlb6740
Copy link
Owner Author

jlb6740 commented Jul 8, 2022

/bench_x64

1 similar comment
@jlb6740
Copy link
Owner Author

jlb6740 commented Jul 8, 2022

/bench_x64

@jlb6740
Copy link
Owner Author

jlb6740 commented Jul 8, 2022

Requested from pull request comment.

Shows clockticks reduced. 1-Patch/Main (positive pct is better)

wasm arch phase pct_change
blake3-scalar x86_64 Compilation 0.267785
blake3-scalar x86_64 Execution 0.523186
blake3-scalar x86_64 Instantiation -0.087342
blake3-simd x86_64 Compilation 0.152520
blake3-simd x86_64 Execution 0.517645
blake3-simd x86_64 Instantiation -0.204900
bz2 x86_64 Compilation 0.140617
bz2 x86_64 Execution 0.029239
bz2 x86_64 Instantiation -0.185546
meshoptimizer x86_64 Compilation 0.355487
meshoptimizer x86_64 Execution 0.126032
meshoptimizer x86_64 Instantiation 0.106589
noop x86_64 Compilation 0.370164
noop x86_64 Execution -0.225434
noop x86_64 Instantiation -0.141821
pulldown-cmark x86_64 Compilation 0.221078
pulldown-cmark x86_64 Execution 0.043281
pulldown-cmark x86_64 Instantiation 0.096677
shootout-ackermann x86_64 Compilation 0.261293
shootout-ackermann x86_64 Execution 0.476967
shootout-ackermann x86_64 Instantiation -0.110514
shootout-base64 x86_64 Compilation 0.192387
shootout-base64 x86_64 Execution 0.023694
shootout-base64 x86_64 Instantiation -0.180700
shootout-ctype x86_64 Compilation 0.211759
shootout-ctype x86_64 Execution 0.118828
shootout-ctype x86_64 Instantiation -0.171770
shootout-ed25519 x86_64 Compilation -0.251544
shootout-ed25519 x86_64 Execution 0.159518
shootout-ed25519 x86_64 Instantiation -0.212709
shootout-fib2 x86_64 Compilation 0.232027
shootout-fib2 x86_64 Execution 0.231405
shootout-fib2 x86_64 Instantiation -0.227172
shootout-gimli x86_64 Compilation 0.013869
shootout-gimli x86_64 Execution 0.117642
shootout-gimli x86_64 Instantiation -0.155875
shootout-heapsort x86_64 Compilation 0.144255
shootout-heapsort x86_64 Execution 0.098402
shootout-heapsort x86_64 Instantiation -0.218630
shootout-keccak x86_64 Compilation -0.517763
shootout-keccak x86_64 Execution 0.298092
shootout-keccak x86_64 Instantiation -0.242030
shootout-matrix x86_64 Compilation 0.200977
shootout-matrix x86_64 Execution -0.147316
shootout-matrix x86_64 Instantiation -0.190787
shootout-memmove x86_64 Compilation 0.146174
shootout-memmove x86_64 Execution 0.029049
shootout-memmove x86_64 Instantiation -0.224595
shootout-minicsv x86_64 Compilation 0.224297
shootout-minicsv x86_64 Execution 0.000593
shootout-minicsv x86_64 Instantiation -0.560320
shootout-nestedloop x86_64 Compilation 0.243502
shootout-nestedloop x86_64 Execution -0.390304
shootout-nestedloop x86_64 Instantiation -0.205522
shootout-random x86_64 Compilation 0.218406
shootout-random x86_64 Execution 0.015519
shootout-random x86_64 Instantiation -0.070179
shootout-ratelimit x86_64 Compilation 0.216435
shootout-ratelimit x86_64 Execution 0.029863
shootout-ratelimit x86_64 Instantiation 0.026571
shootout-seqhash x86_64 Compilation 0.112194
shootout-seqhash x86_64 Execution 0.245023
shootout-seqhash x86_64 Instantiation -0.261993
shootout-sieve x86_64 Compilation 0.229918
shootout-sieve x86_64 Execution -0.056225
shootout-sieve x86_64 Instantiation -0.194323
shootout-switch x86_64 Compilation -0.467942
shootout-switch x86_64 Execution -0.042356
shootout-switch x86_64 Instantiation -0.226528
shootout-xblabla20 x86_64 Compilation 0.120386
shootout-xblabla20 x86_64 Execution 0.368616
shootout-xblabla20 x86_64 Instantiation -0.235619
shootout-xchacha20 x86_64 Compilation 0.086441
shootout-xchacha20 x86_64 Execution 0.226574
shootout-xchacha20 x86_64 Instantiation 0.121736
spidermonkey x86_64 Compilation 0.334647
spidermonkey x86_64 Execution 0.096342
spidermonkey x86_64 Instantiation 0.762032

@jlb6740
Copy link
Owner Author

jlb6740 commented Jul 8, 2022

Requested from pull request comment.

Shows clockticks reduced. 1-Patch/Main (positive pct is better)

wasm arch phase pct_change
blake3-scalar x86_64 Compilation 0.267654
blake3-scalar x86_64 Execution 0.502793
blake3-scalar x86_64 Instantiation -0.167469
blake3-simd x86_64 Compilation 0.035898
blake3-simd x86_64 Execution 0.723880
blake3-simd x86_64 Instantiation -0.188790
bz2 x86_64 Compilation 0.169353
bz2 x86_64 Execution 0.029607
bz2 x86_64 Instantiation -0.167022
meshoptimizer x86_64 Compilation 0.378991
meshoptimizer x86_64 Execution 0.122253
meshoptimizer x86_64 Instantiation 0.427078
noop x86_64 Compilation 0.356464
noop x86_64 Execution 0.100243
noop x86_64 Instantiation -0.205280
pulldown-cmark x86_64 Compilation 0.207149
pulldown-cmark x86_64 Execution 0.036915
pulldown-cmark x86_64 Instantiation 0.043888
shootout-ackermann x86_64 Compilation 0.247948
shootout-ackermann x86_64 Execution -0.118251
shootout-ackermann x86_64 Instantiation -0.156018
shootout-base64 x86_64 Compilation 0.235143
shootout-base64 x86_64 Execution 0.032966
shootout-base64 x86_64 Instantiation -0.201081
shootout-ctype x86_64 Compilation 0.206758
shootout-ctype x86_64 Execution 0.117529
shootout-ctype x86_64 Instantiation -0.204755
shootout-ed25519 x86_64 Compilation -0.281983
shootout-ed25519 x86_64 Execution 0.161328
shootout-ed25519 x86_64 Instantiation -0.208071
shootout-fib2 x86_64 Compilation 0.220997
shootout-fib2 x86_64 Execution 0.233999
shootout-fib2 x86_64 Instantiation -0.025330
shootout-gimli x86_64 Compilation -0.052543
shootout-gimli x86_64 Execution 0.128204
shootout-gimli x86_64 Instantiation -0.373405
shootout-heapsort x86_64 Compilation 0.169238
shootout-heapsort x86_64 Execution 0.099530
shootout-heapsort x86_64 Instantiation -0.267675
shootout-keccak x86_64 Compilation -0.522172
shootout-keccak x86_64 Execution 0.307606
shootout-keccak x86_64 Instantiation -0.265023
shootout-matrix x86_64 Compilation 0.225777
shootout-matrix x86_64 Execution -0.153518
shootout-matrix x86_64 Instantiation -0.231578
shootout-memmove x86_64 Compilation 0.128170
shootout-memmove x86_64 Execution 0.028786
shootout-memmove x86_64 Instantiation -0.282509
shootout-minicsv x86_64 Compilation 0.128989
shootout-minicsv x86_64 Execution -0.008059
shootout-minicsv x86_64 Instantiation -0.244154
shootout-nestedloop x86_64 Compilation 0.225583
shootout-nestedloop x86_64 Execution -0.391409
shootout-nestedloop x86_64 Instantiation -0.318621
shootout-random x86_64 Compilation 0.232833
shootout-random x86_64 Execution 0.014782
shootout-random x86_64 Instantiation -0.232782
shootout-ratelimit x86_64 Compilation 0.237936
shootout-ratelimit x86_64 Execution 0.026535
shootout-ratelimit x86_64 Instantiation -0.233666
shootout-seqhash x86_64 Compilation 0.062589
shootout-seqhash x86_64 Execution 0.205809
shootout-seqhash x86_64 Instantiation -0.331052
shootout-sieve x86_64 Compilation 0.233635
shootout-sieve x86_64 Execution -0.034231
shootout-sieve x86_64 Instantiation -0.207444
shootout-switch x86_64 Compilation -0.517783
shootout-switch x86_64 Execution -0.043518
shootout-switch x86_64 Instantiation -0.250956
shootout-xblabla20 x86_64 Compilation 0.122987
shootout-xblabla20 x86_64 Execution 0.376581
shootout-xblabla20 x86_64 Instantiation -0.265191
shootout-xchacha20 x86_64 Compilation 0.008317
shootout-xchacha20 x86_64 Execution 0.204805
shootout-xchacha20 x86_64 Instantiation -0.330818
spidermonkey x86_64 Compilation 0.342308
spidermonkey x86_64 Execution 0.102966
spidermonkey x86_64 Instantiation 0.717683

@jlb6740
Copy link
Owner Author

jlb6740 commented Jul 8, 2022

/bench_x64

@jlb6740
Copy link
Owner Author

jlb6740 commented Jul 8, 2022

Requested from pull request comment.

Shows clockticks reduced. 1-Patch/Main (positive pct is better)

wasm arch phase pct_change
blake3-scalar x86_64 Compilation 0.280760
blake3-scalar x86_64 Execution 0.508734
blake3-scalar x86_64 Instantiation -0.084096
blake3-simd x86_64 Compilation 0.065832
blake3-simd x86_64 Execution 0.499878
blake3-simd x86_64 Instantiation -0.280597
bz2 x86_64 Compilation 0.130422
bz2 x86_64 Execution 0.027760
bz2 x86_64 Instantiation -0.221698
meshoptimizer x86_64 Compilation 0.375760
meshoptimizer x86_64 Execution 0.125533
meshoptimizer x86_64 Instantiation 0.069344
noop x86_64 Compilation 0.365302
noop x86_64 Execution 0.089593
noop x86_64 Instantiation -0.179320
pulldown-cmark x86_64 Compilation 0.258382
pulldown-cmark x86_64 Execution 0.022203
pulldown-cmark x86_64 Instantiation 0.000310
shootout-ackermann x86_64 Compilation 0.230511
shootout-ackermann x86_64 Execution -0.002342
shootout-ackermann x86_64 Instantiation -0.091477
shootout-base64 x86_64 Compilation 0.132505
shootout-base64 x86_64 Execution 0.029405
shootout-base64 x86_64 Instantiation -0.194545
shootout-ctype x86_64 Compilation 0.207134
shootout-ctype x86_64 Execution 0.115266
shootout-ctype x86_64 Instantiation -0.208857
shootout-ed25519 x86_64 Compilation -0.232659
shootout-ed25519 x86_64 Execution 0.159042
shootout-ed25519 x86_64 Instantiation -0.233938
shootout-fib2 x86_64 Compilation 0.237104
shootout-fib2 x86_64 Execution 0.233725
shootout-fib2 x86_64 Instantiation -0.186542
shootout-gimli x86_64 Compilation -0.004619
shootout-gimli x86_64 Execution 0.140096
shootout-gimli x86_64 Instantiation -0.183178
shootout-heapsort x86_64 Compilation 0.148813
shootout-heapsort x86_64 Execution 0.099344
shootout-heapsort x86_64 Instantiation -0.271239
shootout-keccak x86_64 Compilation -0.533123
shootout-keccak x86_64 Execution 0.305799
shootout-keccak x86_64 Instantiation -0.279859
shootout-matrix x86_64 Compilation 0.183476
shootout-matrix x86_64 Execution -0.147280
shootout-matrix x86_64 Instantiation -0.235017
shootout-memmove x86_64 Compilation 0.170192
shootout-memmove x86_64 Execution 0.026924
shootout-memmove x86_64 Instantiation -0.230222
shootout-minicsv x86_64 Compilation 0.188954
shootout-minicsv x86_64 Execution -0.011269
shootout-minicsv x86_64 Instantiation -0.227523
shootout-nestedloop x86_64 Compilation 0.231949
shootout-nestedloop x86_64 Execution -0.392175
shootout-nestedloop x86_64 Instantiation -0.218772
shootout-random x86_64 Compilation 0.228663
shootout-random x86_64 Execution 0.011367
shootout-random x86_64 Instantiation -0.197998
shootout-ratelimit x86_64 Compilation 0.207005
shootout-ratelimit x86_64 Execution 0.027591
shootout-ratelimit x86_64 Instantiation -0.182847
shootout-seqhash x86_64 Compilation 0.155373
shootout-seqhash x86_64 Execution 0.205752
shootout-seqhash x86_64 Instantiation -0.330183
shootout-sieve x86_64 Compilation 0.214609
shootout-sieve x86_64 Execution -0.035219
shootout-sieve x86_64 Instantiation -0.159932
shootout-switch x86_64 Compilation -0.509938
shootout-switch x86_64 Execution -0.039629
shootout-switch x86_64 Instantiation -0.181343
shootout-xblabla20 x86_64 Compilation 0.134346
shootout-xblabla20 x86_64 Execution 0.399435
shootout-xblabla20 x86_64 Instantiation -0.239339
shootout-xchacha20 x86_64 Compilation 0.112990
shootout-xchacha20 x86_64 Execution 0.260290
shootout-xchacha20 x86_64 Instantiation -0.229198
spidermonkey x86_64 Compilation 0.353140
spidermonkey x86_64 Execution 0.102301
spidermonkey x86_64 Instantiation 0.782431

@jlb6740 jlb6740 self-assigned this Jul 11, 2022
jlb6740 pushed a commit that referenced this pull request Jul 24, 2023
* Integrate experimental HTTP into wasmtime.

* Reset Cargo.lock

* Switch to bail!, plumb options partially.

* Implement timeouts.

* Remove generated files & wasm, add Makefile

* Remove generated code textfile

* Update crates/wasi-http/Cargo.toml

Co-authored-by: Eduardo de Moura Rodrigues <16357187+eduardomourar@users.noreply.github.com>

* Update crates/wasi-http/Cargo.toml

Co-authored-by: Eduardo de Moura Rodrigues <16357187+eduardomourar@users.noreply.github.com>

* Extract streams from request/response.

* Fix read for len < buffer length.

* Formatting.

* types impl: swap todos for traps

* streams_impl: idioms, and swap todos for traps

* component impl: idioms, swap all unwraps for traps, swap all todos for traps

* http impl: idiom

* Remove an unnecessary mut.

* Remove an unsupported function.

* Switch to the tokio runtime for the HTTP request.

* Add a rust example.

* Update to latest wit definition

* Remove example code.

* wip: start writing a http test...

* finish writing the outbound request example

havent executed it yet

* better debug output

* wasi-http: some stubs required for rust rewrite of the example

* add wasi_http tests to test-programs

* CI: run the http tests

* Fix some warnings.

* bump new deps to latest releases (#3)

* Add tests for wasi-http to test-programs (#2)

* wip: start writing a http test...

* finish writing the outbound request example

havent executed it yet

* better debug output

* wasi-http: some stubs required for rust rewrite of the example

* add wasi_http tests to test-programs

* CI: run the http tests

* bump new deps to latest releases

h2 0.3.16
http 0.2.9
mio 0.8.6
openssl 0.10.48
openssl-sys 0.9.83
tokio 1.26.0

---------

Co-authored-by: Brendan Burns <bburns@microsoft.com>

* Update crates/test-programs/tests/http_tests/runtime/wasi_http_tests.rs

* Update crates/test-programs/tests/http_tests/runtime/wasi_http_tests.rs

* Update crates/test-programs/tests/http_tests/runtime/wasi_http_tests.rs

* wasi-http: fix cargo.toml file and publish script to work together (#4)

unfortunately, the publish script doesn't use a proper toml parser (in
order to not have any dependencies), so the whitespace has to be the
trivial expected case.

then, add wasi-http to the list of crates to publish.

* Update crates/test-programs/build.rs

* Switch to rustls

* Cleanups.

* Merge switch to rustls.

* Formatting

* Remove libssl install

* Fix tests.

* Rename wasi-http -> wasmtime-wasi-http

* prtest:full

Conditionalize TLS on riscv64gc.

* prtest:full

Fix formatting, also disable tls on s390x

* prtest:full

Add a path parameter to wit-bindgen, remove symlink.

* prtest:full

Fix tests for places where SSL isn't supported.

* Update crates/wasi-http/Cargo.toml

---------

Co-authored-by: Eduardo de Moura Rodrigues <16357187+eduardomourar@users.noreply.github.com>
Co-authored-by: Pat Hickey <phickey@fastly.com>
Co-authored-by: Pat Hickey <pat@moreproductive.org>
jlb6740 pushed a commit that referenced this pull request Dec 10, 2023
…dealliance#7029)

* Rename `Host*` things to avoid name conflicts with bindings.

* Update to the latest resource-enabled wit files.

* Adapting the code to the new bindings.

* Update wasi-http to the resource-enabled wit deps.

* Start adapting the wasi-http code to the new bindings.

* Make `get_directories` always return new owned handles.

* Simplify the `poll_one` implementation.

* Update the wasi-preview1-component-adapter.

FIXME: temporarily disable wasi-http tests.

Add logging to the cli world, since stderr is now a reseource that
can only be claimed once.

* Work around a bug hit by poll-list, fix a bug in poll-one.

* Comment out `test_fd_readwrite_invalid_fd`, which panics now.

* Fix a few FIXMEs.

* Use `.as_ref().trapping_unwrap()` instead of `TrappingUnwrapRef`.

* Use `drop_in_place`.

* Remove `State::with_mut`.

* Remove the `RefCell` around the `State`.

* Update to wit-bindgen 0.12.

* Update wasi-http to use resources for poll and I/O.

This required making incoming-body and outgoing-body resourrces too, to
work with `push_input_stream_child` and `push_output_stream_child`.

* Re-enable disabled tests, remove logging from the worlds.

* Remove the `poll_list` workarounds that are no longer needed.

* Remove logging from the adapter.

That said, there is no replacement yet, so add a FIXME comment.

* Reenable a test that now passes.

* Remove `.descriptors_mut` and use `with_descriptors_mut` instead.

Replace `.descriptors()` and `.descriptors_mut()` with functions
that take closures, which limits their scope, to prevent them from
invalid aliasing.

* Implement dynamic borrow checking for descriptors.

* Add a cargo-vet audit for wasmtime-wmemcheck.

* Update cargo vet for wit-bindgen 0.12.

* Cut down on duplicate sync/async resource types (#1)

* Allow calling `get-directories` more than once (#2)

For now `Clone` the directories into new descriptor slots as needed.

* Start to lift restriction of stdio only once  (#3)

* Start to lift restriction of stdio only once

This commit adds new `{Stdin,Stdout}Stream` traits which take over the
job of the stdio streams in `WasiCtxBuilder` and `WasiCtx`. These traits
bake in the ability to create a stream at any time to satisfy the API
of `wasi:cli`. The TTY functionality is folded into them as while I was
at it.

The implementation for stdin is relatively trivial since the stdin
implementation already handles multiple streams reading it. Built-in
impls of the `StdinStream` trait are also provided for helper types in
`preview2::pipe` which resulted in the implementation of
`MemoryInputPipe` being updated to support `Clone` where all clones read
the same original data.

* Get tests building

* Un-ignore now-passing test

* Remove unneeded argument from `WasiCtxBuilder::build`

* Fix tests

* Remove some workarounds

Stdio functions can now be called multiple times.

* If `poll_oneoff` fails part-way through, clean up properly.

Fix the `Drop` implementation for pollables to only drop the pollables
that have been successfully added to the list.

This fixes the poll_oneoff_files failure and removes a FIXME.

---------

Co-authored-by: Alex Crichton <alex@alexcrichton.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.