-
Notifications
You must be signed in to change notification settings - Fork 4
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* New zig build asm command to generate assembly for perf analysis * Gate the debug trace and trap features behind a compile-time option. Most users probably don't care about this and it has a non-trivial impact on perf. * Fix instruction immediates regressing to 32 bytes. When they were changed to be extern structs, the ordering started to matter. Now there's a comptime check to make sure they're always 16 bytes. * Embed param/return count in function instance to avoid an extra pointer hop. * Minor optimization to use @Memset to initialize locals since the default values are always 0 anyway. * Some optimization notes for the future.
- Loading branch information
1 parent
2ff4f4b
commit 5931f4f
Showing
9 changed files
with
89 additions
and
94 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
== Failed Optimizations == | ||
|
||
* Giving locals their own stack space separate from values. The idea here was to save | ||
some perf on push/pop of call frames so that we wouldn't have to copy the return values | ||
back to the appropriate place. But since the wasm calling convention is to pass params | ||
via the stack, you'd have to copy them elsewhere anyway, defeating the point of | ||
the optimization anyway, which is to avoid copying values around. | ||
|
||
* Instruction stream. Instead of having an array of structs that contain opcode + immediates, | ||
have a byte stream of opcodes and immediates where you don't have to pay for the extra memory | ||
of the immediates if you don't need them. But it turns out that a lot of instructions | ||
use immediates anyway and the overhead of fetching them out of the stream is more | ||
expensive than just paying for the cache hits. Overall memory is |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.