-
-
Notifications
You must be signed in to change notification settings - Fork 810
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wrap pushing and popping of locals into a loop. #1486
Conversation
I think this is a good PR. The overhead of each loop iteration is about 15 gas (eyeballing), so I think it would be best to unroll the loop so that each loop iteration only has an amortized overhead of 1-3 gas. That suggests a loop unroll size of about 8 words. EDIT: (context: https://en.wikipedia.org/wiki/Loop_unrolling#Simple_manual_example_in_C) |
Co-Authored-By: Charles Cooper <cooper.charles.m@gmail.com>
@siraben I got the loop unrolling to work (https://github.com/siraben/vyper/pull/1/files) but I'm not sure it's worth the extra complexity. It does simplify to your loop in the case that UNROLL_LOOP_SIZE == 1, and the fully unrolled code (like the current code) in the case that UNROLL_LOOP_SIZE is much larger than the number of items. I also looked into a few other optimizations, but they may require some architectural changes so maybe we can explore them later. I am recording them here for future reference. The main things I looked into were a faster if statement and putting the loop index in the stack instead of in memory. This results in fewer instructions, but requires some working around how LLL interprets (seq
(mstore 0 137)
(mstore 32 138)
(0) ; set mload_pos 0
(label save_locals_start_20_11)
(mload (dup1 pass)) ; load item from memory into stack
(swap1 pass pass) ; push loaded item further into stack past index
(add 32 pass) ; mload_pos += 32
; if mload_pos != 64: goto label
(dup1 pass) ; dup mload_pos so next iteration has access
(goto_if (ne 64 pass) save_locals_start_20_11)
(pop pass) ; pop mload_pos
) and here is a manual loop for
Even though it is quite a bit more efficient (save_locals is roughly 9n amortized additional overhead per-item, and restore_locals is 18n(?) amortized additional overhead per-item), and it would be good to have this technique available across the codebase (putting loop variables in the stack instead of memory), it breaks some of the abstraction of LLL so I am hesitant to continue going in that direction. The other technique I looked into was a faster if-statement, |
per gitter conversation with @jacqueswww , we should merge this (with a couple minor requested changes) and explore further optimizations in a later iteration. |
What I did
Reduced contract code size due to excessive pushing/popping of locals.
How I did it
Changed
push_local_vars
andpop_local_vars
inself_call.py
to LLL code that implements a loop equal in semantics.All the tests pass.
How to verify it
Compile the following contract.
In the generated LLL code, there will no longer large portions of
mload
andmstore
calls when saving/restoring locals.Old code size (bytes): 55340
New code size (bytes): 39510
Cute Animal Picture