VIP: Cost and usability Improvements to internal function calls #901

maurelian · 2018-06-16T19:45:22Z

Preamble

VIP: <to be assigned>
Title: reduce the cost and difficulty of code reuse in common situations
Author: maurelian
Status: Draft
Created: 2018-06-16
Requires (*optional): <VIP number(s)>
Replaces (*optional): <VIP number(s)>

Simple Summary

I shouldn't need to copy and paste the same code snippets into 5 functions to valid authorization.
I shouldn't feel like copying and pasting code is more efficient or cheaper.

Abstract

In order to call a within the same contract function, Vyper uses the CALL opcode, which send a new message to call functions within the same contract. This has some nice safety benefits.

Unfortunately it will result in developers using anti-patterns to save on gas, or access msg.sender in a function call.

Motivation

The benefit of using CALL to access code in the same contract is to create a new execution context, with no risk of side effects from memory access. I admit that I find this to be quite an elegant use of the EVM for safety.

Problems with using `CALL`

Unfortunately, there are two major issues with the use of the CALL opcode for calling functions within the contract:

1. Environment values change unexpectedly

At least two important environment opcodes will return different values: CALLER (ie. msg.sender) and CALLVALUE (ie. msg.value). This complicates the use of functions for permission checking. The following is a naive vyper translation of the common "ownable" pattern for a simple storage solidity contract.

owner: address
value: public(int128)

@public
def __init__():
    self.owner = msg.sender

@private
def checkOwner():
  assert msg.sender == self.owner # this will REVERT every time

@public
def writeValue(_value: int128):
    self.checkOwner()
    self.value = _value

2. Gas costs disincentivizes code reuse

The CALL opcode costs 700 gas (+ input data costs + function dispatching again). By contrast Solidity uses JUMP (2 gas) to call a function.

This imposes a large penalty on code reuse. Regardless Vyper's design philosophy favoring safety over gas efficiency, developers will respond to this incentive by simply copying and pasting boilerplate code. This is dangerous, and difficult to audit.

Specification:

I'm sorry, I don't have one specification. Below are some options I see for mitigating these issues.

From the Motivation section above, we have 3 parameters along which to analyze approaches to calling a function in the EVM.

Gas cost
Preservation of env variables
Safety benefits

1. Use `DELEGATECALL` instead of `CALL`

Gas cost: NEUTRAL (same as using CALL)
Env variables: GOOD (gives us back CALLER and CALLVALUE)
Memory Safety: GOOD. It does indeed reset memory.

Seems like a decent solution!

~~3. Memory Safety: BAD (executes the code in the same memory context, nullifying reasons for using CALL in the first place)~~ ~I proposed this earlier, but I don't think this is a good solution. ~

2. Just use `JUMP` like Solidity

Gas cost: GOOD (Much lower gas cost)
Env variables: GOOD (Preserves CALLER and CALLVALUE)
Memory Safety: BAD (Greater risk of compiler bugs)

This is worth considering.
See also #330.

3. Implement a safer analog to Solidity's modifiers

These could have significant restrictions on them, like no access to MSTORE or SSTORE, and not accepting arguments. I assume this would be done by with JUMP, or just inlining the same code each time it's needed.

Gas cost: GOOD (Lower execution cost, but might increase size of contract if inlining is used)
Env variables: GOOD (Preserves CALLER and CALLVALUE)
Memory Safety: Depends on implementation

4. Create special variables which persist across message calls

Built-in vars like caller and callvalue can be created, which are magically passed to a function in a message call.

My code above would thus be:

@private
def checkOwner():
  assert _sender == self.owner # <- replaced msg.sender with a magical global variable `sender`

@public
def writeValue(_value: int128):
    self.checkOwner)
    self.value = _value

This feels a bit funky to me. Probably would be hard to implement safely, and obscures the true functioning of the EVM.

5. Status Quo

My problem in the code sample above can be addressed by passing the values I need as arguments:

@private
def checkOwner(_sender: address):
  assert _sender == self.owner # <- replaced msg.sender with _sender

@public
def writeValue(_value: int128):
    self.checkOwner(msg.sender) # <- pass msg.sender as an argument 
    self.value = _value

Maybe that's OK? But it requires a deeper understanding of how both the EVM and Vyper work.

Backwards Compatibility

Dependent on approach.

Copyright

Copyright and related rights waived via CC0

The text was updated successfully, but these errors were encountered:

haydenadams · 2018-06-16T21:36:51Z

I do find myself avoiding internal calls in favor of reusing the same code in multiple places to save gas, which is probably not ideal. I don't personally know the tradeoffs well but If this can be done safely I'm all for it.

ben-kaufman · 2018-06-17T06:18:12Z

I get your point but I don't fully agree with that. Unlike other software, smart contracts' code must be explicit and easily readable. In most other cases, it's more important to not repeat yourself in the code but in smart contracts, I think readability of the function should be more important.
In the example about you are trying to mitigate the behavior of modifiers which Vyper intentionally excluded.

Modifiers - e.g. in Solidity you can do function foo() mod1 { ... }, where mod1 can be defined
elsewhere in the code to include a check that is done before execution, a check that is done after
execution, some state changes, or possibly other things. Vyper does not have this, because it makes > it too easy to write misleading code. mod1 just looks too innocuous for something that could add
arbitrary pre-conditions, post-conditions or state changes. Also, it encourages people to write code > where the execution jumps around the file, harming auditability. The usual use case for a modifier is > something that performs a single check before execution of a program; our recommendation is to
simply inline these checks as asserts.
This is the reason Vyper don't have modifiers like Solidity in the first place.

I am in favor of your fourth proposal of creating the caller and callvalue.

I would like to hear more opinions on that as code repetition is an insecure thing as well so I'm not completely sure what's more important yet, but I believe there's a room for improvement.

haydenadams · 2018-06-17T06:46:24Z

Idk if I agree that internal function calls have the same readability issues as Solidity modifiers. Also code repetition can feel very verbose and reduce readability. If you need to do a square root in five different functions, its probably bad to have the sqrt() implemented five times. But people will be tempted to do so if they save 1000 gas per function call. Discouragement through high gas cost makes sense on a protocol level (ie making sstore exepensive) but feels wrong for smart contract languages.

To be clear I don't think

Environment values change unexpectedly
At least two important environment opcodes will return different values: CALLER (ie. msg.sender) and CALLVALUE (ie. msg.value).

is a big problem. I do however think

Gas costs disincentivizes code reuse
The CALL opcode costs 700 gas (+ input data costs + function dispatching again). By contrast Solidity uses JUMP (2 gas) to call a function.

is bad. If using JUMP for internal calls can be done as safely as CALL, it should be used. I don't know the security tradeoffs at all, hopefully someone who does can chime in here!

maurelian · 2018-06-17T13:41:32Z

It's not only about readability, but maintainability as well. More mistakes will be made if 5 pieces of code have to change.

Take a look at these @private utilities I wrote for my ERC721 translation:

https://github.com/maurelian/ethereum-erc721/blob/vyperTranslation/contracts/tokens/NFToken.v.py#L102-L120

There are a total of 12 lines of code in the bodies, which can be written once, then reused 3 times in the standard. If they were written 3 times over, it would be very hard (for my brain at least) to read carefully and catch any differences between them.

fubuloubu · 2018-06-17T17:12:49Z

6th option?: an optimization function that will perform a JUMP instead of a CALL if it is able to be validated that the code jump cannot access anything outside of the called function. This would limit functions that make external calls from having advanced memory access while allowing safe internal calls to reduced their gas count. This will also incentivize isolation/reduction of external calls to improve gas costs.

maurelian · 2018-06-17T21:52:06Z

6th option?: an optimization function that will perform a JUMP instead of a CALL if it is able to be validated that the code jump cannot access anything outside of the called function.

This optimization means that source code using msg.sender would behave differently based on a decision by the compiler. It might work if the developer could specify a flag (@memsafe, @pure ?), indicating they would like to use JUMP and the compiler makes a check to ensure it's safe.

maurelian · 2018-06-17T22:00:19Z

To be clear I don't think "Environment values change unexpectedly" is a big problem.

@haydenadams No? IMO, the current behaviour of this code is highly non-intuitive, and will make a lot of common operations much more difficult.

@public
def getCaller() -> address:
    return msg.sender
    
@public
def alsoGetCaller() -> address:
    return self.getCaller() # always returns the contract's own address!

haydenadams · 2018-06-18T01:12:08Z

@maurelian

IMO, the current behaviour of this code is highly non-intuitive, and will make a lot of common operations much more difficult.

Whatever address calls a function is msg.sender within the scope of that function. That is the only rule right now. Its not a massive leap to realize that extends to self for internal calls. When a public function is called I don't think it should react differently if its being called by itself, another smart contract, or an external account.

"on the internet nobody knows you're a dog" -> "on Ethereum nobody knows you're ̶a̶ ̶s̶m̶a̶r̶t̶ ̶c̶o̶n̶t̶r̶a̶c̶t̶ yourself"

With account abstraction the divide between contracts and accounts will go from negligible to nonexistent. Contracts will get more complicated but also more versatile. Its pretty important that smart contract devs understand Ethereum account structure. I would prioritize simplicity of logic over what feels immediately intuitive here. Totally understand where you're coming from though!

This seems reasonable, and personally I like it:

@private
def checkOwner(_sender: address):
  assert _sender == self.owner # <- replaced msg.sender with _sender

@public
def writeValue(_value: int128):
    self.checkOwner(msg.sender) # <- pass msg.sender as an argument 
    self.value = _value

jacqueswww · 2018-06-19T11:27:29Z

We briefly discussed this in the bi-weekly meeting yesterday, but left it as unresolved. As it's quite a significant change, we have to study all the options carefully ;)

maurelian · 2018-06-19T13:40:00Z

As it's quite a significant change, we have to study all the options carefully ;)

@jacqueswww I certainly agree.

maurelian · 2018-06-19T13:54:06Z

@haydenadams it's straightforward to make small changes to my little example to make it work, but it's always going to be confusing.

Here's an other summation:

IF: a contract can call its own @public functions;
THEN: msg.sender is pretty much useless as a builtin variable, because it means something entirely different when called externally vs. internally. It's only useful when passed as an argument.

If the Vyper team decides to move towards the external and internal visibilities as in solidity, this would be fine. (added to OP)

Finally, from the Auditability principle:

Simplicity for the reader is more important than simplicity for the writer, and simplicity for readers with low prior experience with Vyper (and low prior experience with programming in general) is particularly important.

Outside of project contributors, requiring this level of nuance is far from simple for anyone reading code.

jacqueswww · 2018-06-19T14:00:09Z

I agree that it's confusing that msg.sender changes, and will be a major annoyance going forward.
Short/little fixes I will investigate:
1.) disable accessing msg.sender or msg.value in private functions.
2.) Investigate using DELETEGATECALL -> I was under the impression that this also resets the VM.

maurelian · 2018-06-19T18:02:59Z

2.) Investigate using DELETEGATECALL -> I was under the impression that this also resets the VM.

Actually, I think you are right! I tested with this code in remix:

library Lib {
	function foo(uint a) returns(uint) { }
}

contract B {
    using Lib for uint;	
    function bar() returns(uint){
	    uint a = 1;
	    return a.foo();
	}
}

If you call B.bar(), and look at the debugger, after the DELEGATECALL instruction memory is reset.
My bad. I'll update the VIP.

haydenadams · 2018-06-20T04:24:22Z

@jacqueswww

1.) disable accessing msg.sender or msg.value in private functions.

I like this. It gets rid of most of the confusion, and they're both useless in private functions anyway.

fubuloubu · 2018-06-20T04:31:35Z

@haydenadams I think "useless" might be an oversimplification, but it forces a specific design behavior that ultimately is very practical in reducing your attack surface. Basically, private functions should do very specific internal behaviors, preferably dealing only with internal state.

maurelian · 2018-06-20T20:53:47Z

1.) disable accessing msg.sender or msg.value in private functions.
2.) Investigate using DELETEGATECALL -> I was under the impression that this also resets the VM.

To be clear, I'm in favor of either but not both. Preference for #2.

jacqueswww · 2018-06-20T21:08:50Z

I am also leaning towards delegate call.

But definitely feel we might need a more gas efficient dispatcher+memory stack post beta.

fubuloubu · 2018-06-20T21:11:10Z

Would be a user expectation breaking change
So is 2.

More gas efficient dispatcher can be done without breaking expectations if we make the right move now.

Which one makes it easier to implement gas optimization on the dispatcher?

haydenadams · 2018-06-21T04:48:07Z

If switching to delegatecall mean msg.sender in internal calls to public functions will no longer return the contracts own address I'm strongly against it.

fulldecent · 2018-06-21T14:17:08Z

To better understand and make a decision here, it will be helpful if the VIP will include discussion of which stage of the compilation process we are changing. Also a discussion of use cases at that stage of compilation would help me.

Also, if you can give a brief overview of the compilation strategy for newcomers this would help them join the discussion.

maurelian · 2018-06-21T15:20:47Z

@fulldecent and everyone: Looking again, I think it makes sense to reduce the scope of this VIP. A better title might be "Cost and usability Improvements to internal function calls". 👍/ 👎?

Code reuse in general, (ie. enabling imports and something like inheritance/composability) is probably another much needed VIP, or perhaps a few.

fulldecent · 2018-06-21T15:26:45Z

Or "New feature: call a function without using EVM CALL"

fubuloubu · 2018-06-21T15:41:25Z

I think this is modification to existing behavior, not a new feature proposal.

We already have a composibility VIP I believe, as well as one for importing interfaces

maurelian · 2018-07-01T23:38:47Z

In the mid to longer term, I can imagine an EIP to significantly reducing the cost of a contract to CALL or DELEGATECALL its own address might be well received.

fulldecent · 2018-07-01T23:44:03Z

I would vote against such an EIP. Call-to-self has extremely limited utility.

haydenadams · 2018-07-01T23:58:01Z

I believe the issue is that even calls to @private functions use CALL, which is very gas heavy compared to Solidity using JUMP. I believe this is for safety reasons but unfortunately discourages the use of @private functions.

@fubuloubu's response earlier in the thread:

6th option?: an optimization function that will perform a JUMP instead of a CALL if it is able to be validated that the code jump cannot access anything outside of the called function. This would limit functions that make external calls from having advanced memory access while allowing safe internal calls to reduced their gas count. This will also incentivize isolation/reduction of external calls to improve gas costs.

seems like something worth pursuing.

fubuloubu · 2018-07-24T19:58:08Z

is hard, but if we do 2) now then we can drop it in in a minor release without breaking user expectations, which is a bigger problem IMO

haydenadams · 2018-07-24T22:37:10Z

could also be done by adding a new variable msg.caller which is always equal to the calling contract value, and leaving msg.sender as is.

msg.caller could be best practice for most uses, but msg.sender would be available if its needed to determine calling context. Also this would allow msg.sender to function the same between Solidity and Vyper which I think would eliminate a lot of confusion for devs migrating over from Solidity.

fulldecent · 2018-07-28T08:43:51Z

Will's Plan 2 -- The Expensive Reusability Manifesto

This plan is extremely expensive in gas. It is useful only for academic purposes. However it allows code reusability and very easy formal auditing.

At every function call do SSTORE(ffff, msg.sender) if it is not already set, and unset before the function ends
Create a read-only variable to access this information
Continue to use external message calls

To be clear, this is joke plan. It illustrates why we should proceed with Will's Plan 1.

jacqueswww · 2018-07-28T10:47:27Z

@fulldecent there is no need to do it like the above. One can just use the first 20 bytes of the internal calldata to send through the msg.caller, this would be a very minimal cost.

Also we can make it even more effecient, by only passing the data through if msg.caller is actually referenced in the function.

jacqueswww · 2018-07-28T10:51:08Z

I think this whole discussion also highlights the fact that we should start doing some example gas cost comparisons to solidity.
I know for instance that vyper might seem very ineffecient, and this a good place we can improve. But some things are not as bad as they seem - good example our arithmetic has overflow checking built in, and solidity uses internal calls to do the same to SafeMath.

@ben-kaufman @fubuloubu @haydenadams @haydenadams What example contracts would you recommend? ERC20 & ERC721 seem like good options?

fulldecent · 2018-07-28T12:14:34Z

If you can provide me with an all-best-practices-followed example Vyper project repository (including test cases) then I can get my team to implement 721 for this experiment!

jacqueswww · 2018-08-14T15:00:54Z

Starting with a proof of concept of using a jump table, just so you guys know if you don't see me. 👅

fubuloubu · 2018-08-14T20:32:54Z

"Where'd Jacques go?"

"Oh yeah, he's jumping tables now"

haydenadams · 2018-08-21T02:21:49Z

@jacqueswww What was the decision here? You mentioned msg.caller in one of your replies. Was that a typo or are you doing the msg.sender and msg.caller plan?

jacqueswww · 2018-08-21T14:00:08Z

@haydenadams currently I am only working on 1.) Using jumps to improve gas efficiency of internal calls. I decided this was a worthy task, as unfortunately smart contract development is very gas-efficiency dependant at this stage, at the benefits would be quite significant.

{tl;dr: 2.) has not been fully decided yet.} With regards to msg.sender if I just implement the jumps, msg.sender will match solidity in it's behaviour (of internal functions ) ie. msg.sender will be the top public function's calling account address. However if there is consensus that we should use msg.sender == the contract address and msg.caller == original caller (Basically sender address of the public function), this would be simple enough to change.

FWIW I do think following solidity here might be the safest option, it is really confusing for developers coming from solidity to have msg.sender be something different. Likewise a first time user coming from vyper, and going to solidity will be confused.

fulldecent · 2018-08-26T22:35:49Z

I can support copying Solidity behavior.

I can also support using words from Yellow Paper -- ORIGIN, CALLER.

fubuloubu · 2018-08-27T04:08:40Z

msg.origin?

maurelian · 2018-09-05T23:57:59Z

FYI: https://ethereum-magicians.org/t/eip-1380-reduced-gas-cost-for-call-to-self/1242

jacqueswww · 2018-09-06T00:04:15Z

Ha @maurelian! You are such a sniper, I was waiting until the EIP was merged as a draft.

So basically this EIP resulted directly from discussing #901, having the EIP accepted will both benefit solidity and vyper :)

maurelian · 2018-09-06T00:15:03Z

@fulldecent you expressed opposition to a hypothetical gas cost reductions on calls to self. The community would benefit from hearing your concerns here.

fulldecent · 2018-09-06T00:46:36Z

@maurelian Thank you for the ping. I have reviewed and added all my comments to https://ethereum-magicians.org/t/eip-1380-reduced-gas-cost-for-call-to-self/1242/3

jacqueswww mentioned this issue Jun 18, 2018

Meeting 18th June 2018 #907

Closed

8 tasks

jacqueswww added the VIP: Discussion Used to denote VIPs and more complex issues that are waiting discussion in a meeting label Jun 19, 2018

jacqueswww mentioned this issue Jun 19, 2018

Fake return values from stack. #330

Closed

maurelian changed the title ~~VIP: reduce the cost and difficulty of code reuse in common situations~~ VIP: Cost and usability Improvements to internal function calls Jul 1, 2018

jacqueswww mentioned this issue Jul 30, 2018

Meeting 30th July 2018 #963

Closed

jacqueswww mentioned this issue Aug 9, 2018

Allow default parameters #903

Closed

jacqueswww added the VIP: Approved VIP Approved label Aug 14, 2018

haydenadams mentioned this issue Aug 20, 2018

First phase reason string support #889

Closed

jacqueswww mentioned this issue Sep 6, 2018

Less gas for private functions #1012

Merged

haydenadams mentioned this issue Sep 24, 2018

Bug: Unit types not always enforced in private calls from default function #1031

Closed

jacqueswww closed this as completed in #1012 Sep 27, 2018

jacqueswww mentioned this issue Oct 1, 2018

Meeting Monday 1 October 2018 #1038

Closed

haydenadams mentioned this issue Nov 21, 2018

"Invalid Jump Destination" #1091

Closed

charles-cooper mentioned this issue Jan 15, 2019

VIP: Restrict usage of msg fields in public functions #1199

Closed

charles-cooper mentioned this issue Feb 6, 2022

VIP: re-enable msg.* environment variables in internal functions #2631

Closed

VIP: Cost and usability Improvements to internal function calls #901

VIP: Cost and usability Improvements to internal function calls #901

Comments

maurelian commented Jun 16, 2018 • edited Loading

Preamble

Simple Summary

Abstract

Motivation

Problems with using CALL

1. Environment values change unexpectedly

2. Gas costs disincentivizes code reuse

Specification:

1. Use DELEGATECALL instead of CALL

2. Just use JUMP like Solidity

3. Implement a safer analog to Solidity's modifiers

4. Create special variables which persist across message calls

5. Status Quo

Backwards Compatibility

Copyright

haydenadams commented Jun 16, 2018

ben-kaufman commented Jun 17, 2018 • edited Loading

haydenadams commented Jun 17, 2018 • edited Loading

maurelian commented Jun 17, 2018

fubuloubu commented Jun 17, 2018

maurelian commented Jun 17, 2018

maurelian commented Jun 17, 2018 • edited Loading

haydenadams commented Jun 18, 2018

jacqueswww commented Jun 19, 2018

maurelian commented Jun 19, 2018

maurelian commented Jun 19, 2018 • edited Loading

jacqueswww commented Jun 19, 2018 • edited Loading

maurelian commented Jun 19, 2018 • edited Loading

haydenadams commented Jun 20, 2018

fubuloubu commented Jun 20, 2018

maurelian commented Jun 20, 2018

jacqueswww commented Jun 20, 2018

fubuloubu commented Jun 20, 2018

haydenadams commented Jun 21, 2018

fulldecent commented Jun 21, 2018

maurelian commented Jun 21, 2018

fulldecent commented Jun 21, 2018 • edited Loading

fubuloubu commented Jun 21, 2018

maurelian commented Jul 1, 2018

fulldecent commented Jul 1, 2018

haydenadams commented Jul 1, 2018 • edited Loading

fubuloubu commented Jul 24, 2018

haydenadams commented Jul 24, 2018

fulldecent commented Jul 28, 2018 • edited Loading

Will's Plan 2 -- The Expensive Reusability Manifesto

jacqueswww commented Jul 28, 2018 • edited Loading

jacqueswww commented Jul 28, 2018 • edited Loading

fulldecent commented Jul 28, 2018

jacqueswww commented Aug 14, 2018 • edited Loading

fubuloubu commented Aug 14, 2018

haydenadams commented Aug 21, 2018

jacqueswww commented Aug 21, 2018 • edited Loading

fulldecent commented Aug 26, 2018

fubuloubu commented Aug 27, 2018

maurelian commented Sep 5, 2018

jacqueswww commented Sep 6, 2018 • edited Loading

maurelian commented Sep 6, 2018

fulldecent commented Sep 6, 2018

maurelian commented Jun 16, 2018 •

edited

Loading

Problems with using `CALL`

1. Use `DELEGATECALL` instead of `CALL`

2. Just use `JUMP` like Solidity

ben-kaufman commented Jun 17, 2018 •

edited

Loading

haydenadams commented Jun 17, 2018 •

edited

Loading

maurelian commented Jun 17, 2018 •

edited

Loading

maurelian commented Jun 19, 2018 •

edited

Loading

jacqueswww commented Jun 19, 2018 •

edited

Loading

maurelian commented Jun 19, 2018 •

edited

Loading

fulldecent commented Jun 21, 2018 •

edited

Loading

haydenadams commented Jul 1, 2018 •

edited

Loading

fulldecent commented Jul 28, 2018 •

edited

Loading

jacqueswww commented Jul 28, 2018 •

edited

Loading

jacqueswww commented Jul 28, 2018 •

edited

Loading

jacqueswww commented Aug 14, 2018 •

edited

Loading

jacqueswww commented Aug 21, 2018 •

edited

Loading

jacqueswww commented Sep 6, 2018 •

edited

Loading