Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VIP: Cost and usability Improvements to internal function calls #901

Closed
maurelian opened this issue Jun 16, 2018 · 52 comments
Closed

VIP: Cost and usability Improvements to internal function calls #901

maurelian opened this issue Jun 16, 2018 · 52 comments
Labels
VIP: Approved VIP Approved VIP: Discussion Used to denote VIPs and more complex issues that are waiting discussion in a meeting

Comments

@maurelian
Copy link
Contributor

maurelian commented Jun 16, 2018

Preamble

VIP: <to be assigned>
Title: reduce the cost and difficulty of code reuse in common situations
Author: maurelian
Status: Draft
Created: 2018-06-16
Requires (*optional): <VIP number(s)>
Replaces (*optional): <VIP number(s)>

Simple Summary

  1. I shouldn't need to copy and paste the same code snippets into 5 functions to valid authorization.
  2. I shouldn't feel like copying and pasting code is more efficient or cheaper.

Abstract

In order to call a within the same contract function, Vyper uses the CALL opcode, which send a new message to call functions within the same contract. This has some nice safety benefits.

Unfortunately it will result in developers using anti-patterns to save on gas, or access msg.sender in a function call.

Motivation

The benefit of using CALL to access code in the same contract is to create a new execution context, with no risk of side effects from memory access. I admit that I find this to be quite an elegant use of the EVM for safety.

Problems with using CALL

Unfortunately, there are two major issues with the use of the CALL opcode for calling functions within the contract:

1. Environment values change unexpectedly

At least two important environment opcodes will return different values: CALLER (ie. msg.sender) and CALLVALUE (ie. msg.value). This complicates the use of functions for permission checking. The following is a naive vyper translation of the common "ownable" pattern for a simple storage solidity contract.

owner: address
value: public(int128)

@public
def __init__():
    self.owner = msg.sender

@private
def checkOwner():
  assert msg.sender == self.owner # this will REVERT every time

@public
def writeValue(_value: int128):
    self.checkOwner()
    self.value = _value

2. Gas costs disincentivizes code reuse

The CALL opcode costs 700 gas (+ input data costs + function dispatching again). By contrast Solidity uses JUMP (2 gas) to call a function.

This imposes a large penalty on code reuse. Regardless Vyper's design philosophy favoring safety over gas efficiency, developers will respond to this incentive by simply copying and pasting boilerplate code. This is dangerous, and difficult to audit.

Specification:

I'm sorry, I don't have one specification. Below are some options I see for mitigating these issues.

From the Motivation section above, we have 3 parameters along which to analyze approaches to calling a function in the EVM.

  1. Gas cost
  2. Preservation of env variables
  3. Safety benefits

1. Use DELEGATECALL instead of CALL

  1. Gas cost: NEUTRAL (same as using CALL)
  2. Env variables: GOOD (gives us back CALLER and CALLVALUE)
  3. Memory Safety: GOOD. It does indeed reset memory.

Seems like a decent solution!

3. Memory Safety: BAD (executes the code in the same memory context, nullifying reasons for using CALL in the first place) ~I proposed this earlier, but I don't think this is a good solution. ~

2. Just use JUMP like Solidity

  1. Gas cost: GOOD (Much lower gas cost)
  2. Env variables: GOOD (Preserves CALLER and CALLVALUE)
  3. Memory Safety: BAD (Greater risk of compiler bugs)

This is worth considering.
See also #330.

3. Implement a safer analog to Solidity's modifiers

These could have significant restrictions on them, like no access to MSTORE or SSTORE, and not accepting arguments. I assume this would be done by with JUMP, or just inlining the same code each time it's needed.

  1. Gas cost: GOOD (Lower execution cost, but might increase size of contract if inlining is used)
  2. Env variables: GOOD (Preserves CALLER and CALLVALUE)
  3. Memory Safety: Depends on implementation

4. Create special variables which persist across message calls

Built-in vars like caller and callvalue can be created, which are magically passed to a function in a message call.

My code above would thus be:

@private
def checkOwner():
  assert _sender == self.owner # <- replaced msg.sender with a magical global variable `sender`

@public
def writeValue(_value: int128):
    self.checkOwner)
    self.value = _value

This feels a bit funky to me. Probably would be hard to implement safely, and obscures the true functioning of the EVM.

5. Status Quo

My problem in the code sample above can be addressed by passing the values I need as arguments:

@private
def checkOwner(_sender: address):
  assert _sender == self.owner # <- replaced msg.sender with _sender

@public
def writeValue(_value: int128):
    self.checkOwner(msg.sender) # <- pass msg.sender as an argument 
    self.value = _value

Maybe that's OK? But it requires a deeper understanding of how both the EVM and Vyper work.

Backwards Compatibility

Dependent on approach.

Copyright

Copyright and related rights waived via CC0

@haydenadams
Copy link

I do find myself avoiding internal calls in favor of reusing the same code in multiple places to save gas, which is probably not ideal. I don't personally know the tradeoffs well but If this can be done safely I'm all for it.

@ben-kaufman
Copy link
Contributor

ben-kaufman commented Jun 17, 2018

I get your point but I don't fully agree with that. Unlike other software, smart contracts' code must be explicit and easily readable. In most other cases, it's more important to not repeat yourself in the code but in smart contracts, I think readability of the function should be more important.
In the example about you are trying to mitigate the behavior of modifiers which Vyper intentionally excluded.

Modifiers - e.g. in Solidity you can do function foo() mod1 { ... }, where mod1 can be defined
elsewhere in the code to include a check that is done before execution, a check that is done after
execution, some state changes, or possibly other things. Vyper does not have this, because it makes > it too easy to write misleading code. mod1 just looks too innocuous for something that could add
arbitrary pre-conditions, post-conditions or state changes. Also, it encourages people to write code > where the execution jumps around the file, harming auditability. The usual use case for a modifier is > something that performs a single check before execution of a program; our recommendation is to
simply inline these checks as asserts.
This is the reason Vyper don't have modifiers like Solidity in the first place.

I am in favor of your fourth proposal of creating the caller and callvalue.

I would like to hear more opinions on that as code repetition is an insecure thing as well so I'm not completely sure what's more important yet, but I believe there's a room for improvement.

@haydenadams
Copy link

haydenadams commented Jun 17, 2018

Idk if I agree that internal function calls have the same readability issues as Solidity modifiers. Also code repetition can feel very verbose and reduce readability. If you need to do a square root in five different functions, its probably bad to have the sqrt() implemented five times. But people will be tempted to do so if they save 1000 gas per function call. Discouragement through high gas cost makes sense on a protocol level (ie making sstore exepensive) but feels wrong for smart contract languages.

To be clear I don't think

  1. Environment values change unexpectedly
    At least two important environment opcodes will return different values: CALLER (ie. msg.sender) and CALLVALUE (ie. msg.value).

is a big problem. I do however think

  1. Gas costs disincentivizes code reuse
    The CALL opcode costs 700 gas (+ input data costs + function dispatching again). By contrast Solidity uses JUMP (2 gas) to call a function.

is bad. If using JUMP for internal calls can be done as safely as CALL, it should be used. I don't know the security tradeoffs at all, hopefully someone who does can chime in here!

@maurelian
Copy link
Contributor Author

It's not only about readability, but maintainability as well. More mistakes will be made if 5 pieces of code have to change.

Take a look at these @private utilities I wrote for my ERC721 translation:

https://github.com/maurelian/ethereum-erc721/blob/vyperTranslation/contracts/tokens/NFToken.v.py#L102-L120

There are a total of 12 lines of code in the bodies, which can be written once, then reused 3 times in the standard. If they were written 3 times over, it would be very hard (for my brain at least) to read carefully and catch any differences between them.

@fubuloubu
Copy link
Member

6th option?: an optimization function that will perform a JUMP instead of a CALL if it is able to be validated that the code jump cannot access anything outside of the called function. This would limit functions that make external calls from having advanced memory access while allowing safe internal calls to reduced their gas count. This will also incentivize isolation/reduction of external calls to improve gas costs.

@maurelian
Copy link
Contributor Author

6th option?: an optimization function that will perform a JUMP instead of a CALL if it is able to be validated that the code jump cannot access anything outside of the called function.

This optimization means that source code using msg.sender would behave differently based on a decision by the compiler. It might work if the developer could specify a flag (@memsafe, @pure ?), indicating they would like to use JUMP and the compiler makes a check to ensure it's safe.

@maurelian
Copy link
Contributor Author

maurelian commented Jun 17, 2018

To be clear I don't think "Environment values change unexpectedly" is a big problem.

@haydenadams No? IMO, the current behaviour of this code is highly non-intuitive, and will make a lot of common operations much more difficult.

@public
def getCaller() -> address:
    return msg.sender
    
@public
def alsoGetCaller() -> address:
    return self.getCaller() # always returns the contract's own address!

@haydenadams
Copy link

@maurelian

IMO, the current behaviour of this code is highly non-intuitive, and will make a lot of common operations much more difficult.

Whatever address calls a function is msg.sender within the scope of that function. That is the only rule right now. Its not a massive leap to realize that extends to self for internal calls. When a public function is called I don't think it should react differently if its being called by itself, another smart contract, or an external account.

"on the internet nobody knows you're a dog" -> "on Ethereum nobody knows you're ̶a̶ ̶s̶m̶a̶r̶t̶ ̶c̶o̶n̶t̶r̶a̶c̶t̶ yourself"

With account abstraction the divide between contracts and accounts will go from negligible to nonexistent. Contracts will get more complicated but also more versatile. Its pretty important that smart contract devs understand Ethereum account structure. I would prioritize simplicity of logic over what feels immediately intuitive here. Totally understand where you're coming from though!

This seems reasonable, and personally I like it:

@private
def checkOwner(_sender: address):
  assert _sender == self.owner # <- replaced msg.sender with _sender

@public
def writeValue(_value: int128):
    self.checkOwner(msg.sender) # <- pass msg.sender as an argument 
    self.value = _value

@jacqueswww jacqueswww mentioned this issue Jun 18, 2018
8 tasks
@jacqueswww jacqueswww added the VIP: Discussion Used to denote VIPs and more complex issues that are waiting discussion in a meeting label Jun 19, 2018
@jacqueswww
Copy link
Contributor

We briefly discussed this in the bi-weekly meeting yesterday, but left it as unresolved. As it's quite a significant change, we have to study all the options carefully ;)

@maurelian
Copy link
Contributor Author

As it's quite a significant change, we have to study all the options carefully ;)

@jacqueswww I certainly agree.

@maurelian
Copy link
Contributor Author

maurelian commented Jun 19, 2018

@haydenadams it's straightforward to make small changes to my little example to make it work, but it's always going to be confusing.

Here's an other summation:

IF: a contract can call its own @public functions;
THEN: msg.sender is pretty much useless as a builtin variable, because it means something entirely different when called externally vs. internally. It's only useful when passed as an argument.

If the Vyper team decides to move towards the external and internal visibilities as in solidity, this would be fine. (added to OP)

Finally, from the Auditability principle:

Simplicity for the reader is more important than simplicity for the writer, and simplicity for readers with low prior experience with Vyper (and low prior experience with programming in general) is particularly important.

Outside of project contributors, requiring this level of nuance is far from simple for anyone reading code.

@jacqueswww
Copy link
Contributor

jacqueswww commented Jun 19, 2018

I agree that it's confusing that msg.sender changes, and will be a major annoyance going forward.
Short/little fixes I will investigate:
1.) disable accessing msg.sender or msg.value in private functions.
2.) Investigate using DELETEGATECALL -> I was under the impression that this also resets the VM.

@maurelian
Copy link
Contributor Author

maurelian commented Jun 19, 2018

2.) Investigate using DELETEGATECALL -> I was under the impression that this also resets the VM.

Actually, I think you are right! I tested with this code in remix:

library Lib {
	function foo(uint a) returns(uint) { }
}

contract B {
    using Lib for uint;	
    function bar() returns(uint){
	    uint a = 1;
	    return a.foo();
	}
}

If you call B.bar(), and look at the debugger, after the DELEGATECALL instruction memory is reset.
My bad. I'll update the VIP.

@haydenadams
Copy link

@jacqueswww

1.) disable accessing msg.sender or msg.value in private functions.

I like this. It gets rid of most of the confusion, and they're both useless in private functions anyway.

@fubuloubu
Copy link
Member

@haydenadams I think "useless" might be an oversimplification, but it forces a specific design behavior that ultimately is very practical in reducing your attack surface. Basically, private functions should do very specific internal behaviors, preferably dealing only with internal state.

@maurelian
Copy link
Contributor Author

1.) disable accessing msg.sender or msg.value in private functions.
2.) Investigate using DELETEGATECALL -> I was under the impression that this also resets the VM.

To be clear, I'm in favor of either but not both. Preference for #2.

@jacqueswww
Copy link
Contributor

I am also leaning towards delegate call.

But definitely feel we might need a more gas efficient dispatcher+memory stack post beta.

@fubuloubu
Copy link
Member

  1. Would be a user expectation breaking change
    So is 2.

More gas efficient dispatcher can be done without breaking expectations if we make the right move now.

Which one makes it easier to implement gas optimization on the dispatcher?

@haydenadams
Copy link

If switching to delegatecall mean msg.sender in internal calls to public functions will no longer return the contracts own address I'm strongly against it.

@fulldecent
Copy link
Contributor

To better understand and make a decision here, it will be helpful if the VIP will include discussion of which stage of the compilation process we are changing. Also a discussion of use cases at that stage of compilation would help me.

Also, if you can give a brief overview of the compilation strategy for newcomers this would help them join the discussion.

@maurelian
Copy link
Contributor Author

@fulldecent and everyone: Looking again, I think it makes sense to reduce the scope of this VIP. A better title might be "Cost and usability Improvements to internal function calls". 👍/ 👎?

Code reuse in general, (ie. enabling imports and something like inheritance/composability) is probably another much needed VIP, or perhaps a few.

@fulldecent
Copy link
Contributor

fulldecent commented Jun 21, 2018

Or "New feature: call a function without using EVM CALL"

@fubuloubu
Copy link
Member

I think this is modification to existing behavior, not a new feature proposal.

We already have a composibility VIP I believe, as well as one for importing interfaces

@maurelian maurelian changed the title VIP: reduce the cost and difficulty of code reuse in common situations VIP: Cost and usability Improvements to internal function calls Jul 1, 2018
@maurelian
Copy link
Contributor Author

In the mid to longer term, I can imagine an EIP to significantly reducing the cost of a contract to CALL or DELEGATECALL its own address might be well received.

@fulldecent
Copy link
Contributor

I would vote against such an EIP. Call-to-self has extremely limited utility.

@haydenadams
Copy link

haydenadams commented Jul 1, 2018

I believe the issue is that even calls to @private functions use CALL, which is very gas heavy compared to Solidity using JUMP. I believe this is for safety reasons but unfortunately discourages the use of @private functions.

@fubuloubu's response earlier in the thread:

6th option?: an optimization function that will perform a JUMP instead of a CALL if it is able to be validated that the code jump cannot access anything outside of the called function. This would limit functions that make external calls from having advanced memory access while allowing safe internal calls to reduced their gas count. This will also incentivize isolation/reduction of external calls to improve gas costs.

seems like something worth pursuing.

@fubuloubu
Copy link
Member

  1. is hard, but if we do 2) now then we can drop it in in a minor release without breaking user expectations, which is a bigger problem IMO

@haydenadams
Copy link

  1. could also be done by adding a new variable msg.caller which is always equal to the calling contract value, and leaving msg.sender as is.

msg.caller could be best practice for most uses, but msg.sender would be available if its needed to determine calling context. Also this would allow msg.sender to function the same between Solidity and Vyper which I think would eliminate a lot of confusion for devs migrating over from Solidity.

@fulldecent
Copy link
Contributor

fulldecent commented Jul 28, 2018

Will's Plan 2 -- The Expensive Reusability Manifesto

This plan is extremely expensive in gas. It is useful only for academic purposes. However it allows code reusability and very easy formal auditing.

  1. At every function call do SSTORE(ffff, msg.sender) if it is not already set, and unset before the function ends
  2. Create a read-only variable to access this information
  3. Continue to use external message calls

To be clear, this is joke plan. It illustrates why we should proceed with Will's Plan 1.

@jacqueswww
Copy link
Contributor

jacqueswww commented Jul 28, 2018

@fulldecent there is no need to do it like the above. One can just use the first 20 bytes of the internal calldata to send through the msg.caller, this would be a very minimal cost.

Also we can make it even more effecient, by only passing the data through if msg.caller is actually referenced in the function.

@jacqueswww
Copy link
Contributor

jacqueswww commented Jul 28, 2018

I think this whole discussion also highlights the fact that we should start doing some example gas cost comparisons to solidity.
I know for instance that vyper might seem very ineffecient, and this a good place we can improve. But some things are not as bad as they seem - good example our arithmetic has overflow checking built in, and solidity uses internal calls to do the same to SafeMath.

@ben-kaufman @fubuloubu @haydenadams @haydenadams What example contracts would you recommend? ERC20 & ERC721 seem like good options?

@fulldecent
Copy link
Contributor

If you can provide me with an all-best-practices-followed example Vyper project repository (including test cases) then I can get my team to implement 721 for this experiment!

@jacqueswww
Copy link
Contributor

jacqueswww commented Aug 14, 2018

Starting with a proof of concept of using a jump table, just so you guys know if you don't see me. 👅

@fubuloubu
Copy link
Member

"Where'd Jacques go?"

"Oh yeah, he's jumping tables now"

@haydenadams
Copy link

@jacqueswww What was the decision here? You mentioned msg.caller in one of your replies. Was that a typo or are you doing the msg.sender and msg.caller plan?

@jacqueswww
Copy link
Contributor

jacqueswww commented Aug 21, 2018

@haydenadams currently I am only working on 1.) Using jumps to improve gas efficiency of internal calls. I decided this was a worthy task, as unfortunately smart contract development is very gas-efficiency dependant at this stage, at the benefits would be quite significant.

{tl;dr: 2.) has not been fully decided yet.} With regards to msg.sender if I just implement the jumps, msg.sender will match solidity in it's behaviour (of internal functions ) ie. msg.sender will be the top public function's calling account address. However if there is consensus that we should use msg.sender == the contract address and msg.caller == original caller (Basically sender address of the public function), this would be simple enough to change.

FWIW I do think following solidity here might be the safest option, it is really confusing for developers coming from solidity to have msg.sender be something different. Likewise a first time user coming from vyper, and going to solidity will be confused.

@fulldecent
Copy link
Contributor

I can support copying Solidity behavior.

I can also support using words from Yellow Paper -- ORIGIN, CALLER.

@fubuloubu
Copy link
Member

msg.origin?

@maurelian
Copy link
Contributor Author

@jacqueswww
Copy link
Contributor

jacqueswww commented Sep 6, 2018

Ha @maurelian! You are such a sniper, I was waiting until the EIP was merged as a draft.

So basically this EIP resulted directly from discussing #901, having the EIP accepted will both benefit solidity and vyper :)

@maurelian
Copy link
Contributor Author

@fulldecent you expressed opposition to a hypothetical gas cost reductions on calls to self. The community would benefit from hearing your concerns here.

@fulldecent
Copy link
Contributor

@maurelian Thank you for the ping. I have reviewed and added all my comments to https://ethereum-magicians.org/t/eip-1380-reduced-gas-cost-for-call-to-self/1242/3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
VIP: Approved VIP Approved VIP: Discussion Used to denote VIPs and more complex issues that are waiting discussion in a meeting
Projects
None yet
Development

No branches or pull requests

6 participants