Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Skinny CREATE2 #1014

Merged
merged 9 commits into from
Jun 18, 2018
Merged

Skinny CREATE2 #1014

merged 9 commits into from
Jun 18, 2018

Conversation

vbuterin
Copy link
Contributor

@vbuterin vbuterin commented Apr 20, 2018

Adds a new opcode at 0xf5, which takes 4 stack arguments: endowment, memory_start, memory_length, salt. Behaves identically to CREATE, except using sha3(msg.sender ++ salt ++ init_code)[12:] instead of the usual sender-and-nonce-hash as the address the contract is created.

Update 2018.09.02: Version 2: use sha3(msg.sender ++ salt ++ sha3(init_code))[12:]

@emansipater
Copy link

I am extremely in support of this. We NEED deterministic, immutable addresses for so many applications (state channels plus offline multisig plus a ton more).

@vbuterin vbuterin changed the title Create Skinny_CREATE2.md Skinny CREATE2 Apr 20, 2018
@snario
Copy link

snario commented Apr 20, 2018

This EIP allows for a significant performance increase in state channels by removing the need for an additional contract to allow for counterfactual addressing. I'm highly in favour of accepting it as soon as possible. :)

@SilentCicero
Copy link

I am in complete support of this, there are countless governance use-cases which need this to be efficient and successful. Hurray!


### Specification

Adds a new opcode at 0xf5, which takes 4 stack arguments: endowment, memory_start, memory_length, salt. Behaves identically to CREATE, except using `sha3(msg.sender ++ salt ++ init_code)[12:]` instead of the usual sender-and-nonce-hash as the address where the contract is initialized at.
Copy link
Contributor

@LefterisJP LefterisJP Apr 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vbuterin The salt can have any arbitrary length?

Copy link
Contributor

@holiman holiman Apr 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Existing scheme looks like this for low nonces (nonce 1 below):

sha3(0xd6 ++ 0x94 ++ sender ++ 0x01)[12:]

So if you mine an address starting with d694, it seems possible to create destination collisions. Using larger nonces give more room for collisions.

This means that you could say "look, this contract can only be create2:ed with the initcode x. But in fact, you can create arbitrary contracts there using old-style create.

I may be wrong, thinking while writing here... A trivial way to get around this would be to prefix the entire thing with something that is invalid rlp.

And @LefterisJP @vbuterin I assume the salt is fixed-size, and the size of that would naturally affect the ability to do the attack described here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vbuterin The salt can have any arbitrary length?

It's a stack argument, hence 32 bytes.

A trivial way to get around this would be to prefix the entire thing with something that is invalid rlp

If we want, we can prefix with 0xff; the only valid RLP that starts with 0xff would be petaby long.

@@ -0,0 +1,16 @@
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update this to the new format: frontmatter should start and end with --- on its own line, and keys are lower-case.

@@ -0,0 +1,16 @@
```
EIP: <to be assigned>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please number this 1014 and rename the file to eip-1014.md.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0xff would work, but 0x01 would be not even theoretically possible to collide (on the preimage side)

EIP: <to be assigned>
Title: Skinny CREATE2
Author: Vitalik Buterin
Category: Core
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should specify Type: Standards Track, as well.

```
EIP: <to be assigned>
Title: Skinny CREATE2
Author: Vitalik Buterin
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please include a username or email address (parentheses for Github username) or .

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lol. @vbuterin please follow protocol around here... just because you created Ethereum doesn't mean you get to break EIP specification.

EIPS/eip-1014.md Outdated
---
eip: 1014
title: Skinny CREATE2
author: Vitalik Buterin (vbuterin)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GitHub username is recognized when adding the @ in front of it. Like '@vbuterin'

@CoinHodl
Copy link

Make state channels great again👍🏽

@Arachnid
Copy link
Contributor

Needs a discussions-to URL, but otherwise good to go.

@SilentCicero
Copy link

SilentCicero commented Apr 22, 2018 via email

@chriseth
Copy link
Contributor

chriseth commented Jun 15, 2018

What happens on address collisions? How exactly is an address collision defined (existing code, existing balance, pre-existing code, pre-existing balance, etc.)?

Also, in which way is this different from the earlier proposal about create2?

@AlexeyAkhunov
Copy link
Contributor

@karalabe Has a good point here: ethereum/pm#44

@emansipater @snario Will state channel mechanisms be able to detect if the code of counterfactual contract will behave differently depending on the environment? Because currently contracts can do anything in their constructor, which is the code that runs during the instantiation.

(Copying his comment below).

Regarding CREATE2, do we have any restriction on the execution context, or it remains the same? The reason I'm asking is because the EIP states:

Allows interactions to be made with addresses that do not exist yet on-chain but can be relied on to only possibly eventually contain code that has been created by a particular piece of init code.
Even though this statement is true, the final deployed contract can behave arbitrarily differently depending on who deploys it and when (since the execution environment changes). This is in itself fine, but I think it's an important limitation of the opcode to explicitly highlight, otherwise we'll see many many abuses around this.

Alternatively we could enforce CREATE2 to not have access to environmentals, but that might be an ugly complication.

Just food for thought.

@AlexeyAkhunov
Copy link
Contributor

@chriseth As far as I understand, now "msg.sender" is kind of in full control of the contract address. And I assume that CREATE2 will only create contract if it does not exist (so not to break the invariant of contract code not changing after initial deployment). In the state channel settings, possibility of doing CREATE2 is a leverage that participants use against each other. So they will not compromise their leverage by producing an address collision.

@ameensol
Copy link

ameensol commented Aug 20, 2018

Chiming in because @lrettig let me know this discussion was taking place. I'm not sure how L4 plan on doing "counterfactual" in practice, but we've found that deploying a new contract for each individual dispute would be pretty expensive, especially if the contracts are complex (e.g. @funfair-tech casino games). Instead, we think it makes more sense to reuse contracts already deployed onchain for disputes.

Maybe I'm missing something, but this seems like a bit of a distraction. @emansipater @snario @SilentCicero could you please expand on why this is valuable—are you assuming contract deployment on every dispute? What other use cases does this enable/optimize?

cc @ConnextProject @nginnever

@ArjunBhuptani
Copy link

Interested to learn more about tradeoffs here too!

FYI In our construction, we're working towards a dispute registry with predeployed dispute contracts. When opening a "thread" (virtual channel), participants can sign a hash associated with the specific dispute they want to reference.

At the very least this means that participants around a specific use-case don't need to deploy duplicate contracts for byzantine cases and ideally this also means that we can have community-sourced standard disputes for various use-cases. This also removes the need for deterministic addressing I think?

@ldct
Copy link

ldct commented Aug 22, 2018

Few points to talk about here:

  1. There is no advantage (AFAICT) of the current CREATE address computation scheme, in which the address depends on account nonce, over schemes where the address does not depend on account nonce.

  2. Using commitments to deploy contracts ("deployment commitment") does not imply no opportunity for code reuse - the deployed contract can still share code with other instances and marginally be quite small.

  3. In the case that one wants to enter ad-hoc/custom/private contracts in a channel, the only efficient way to do so is deployment commitment.

  4. Our specific metachannel constructions rely on deployment commitments; I haven't seen any constructions that achieve time-based lockup as well as constant locktime without deployment commitment, although this might well be just from people not trying.

  5. To be clear, we're not blocked by this, since we can emulate deterministic addresses at an application level.

  6. A system of multiple contracts which share code but have separate storage can be restructured into a single contract with an "indexed" storage scheme, and under the current gas schedule this is often more efficient, however I think this is a non-permanent accidental feature of how storage is charged (e.g. the different costs charged for SSTORE vs contract data, and the 32k upfront cost of creating a contract). Certainly research discussions like https://ethresear.ch/t/cross-shard-contract-yanking/1450 and the various rent discussions suggest to me that we eventually want to not have a penalty for per-user contracts over giant contracts shared by many people.

  7. There are applications that will benefit from deterministic addressing, but haven't realised it yet. An example is airgapped cold storage and hardware wallets; say you're on a cold storage machine and want to deploy a new policy. You have literally no idea what state the blockchain is in, because you're airgapped. You need deterministic contracts to know what you are authorising sending money to, etc.

  8. Jeff Coleman thinks existing teams like Gnosis and Plasma implementers will find this useful. Personally I think we should get their feedback in order to work out the details and make sure we didn't miss anything.

@kaibakker
Copy link

Am I correct in comparing this to p2sh (Pay to script hash) like functionality? Where value can be allocated to a specific script hash instead of the full script?

@SergioDemianLerner
Copy link
Contributor

Yes, I think it's a good comparison.

@Arachnid
Copy link
Contributor

From the point of view of a hardware wallet, a factory contract is not better than a contract with unknown code because the problem is only moved on level deeper but now is worse: how can it validate that the code of the factory contract is correct? It would need to read the codehash of the accountstate of the Factory in certain given blokchain, but it can't decide if the blockchain given is the correct one.

Either way the code of the deployed contract has to be verified, which is out of scope for the hardware wallet. I don't see how using a factory contract makes this any worse.

@SergioDemianLerner
Copy link
Contributor

@Arachnid Let's say the hardware wallet is built to work with certain known wallets, such as Gnosis (the same works if you register the code hash into the hardware wallet later, and the HW shows this hash in the display for you to check).
When a hardware wallet needs to deploy a new Gnosis-wallet contract, it can use CREATE2 and pass the sha3 of the EVM code. It can have this hash pre-stored. The user can also validate with other offhchain tools that the created wallet is in fact a Gnosis-wallets (by recomputing the created contract address)
Now you say the same would work if the hardware knows the factory contract address, let's say it's hard-coded. So the hardware wallet can create a message that requests the factory contract to create the final Gnosis-wallet. But how does the hardware wallet knows the address ? It must receive it from the outside world. So all verification rests on humans to get information onchain. The hardware wallet cannot help. This is very bad for the creation of institutional multi-sig wallets, where you want the whole process to be auditable and repeatable, and not require a step where you must download and sync a full node, or where you must peek into etherscan.

@SergioDemianLerner
Copy link
Contributor

SergioDemianLerner commented Sep 1, 2018

As @vbuterin still hasn't reviewed my proposed change to this proposal, I will try to emphasize more the benefits of using sha3(init_code) instead of just init_code.

In the future Ethereum might want to do parallel transaction processing. Check for example https://github.com/rsksmart/RSKIPs/blob/master/IPs/RSKIP04.md
Even if you may don't like this RSKIP04 proposal, it's clear that if you implement parallel transaction processing many contracts will need to create child contracts to distribute the load without becoming a bottleneck in transaction parallelization. In those cases contracts would need to dynamically compute a child-contract address without accessing a local mapping (which would make them become a parallelization bottleneck). CREATE2 seems ideal to solve this problem IF we make the amount of data to dynamically hash fixed-length.

For example, our RSK's RSKIP59 proposal won't be needed anymore:
https://github.com/rsksmart/RSKIPs/blob/master/IPs/RSKIP59.md

Think of an ERC20 contract that, depending on the source and destination address of a transfer(), calls two per-user child-contracts (where the EIP-1024 nonce is the src/dst address) that subtract and increment child contract storage cells where per-user balances are stored.

Even without taking into account parallelization, I'm sure there are many more examples where dynamically computing the destination address is highly more efficient than getting it from a mapping.

(It could be even better, to hash sha3(sha3(msg.sender ++ sha3(init_code)) ++ salt), so that it requires even less gas to dynamically compute the child contract address. But this is obviously too complicated for so little benefit. And also there is the problem of size collision with other types of addresses )

@holiman
Copy link
Contributor

holiman commented Sep 1, 2018

@SergioDemianLerner oh we've already decided to use the sha3 of the initcode, the EIP hasn't been updated yet. (decided on a coredev-call a few weeks ago, and I posted a comment here to that effect)

@holiman
Copy link
Contributor

holiman commented Sep 1, 2018

I'll submit a PR next week to update this accordingly

@vbuterin
Copy link
Contributor Author

vbuterin commented Sep 2, 2018

I'm ok with sha3(msg.sender ++ sha3(init_code) ++ salt); can add to the EIP.

@holiman
Copy link
Contributor

holiman commented Sep 2, 2018

@vbuterin please see #1014 (comment) . Thanks for updating, but you forgot the 0xFF

@ghost
Copy link

ghost commented Sep 14, 2018

@vbuterin @holiman

Since msg.sender, init_code, salt are all stack items what happens if I try to deploy a contract twice? What if I selfdestruct a contract and then recreate it? It will be possible to use the same address since it can be regenerated?

EDIT: What happens to the constructor if I recreate an already deployed contract? Does it re-run twice? (If it actually gets deployed I assume the constructor indeed runs twice)

@SergioDemianLerner
Copy link
Contributor

@MoonMissionControl I suppose that without any other change in the EVM, a SELFDESTRUCT will take precedence and the contract will be destructed at the end of the processing of the transaction.
E.g. After CREATE2 SELFDESTRUCT CREATE2 the contract will be destructed.
It's a bit weird, but it's ok as long as the semantic is clear.
A better semantic would be that you can't CREATE2 a contract that has been destructed in the same transaction.

Regarding multiple creations, I think that is a real problem and is similar to the problem where there is a previous balance on an account which then turns into a contract.
But in the case of double-creation is worse because the constructor may assume some fields are zero, which may not be, and then the newly created contract would be in a invalid state.
I would suggest that double-creation is avoided. It should make CREATE2 fail and return 0.

If this is not enforced by the EVM, then a way to prevent this at the application level would be to set a storage cell "initialized" to true just after the constructor finishes and check for this in the first line of the constructor { if (initialized) revert(); }.

@holiman
Copy link
Contributor

holiman commented Sep 23, 2018

Collision cause deployment to fail. Collision occurs if nonce or code is nonzero.

Since nonce is set to 1 at creation (nowadays), even empty create can't be overwritten later.

Since selfdestruct takes effect post-tx, there can be no double-create during one tx.

@holiman
Copy link
Contributor

holiman commented Sep 23, 2018

A better semantic would be that you can't CREATE2 a contract that has been destructed in the same transaction.

So basically, that is already the case

@SergioDemianLerner
Copy link
Contributor

It would be very informative if the EIP states it.

@holiman
Copy link
Contributor

holiman commented Sep 23, 2018

Yes, it should have a reference to eip 684 (iirc) which defines the collision behavior.

@holgerd77
Copy link
Contributor

Could we have 4-5 (or at least one) example cases for the hash creation in the EIP?


#### Option 2

Use `sha3(0xff ++ msg.sender ++ salt ++ init_code)[12:]`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reading from the comments it's:

sha3(0xff ++ msg.sender ++ salt ++ sha3(init_code))[12:]

instead of

sha3(0xff ++ msg.sender ++ salt ++ init_code)[12:]

so with the hash value sha3(init_code) and not the code itself. Am I correct on this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see #1375 . But yes, the init_code, not the code itself

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, thanks!

@tersec
Copy link

tersec commented Oct 5, 2018

This is looking good for Nimbus use cases, with reasonable design tradeoffs.

@bmann
Copy link
Contributor

bmann commented Dec 4, 2018

This is still listed as Draft, and should move through Last Call and Accepted if it's supposed to be in Constantinople.

@jochem-brouwer
Copy link
Member

jochem-brouwer commented Jan 3, 2019

Is there any reason why endowment is not hashed when the contract address is computed?

@vbuterin @holiman

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.