Inconsistent bytecode created for identical contract input using standard-json compiler input #9573

epheph · 2020-08-05T00:57:28Z

Description

During deployment of Augur, one of the contracts would not verify on Etherscan properly: the deployment process created one bytecode for "ZeroXTrade", while giving the Standard Json to Etherscan and having them compile created bytecode with VERY slightly different bytecode (of different lengths!).

Due to past issues with contract verification, Augur flattens all contracts prior to deployment. The Standard Json that is provided to solidity lists many contracts to be compiled, but all of them are completely flattened and self-contained (with significant code-duplication between them).

The issue is that the bytecode created for a self-contained contract changes depending on what OTHER self-contained contracts are compiled at the same time. In the example below, I have linked two different inputs that are identical except for an additional self-contained contract being compiled at the same time which demonstrates this issue.

Environment

Compiler version: 0.5.17
Target EVM version (as per compiler settings): 0.5.17
Framework/IDE (e.g. Truffle or Remix): n/a
EVM execution environment / backend / blockchain client:
Operating system: Linux

Steps to Reproduce

standard-json examples that show the issue:

https://gist.github.com/epheph/20428a1ff7163bce1d47be7d7b623344

The only difference between these two files is the "Augur.sol" section (which jq is discarding the output of).

If you run each of these through solc 0.5.17:

$ cat 1-contract.json  | solc-0.5.17 --standard-json \
      | jq -r '.contracts["trading/ZeroXTrade.sol"].ZeroXTrade.evm.bytecode.object' | egrep -o '^.{64}'
60806040526000805461ffff1916905534801561001b57600080fd5b506147c2

$ cat 2-contracts.json  | solc-0.5.17 --standard-json \
     | jq -r '.contracts["trading/ZeroXTrade.sol"].ZeroXTrade.evm.bytecode.object' | egrep -o '^.{64}'
60806040526000805461ffff1916905534801561001b57600080fd5b506147d1

Look at the last characters, c2 vs d1. There are many other slight differences, but this is the first.

ZeroXTrade.sol makes NO reference to Augur.sol, so the fact that Augur.sol is compiled in the same solidity run should not change the ZeroXTrade's bytecode.

/cc @nuevoalex

The text was updated successfully, but these errors were encountered:

chriseth · 2020-08-05T07:15:07Z

Thanks for reporting this!

Looking at the assembly output, two things are interesting:

the source references to internal routines are different, which means that the internal routines themselves are probably different
the differences in generated code look like different decisions being made in the optimizer

chriseth · 2020-08-05T07:19:11Z

To fix this properly, we have to reset the yul string repository between contracts, which would invalidate the inline assembly part of the AST which is stored in parsed form and thus uses yul strings.

chriseth · 2020-08-05T07:59:36Z

Actually the difference is only in the internal routines, so resetting the yul string repository might be all that is needed.

chriseth · 2020-08-05T10:21:24Z

It turns out that the problem is that AST IDs are encoded into function names which leads to the optimizing handling them in a different order resulting in a different way functions are inlined.

chriseth · 2020-08-05T14:37:23Z

Current plan of attack:

rename all identifiers to look nicer (we wanted to do that anyway) and remove AST IDs in the process
modify the function inliner such that it inlines functions in an order that does not depend (very much) on the names of the functions: Start with leaves of the call graph ( Handle "leaf functions" first in full inliner. #9314 ) and use the block hasher to make it deterministic

alcuadrado · 2020-08-20T13:27:29Z

This bug has really big implications for Buidler's new compilation pipeline. We are now forced to disable some optimizations we've been working on.

Can someone point to the first version of solc with this problem? That would be incredibly valuable.

chriseth · 2020-08-20T15:46:12Z

The problem should be present whenever you have some Yul code and the yul optimizer is activated. It got activated together with the regular optimizer starting from 0.6.0. You have Yul code whenever you use ABIEncoderV2. I think there are also some other situations here yul code is generated, but it might not have that problem.

Actually I think this only happens if you have multiple files in one compilation unit that contain structs or contracts with the same name. So if you don't just flatten everything ( ;) ), you should be rather safe. The reason is that the names of the Yul functions always have the data type name and its AST ID. If the data type name is already unique, a change in the AST ID should not matter much.

alcuadrado · 2020-08-23T22:30:59Z

So if you don't just flatten everything ( ;) ), you should be rather safe.

I think that's not always true. I believe our new compilation pipeline can easily trigger this bug.

Here's an example of what we were planning to do, and why it would fail:

Suppose you have files A.sol, B.sol and C.sol. B.sol imports A.sol.
Starting with a clean project, you'd compile A.sol, B.sol and C.sol together.
Then, you modify B.sol and it gets compiled alongside A.sol.
The user then deploys each a contract from B.sol, and forgets to verify it, nor is willing to commit the compilation output to git. Note that this is extremely common. I can't stress enough how common this is.
For some reason, like migrating to another pc, the project's cache/artifacts get cleared.
The user now wants to verify the contract from B.sol, so they compile the project. i.e. A.sol, B.sol and C.sol together.
B.sol's output may be different, and if so, the verification would fail.

chriseth · 2020-08-24T08:02:36Z

@alcuadrado maybe I didn't understand, but are you saying that a user deploys a contract from a source that is not stored permanently and then cannot verify the contract? If this is so extremely common, then deployment tools should make sure they only work from a clean git repo that is synced to origin or even better, store the sources and the metadata somewhere.

Or did I misunderstand you - I don't see how this is related to importing or flattening, for example.

fvictorio · 2020-08-24T12:47:51Z

@chriseth the sources (meaning, the Solidity files) are stored permanently. What is not stored permanently is the JSON input used by solc. In the first scenario, this JSON had the contracts [A, B, C], but in the second scenario it only had [B, C].

The point is that, because of this bug, this optimization (ignoring A in a re-compilation because it doesn't affect B and C) cannot be done. Does that make sense?

chriseth · 2020-08-24T14:01:35Z

@fvictorio as I said, my assumption is that this bug does not manifest itself as long as the files do not contain structs or contracts of the same name, but I will try to verify this assumption.

In general, the optimizer settings, file names, remappings and so on are crucial for verification and thus have to be stored. It would be great if deployment tools could help store the metadata file. If the metadata file is not requested as one of the output artefacts, then it is very much more difficult to source-verify a contract.

fvictorio · 2020-08-24T14:16:30Z

Oh, yes, I forgot to add that detail: in that example, files A and B have contracts with the same name (and optimization is enabled, etc.) I was able to reproduce this bug that way.

So the thing we want to do is to be able to make the optimization of only compiling B and C if they were modified but A wasn't. With this bug, we won't be able to do that: we'll always need to include A even if it's not necessary. I think the point @alcuadrado was making is that this is an example of this bug manifesting itself while there's no flattening involved and limiting a potentially useful feature.

I agree that this, in theory, this bug could be worked around by the deployment tool, but that doesn't change our situation with respect to the compilation pipeline.

chriseth · 2020-08-24T20:31:54Z

@fvictorio ok, you are right, I confused the flattening in the initial example with the main problem of two files sharing identifiers of the same name. I assume that if you don't flatten then this will be a very rare case, but I guess we'll just fix this issue and then we are all happy - as long as everyone uses the latest version of Solidity :)

chriseth · 2020-08-25T12:43:55Z

Disabling the FullInliner results in the same bytecode, so it might be that we only have to fix that component.

@ekpyron I think using the blockHasher does not really help here because it will consider two functions to be different if they call two different but equivalent functions.

chriseth · 2020-08-25T12:58:31Z

It turns out the actual fix is much easier: The function inliner processes functions according to their YulString sorting order. If they are processed based on their name (or position in the source, which is equivalent) instead, the bug is gone, and even the metadata is fully equal.

fvictorio · 2020-08-25T14:05:28Z

@chriseth can we assume that this bug will never happen if the optimizer is disabled?

See issue ethereum/solidity#9573 When the solc optimizer is enabled, there might be inconsistencies in the generated bytecodes. To avoid this we need to make sure that the files involved are always the same for a given group. We don't need to do this if the solc optimizer is disabled.

chriseth · 2020-08-25T17:18:37Z

@fvictorio this specific bug needs the yul optimizer to be enabled.

chriseth · 2020-08-25T18:59:21Z

It turns out that if you update the example code to Solidity 0.7.0, the bug is not present. I compared the yul code generated and it seems that there are some functions that are duplicated it 0.5.17: One for "memory" and one for "memory pointer", while the distinction between "pointer" or "not pointer" is only relevant for storage and thus it was removed for memory at some point.

I implemented and verified a fix, but we cannot test it due to lack of example input.

chriseth · 2020-08-25T18:59:43Z

If someone encounters an example that fails for 0.7.0, please report it!

gitpusha · 2020-08-31T12:44:10Z

Hi. This bug caused me a lot of trouble recently when I tried to verify some contracts that a colleague had compiled using solc-js 0.6.10 and deployed. I understand that this problem should be fixed with solidity 0.7.x. @chriseth - concerning my previously deployed contracts, should I have any other (security?) concerns with regards to this bug other than that verification might be difficult (if we havent stored the metadata)?

chriseth · 2020-08-31T13:03:54Z

The cause of the bug was just that optimizer steps were applied in a different order to the individual functions, so no reason for concern.

due to ethereum/solidity#9573

elenadimitrova · 2020-11-04T09:57:11Z

Leaving a note here that this affects Argent wallet contracts that are on solc 0.6.12 and using ABIEncoderV2 pragma. When compiling these as part of the overall set of contracts they generate different bytecode than when compiled on their own. Example bytecode differences for ApprovedTransfer contract:

Compiled as part of larger contracts set
https://gist.github.com/elenadimitrova/9073bc073ee72a7264cafe2982823f6b#file-approvedtransfer_bytecode_compiledwithall

Compiled alone
https://gist.github.com/elenadimitrova/4f58d3795d009a05be2c2f56ca77d97c#file-approvedtransfer_bytecode_compiledalone

Affected contracts are: ApprovedTransfer, TokenExchanger, TransferManager, LimitStorage and RelayerManager.
By compiling ApprovedTransfer and RelayerManager independently we've managed to produce the same bytecode as Etherscan for verification however this wasn't successful for TransferManager and TokenExchanger contracts which makes me think there is a larger issue here. I'd like this issue to be reopened and addressed as part of an update to 0.6 (as it is not an issue with 0.7).

Building contracts individually produces more reproducible bytecode. Due to a bug in the 0.6.10 compiler (ethereum/solidity/issues/9573)

vittominacori · 2022-12-20T15:40:07Z

Hi @fvictorio @alcuadrado is this issue solved?
I always had that issue and it means that after developing with hardhat I must compile contracts with truffle before deploy otherwise I won't be able to verify them on etherscan.

For instance if I compile SampleContract in this repository I obtain the following (different) bytecode:

Is there any hardhat configuration I'm missing?

scnale · 2022-12-20T17:22:41Z

I don't think you can reasonably expect hardhat and truffle to produce the same bytecode since they both implement their own build pipeline. I.e. they define the standard input JSON that is passed to the compiler, which is not necessarily the same.

fvictorio · 2022-12-21T11:31:09Z

@vittominacori from what I can tell, the difference is just in the metadata (the last bytes) generated by Hardhat and Truffle. I don't quite understand why they are different though. (Btw, I obtained this by actually compiling your project, because the gist files you shared are super different in length, that can't be right.)

This issue is about a difference in the executable part of the bytecode, so I don't think you are running into a regression.

vittominacori · 2022-12-21T14:50:53Z

@vittominacori from what I can tell, the difference is just in the metadata (the last bytes) generated by Hardhat and Truffle. I don't quite understand why they are different though. (Btw, I obtained this by actually compiling your project, because the gist files you shared are super different in length, that can't be right.)

This issue is about a difference in the executable part of the bytecode, so I don't think you are running into a regression.

Yes, sorry. Maybe I had old node modules installed.
I've just executed again with a fresh install and I updated the gist files.
They are still different but now I'm able to verify code on etherscan. Thanks

Here the env

Hardhat 2.12.4
Truffle v5.7.0 (core: 5.7.0)
Ganache v7.5.0
Solidity - 0.8.17 (solc-js)
Node v16.18.1
npm 8.19.2

chriseth added the bug 🐛 label Aug 5, 2020

chriseth mentioned this issue Aug 21, 2020

Support easy multifile verification on block explorers #9650

Closed

alcuadrado mentioned this issue Aug 23, 2020

[buidler-etherscan] verify-contract by default is pushing all the project source code NomicFoundation/hardhat#670

Closed

chriseth mentioned this issue Aug 25, 2020

Fix inlining order to correspond to source order. #9672

Merged

chriseth closed this as completed in #9672 Aug 26, 2020

alcuadrado mentioned this issue Aug 28, 2020

Workarounds for Solidity's bug #9573 ethereum/sourcify#246

Closed

alcuadrado mentioned this issue Sep 23, 2020

[builder-etherscan]: compiled contracts included in verification NomicFoundation/hardhat#804

Closed

elenadimitrova added a commit to argentlabs/argent-contracts that referenced this issue Nov 4, 2020

Independently compile contracts with ABIEncoderV2

5534a67

due to ethereum/solidity#9573

elenadimitrova mentioned this issue Nov 4, 2020

Independently compile contracts that fail verification argentlabs/argent-contracts#170

Merged

mds1 mentioned this issue Nov 18, 2020

Contract Updates raid-guild/gitcoinco-ptoken-contracts#4

Merged

nventuro mentioned this issue May 10, 2021

solc 0.7.1 - inconsistent bytecode created for identical contract input using standard-json compiler input #11375

Closed

rayanoncyber mentioned this issue Sep 12, 2021

Verifying diamonds with hardhat etherscan-verify results in error wighawag/hardhat-deploy#188

Closed

gtaschuk mentioned this issue Dec 4, 2021

added UNN token balancer/assets#388

Closed

alcuadrado mentioned this issue Dec 17, 2021

Hardhat and Truffle with same compiler options compile different bytecode NomicFoundation/hardhat#1147

Closed

snake-poison added a commit to IndexCoop/index-protocol that referenced this issue Oct 17, 2022

build(contracts): build contracts individually.

f72a56b

Building contracts individually produces more reproducible bytecode. Due to a bug in the 0.6.10 compiler (ethereum/solidity/issues/9573)

snake-poison mentioned this issue Oct 17, 2022

build(contracts): build contracts individually via index:compile:all hh task. IndexCoop/index-protocol#5

Merged

cameel mentioned this issue Aug 16, 2023

Bytecode comparison run with extra input files #14495

Closed

ben46 mentioned this issue Mar 29, 2024

fails to verify on testnet.ftmscan.com wighawag/hardhat-deploy#527

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inconsistent bytecode created for identical contract input using standard-json compiler input #9573

Inconsistent bytecode created for identical contract input using standard-json compiler input #9573

epheph commented Aug 5, 2020

chriseth commented Aug 5, 2020

chriseth commented Aug 5, 2020

chriseth commented Aug 5, 2020

chriseth commented Aug 5, 2020

chriseth commented Aug 5, 2020

alcuadrado commented Aug 20, 2020

chriseth commented Aug 20, 2020

alcuadrado commented Aug 23, 2020 •

edited

Loading

chriseth commented Aug 24, 2020

fvictorio commented Aug 24, 2020

chriseth commented Aug 24, 2020

fvictorio commented Aug 24, 2020

chriseth commented Aug 24, 2020

chriseth commented Aug 25, 2020

chriseth commented Aug 25, 2020

fvictorio commented Aug 25, 2020

chriseth commented Aug 25, 2020

chriseth commented Aug 25, 2020

chriseth commented Aug 25, 2020

gitpusha commented Aug 31, 2020

chriseth commented Aug 31, 2020

elenadimitrova commented Nov 4, 2020

vittominacori commented Dec 20, 2022

scnale commented Dec 20, 2022

fvictorio commented Dec 21, 2022

vittominacori commented Dec 21, 2022

Inconsistent bytecode created for identical contract input using standard-json compiler input #9573

Inconsistent bytecode created for identical contract input using standard-json compiler input #9573

Comments

epheph commented Aug 5, 2020

Description

Environment

Steps to Reproduce

chriseth commented Aug 5, 2020

chriseth commented Aug 5, 2020

chriseth commented Aug 5, 2020

chriseth commented Aug 5, 2020

chriseth commented Aug 5, 2020

alcuadrado commented Aug 20, 2020

chriseth commented Aug 20, 2020

alcuadrado commented Aug 23, 2020 • edited Loading

chriseth commented Aug 24, 2020

fvictorio commented Aug 24, 2020

chriseth commented Aug 24, 2020

fvictorio commented Aug 24, 2020

chriseth commented Aug 24, 2020

chriseth commented Aug 25, 2020

chriseth commented Aug 25, 2020

fvictorio commented Aug 25, 2020

chriseth commented Aug 25, 2020

chriseth commented Aug 25, 2020

chriseth commented Aug 25, 2020

gitpusha commented Aug 31, 2020

chriseth commented Aug 31, 2020

elenadimitrova commented Nov 4, 2020

vittominacori commented Dec 20, 2022

scnale commented Dec 20, 2022

fvictorio commented Dec 21, 2022

vittominacori commented Dec 21, 2022

alcuadrado commented Aug 23, 2020 •

edited

Loading