-
Notifications
You must be signed in to change notification settings - Fork 6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Solve nondeterminism about AST IDs #11907
Comments
We do generate code per-contract - can't we just have a mapping "global ID -> contract-local ID" that maps our current IDs to a new counter that is per-contract and increments based on codegen visit order, resp. first use in codegen? You considered that yourself earlier, not sure what made you drop the idea... |
Yes, that is the other way to do it. I'm a bit uneasy about the codegen visit order, and it would require |
Ah sorry, it's |
In
This might lead to |
It's not exactly beautiful to pass a mapping to |
Yeah, that indeed makes it extremely error-prone... |
It turns out that it might be better to make (a) the sol -> yul code generator output code that only depends on solidity source order and (b) check that the yul (util) function generation is deterministic. This is done in #11910 |
Why not take a truncated hash as the ID of the two-level ID? |
Hopefully solved by not making anything depend on names at all. |
The nondeterminism about AST IDs will always hurt us, and we should solve it once and for all.
The problem is that when you compile multiple unrelated files, the AST IDs get assigned uniquely for the whole compilation run.
This can result in different code depending on whether you compile a file in isolation or together with other files for the following reason:
These AST IDs are used in the code generator to create unique function names (for example for struct types in the ABI routines).
We are currently trying to strip the AST IDs as much as possible as long as the function names are still unique, but if this fails, it can happen that functions are sorted differently and thus the optimizer handles them in a different order or they are just laid out differently in the final code.
Proposals to overcome this problem:
Make AST IDs two-dimensional where the first dimension depends on the file. This can be the file name or the hash of the file content.
The problem here is that these IDs get very long and also it is a breaking change. We can keep the current AST IDs for anything that is not related to code generation. The length could be dealt with by specially marking those AST IDs in function names and reducing their length in the NameSimplifier (we already do something like that there). The NameSimplifier could start with one pass where it just collects all prefixes and replaces them with unique and shorter identifiers (either prefixes or just newly created numbers starting from 1).
The text was updated successfully, but these errors were encountered: