-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] fix(fuzz): strip metadata when collecting push bytes #8139
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not an expert on bytecode layout, but this explainer made sense to me.
if this is something that holds true, we could upstream this to revm perhaps
pending @DaniPopes
return; | ||
} | ||
|
||
// Remove metadata by looking up the last two bytes of original bytecode. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this always true?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For initcode, no. Sometimes e.g. long strings in a constructor are after the metadata hash.
For runtime code, yes, IF the metadata hash is present. When bytecode_hash = none
there is still a metadata hash containing the solidity version, so the last two bytes are still the length. When cbor_metadata = false
then there is no bytecode hash at all and the last two bytes are code
To know for certain if what you have is a metadata hash, without knowing the compilation settings, you'd assume the last two bytes are the length of the metadata hash, then try CBOR decoding that hash. If it decodes successfully, it's a metadata hash, otherwise it's code.
Here is some old, probably bad, rust code I had that did this using ciborium: https://github.com/ScopeLift/cove-backend/blob/d8c875c5a64bf89a874892bc8ed57751c47b1233/src/bytecode.rs#L53-L97
if code.len() > 2 { | ||
let metadata_len = &code[code.len() - 2..]; | ||
let metadata_len = u16::from_be_bytes([metadata_len[0], metadata_len[1]]).into(); | ||
if code.len() > metadata_len { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this could still fail due to the -2, but not sure if possible
perhaps metadata_offset = metadata_len + 2
&code[..code.len() - metadata_offset];
Motivation
Closes #8115
When we collect values from push bytes we don't strip metadata from contract bytecode. This results in irrelevant values added to fuzz dictionary and it also affects
forge snapshot
(each time an comment is added / removed the snapshot values are changed)Would be nice to have such method for getting bytecode without metadata in
Bytecode
revm primitives (and use in verify code as well)Solution
original_byte_slice
(prev we were usingbytes_slice
which is padded with 32 zero bytes) and then remove metadataTBD: should we ignore these addresses? check should be preserved in order to avoid panics from other edge cases
SnapshotTest
contract (test is too big to be included as unit test and I wasn't able to simplify it and reproduce issue)