Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: simplify symbol parsing #334

Merged
merged 9 commits into from
Aug 1, 2024
Merged

fix: simplify symbol parsing #334

merged 9 commits into from
Aug 1, 2024

Conversation

0xkarmacoma
Copy link
Collaborator

  • simplify method names - get_contract_mapping_info_by_bytecode -> get_by_bytecode - get_contract_mapping_info_by_name -> get_by_name - add_contract_mapping_info -> add
  • introduce an add_or_create method (like setdefault, but also returns the value)
  • factor into a separate parse_symbols method to avoid nested try..except blocks
  • avoid the except Exception: pass pattern, it does hide exceptions
  • in parse_ast, make contract_name required (it could lead to exceptions when the default was used)

- simplify method names
    - get_contract_mapping_info_by_bytecode -> get_by_bytecode
    - get_contract_mapping_info_by_name -> get_by_name
    - add_contract_mapping_info -> add
- introduce an add_or_create method (like setdefault, but also returns the value)
- factor into a separate parse_symbols method to avoid nested try..except blocks
- avoid the `except Exception: pass` pattern, it does hide exceptions
- in parse_ast, make contract_name required (it could lead to exceptions when the default was used)
@0xkarmacoma 0xkarmacoma requested a review from daejunpark July 29, 2024 21:59
@0xkarmacoma
Copy link
Collaborator Author

cc @pillip wdyt?

Copy link
Collaborator

@daejunpark daejunpark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would be great if we could get this reviewed by @pillip, as there is a non-trivial change.


for child_node in node.get("nodes", []):
self.parse_ast(child_node, current_contract)
self.parse_ast(child_node, contract_name)
Copy link
Collaborator

@daejunpark daejunpark Jul 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this doesn't seem semantic-preserving. why not treating differently the case of child_node["nodeType"] == "ContractDefinition"? in that case, child_node.get("name", "") was used instead of contract_name.

i mean, two names could be different, especially for nested contracts.

Copy link
Collaborator Author

@0xkarmacoma 0xkarmacoma Jul 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is definitely not semantic preserving, because I think there was an issue. The except Exception: pass was swallowing exceptions like this:

error parsing symbols for contract FuzzCantCatch
Traceback (most recent call last):
  File "/Users/karma/projects/halmos/src/halmos/__main__.py", line 1304, in parse_symbols
    Mapper().parse_ast(contract_map[contract_name][0]["ast"])
  File "/Users/karma/projects/halmos/src/halmos/mapper.py", line 103, in parse_ast
    self.parse_ast(child_node, current_contract)
  File "/Users/karma/projects/halmos/src/halmos/mapper.py", line 97, in parse_ast
    self.append_node(
  File "/Users/karma/projects/halmos/src/halmos/mapper.py", line 79, in append_node
    raise ValueError(f"Contract {contract_name} not found")
ValueError: Contract  not found

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now you're right, I had some wrong assumptions about what we would find in the AST. For instance with this contract: https://github.com/karmacoma-eth/halmos-sandbox/blob/main/test/46_hellFunc.t.sol (one solidity file with no eponymous contract, and it contains 2 contract definitions)

Walking the AST in out/46_hellFunc.t.sol/FuzzCantCatch.json we find:

SourceUnit
  PragmaDirective
  UserDefinedValueTypeDefinition (Int)
  UsingForDirective
  UsingForDirective
  UsingForDirective
  UsingForDirective
  FunctionDefinition (add)
    Block
  FunctionDefinition (div)
    Block
  FunctionDefinition (mul)
    Block
  FunctionDefinition (sub)
    Block
  ContractDefinition (FuzzCantCatch)
    VariableDeclaration (numbr)
    VariableDeclaration (namber)
    VariableDeclaration (nunber)
    VariableDeclaration (mumber)
    VariableDeclaration (numbor)
    VariableDeclaration (numbir)
    FunctionDefinition (hellFunc)
      Block
    FunctionDefinition (_numbaar)
      Block
  ContractDefinition (HellCatcher)
    VariableDeclaration (fuzz)
    FunctionDefinition (setUp)
      Block
    FunctionDefinition (test_hellFunc)
      Block
    FunctionDefinition (testCatchEasyBugWithFuzz)
      Block

And if we walk the AST of out/46_hellFunc.t.sol/HellCatcher.json, we get exactly the same thing 😵 it's actually the AST of the whole source file, not just the contract

Copy link
Collaborator Author

@0xkarmacoma 0xkarmacoma Jul 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still, based on this I think we should indeed "guess" contract_name based on the json filename, and dig into the AST until we hit the right node.

We still have a couple problems:

  • how to handle another contract definition that's not the same as contract_name? Currently it would be processed if it has not been seen yet, or skipped if it has
  • how to handle contract name collisions (A.sol and B.sol both define a contract called C), I think right now it's nondeterministic (whichever one is processed first is added, the other one is skipped)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok I found what was causing the initial issue, in 46_hellFunc.t.sol:

type Int is uint256;

What happens is this:

  • it is outside of a contract definition
  • it results in an AST node of type UserDefinedValueTypeDefinition
  • because node_type != "SourceUnit", we try to append it to the current_contract mapping
  • but because we haven't encountered a ContractDefinition yet, it tries to append it to the default contract_name = "", which raises the exception

Similar thing going on here: https://github.com/karmacoma-eth/halmos-sandbox/blob/main/test/17_pAtHExPlOSion.t.sol, the free functions (FunctionDefinition AST nodes) before the contract definition cause the same exception

@0xkarmacoma 0xkarmacoma marked this pull request as draft July 31, 2024 01:37
+ rename to lookup_selector
+ rename address to selector
we don't need the id and visibility field
by making ContractMappingInfo a dict instead of a list
@0xkarmacoma
Copy link
Collaborator Author

Ok I think this is ready for another round of review, I added a couple more tests that show that we handle the previous issue. Plus a couple more goodies:

  • can now invoke halmos.mapper as a module to inspect a json file containing an ast. For instance:
python -m halmos.mapper tests/regression/out/Address.sol/Address.json

SourceUnit
  PragmaDirective (ignored node type)
  ContractDefinition: Address
    ErrorDefinition: AddressInsufficientBalance (added node with ast_node.selector='0xcd786059'
    ErrorDefinition: AddressEmptyCode (added node with ast_node.selector='0x9996b315'
    ErrorDefinition: FailedInnerCall (added node with ast_node.selector='0x1425ea42'
    FunctionDefinition: sendValue
      Block (ignored node type)
    FunctionDefinition: functionCall
      Block (ignored node type)
    FunctionDefinition: functionCall
      Block (ignored node type)
    FunctionDefinition: functionCallWithValue
      Block (ignored node type)
    FunctionDefinition: functionCallWithValue
      Block (ignored node type)
    FunctionDefinition: functionStaticCall
      Block (ignored node type)
    FunctionDefinition: functionStaticCall
      Block (ignored node type)
    FunctionDefinition: functionDelegateCall
      Block (ignored node type)
    FunctionDefinition: functionDelegateCall
      Block (ignored node type)
    FunctionDefinition: verifyCallResultFromTarget
      Block (ignored node type)
    FunctionDefinition: verifyCallResult
      Block (ignored node type)
    FunctionDefinition: verifyCallResult
      Block (ignored node type)
    FunctionDefinition: defaultRevert
      Block (ignored node type)
    FunctionDefinition: _revert
      Block (ignored node type)

### Results ###

Contract: Address
  0xcd786059: AddressInsufficientBalance
  0x9996b315: AddressEmptyCode
  0x1425ea42: FailedInnerCall
  • comes with a cutesie little Explanation helper
  • rename address to selector
  • rename find_nodes_by_address to lookup_selector
  • ContractMappingInfo.nodes is a dict instead of a list
  • ☝️ this lets us look up selectors in a contract scope, instead of scanning linearly the entire namespace
  • AstNodes are simplified, I don't think we need to keep id and visibility
  • only keep track of AstNodes that have a selector
  • things that are not scoped to a contract (e.g. global errors and events) are tracked separately under the contract_name = None "namespace", which can be retrieved with Mapper().get_by_name(None)

@0xkarmacoma 0xkarmacoma marked this pull request as ready for review August 1, 2024 01:44
@0xkarmacoma 0xkarmacoma merged commit 954ed59 into main Aug 1, 2024
33 checks passed
@0xkarmacoma 0xkarmacoma deleted the simplify-symbol-parsing branch August 1, 2024 22:37
@pillip
Copy link
Contributor

pillip commented Aug 3, 2024

@karmacoma-eth @daejunpark
Sorry for the late checking. I totally agree with the refactoring of parsing.
I think halmos.mapper is too good 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants