Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(cheatcodes): Record Account and Storage Access Cheatcodes #6310

Merged
merged 19 commits into from
Nov 17, 2023

Conversation

Inphi
Copy link
Contributor

@Inphi Inphi commented Nov 14, 2023

Motivation

This follows up the great work @refcell and @jameswenzel have done on #6087. It implements a record and storage access cheatcode interface based on #6125.

Solution

#6125 goes over the details of the implementation. However, the interface proposed has been tweaked to the following:

enum AccountAccessKind {
  Call, CallReturn, CallCode, StatiCall,
  Create,
  SelfDestruct,
  Resume,
}

struct ChainInfo { uint256 forkId; uint256 chainId; }
 
struct AccountAccess {
    ChainInfo chainInfo;
    AccountAccessKind kind;
    address account;
    address accessor;
    bool initialized;
    uint256 oldBalance;
    uint256 newBalance;
    bytes deployedCode;
    uint256 value;
    bytes data;
    bool reverted;
    StorageAccess[] storageAccesses;
}

struct StorageAccess {
    address account;
    bytes32 slot;
    bool isWrite;
    bytes32 previousValue;
    bytes32 newValue;
    bool reverted;
}

function startStateDiffRecording() external;
function stopAndReturnStateDiff() external returns (AccountAccess[] memory accesses);

Once cheats.startStateDiffRecording() is called, all state and account accesses will be recording starting from the next context, rather than the current one. For example, given the following code snippet:

function run() {
  B b = new B();
  cheats.startStateDiffRecording();
  A a = new A(b);
  a.foo();
}

contract A {
  B _b;
  constructor(B b) { b = b }
  function foo() external { _b.foo(); }
}

The recorded accesses will contain the following:

  • CREATE opcode on A
  • CALL on a.foo
  • CALL on b.foo

Which excludes the CREATE context for B's ctor because no AccountAccess record exists for the current context at the time recording was enabled. Without keeping track of all AccountAccess records even when recording is disabled, we cannot emit a well-defined AccountAccess for B's ctor.

Control-Flow Linking

A context being recorded may be temporarily switched via sub-calls. Once control flow is returned to that context, it's important to maintain an ordering of subsequent accesses. This is accomplished through the Resume access kind. A Resume AccountAccess record contains storage accesses that occur between context switches. For example, given the following:

function run() {
  cheats.startStateDiffRecording();
  a.foo();  
}

contract A {
  function foo() external {
    assembly { sstore(0, 0) }
    this.bar(); //
    assembly { sstore(1, 0) }
  }
  function bar() external {}
} 

We record the following:

  • AccountAccess for the A.foo Call
    • StorageAccess for slot 0 store
  • AccountAccess for the A.bar Call
  • AccountAccess for the A.foo Return
    • StorageAccess for slot 1 store

Here, the Return allows users determine the context where the storage access occurred. The Return account access is only created if there were subsequent storage accesses made after a context switch. This interface only aims to provide enough information to track state changes, rather than offering a comprehensive control flow trace.

@Inphi Inphi force-pushed the inphi/cheat-records branch 2 times, most recently from c217748 to bdfe4ed Compare November 14, 2023 16:02
Copy link
Collaborator

@mds1 mds1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't reviewed the implementation, but have some questions around UX and behavior. Overall looks great though agree with the direction!

crates/cheatcodes/defs/src/vm.rs Outdated Show resolved Hide resolved
crates/cheatcodes/defs/src/vm.rs Outdated Show resolved Hide resolved
crates/cheatcodes/defs/src/vm.rs Outdated Show resolved Hide resolved
crates/cheatcodes/defs/src/vm.rs Outdated Show resolved Hide resolved
crates/cheatcodes/defs/src/vm.rs Outdated Show resolved Hide resolved
crates/cheatcodes/defs/src/vm.rs Outdated Show resolved Hide resolved
crates/cheatcodes/defs/src/vm.rs Outdated Show resolved Hide resolved
@@ -84,6 +86,9 @@ record()
accesses(address)(bytes32[], bytes32[])
skip(bool)

recordStateDiff()
getStateDiff()(AccountAccess[])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's say my test calls A.foo() where foo calls another contract B, so the full flow is:

  • A calls B
    • B reads slot 0
    • B calls C
      • C reads slot 3
    • B has control again and reads slot 5

Is the returned array ordered like this:

  • AccountAccess.A
  • AccountAccess.B
    • StorageAccess in B, slot 0
  • AccountAccess.C
    • StorageAccess in C, slot 3
  • AccountAccess.B (I guess technically it was accessed again when re-gaining control flow? If so perhaps we need a new access kind)
    • StorageAccess in B, slot 5

Copy link
Contributor Author

@Inphi Inphi Nov 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Depends on where the record was initiated. If you start recording just prior to A.foo(), then there wouldn't be any AccountAccess.A since A wasn't accessed. The first AccountAccess would be for B, with A as its accessor. So the returned array would be as you say, without the AccountAccess.A.

Yeah, I'm starting to agree with you on a new AccessKind needed for this return control flow. We do need to maintain the "chronological" order so StorageAccess C, slot 3 relates with StorageAccess B, slot 5. But it may not be immediately to users that the latter storage access isn't from C's context.

OTOH, this kinda defeats the purpose of having a storageAccesses field for AccountAccess if we need a new kind to handle this case. I'm not exactly sure the right approach should be. But I lean towards completely decoupling storage accesses from account accesses, while maintaining ordering. So you have a the following state diff array:

  • AccountAccess.B
  • StorageAccess.B, slot 0
  • AccountAccess.C
  • StorageAccess.C, slot 3
  • StorageAccess.B, slot 5

The main issue with this approach is that the user cannot tell the context StorageAccess.B, slot 5 is for. This should be evident from the StorageAccess.account field, but perhaps in cases like delegatecall it's a bit blurry.
This API depends on whether users need to know exactly which context an account access occurred.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Depends on where the record was initiated. If you start recording just prior to A.foo(), then there wouldn't be any AccountAccess.A since A wasn't accessed. The first AccountAccess would be for B, with A as its accessor. So the returned array would be as you say, without the AccountAccess.A.

Interesting, I'd expect this to have resulted in AccountAccess.A entry:

contract MyTest is Test {
  function testAccesses() external {
    vm.recordStateDiff();
    a.foo(); // `foo` now calls `b`
    AccountAccess[] memory accesses = vm.getStateDiff();
  }
}

And I think it should, because users will want the storage accesses that took place in the a.foo() call also


Hmm, yea this is tricky. A "return control flow" seems ok but feels hacky. And I don't think Solidity is flexible enough to allow the cheat to return a flat array of (AccountAccess | StorageAccess)[] so the UX there may be clunky because I think you'd need it return a struct of:

struct AccountOrStorageAccess {
  AccountAccess accountAccess; // empty if it was a storage access
  StorageAccess storageAccess; // empty if it was an account access
  bool isAccountAccess;
  bool isStorageAccess;
}

Copy link
Contributor Author

@Inphi Inphi Nov 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me clarify how recording occurs. In your code snippet, the a.foo() CALL will be recorded. What I was trying to explain is that state accesses in the same context a record begins are not recorded. That is the sstore in the following example won't be recorded:

contract MyTest is Test {
  function testAccesses() external {
    vm.recordStateDiff();
    assembly { sstore(0x0, 0x0) }
    a.foo(); // `foo` now calls `b`
    AccountAccess[] memory accesses = vm.getStateDiff();
  }
}

The a.foo() call generates an AccountAccess record for only the b contract, but not a since that's the only account being accessed there. So we have the following record:

AccountAccess {
  accessor: a,
  account: b,
  data: abi.encodeCall(B.foo,()),
  kind: CALL,
  ...
}

I guess, if there is a value transfer then it should generate two records for the credit and debit changes. Though this seems redundant as we'll need to represent this with new type of account access or even a new Access type struct.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I do think that a.foo() call should capture 2 account accesses—one for A and one for B. One way to think about this is that if a has storage writes those should be captured, and you can't have a storage write without first having an account access.

I do agree that sstore should not be captured though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In your example, is the a.foo() an actual CALL, i.e. calling an external A.foo function for example? If so, then I think we're on the same page. If it's just a jump, then that kinda breaks the CALL kind concept we have going.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh sorry it was a CALL, I see how that is unclear

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just noting for completeness here that we discussed offline and agreed on adding a synthetic Resume access type to handle the "control flow returned to caller" access kind

crates/abi/abi/HEVM.sol Outdated Show resolved Hide resolved
@Inphi Inphi force-pushed the inphi/cheat-records branch 2 times, most recently from 1da7115 to aebbfa3 Compare November 15, 2023 20:43
@Inphi Inphi marked this pull request as ready for review November 16, 2023 04:47
Co-authored-by: refcell.eth <abigger87@gmail.com>
crates/abi/abi/HEVM.sol Outdated Show resolved Hide resolved
crates/cheatcodes/spec/src/vm.rs Outdated Show resolved Hide resolved
Copy link
Contributor

@refcell refcell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me now, really nice work @Inphi

Copy link
Member

@mattsse mattsse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some style nits,
defer to @mds1 and @DaniPopes

crates/cheatcodes/src/inspector.rs Outdated Show resolved Hide resolved
crates/cheatcodes/src/inspector.rs Outdated Show resolved Hide resolved
crates/cheatcodes/src/inspector.rs Outdated Show resolved Hide resolved
crates/cheatcodes/src/inspector.rs Outdated Show resolved Hide resolved
crates/cheatcodes/src/inspector.rs Outdated Show resolved Hide resolved
crates/cheatcodes/src/inspector.rs Outdated Show resolved Hide resolved
crates/cheatcodes/src/inspector.rs Outdated Show resolved Hide resolved
@mds1 mds1 mentioned this pull request Nov 16, 2023
@@ -5,6 +5,8 @@ struct DirEntry { string errorMessage; string path; uint64 depth; bool isDir; bo
struct FsMetadata { bool isDir; bool isSymlink; uint256 length; bool readOnly; uint256 modified; uint256 accessed; uint256 created; }
struct Wallet { address addr; uint256 publicKeyX; uint256 publicKeyY; uint256 privateKey; }
struct FfiResult { int32 exitCode; bytes stdout; bytes stderr; }
struct AccountAccess { address accessor; address account; uint256 kind; bool initialized; uint256 oldBalance; uint256 newBalance; bytes deployedCode; uint256 value; bytes data; bool reverted; StorageAccess[] storageAccesses; }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we include the enum definition in this file and that instead of uint256 kind?

Copy link
Contributor Author

@Inphi Inphi Nov 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup will do. Though currently running into an abigen issue where it fails to generate proper bindings for an AccountAccess with too many fields.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah interesting, should be ok if we can't use the enum here since we can still use in Vm.sol in forge-std, so will defer to you + @mattsse / @Evalir here about the abigen issue

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got it working by replacing both chainId and forkId with a struct h/t @refcell . Good to do anyways if we'll be adding this info to the rest of the cheatcode interface.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't worry about it, crates/abi is deprecated and will be removed soon. It's currently only used for trace/log decoding

crates/cheatcodes/src/inspector.rs Outdated Show resolved Hide resolved
crates/cheatcodes/src/inspector.rs Outdated Show resolved Hide resolved
Copy link
Member

@mattsse mattsse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to rename the variable

otherwise lgtm, pending @DaniPopes

I still prefer ref mut because it makes it instantly clear to me when looking at the Some that this is a &mut without checking the rhs
but I don't mind &mut

crates/cheatcodes/src/inspector.rs Outdated Show resolved Hide resolved
crates/cheatcodes/src/inspector.rs Outdated Show resolved Hide resolved
Copy link
Member

@DaniPopes DaniPopes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@mattsse
Copy link
Member

mattsse commented Nov 17, 2023

gg
sending it

@mattsse mattsse merged commit c948388 into foundry-rs:master Nov 17, 2023
19 checks passed
@refcell
Copy link
Contributor

refcell commented Nov 17, 2023

Wooot thank you @mattsse @DaniPopes 👑 👑 👑 👑

@mds1
Copy link
Collaborator

mds1 commented Nov 17, 2023

@Inphi Can you PR the cheats into forge-std's VmSafe in Vm.Sol, and the book also?

@Inphi
Copy link
Contributor Author

Inphi commented Nov 17, 2023

@Inphi Can you PR the cheats into forge-std's VmSafe in Vm.Sol, and the book also?

yup. Will do

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants