Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache code size/hash in storage #893

Merged
1 change: 1 addition & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions frame/evm/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ log = { workspace = true }
rlp = { workspace = true }
scale-codec = { package = "parity-scale-codec", workspace = true }
scale-info = { workspace = true }
sha3 = { version = "0.10", default-features = false }
# Substrate
frame-benchmarking = { workspace = true, optional = true }
frame-support = { workspace = true }
Expand Down
44 changes: 43 additions & 1 deletion frame/evm/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -75,7 +75,8 @@ use frame_support::{
};
use frame_system::RawOrigin;
use impl_trait_for_tuples::impl_for_tuples;
use sp_core::{Hasher, H160, H256, U256};
use scale_info::TypeInfo;
use sp_core::{Decode, Encode, Hasher, H160, H256, U256};
use sp_runtime::{
traits::{BadOrigin, Saturating, UniqueSaturatedInto, Zero},
AccountId32, DispatchErrorWithPostInfo,
Expand Down Expand Up @@ -506,6 +507,11 @@ pub mod pallet {
#[pallet::getter(fn account_codes)]
pub type AccountCodes<T: Config> = StorageMap<_, Blake2_128Concat, H160, Vec<u8>, ValueQuery>;

#[pallet::storage]
#[pallet::getter(fn account_codes_metadata)]
Copy link
Collaborator

@koushiro koushiro Apr 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

better not to add pallet::getter syntax for new storages, because it will be deprecated by upstream.

pub type AccountCodesMetadata<T: Config> =
nanocryk marked this conversation as resolved.
Show resolved Hide resolved
StorageMap<_, Blake2_128Concat, H160, CodeMetadata, OptionQuery>;

#[pallet::storage]
#[pallet::getter(fn account_storages)]
pub type AccountStorages<T: Config> =
Expand All @@ -520,6 +526,23 @@ pub type BalanceOf<T> =
type NegativeImbalanceOf<C, T> =
<C as Currency<<T as frame_system::Config>::AccountId>>::NegativeImbalance;

#[derive(Clone, Copy, Eq, PartialEq, Encode, Decode, TypeInfo)]
pub struct CodeMetadata {
pub size: u64,
pub hash: H256,
}

impl CodeMetadata {
fn from_code(code: &[u8]) -> Self {
use sha3::Digest;

let size = code.len() as u64;
let hash = H256::from_slice(sha3::Keccak256::digest(&code).as_slice());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the account not have any code, we will compute a hash of an empty bytes vector at runtime, I wonder if it's the expected behavior.
If it's expected it's till unoptimized, it would be better to check if code.is_empty() and return an hardcoded constant in that case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the metadata should not even be computed and stored for an empty code, because it could be an address that later gets code deployed.


Self { size, hash }
}
}

pub trait EnsureAddressOrigin<OuterOrigin> {
/// Success return type.
type Success;
Expand Down Expand Up @@ -714,6 +737,7 @@ impl<T: Config> Pallet<T> {
}

<AccountCodes<T>>::remove(address);
<AccountCodesMetadata<T>>::remove(address);
#[allow(deprecated)]
let _ = <AccountStorages<T>>::remove_prefix(address, None);
}
Expand All @@ -730,6 +754,24 @@ impl<T: Config> Pallet<T> {
}

<AccountCodes<T>>::insert(address, code);

// Update metadata.
let meta = CodeMetadata::from_code(&code);
<AccountCodesMetadata<T>>::insert(address, meta);
}

/// Get the account metadata (hash and size) from storage if it exists,
/// or compute it from code and store it if it doesn't exist.
pub fn account_code_metadata(address: H160) -> CodeMetadata {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this function should return an option to express the case where the account not have any code.

if let Some(meta) = <AccountCodesMetadata<T>>::get(&address) {
return meta;
}

let code = <AccountCodes<T>>::get(&address);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to me really unsafe, and not compatible with Weight v2, to force the reading of the code if the metadata are not set.

It would be better to return None and create a dedicated call for migration.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case we need to patch evm so that Handler::code_size / Handler::code_hash can return error. It's relatively trivial.

However, my view is that this is optional. If a chain is upgraded, then they should be recommended to "touch" all contracts upon upgrade (probably through a multi-call to ensure no other code are executed in between). If a chain starts with the new runtime, then the None path will never happen.

Copy link
Contributor

@librelois librelois Apr 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my view is that this is optional. If a chain is upgraded, then they should be recommended to "touch" all contracts upon upgrade (probably through a multi-call to ensure no other code are executed in between)

This is not possible for parachains, it will make a too big storage proof size. For decentralized parachains, metadata must be computed offchain and injected onchain by a referendum via governance, which requires the code to support hybrid state.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that data should be inserted via governance.

However in the meantime knowing the code hash and size is necessary and require to read the full code. Once it is computed it is better to store it than to have to compute it again the next time. Thus it seems better to not return an Option and keep the unhappy path, which will never be reached once all contracts have been migrated.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok

let meta = CodeMetadata::from_code(&code);

<AccountCodesMetadata<T>>::insert(address, meta);
meta
}

/// Get the account basic in EVM format.
Expand Down
8 changes: 8 additions & 0 deletions frame/evm/src/runner/stack.rs
Original file line number Diff line number Diff line change
Expand Up @@ -827,6 +827,14 @@ where
self.substate
.recursive_is_cold(&|a: &Accessed| a.accessed_storage.contains(&(address, key)))
}

fn code_size(&self, address: H160) -> U256 {
U256::from(<Pallet<T>>::account_code_metadata(address).size)
}

fn code_hash(&self, address: H160) -> H256 {
<Pallet<T>>::account_code_metadata(address).hash
}
}

#[cfg(feature = "forbid-evm-reentrancy")]
Expand Down