Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement storage (revision 2) module #311

Merged
merged 558 commits into from
May 20, 2020
Merged

Implement storage (revision 2) module #311

merged 558 commits into from
May 20, 2020

Conversation

Robbepop
Copy link
Collaborator

@Robbepop Robbepop commented Jan 14, 2020

New Contract Initialization & Finalization

Technical ToDo List

  • Add unit tests for SpreadLayout & PackedLayout impls of Rust primitives.
  • Normalize trait bounds for some storage abstractions.
  • Add expect for some iterator impls of storage collections.
  • Decide about usage of storage::Pack in storage2::stash::Entry internally.
    • We decided to use storage::Pack in the internals of data structures and disallow types T that cannot be put into such a storage::Pack. Users can opt-into using storage2::Box to be able to pack otherwise non-package types.
  • Add storage2::Bitvec: required for dynamic storage allocator.
  • Dynamic storage allocator.
  • Make storage2::Stash generic over a symbol type.
    • We opted not to do this for the sake of keeping generated Wasm file sizes low since this could explode in the face of monomorphization.
  • Implement missing non-storage traits for storage2::HashMap.
  • Write unit tests for storage2::collections::HashMap.
  • Write more unit tests for dynamic storage allocator.
  • Write more unit tests for storage2::collections::Bitvec.
  • Write SpreadLayout unit tests for storage collections:
    • storage::Vec
    • storage::SmallVec
    • storage::Stash
    • storage::HashMap
    • storage::Box
    • storage::Bitvec
  • Add unit tests for some storage abstractions:
    • storage::Pack
    • storage::Memory
    • storage::Box
  • Write unit tests for lazy abstractions.
    • LazyCell
    • LazyArray
    • LazyIndexMap
    • LazyHashMap
  • Make storage collections work on PackedLayout elements
    • Vec
    • SmallVec
    • Stash - still uses Pack<Entry<T>> internally instead of Entry<T>
    • HashMap - still uses Pack<ValueEntry<T>> internally instead of ValueEntry<T>
    • Bitvec - still uses Pack<Bits256> instead of Bits256 internally
  • Implement Drop for storage data structures to clean up their associated storage.
    • Box
    • Vec
    • Stash
    • SmallVec
    • HashMap
  • Improve interfaces so that it is possible to avoid the extraneous storage read in clear_packed_at implementations of some lazy data structures.
    • storage2::LazyArray
    • storage2::LazyIndexMap
    • storage2::LazyHashMap
    • Implement the trait setting for all built-in types.

Follow-up Work Items

  • Try to remove the code duplication in LazyIndexMap and LazyHashMap.
  • Make SpreadLayout::pull_spread and utility functions such as pull_spread_root return Result instead of panicking.
  • Implement storage::HashMap entry API
  • Adjust ink_lang
    • Introduce new #[ink(constructor)] syntax and semantics to adjust to the changes.
    • Adjust test cases for new #[ink(constructor)] syntax and semantics
  • Improve ergonomics for storage trait implementations.
    • Forward as packed should be simple to implement. (e.g. always packed)
  • Adjust example contracts for the changes.
  • Remove old storage module and replace all usage by the new storage2 module.
  • Smooth edges of initial dynamic storage allocator implementation.
  • Move storage2::collections::boxed into storage2::alloc.
  • Add storage2::Stash::take_drop.
    • Make use of storage2::Stash::take_drop in storage2::HashMap::remove
  • Implement utility storage abstraction Trie that uses storage subtries to pack storage entities.
  • Implement missing high-level storage data structures:
    • VecDeque: Storage data structure with an API similar to Rust's VecDeque.
    • BTreeMap: Re-implementation of old storage::BTreeMap.
    • BinaryHeap: Re-implementation of old storage::BinaryHeap.
  • Add PushOnDrop abstraction
    • Make use of PushOnDrop for dynamic storage allocator.
    • Rough implementation
      pub struct PushOnDrop<T>
      where
          T: SpreadLayout,
      {
          at: Key,
          what: T,
      }
      
      impl<T> Drop for PushOnDrop<T>
      where
          T: SpreadLayout,
      {
          fn drop(&mut self) {
              let prevent_drop = ManuallyDrop::new(self.what);
              push_spread_root::<T>(&*prevent_drop, &self.at);
          }
      }
  • Add storage2::Memory<T> to specify entities that are never synchronized with the contract storage. This is useful for intermediate state between messages.
  • Implement Debug impls that can properly handle the UnsafeCell field and displays the cache below.
  • Improve API of lazy::Entry, e.g. some methods should be &self instead of &mut self etc.
  • Add prefix bytes to storage::HashMap::to_offset_key key calculation.
  • Turn the data structure underlying to the DynamicAllocator for storage into a "generic" data structure usable by other ink! abstractions.
    • Find a proper name for it.
      • BitStash maybe?
  • Improve efficiency of finding the first free slot in the dynamic storage allocator. We can do this by using the information in the full mask of a 256-bit block instead of iterating over the u8 masks which should be much more efficient.
  • Implement more operations on storage collections:
    • storage::Vec
      • clear(&mut self): Empties the storage vector. More efficient than doing it manually. Might be useful for small storage vector instances.
      • set(&mut self, index: Index, new_value: T): Sets the value of an indexed elements without loading and returning the old value which is more efficient. There currently is no alternative to that.
  • Write proc. macros for deriving the new storage traits:
  • Bonus: Refactor storage::Key to be based on u128 instead of [u8; 32].
    • We do this because we hope that we can reduce some computational overhead by this in key offset calculations that have to be done always upon contract initialization or finalization.

Why?

Currently ink! has some major problems around contract initialization (contract instantiation), contract setup upon a call and finalization (contract tear-down after instantiation or call).

Contract instantiation and the setup phase upon calling a contract are treated the same even though they do completely different tasks and also have different assumptions about the state of the contract storage.

  • The contract instantiation has to assume that the contract storage is basically empty.
  • The setup phase (upon calling a contract) can actually assume a more or less properly setup contract storage and directly work on that.

Due to this major problem of unifying two concepts that should not have been unified many abstractions in ink! have developed in an incorrect way leading to bad UX and non optimal performance and memory usage characteristics.

New Design

This PR changes the way how contract instantiation and contract call setup phase are handled and adjusts the contract finalization.

So far #[ink(constructor)] messages always had to be &mut self and return nothing as if they were constructors of known object oriented languages. However, Rust constructors normally return Self and don't have a receiver. This design separated ink! from Rust quite a bit and made it awkward for Rustaceans to "unlearn" constructors.

The new design will treat ink! constructors just like Rust constructors. This has the consequence that you can no longer access the environment via self because there is no longer a self receiver in an #[ink(constructor)]. For that purpose we introduce another universal way of accessing the environment from within #[ink(constructor)] and #[ink(message)] functions via Self::env().

Example: ERC20 Constructor

OLD

#[ink(constructor)]
fn new(&mut self, initial_supply: Balance) {
    let caller = self.env().caller();
    self.total_supply.set(initial_supply);
    self.balances.insert(caller, initial_supply);
    self.env().emit_event(Transfer { from: None, to: Some(caller), value: initial_supply })

NEW

#[ink(constructor)]
fn new(initial_supply: Balance) -> Self {
    let caller = Self::env().caller();
    let mut balances = storage::BTreeMap::new();
    balances.insert(caller, initial_supply);
    Self::env().emit_event(Transfer { from: None, to: Some(caller), value: initial_supply });
    Self {
        total_supply: storage::Lazy::new(initial_supply),
        balances,
        allowances: storage::BTreeMap::new(),
    }
}

New Storage Primitives

So far ink! storage has been mainly managed by the following primitives:

  • storage::SyncCell: Low-level primitive to load and store a single cached cell of type T.
  • storage::SyncChunk: Low-level primitive to load and store a chunk of cached and cells of type T.
  • storage::Value: High-level primitive to manage a single T.
  • storage::Vec: High-level primitive to manage a vector of T.
    And other storage collections similar to storage::Vec that manage a whole set of storage entities each living in their own cell.

All of these primitives mainly have a single or multiple ink_primitives::Key instances to maintain a mapping to some concrete contract storage entries all the time. This was designed to allow to flush the contract storage at any point during contract execution which could be a valuable counter measurement against re-entrancy attacks upon calling another contract. However, this whole mechanic of enforcing all storage types to always have direct pointers into their mapped storage region has significant drawbacks. For once we cannot support Rust primitive types since they themselves cannot hold such a key.

The new storage abstraction data structures would no longer have their respective contract storage region encoded into themselves. Instead they rely on contract initialization and finalization routines to provide them with the correct mapping. So those mappings are now managed by the routines instead of by the data structures.

Furthermore this means that some already existing storage abstractions need adjustments, others might even be removed.

New Storage Types

The new storage types introduced by this PR design are the following:

Name Purpose
storage::Lazy<T> Allows to lazily load upon contract calls.
storage::Pack<T> Packs T into using just a single cell instead of distributing to as many cells as possible.
storage::Box<T> Stores an indirection to a dynamically allocated T on the storage.
storage::LazyMap<K, V> Low-level primitive and a generalization of the concept of a chunk by mapping from K instead of an index.

Removed or deprecated storage types are:

Name Reason
storage::SynCell<T> Replaced and generalized mainly by storage::Lazy<T>
storage::SynChunk<T> Replaced and generalized mainly by storage::LazyMap<T>
storage::Value<T> Used to represent a lazily loaded T, either use T or storage::Lazy<T>

The other storage collections will remain mostly the same but are having adjusted interfaces to meet the new requirements. E.g. they will have proper constructors, e.g. storage::Vec::new() and implement the new storage traits.

Example Use Cases

  1. You simply want to store and load a single i32.

    • You simply use i32 as type. It won't be lazily loaded but that wasn't a requirement.
  2. You want to have a lazily loadable i64 value.

    • You use storage::Lazy<i64>. It will load lazily but besides that behave the same as a i64.

Note that storage collections that manage multiple elements are always lazy.
Lazy loading storage primitives necessarily need to have a key somehow to lazily load from when needed.

Lazy

The structure of storage::Lazy<T> is roughly this:

enum Lazy<T> {
    Lazy(Key), // Has a single key when initialized as lazy.
    Loaded(T),  // Has the T when initialized by value or when already loaded.
}

However, due to the fact that we want to access the T of a lazy by reference and that the loading of a Lazy<T> should require a &self receiver instead of a &mut self receiver we need a few low-level adjustments to let that happen. Fore more details the reader should look into the source of the PR.

@Robbepop Robbepop changed the title [core] introduce singleton EnvInstance and make EnvAccessMut forward … Implement refactoring of contract initialization and finalization Jan 14, 2020
@Robbepop Robbepop changed the title Implement refactoring of contract initialization and finalization Implement new contract initialization and finalization Jan 14, 2020
@Robbepop Robbepop force-pushed the redo-init-and-flush branch from 6d24957 to 0f7b0e1 Compare February 4, 2020 19:13
core/src/storage/lazy.rs Outdated Show resolved Hide resolved
core/src/storage/lazy.rs Outdated Show resolved Hide resolved
@Robbepop Robbepop marked this pull request as ready for review February 19, 2020 14:28
Copy link
Collaborator

@ascjones ascjones left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WIP review (8/14 files).

core/src/storage/lazy/lazy_chunk.rs Outdated Show resolved Hide resolved
core/src/storage/traits/push.rs Outdated Show resolved Hide resolved
core/src/storage/lazy/lazy_chunk.rs Outdated Show resolved Hide resolved
core/src/storage/lazy/lazy_chunk.rs Outdated Show resolved Hide resolved
core/src/storage/lazy/lazy_chunk.rs Outdated Show resolved Hide resolved
core/src/storage/lazy/lazy_chunk.rs Outdated Show resolved Hide resolved
core/src/storage/lazy/lazy_chunk.rs Outdated Show resolved Hide resolved
core/src/storage/lazy/mod.rs Outdated Show resolved Hide resolved
core/src/storage/lazy/lazy_chunk.rs Outdated Show resolved Hide resolved
core/src/storage/lazy/lazy_chunk.rs Outdated Show resolved Hide resolved
core/src/storage/traits/pull.rs Outdated Show resolved Hide resolved
core/src/storage/traits/push.rs Outdated Show resolved Hide resolved
@athei athei linked an issue Feb 28, 2020 that may be closed by this pull request
11 tasks
@Robbepop Robbepop changed the title Implement new contract initialization and finalization Implement storage2 module Mar 27, 2020
@Robbepop Robbepop changed the title Implement storage2 module Implement storage (revision 2) module Mar 27, 2020
@use-ink use-ink deleted a comment from codecov-io May 7, 2020
Copy link
Collaborator

@ascjones ascjones left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

69/69 🌮

All in all LGTM.

There is a fair amount of "similar" code in collections, which makes sense but we must remember if we apply any bugfixes/optimizations to do so in other collections where there is a similar code block.

core/src/storage2/collections/hashmap/tests.rs Outdated Show resolved Hide resolved
core/src/storage2/collections/hashmap/mod.rs Outdated Show resolved Hide resolved
core/src/storage2/collections/hashmap/mod.rs Outdated Show resolved Hide resolved
Robbepop and others added 3 commits May 20, 2020 11:17
Co-authored-by: Andrew Jones <ascjones@gmail.com>
Co-authored-by: Andrew Jones <ascjones@gmail.com>
Co-authored-by: Andrew Jones <ascjones@gmail.com>
@ascjones
Copy link
Collaborator

Also I would suggest another pass from the unsafe gurus for just the unsafe blocks

Copy link
Collaborator

@ascjones ascjones left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

core/src/storage2/collections/hashmap/mod.rs Outdated Show resolved Hide resolved
core/src/storage2/collections/hashmap/mod.rs Outdated Show resolved Hide resolved
core/src/storage2/collections/hashmap/tests.rs Outdated Show resolved Hide resolved
Robbepop and others added 4 commits May 20, 2020 13:30
Co-authored-by: Andrew Jones <ascjones@gmail.com>
Co-authored-by: Andrew Jones <ascjones@gmail.com>
@Swader
Copy link

Swader commented May 20, 2020

Quick question here since it's merged but also mentioned in the v3.0 docs issue - is this going in already, or will it be part of the 3.0 release only?

@Robbepop
Copy link
Collaborator Author

Quick question here since it's merged but also mentioned in the v3.0 docs issue - is this going in already, or will it be part of the 3.0 release only?

Yes, this PR resembles parts of the upcoming ink! 3.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Refactor contract initialization and finalization
10 participants