-
Notifications
You must be signed in to change notification settings - Fork 2.7k
Vision: Multi-block Migrations #7911
Comments
One solution can be: do a runtime migration which change the logic of the storage to:
once all value in old storage and new storage are identical, do another runtime migration which change the logic of the storage to:
|
We can always do lazy upgrade. i.e. Only migrate storage upon access. So when reading something, try new storage, and if not exist, try old one and migrate to new one. After long enough time, most of the old storages should be moved to new one and then we can do a final migration to move remaining ones. |
We know more efficient storage migration schemes, but nobody clarified why we'd ever justify the implementation effort. I'll describe one fairly straightforward approach: We first require storage checkpoint protocol where one passes the storage through the availability and approval system, like parathread blocks. In other words, one node erasure codes the storage, either all or only part, distributes the pieces among the validators, and then approval checkers reconstruct the storage and check its Merkle root. We require checkpointing for system parathreads and multiple relay chains too, so rather important. We do migration for the relay chains by first one node migrating and checkpointing the new storage, so like publishing the relay chain state in a series of parathread blocks. We then continue running the chain almost as usual, except we now track both the pre- and post-migration Merkle roots. All nodes validate either pre- or post- migration Merkle root updates, with initially the checkpoint blocks' approval checkers validating post-migration updates, but also they spot check the other. After we approve and finalize the checkpoint, then all nodes download the checkpoint, and the replay the intervening blocks upon the checkpointed state. We need parathreads, storage checkpoints, and this double hashing before this makes any sense, but after those this becomes straightforward. We avoid most nodes recomputing the whole storage since they obtain it from the checkpoint and replay a small-ish number of blocks. |
@burdges Your scheme sounds like a way to do relay chain migrations. |
Hey, is anyone still working on this? Due to the inactivity this issue has been automatically marked as stale. It will be closed if no further activity occurs. Thank you for your contributions. |
Stil relevant |
The Moonbeam team has begun to explore a multi-block migration strategy in moonbeam-foundation/moonbeam#527. This design expresses migrations as batches which allows migrations to split across multiple blocks. This way the strict execution time limit can be met and the migration can also be run to completion. But this multi-block migration support introduces the additional challenge that normal transactions should not be processed while the migration is ongoing or else they may corrupt the state. Ideally transactions would accumulate in the transaction pool during this period and be processed as soon as the migration is complete. In the Drupal web CMS (and others) they have a notion of "maintenance mode". In this mode site visitors cannot interact with the site except to view public static files. No db reads or writes are permitted at all. Such a maintenance mode may make sense at the FRAME level. These Drupal docs mention that
Maintenance mode could also serve as an emergency stop button when a chain is being attacked or a bug exploited. |
I really like @xlc's idea of lazy storage migrations. These could also be run during the |
yes a lazy migration is not hard. All access to storage are done through the type alias (as
Once every old keys have been removed, we can do another runtime upgrade which only use the new storage. |
Hi @JoshOrndorff! Is there a way Unique team can contribute to this issue? |
Hi everyone! It seems to me that the lazy migration would have some issues, namely:
So, it seems to me that the Multi-Block Migration should work in a span of several continuous blocks with a guarantee that no transaction will be executed during the migration. I wrote a gist with thoughts on how we can implement the Multi-Block Migration and how it could look like. |
Closed in lieu of paritytech/polkadot-sdk#198 |
In situations where the block time is subject to a hard cap, a Substrate chain needs to be able to execute a(n expensive) storage migration over the course of more than one block. This will likely happen in the context of running as a parachain on Polkadot, as validators will enforce a maximum block time on parachain blocks.
This can likely not be addressed by the existing task scheduling means as the blockchain is not operational while migrating. It should likely be "frozen" to avoid inconsistent states.
The text was updated successfully, but these errors were encountered: