Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ADR 041: In-Place Store Migrations #8646

Merged
merged 26 commits into from
Mar 4, 2021
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
a23b5fa
Initial draft
amaury1093 Feb 17, 2021
6f69bc5
Draft
amaury1093 Feb 19, 2021
9f24445
Add x/upgrade stuff
amaury1093 Feb 19, 2021
f153843
Tweaks
amaury1093 Feb 19, 2021
6993093
Merge branch 'master' into am/adr-in-place-migrations
jgimeno Mar 3, 2021
0eea75b
Update docs/architecture/adr-041-in-place-store-migrations.md
amaury1093 Mar 4, 2021
151149b
Reviews
amaury1093 Mar 4, 2021
01c3507
Merge branch 'am/adr-in-place-migrations' of ssh://github.com/cosmos/…
amaury1093 Mar 4, 2021
6dc29f9
Use migrator
amaury1093 Mar 4, 2021
94c0475
Update docs/architecture/adr-041-in-place-store-migrations.md
amaury1093 Mar 4, 2021
2f7fede
Update docs/architecture/adr-041-in-place-store-migrations.md
amaury1093 Mar 4, 2021
8494883
Update docs/architecture/adr-041-in-place-store-migrations.md
amaury1093 Mar 4, 2021
aaa991b
Update docs/architecture/adr-041-in-place-store-migrations.md
amaury1093 Mar 4, 2021
3579a10
Update docs/architecture/adr-041-in-place-store-migrations.md
amaury1093 Mar 4, 2021
865713e
Update docs/architecture/adr-041-in-place-store-migrations.md
amaury1093 Mar 4, 2021
cb361b7
Update docs/architecture/adr-041-in-place-store-migrations.md
amaury1093 Mar 4, 2021
1f1d4ce
Update docs/architecture/adr-041-in-place-store-migrations.md
amaury1093 Mar 4, 2021
cbda00b
Merge branch 'am/adr-in-place-migrations' of ssh://github.com/cosmos/…
amaury1093 Mar 4, 2021
31357e6
More fixes
amaury1093 Mar 4, 2021
0fc97c5
Add grpc, use functions
amaury1093 Mar 4, 2021
53a13ab
Add special case with 0 version
amaury1093 Mar 4, 2021
9e07b4d
Update docs/architecture/adr-041-in-place-store-migrations.md
amaury1093 Mar 4, 2021
41386ca
Update docs/architecture/adr-041-in-place-store-migrations.md
amaury1093 Mar 4, 2021
606aec1
Update docs/architecture/adr-041-in-place-store-migrations.md
amaury1093 Mar 4, 2021
633d07b
Remove useless err return
amaury1093 Mar 4, 2021
dd2b62c
Merge branch 'master' into am/adr-in-place-migrations
mergify[bot] Mar 4, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
174 changes: 174 additions & 0 deletions docs/architecture/adr-041-in-place-store-migrations.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,174 @@
# ADR 041: In-Place Store Migrations

## Changelog

- 17.02.2021: Initial Draft

## Status

Proposed
amaury1093 marked this conversation as resolved.
Show resolved Hide resolved

## Abstract

This ADR introduces a mechanism to perform in-place store migrations during chain upgrades.
amaury1093 marked this conversation as resolved.
Show resolved Hide resolved

## Context

When a chain upgrade introduces state-breaking changes inside modules, the current procedure consists of exporting the whole state into a JSON file (via the `simd export` command), running migration scripts on the JSON file (`simd migrate` command), clearing the stores (`simd unsafe-reset-all` command), and starting a new chain with the migrated JSON file as new genesis (optionally with a custom initial block height). An example of such a procedure can be seen [in the Cosmos Hub 3->4 migration guide](https://github.com/cosmos/gaia/blob/v4.0.3/docs/migration/cosmoshub-3.md#upgrade-procedure).

While [Cosmovisor](https://github.com/cosmos/cosmos-sdk/tree/v0.41.1/cosmovisor) aims to alleviate the difficulty of handling upgrades, this procedure is still cumbersome for multiple reasons:
amaury1093 marked this conversation as resolved.
Show resolved Hide resolved

- The procedure takes time. It can take hours to run the `export` command, plus some additional hours to run `InitChain` on the fresh chain using the migrated JSON.
- The exported JSON file can be heavy (~100MB-1GB), making it difficult to view, edit and transfer, which in turn introduces additional work to solve these problems (such as [streaming genesis](https://github.com/cosmos/cosmos-sdk/issues/6936)).

## Decision

We propose a migration procedure based on modifying the KV store in-place. This procedure does not manipulate intermediary JSON files.
amaury1093 marked this conversation as resolved.
Show resolved Hide resolved

### Module `ConsensusVersion`

We introduce a new method on the `AppModule` interface:

```go
type AppModule interface {
// --snip--
ConsensusVersion() uint64
}
```

This methods returns an `uint64` which serves as state-breaking versioning of the module. It MUST be incremented on each consensus-breaking change introduced by the module. To avoid potential errors with default values, the initial version of a module MUST be set to 1. In the SDK, version 1 corresponds to the modules in the v0.41 series.
amaury1093 marked this conversation as resolved.
Show resolved Hide resolved

### Module-Specific Migration Scripts
amaury1093 marked this conversation as resolved.
Show resolved Hide resolved

For each consensus-breaking change introduced by the module, a migration script from ConsensusVersion `N` to version `N+1` MUST be registered in the `Configurator` using its newly-added `RegisterMigration` method. All modules receive a reference to the configurator in their `RegisterServices` method on `AppModule`, and this is where the migration scripts should be registered.

```go
func (am AppModule) RegisterServices(cfg module.Configurator) {
amaury1093 marked this conversation as resolved.
Show resolved Hide resolved
// --snip--
cfg.RegisterMigration(types.ModuleName, 1, func(ctx sdk.Context) error {
// Perform in-place store migrations from ConsensusVersion 1 to 2.
})
cfg.RegisterMigration(types.ModuleName, 2, func(ctx sdk.Context) error {
// Perform in-place store migrations from ConsensusVersion 2 to 3.
})
amaury1093 marked this conversation as resolved.
Show resolved Hide resolved
// etc.
}
```

For example, if the current ConsensusVersion of a module is `N` , then `N-1` migration scripts MUST be registered in the configurator.
amaury1093 marked this conversation as resolved.
Show resolved Hide resolved

In the SDK, the migration scripts are handled by each module's keeper, because the keeper holds the `sdk.StoreKey` used to perform in-place store migrations. A `MigrationKeeper` interface is implemented by each keeper:

```go
// MigrationKeeper is an interface that the keeper implements for handling
// in-place store migrations.
type MigrationKeeper interface {
amaury1093 marked this conversation as resolved.
Show resolved Hide resolved
// Migrate1 migrates the store from version 1 to 2.
Migrate1(ctx sdk.Context) error
alessio marked this conversation as resolved.
Show resolved Hide resolved

// ...Add more MigrateN scripts here.
}
```

Since migration scripts manipulate legacy code, they should live inside the `legacy/` folder of each module, and be called by the keeper's implementation of `MigrationKeeper`.
amaury1093 marked this conversation as resolved.
Show resolved Hide resolved

```go
func (keeper BankKeeper) Migrate1(ctx sdk.Context) error {
return v042bank.MigrateStore(ctx, keeper.storeKey) // v042bank is package `x/bank/legacy/v042`.
}
```

Each module's migration scripts are specific to the module's store evolutions, and are not described in this ADR. An example of x/bank store key migrations following the introduction of ADR-028 length-prefixed addresses can be seen [here](https://github.com/cosmos/cosmos-sdk/blob/ef8dabcf0f2ecaf26db1c6c6d5922e9399458bb3/x/bank/legacy/v042/store.go#L15).

### Tracking Module Versions in `x/upgrade`

We introduce a new prefix store in `x/upgrade`'s store. This store will track each module's current version, it can be modelized as a `map[string]uint64` of module name to module ConsensusVersion, and will be used when running the migrations (see next section for details). The key prefix used is `0x1`, and the key/value format is:

```
0x1 | {module_name_bytes} => LittleEndian(module_consensus_version)
amaury1093 marked this conversation as resolved.
Show resolved Hide resolved
```

We add a new parameter `ModuleManager` `x/upgrade`'s `NewKeeper` constructor, where ModuleManager is:
amaury1093 marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why ModuleManager goes to the x/upgrade.Keeper constructor? I thought that the x/upgrade.QueryServer will provide that functionality. In that case we would simply query the x/upgrade and as for old versions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To query the current versions from loaded modules

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated this a little bit: we unfortunately cannot add it in the constructor, because of a definition cycle in app.go: UpgradeKeeper needs ModuleManager which needs UpgradeKeeper. The way proposed here is to add a SetVersionManager setter on UpgradeKeeper.


```go
type ModuleManager interface {
GetConsensusVersions() MigrationMap
}

// Map of module name => ConsensusVersion.
type MigrationMap map[string]uint64
amaury1093 marked this conversation as resolved.
Show resolved Hide resolved
```

When `InitChain` is run, the initial `ModuleManager.GetConsensusVersions()` value will be recorded to state. The UpgradeHandler signature needs to be updated to take a `MigrationMap`, as well as return an error:

```diff
- type UpgradeHandler func(ctx sdk.Context, plan Plan)
+ type UpgradeHandler func(ctx sdk.Context, plan Plan, migrationMap MigrationMap) error
```

Applying an upgrade will fetch the `MigrationMap` from the `x/upgrade` store and pass it into the handler.
amaury1093 marked this conversation as resolved.
Show resolved Hide resolved

```diff
func (k UpgradeKeeper) ApplyUpgrade(ctx sdk.Context, plan types.Plan) {
// --snip--
- handler(ctx, plan)
+ handler(ctx, plan, k.GetConsensusVersions()) // k.GetConsensusVersions() fetches the MigrationMap stored in state.
}
```

### Running Migrations

Once all the migration handlers are registered inside the configurator (which happens at startup), running migrations can happen by calling the `RunMigrations` method on `module.Manager`. This function will loop through all modules, and for each module:

- Fetch the old ConsensusVersion of the module from its `migrationMap` argument (let's call it `M`).
amaury1093 marked this conversation as resolved.
Show resolved Hide resolved
- Fetch the new ConsensusVersion of the module from the `ConsensusVersion()` method on `AppModule` (call it `N`).
- If `N>M`, run all registered migrations for the module sequentially `M -> M+1 -> M+2...` until `N`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's explain that if there is no old ConsensusVersion, no migration is run but the version is set in the new migration map. i.e. if we add module foo to our chain at version 5, no migrations will be run but 5 will be saved as the current version after migrations are run.

Copy link
Contributor Author

@amaury1093 amaury1093 Mar 4, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. The upgradeKeeper.SetCurrentConsensusVersions() method to store the current module versions is run right after the upgrade handler.


If a required migration is missing (e.g. if it has not been registered in the `Configurator`), then the `RunMigrations` function will error.

In practice, the `RunMigrations` method should be called from inside an `UpgradeHandler`.

```go
app.UpgradeKeeper.SetUpgradeHandler("my-plan", func(ctx sdk.Context, plan upgradetypes.Plan, migrationMap MigrationMap) {
err := app.mm.RunMigrations(ctx, migrationMap)
if err != nil {
amaury1093 marked this conversation as resolved.
Show resolved Hide resolved
panic(err)
}
amaury1093 marked this conversation as resolved.
Show resolved Hide resolved
})
```

Assuming a chain upgrades at block `N`, the procedure should run as follows:
amaury1093 marked this conversation as resolved.
Show resolved Hide resolved

- the old binary will halt in `BeginBlock` at block `N`. In its store, the ConsensusVersions of the old binary's modules are stored.
amaury1093 marked this conversation as resolved.
Show resolved Hide resolved
- the new binary will start at block `N`. The UpgradeHandler is set in the new binary, so will run at `BeginBlock` of the new binary. Inside `x/upgrade`'s `ApplyUpgrade`, the `MigrationMap` will be retrieved from the (old binary's) store, and passed into the `RunMigrations` functon, migrating all module stores in-place before the modules' own `BeginBlock`s.

## Consequences

### Backwards Compatibility

This ADR introduces a new method `ConsensusVersion()` on `AppModule`, which all modules need to implement. It also alters the UpgradeHandler function signature. As such, it is not backwards-compatible.

While modules MUST register their migration scripts when bumping ConsensusVersions, running those scripts using an upgrade handler is optional. An application may perfectly well decide to not call the `RunMigrations` inside its upgrade handler, and continue using the legacy JSON migration path.

### Positive

- Perform chain upgrades without manipulating JSON files.
- While no benchmark has been made yet, it is probable that in-place store migrations will take less time than JSON migrations. The main reason supporting this claim is that both the `simd export` command on the old binary and the `InitChain` function on the new binary will be skipped.

### Negative

- Module developers MUST correctly track consensus-breaking changes in their modules. If a consensus-breaking change is introduced in a module without its corresponding `ConsensusVersion()` bump, then the `RunMigrations` function won't detect the migration, and the chain upgrade might be unsuccessful. Documentation should clearly reflect this.

### Neutral

- Currently, Cosmovisor only handles JSON migrations. Its code should be updated to support in-place store migrations too.
amaury1093 marked this conversation as resolved.
Show resolved Hide resolved
- The SDK will continue to support JSON migrations via the existing `simd export` and `simd migrate` commands.
- The current ADR does not allow creating, renaming or deleting stores, only modifying existing store keys and values. The SDK already has the `StoreLoader` for those operations.

## Further Discussions

## References

- Initial discussion: https://github.com/cosmos/cosmos-sdk/discussions/8429
- Implementation of `ConsensusVersion` and `RunMigrations`: https://github.com/cosmos/cosmos-sdk/pull/8485
- Issue discussing `x/upgrade` design: https://github.com/cosmos/cosmos-sdk/issues/8514
5 changes: 0 additions & 5 deletions docs/architecture/readme.md~origin_master-docs

This file was deleted.