Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

State Sync for V2 Store (ADR 40) #10705

Closed
wants to merge 49 commits into from

Conversation

gsk967
Copy link
Contributor

@gsk967 gsk967 commented Dec 8, 2021

Description

Snapshot Sync For New Node

Closes: #XXXX


Author Checklist

All items are required. Please add a note to the item if the item is not applicable and
please add links to any relevant follow up issues.

I have...

  • included the correct type prefix in the PR title
  • added ! to the type prefix if API or client breaking change
  • targeted the correct branch (see PR Targeting)
  • provided a link to the relevant issue or specification
  • followed the guidelines for building modules
  • included the necessary unit and integration tests
  • added a changelog entry to CHANGELOG.md
  • included comments for documenting Go code
  • updated the relevant documentation or specification
  • reviewed "Files changed" and left comments if necessary
  • confirmed all CI checks have passed

Reviewers Checklist

All items are required. Please add a note if the item is not applicable and please add
your handle next to the items reviewed if you only reviewed selected items.

I have...

  • confirmed the correct type prefix in the PR title
  • confirmed ! in the type prefix if API or client breaking change
  • confirmed all author checklist items have been addressed
  • reviewed state machine logic
  • reviewed API design and naming
  • reviewed documentation is accurate
  • reviewed tests and test coverage
  • manually tested (if applicable)

Copy link

@erikdies erikdies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First pass notes, still needs a qualified technical review.

@@ -54,6 +54,7 @@ Ref: https://keepachangelog.com/en/1.0.0/
* [\#10379](https://github.com/cosmos/cosmos-sdk/pull/10379) Add validation to `x/upgrade` CLI `software-upgrade` command `--plan-info` value.
* [\#10561](https://github.com/cosmos/cosmos-sdk/pull/10561) Add configurable IAVL cache size to app.toml
* [\10507](https://github.com/cosmos/cosmos-sdk/pull/10507) Add middleware for tx priority.
* [\#10430](https://github.com/cosmos/cosmos-sdk/pull/10430) ADR-040: Add store/v2 `MultiStore` implementation
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a hash to L56 to make the markdown consistent.
* [\#10507](https://github.com/cosmos/cosmos-sdk/pull/10507) Add middleware for tx priority.

// (store/types).Iterator interface.
func DBToStoreIterator(source dbm.Iterator) *dbAsStoreIter {
ret := &dbAsStoreIter{Iterator: source}
ret.Next()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why iterate here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the DB iterator Next and Valid were combined into Next() bool, so you have to call Next once to know if the iterator is valid

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. I don't know if this is us, but not a fan of that API; I would assume that we were throwing away the first value.

Copy link
Contributor

@roysc roysc Dec 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, not my idea. The motivation was to reduce the size of the Iterator interface I think. I should have pushed against this harder, all it's done is create useless boilerplate code

// SnapshotSchema is an exported schema of store
message SnapshotSchema{
bytes key = 1;
bytes type = 2;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Newline

require.Panics(t, func() { store.Set([]byte(""), []byte("value")) }, "setting an empty key should panic")

require.Equal(t, types.StoreTypeDB, store.GetStoreType())
store.GetStoreType()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this doing?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nothing apparently, i'll remove


var errFoo = errors.New("dummy")

func TestAccessors(t *testing.T) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is long, can it be broken into more understandable parts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's mainly long because of the mocking code. Other than that it's just simple CRUD function tests

onKey []byte
}

func (dbVersionsFails) Versions() (dbm.VersionSet, error) { return nil, errors.New("dbVersionsFails") }
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why no receiver? (Just a question - I've never seen a go method without a receiver)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an interface wrapper which just exists to trigger a failure, so it doesn't actually need to use the receiver object

return
}

func (ts *Store) SetPruning(types.PruningOptions) {}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth logging something here, letting the caller know they have invoked a noop?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably not a big deal - this is copied from the old version and shouldn't actually be called by anything. it's an extraneous part of the Committer store interface.

However, now that I think of it, we could probably remove the v2/mem and v2/transient stores altogether. They are only used as part of the root store now and just wrap the dbadapter.Store

func (cs *cacheStore) Write() {
for skey, sub := range cs.substores {
sub.Write()
delete(cs.substores, skey)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we deleting the key here?

Copy link
Contributor

@roysc roysc Dec 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking that cache stores are invalidated after writing, so the substores can be cleared. But I actually misunderstood this, it seems they can be reused. So we shouldn't delete here, I'll fix

string([]byte{0x01}): "1",
})

var testCase = func(t *testing.T, iter types.Iterator, expected []string) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this take a description?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could add it, but I'm not sure what a useful description would be here

opts.Pruning = types.PruneNothing

// Ensure Store's commit is rolled back in each failure case...
t.Run("recover after failed Commit", func(t *testing.T) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like more test breakdowns like this. These r.Runs are readable and descriptive.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I can look at adding more

Copy link
Contributor

@roysc roysc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pretty good, mostly just need some fixes on the schema handling


// SnapshotSchema is an exported schema of store
message SnapshotSchema{
bytes key = 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's probably better if we put the whole schema in one message. Also, we don't need to include the type - the only store type that forms part of the committed state is StoreTypePersistent (so it's the one only accessible for past heights). So we can assume all stores are this type when sending this and just send an array of bytes keys.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok i will change the SnapshotSchema to send all the keys as an array of bytes keys.

sKey := item.Schema.GetKey()
sValue := item.Schema.GetType()
if len(sValue) != 1 || !validSubStoreType(types.StoreType(sValue[0])) {
return sdkerrors.Wrap(err, fmt.Sprintf("invalid mapping for store key: %v => %v", item.Schema.Key, item.Schema.Type))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit - can use sKey instead of item.Schema.Key

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok


case *storetypes.SnapshotItem_Store:
// set the new snapshot store schema to root-store
rs.schema = ret.StoreSchema
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also need to write the new schema to the DB as here - this can be done after the sync is finished.

Also, I think there should be a sanity check to make sure the schema message is received before any Store or KV messages.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@roysc we need to delete the old schema and write the new schema after the sync ? or just the write the new schema to prefixdb

Copy link
Contributor

@roysc roysc Dec 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should check at the beginning that the existing schema is empty, then just write the new one - meant to link to this line

func (rs *Store) Restore(height uint64, format uint32, chunks <-chan io.ReadCloser, ready chan<- struct{}) error {
if height == 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add a check this is being run on an empty store, ie. there are no saved versions and the schema is empty.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about if we restore one version snapshot then we will receive new version snapshot right ?
so we need check only that version exists or not right ?
@roysc

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

snapshot should only be done for a new instance of an application to catch up the state - see here. So we don't need to worry about multiple versions being restored for now. Just need to check that the db's versions.Count() == 0 so we know it's a fresh store instance

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

}

for key := range vs.schema {
getSubstore, err := vs.getSubstore(key)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, don't name the var as a verb - sstore or just sub is better

@roysc
Copy link
Contributor

roysc commented Dec 8, 2021

One more nit actually - can we move these functions to a new file, like root/snapshot.go?

@roysc
Copy link
Contributor

roysc commented Dec 8, 2021

Oh I see @erikdies reviewed a lot of stuff from #10430. On second thought @gsk967 I should have asked you to open this targeting my branch for that PR, so this only contains the diff between them. We can still change it, and change back master once mine is merged, so this is easier to read. Sorry about that.

@@ -190,6 +190,12 @@ func TestMultistoreSnapshotRestore(t *testing.T) {
require.NoError(t, err)
require.Equal(t, sourceSubStore, targetSubStore)
}

// restore checking for non-empty store
target2 := newMultiStoreWithBasicData(t, memdb.NewDB(), 0)
Copy link
Contributor

@roysc roysc Dec 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does the same as above. We need to check that the restore fails if there is an existing data or schema in the target.

It would also be good to check that it fails if there's an existing saved version, even if it contains no data

My mistake, this checks the existing-version case, just need one for the existing-schema case

@gsk967 gsk967 closed this Dec 16, 2021
@github-actions github-actions bot added C:CLI C:x/authz C:x/gov C:x/nft T: ADR An issue or PR relating to an architectural decision record T: CI and removed Type: Build labels Dec 16, 2021
mergify bot pushed a commit that referenced this pull request Mar 4, 2022
## Description

Closes: #10705 


State Sync for V2 Store (ADR-40) 
---

### Author Checklist

*All items are required. Please add a note to the item if the item is not applicable and
please add links to any relevant follow up issues.*

I have...

- [ ] included the correct [type prefix](https://github.com/commitizen/conventional-commit-types/blob/v3.0.0/index.json) in the PR title
- [ ] added `!` to the type prefix if API or client breaking change
- [ ] targeted the correct branch (see [PR Targeting](https://github.com/cosmos/cosmos-sdk/blob/master/CONTRIBUTING.md#pr-targeting))
- [ ] provided a link to the relevant issue or specification
- [ ] followed the guidelines for [building modules](https://github.com/cosmos/cosmos-sdk/blob/master/docs/building-modules)
- [ ] included the necessary unit and integration [tests](https://github.com/cosmos/cosmos-sdk/blob/master/CONTRIBUTING.md#testing)
- [ ] added a changelog entry to `CHANGELOG.md`
- [ ] included comments for [documenting Go code](https://blog.golang.org/godoc)
- [ ] updated the relevant documentation or specification
- [ ] reviewed "Files changed" and left comments if necessary
- [ ] confirmed all CI checks have passed

### Reviewers Checklist

*All items are required. Please add a note if the item is not applicable and please add
your handle next to the items reviewed if you only reviewed selected items.*

I have...

- [ ] confirmed the correct [type prefix](https://github.com/commitizen/conventional-commit-types/blob/v3.0.0/index.json) in the PR title
- [ ] confirmed `!` in the type prefix if API or client breaking change
- [ ] confirmed all author checklist items have been addressed 
- [ ] reviewed state machine logic
- [ ] reviewed API design and naming
- [ ] reviewed documentation is accurate
- [ ] reviewed tests and test coverage
- [ ] manually tested (if applicable)
larry0x pushed a commit to larry0x/cosmos-sdk that referenced this pull request May 22, 2023
## Description

Closes: cosmos#10705 


State Sync for V2 Store (ADR-40) 
---

### Author Checklist

*All items are required. Please add a note to the item if the item is not applicable and
please add links to any relevant follow up issues.*

I have...

- [ ] included the correct [type prefix](https://github.com/commitizen/conventional-commit-types/blob/v3.0.0/index.json) in the PR title
- [ ] added `!` to the type prefix if API or client breaking change
- [ ] targeted the correct branch (see [PR Targeting](https://github.com/cosmos/cosmos-sdk/blob/master/CONTRIBUTING.md#pr-targeting))
- [ ] provided a link to the relevant issue or specification
- [ ] followed the guidelines for [building modules](https://github.com/cosmos/cosmos-sdk/blob/master/docs/building-modules)
- [ ] included the necessary unit and integration [tests](https://github.com/cosmos/cosmos-sdk/blob/master/CONTRIBUTING.md#testing)
- [ ] added a changelog entry to `CHANGELOG.md`
- [ ] included comments for [documenting Go code](https://blog.golang.org/godoc)
- [ ] updated the relevant documentation or specification
- [ ] reviewed "Files changed" and left comments if necessary
- [ ] confirmed all CI checks have passed

### Reviewers Checklist

*All items are required. Please add a note if the item is not applicable and please add
your handle next to the items reviewed if you only reviewed selected items.*

I have...

- [ ] confirmed the correct [type prefix](https://github.com/commitizen/conventional-commit-types/blob/v3.0.0/index.json) in the PR title
- [ ] confirmed `!` in the type prefix if API or client breaking change
- [ ] confirmed all author checklist items have been addressed 
- [ ] reviewed state machine logic
- [ ] reviewed API design and naming
- [ ] reviewed documentation is accurate
- [ ] reviewed tests and test coverage
- [ ] manually tested (if applicable)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C:CLI C:Store C:x/authz C:x/gov C:x/nft T: ADR An issue or PR relating to an architectural decision record T: CI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants