Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(cli/module)!: separate the app genesis state by module into folders when export/import/validate #13032

Closed
wants to merge 18 commits into from

Conversation

JayT106
Copy link
Contributor

@JayT106 JayT106 commented Aug 24, 2022

Description

Closes: #12808

updates:
add extra arguments in ExportAppStateAndValidators (API-breaking). Since #13437 already added this new API, it should be fine if this PR is accepted and released in the same SDK version.

Implementation

  • Export
    add split-modules flag in the cli export, the app will export the genesis state to the current working path and name it genesis. otherwise, the export cli will export the whole app genesis state into a genesis.json file (the current default export method)

  • Validate
    use validate-split-modules flag in the cli validate-genesis, to validate the exported genesis folder

  • Import (InitChain)
    a. clean out the app data directory, i.e. home/.simapp/data.
    b. copy/move the exported genesis folder to the app config directory, i.e. home/.simapp/config.
    c. copy/move genesis.json in thegenesis folder to the app config directory
    d. re-start the app

The structure in thegenesis folder will be (the gray background meaning it's a folder):

genesis

genesis.json
A(module)

genesis_A_0.bin

If the module state is more than 100 MB (currently hard cording), it will split into several files

B(module)

genesis_B_0.bin
genesis_B_1.bin
genesis_B_Y.bin

X(module)

genesis_X_0.bin

  • document update
  • integrating tests

Author Checklist

All items are required. Please add a note to the item if the item is not applicable and
please add links to any relevant follow up issues.

I have...

  • included the correct type prefix in the PR title
  • added ! to the type prefix if API or client breaking change
  • targeted the correct branch (see PR Targeting)
  • provided a link to the relevant issue or specification
  • followed the guidelines for building modules
  • included the necessary unit and integration tests
  • added a changelog entry to CHANGELOG.md
  • included comments for documenting Go code
  • updated the relevant documentation or specification
  • reviewed "Files changed" and left comments if necessary
  • confirmed all CI checks have passed

Reviewers Checklist

All items are required. Please add a note if the item is not applicable and please add
your handle next to the items reviewed if you only reviewed selected items.

I have...

  • confirmed the correct type prefix in the PR title
  • confirmed ! in the type prefix if API or client breaking change
  • confirmed all author checklist items have been addressed
  • reviewed state machine logic
  • reviewed API design and naming
  • reviewed documentation is accurate
  • reviewed tests and test coverage
  • manually tested (if applicable)

@JayT106 JayT106 requested a review from a team as a code owner August 24, 2022 21:45
@JayT106 JayT106 changed the title !feat(cli): separate the app genesis state by module into folders when export/import/validate feat(cli/module)!: separate the app genesis state by module into folders when export/import/validate Aug 24, 2022
@codecov
Copy link

codecov bot commented Aug 25, 2022

Codecov Report

Merging #13032 (156b973) into main (7781cdb) will decrease coverage by 0.11%.
The diff coverage is 49.55%.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main   #13032      +/-   ##
==========================================
- Coverage   53.92%   53.80%   -0.12%     
==========================================
  Files         643      653      +10     
  Lines       55475    57032    +1557     
==========================================
+ Hits        29913    30686     +773     
- Misses      23159    23877     +718     
- Partials     2403     2469      +66     
Impacted Files Coverage Δ
simapp/simd/cmd/root.go 70.35% <0.00%> (ø)
testutil/sims/simulation_helpers.go 17.14% <0.00%> (ø)
x/genutil/client/cli/validate_genesis.go 48.05% <26.08%> (-30.00%) ⬇️
simapp/export.go 13.24% <35.00%> (ø)
simapp/app.go 75.00% <38.46%> (ø)
types/module/module.go 64.51% <60.83%> (-1.74%) ⬇️
simapp/genesis.go 100.00% <0.00%> (ø)
simapp/test_helpers.go 30.25% <0.00%> (ø)
... and 10 more

@alexanderbez
Copy link
Contributor

Prior to continuing potentially wasted work on this PR, have we come to consensus on if we need this or not? I mean we did introduce recently the ability to export named/specific module states.

@JayT106 JayT106 force-pushed the genesis-state-folder branch from abd64fd to 17d2f06 Compare October 12, 2022 20:07
forZeroHeight, _ := cmd.Flags().GetBool(FlagForZeroHeight)
jailAllowedAddrs, _ := cmd.Flags().GetStringSlice(FlagJailAllowedAddrs)
modulesToExport, _ := cmd.Flags().GetStringSlice(FlagModulesToExport)
splitModules, _ := cmd.Flags().GetBool(FlagSplitModules)

Check warning

Code scanning / gosec

Returned error is not propagated up the stack.

Returned error is not propagated up the stack.
@JayT106
Copy link
Contributor Author

JayT106 commented Oct 12, 2022

Prior to continuing potentially wasted work on this PR, have we come to consensus on if we need this or not? I mean we did introduce recently the ability to export named/specific module states.

@alexanderbez
this PR is mainly to ease the memory usage for exporting/importing the whole genesis state if the app has a massive app genesis state (over several GBs). In #13437, the PR tries to export all of the module states and then write them into one file. it's not able to use it to initiate a new chain if I understand it correctly.

@JayT106 JayT106 changed the title feat(cli/module): separate the app genesis state by module into folders when export/import/validate feat(cli/module)!: separate the app genesis state by module into folders when export/import/validate Oct 12, 2022
@JayT106 JayT106 force-pushed the genesis-state-folder branch from d05698e to 12c9fad Compare October 12, 2022 20:57
@julienrbrt julienrbrt dismissed their stale review October 13, 2022 08:23

Thanks for the update. Re-reviewing now!

Copy link
Member

@julienrbrt julienrbrt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have left a few comments. Thanks for the PR, I see the use case now!
Can you as well add a CHANGELOG?

return err
}
if splitModules {
wd, err := os.Getwd()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't correct, as it will export it in the current directory. You don't need it as we have homeDir.

cmd.SetOut(cmd.OutOrStdout())
cmd.SetErr(cmd.OutOrStderr())
cmd.Println(string(sdk.MustSortJSON(encoded)))
if err := doc.SaveAs(filepath.Join(wd, "genesis", "genesis.json")); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should use homeDir instead. Are we sure we want to override the file if it exists? If so let's add an explanation in the command description.

if err != nil {
return servertypes.ExportedApp{}, err
}
app.ModuleManager.GenesisPath = wd
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
app.ModuleManager.GenesisPath = wd
app.ModuleManager.GenesisPath = filepath.Join(cast.ToString(app.appOpts.Get(flags.FlagHome)), "config", "genesis")

@@ -28,11 +34,30 @@ func (app *SimApp) ExportAppStateAndValidators(forZeroHeight bool, jailAllowedAd
app.prepForZeroHeightGenesis(ctx, jailAllowedAddrs)
}

genState := app.ModuleManager.ExportGenesisForModules(ctx, app.appCodec, modulesToExport)
appState, err := json.MarshalIndent(genState, "", " ")
wd, err := os.Getwd()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't get the current directory.

@@ -235,6 +239,7 @@ func NewManager(modules ...AppModule) *Manager {
OrderExportGenesis: modulesStr,
OrderBeginBlockers: modulesStr,
OrderEndBlockers: modulesStr,
GenesisPath: "",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
GenesisPath: "",

It is go default.

@@ -320,19 +340,31 @@ func (m *Manager) InitGenesis(ctx sdk.Context, cdc codec.JSONCodec, genesisData
}

// ExportGenesis performs export genesis functionality for modules
func (m *Manager) ExportGenesis(ctx sdk.Context, cdc codec.JSONCodec) map[string]json.RawMessage {
return m.ExportGenesisForModules(ctx, cdc, []string{})
func (m *Manager) ExportGenesis(ctx sdk.Context, cdc codec.JSONCodec) (map[string]json.RawMessage, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is API breaking. Shouldn't we panic instead?

x/authz/client/testutil/grpc.go Show resolved Hide resolved
x/authz/client/testutil/grpc.go Show resolved Hide resolved
return nil
},
}

cmd.Flags().Bool(FlagValidateSplitModules, false, "validate the modules' genesis state in current working path")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
cmd.Flags().Bool(FlagValidateSplitModules, false, "validate the modules' genesis state in current working path")
cmd.Flags().Bool(FlagValidateSplitModules, false, "validate the modules' genesis state at the genesis.json directory")

return fmt.Errorf("error unmarshalling genesis doc %s: %s", genesis, err.Error())
}
if splitModules {
wd, err := os.Getwd()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think of using genesis so we support arguments too and keep the same behavior. We can get the directory of that file with filepath.Dir.

@alexanderbez
Copy link
Contributor

Prior to continuing potentially wasted work on this PR, have we come to consensus on if we need this or not? I mean we did introduce recently the ability to export named/specific module states.

@alexanderbez this PR is mainly to ease the memory usage for exporting/importing the whole genesis state if the app has a massive app genesis state (over several GBs). In #13437, the PR tries to export all of the module states and then write them into one file. it's not able to use it to initiate a new chain if I understand it correctly.

But isn't that at odds with the ORM which allows Genesis streaming (cc @aaronc). I don't think it's ideal to have two completely different ways to do one thing.

@JayT106
Copy link
Contributor Author

JayT106 commented Oct 13, 2022

But isn't that at odds with the ORM which allows Genesis streaming (cc @aaronc). I don't think it's ideal to have two completely different ways to do one thing.

Yes, but to solve the issue (RAM) completely using Genesis steaming, we need to do it within each module when executing the genesis export/import. That's a big API change.

Agree, we can do Genesis steaming for the current export/import when we got the rawjson from/to each module export/import.

My question:

  1. Is anyone implementing genesis: an option to separate the app genesis states to folders when export/import/validate the genesis state  #12808 (comment), will it be landed in the SDK v0.47(or 0v.48)?
  2. Is it able to backport to V0.46?

@aaronc
Copy link
Member

aaronc commented Oct 18, 2022

Prior to continuing potentially wasted work on this PR, have we come to consensus on if we need this or not? I mean we did introduce recently the ability to export named/specific module states.

@alexanderbez this PR is mainly to ease the memory usage for exporting/importing the whole genesis state if the app has a massive app genesis state (over several GBs). In #13437, the PR tries to export all of the module states and then write them into one file. it's not able to use it to initiate a new chain if I understand it correctly.

But isn't that at odds with the ORM which allows Genesis streaming (cc @aaronc). I don't think it's ideal to have two completely different ways to do one thing.

This approach I'm pretty sure is compatible with what I've outlined in #12972 and https://github.com/cosmos/cosmos-sdk/pull/13082/files#diff-e8221a593d36e8d8497d6f3970bf820a8bb3533426f13f6045cd941359483a37. That API tries to be flexible and allow for a few different approaches. This approach wouldn't be needed for modules which use the ORM and have built-in streaming, but sounds like it could be helpful for the current genesis structure.

@alexanderbez
Copy link
Contributor

@aaronc do you think or recommend that we take this approach then as well?

@aaronc
Copy link
Member

aaronc commented Oct 18, 2022

@aaronc do you think or recommend that we take this approach then as well?

Well it seems like it solves an immediate problem that people have. On the other hand, I wonder how often this scenario is needed. The logic for splitting a single module's JSON into multiple files seems a bit complex and renders the JSON more or less unreadable and uneditable.

@JayT106 can you describe more the use case you're targeting for this? Is it for dump state/restart upgrades or for some other internal analytics/testing usage?

An alternative approach which might be simpler but allow the same benefits and also keep the JSON usable to split a module's genesis based on object fields. For example, if we split the top-level fields in bank's GenesisState object into individual files we would get:

genesis.json
bank/
  balances.json
  supply.json
  denom_metadata.json
  send_enabled.json

This is essentially what the ORM does behind the scenes, but with support for actually streaming the data which should make memory requirements very low. Without the streaming memory usage would be higher but probably much better than what we have. Would this alternate approach meet your needs @JayT106 ?

@JayT106
Copy link
Contributor Author

JayT106 commented Oct 18, 2022

@aaronc do you think or recommend that we take this approach then as well?

Well it seems like it solves an immediate problem that people have. On the other hand, I wonder how often this scenario is needed. The logic for splitting a single module's JSON into multiple files seems a bit complex and renders the JSON more or less unreadable and uneditable.

@JayT106 can you describe more the use case you're targeting for this? Is it for dump state/restart upgrades or for some other internal analytics/testing usage?

An alternative approach which might be simpler but allow the same benefits and also keep the JSON usable to split a module's genesis based on object fields. For example, if we split the top-level fields in bank's GenesisState object into individual files we would get:

genesis.json
bank/
  balances.json
  supply.json
  denom_metadata.json
  send_enabled.json

This is essentially what the ORM does behind the scenes, but with support for actually streaming the data which should make memory requirements very low. Without the streaming memory usage would be higher but probably much better than what we have. Would this alternate approach meet your needs @JayT106 ?

@aaronc
The use case for us is to be able to export/import the genesis state for upgrading the chain without requiring a big cloud instance. The current app state of the Cronos chain already exceeds several GBs. So we would like to split the genesis state during state export/import, and certain module states (i.e. the evm module) are over GBs if we don't split the file into several chunks. Even if we can split the module genesis based on object fields, it still cannot avoid certain objects are the major resource comsumer in the app.

The logic for splitting a single module's JSON into multiple files seems a bit complex and renders the JSON more or less unreadable and uneditable.

I think the operator can choose whether they want to split it or not by a flag, we just need to provide the option. Or maybe provides some tooling if they would like to read or edit the files.

I can try to rewrite it by implementing #12808 (comment)
Just would like to know the timeline or if anyone is doing it to avoid duplicate work.

@aaronc
Copy link
Member

aaronc commented Oct 18, 2022

I can try to rewrite it by implementing #12808 (comment)
Just would like to know the timeline or if anyone is doing it to avoid duplicate work.

Nobody is doing it currently. What you're trying to do doesn't even need the API change because you're recomposing everything into a single runtime object per module right?

Just to confirm, do you think this alternate approach of splitting by top-level field would address the issue?

Also just curious, why is cronos doing export/import upgrades instead of using x/upgrade?

@JayT106
Copy link
Contributor Author

JayT106 commented Oct 18, 2022

Nobody is doing it currently. What you're trying to do doesn't even need the API change because you're recomposing everything into a single runtime object per module right?

that's correct!

Just to confirm, do you think this alternate approach of splitting by top-level field would address the issue?

No, but easing it (same as this PR) unless we are able to introduce a new module export/import API with the file streamer, which will require the downstream project to integrate. Or migrate the whole module state using the ORM module?

Also just curious, why is cronos doing export/import upgrades instead of using x/upgrade?

Yes, currently we are using x/upgrade for the chain upgrade
But we would like to use it as a backup feature for dealing with the consensus break cases (if it happens)

@JayT106 JayT106 mentioned this pull request Nov 1, 2022
19 tasks
@github-actions
Copy link
Contributor

github-actions bot commented Dec 3, 2022

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the stale label Dec 3, 2022
@github-actions github-actions bot closed this Dec 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

genesis: an option to separate the app genesis states to folders when export/import/validate the genesis state
7 participants