Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor snapshotter #8072

Merged
merged 16 commits into from
Aug 4, 2023
Merged

refactor snapshotter #8072

merged 16 commits into from
Aug 4, 2023

Conversation

mhofman
Copy link
Member

@mhofman mhofman commented Jul 20, 2023

refs: #6527
refs: #8025
refs: #8031
Best reviewed commit-by-commit

Description

In order to support the migrations needed for #8025 #8031, and to implement #6527, we need to enable using the snapshot pathways between golang and JS outside of the cosmos state-sync mechanism.

This PR splits the snapshotter into a SwingStoreExportsHandler module responsible for the communication and synchronization with JS, and a simplified cosmos ExtensionSnapshotter that uses the new SwingStoreExportsHandler hide the JS interactions.

Given the goroutine synchronization and multistep process of performing a SwingStore snapshot, there is a decent amount of interleaving needed, which is now expressed as facets provided as arguments to methods representing each step:

  • The SwingStoreExportsHandler's InitiateExport() takes a SwingStoreExportEventHandler, which has a ExportInitiated() method called from the goroutine it starts.
  • ExportInitiated() is provided a retrieveExport callback. The event handler implemented by the ExtensionSnapshotter stores that callback and invokes it in its SnapshotExtension() method (internally invoked by the cosmos snapshot manager)
  • retrieveExport() processes the data received from JS, and creates a SwingStoreExportProvider for it, which is provided as an argument to the ExportRetrieved() method of the SwingStoreExportEventHandler
  • ExportRetrieved() is also implemented by the ExtensionsSnapshotter and uses that provider to recover the snapshot artifacts and write them out as state-sync payloads.

The new layering allows the SwingStoreExportsHandler to more easily notice errors happening during the retrieval phase, which are currently swallowed by cosmos's snapshot manager. It also introduces a way to explicitly await for the completion of a snapshot in progress, which will be needed for genesis export and the SwingStore shadow copy migration.

Finally the swing-store import/export options are threaded through all the way into the SwingStoreExportsHandler interface, allowing different kind of snapshot artifacts to be handled, as needed for the future use cases above.

Security Considerations

None

Scaling Considerations

None

Documentation Considerations

None

Testing Considerations

This is a refactor splitting an existing module. The tests covering that module were similarly split, and sometimes duplicated where appropriate.
With the split it may be possible to cover some more granular steps which could only be covered in aggregate before.

Upgrade Considerations

This PR does not change any behavior or stored data, as such it does not have any upgrade impact.

@mhofman mhofman requested review from michaelfig, gibson042 and JimLarson and removed request for michaelfig July 20, 2023 14:27
@mhofman
Copy link
Member Author

mhofman commented Jul 21, 2023

I followed up with a lot of code comment clarifications / godoc fixes, so it may be easier to review if I'm allowed to rebase all the fixup commits. Any objections from reviewers?

@mhofman mhofman force-pushed the mhofman/refactor-snapshotter branch 2 times, most recently from a13849d to 47f41a5 Compare July 21, 2023 23:06
golang/cosmos/app/app.go Outdated Show resolved Hide resolved
golang/cosmos/proto/agoric/swingset/swingset.proto Outdated Show resolved Hide resolved
golang/cosmos/x/swingset/keeper/extension_snapshotter.go Outdated Show resolved Hide resolved
payloadWriter snapshots.ExtensionPayloadWriter
}

// SwingsetSnapshotter manages Swingset snapshots, ensuring insensitivity to
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a good place to document and/or reference external documentation of how a Swingset snapshot is structured and/or the general process for constructing one?

golang/cosmos/x/swingset/keeper/extension_snapshotter.go Outdated Show resolved Hide resolved
golang/cosmos/x/swingset/keeper/store_snapshotter.go Outdated Show resolved Hide resolved
golang/cosmos/x/swingset/keeper/store_snapshotter.go Outdated Show resolved Hide resolved
golang/cosmos/x/swingset/keeper/store_snapshotter.go Outdated Show resolved Hide resolved
Comment on lines 172 to 174
// InitiateSnapshot initiates a SwingStore snapshot for the given
// height. If a snapshot is already in progress, this will fail. The snapshot
// processing is delegated to the provided `snapshotTaker`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// InitiateSnapshot initiates a SwingStore snapshot for the given
// height. If a snapshot is already in progress, this will fail. The snapshot
// processing is delegated to the provided `snapshotTaker`.
// InitiateSnapshot synchronously verifies that there is not already a snapshot
// in progress and records the start of a new one for the provided block height,
// then launches a goroutine to asynchronously coordinate with JS for initiating
// and retrieving a snapshot, sending it to the provided `snapshotTaker` for processing,
// and ultimately releasing it.

That the behavior includes retrieval and processing also suggests to me that "initiate" is not the best name, although a similar argument would probably apply to using the name "take" for a function that synchronously starts a process which finishes asynchronously. It might be best to just reify the separation such that e.g. call sites look like

if err := snapshotter.InitiateSnapshot(height); err != nil {
	…
}
go snapshotter.FinishSnapshot(taker, exportOptions)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That shape is not possible. Let's chat

golang/cosmos/x/swingset/keeper/store_snapshotter.go Outdated Show resolved Hide resolved
@mhofman mhofman force-pushed the mhofman/refactor-snapshotter branch from 47f41a5 to 0583b45 Compare July 24, 2023 21:56
Copy link
Member

@gibson042 gibson042 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed in real time, I'd like to see a refactor in which the swing-store snapshotter communicates completion of its synchronous initiation to the main thread rather than directly driving the cosmos snapshot itself, which seems like an inversion of responsibility. But to reiterate, this need not be considered a blocking concern.

golang/cosmos/x/swingset/keeper/store_snapshotter.go Outdated Show resolved Hide resolved
golang/cosmos/x/swingset/keeper/store_snapshotter.go Outdated Show resolved Hide resolved
Comment on lines 19 to 21
// This module abstract the handling of swing-store snapshots, also known as
// swing-store imports/exports, and the necessary communication with the
// JS side.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have anywhere that describes how swing-store exports fit into the overall structure of cosmos state-sync snapshots? My recollection from our call is that there is some hook by which a module registers its integration(s) for cosmos to call when creating/loading/etc. a snapshot, and that the responsibility for creation is to produce an arbitrary number of protobuf "extension" entries with a common format version and for loading is to validate the format version of each extension entry and then consume it as appropriate.

golang/cosmos/x/swingset/keeper/store_snapshotter.go Outdated Show resolved Hide resolved
golang/cosmos/x/swingset/keeper/store_snapshotter.go Outdated Show resolved Hide resolved
golang/cosmos/x/swingset/keeper/store_snapshotter.go Outdated Show resolved Hide resolved
golang/cosmos/x/swingset/keeper/store_snapshotter.go Outdated Show resolved Hide resolved
golang/cosmos/x/swingset/keeper/store_snapshotter.go Outdated Show resolved Hide resolved
golang/cosmos/x/swingset/keeper/extension_snapshotter.go Outdated Show resolved Hide resolved
golang/cosmos/x/swingset/keeper/extension_snapshotter.go Outdated Show resolved Hide resolved
Copy link
Contributor

@JimLarson JimLarson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Giving up on the commit-by-commit review halfway through the 5th commit - too much back-and-forth - and publishing the comments so far. I'm going to update to see recent commits and review the PR in its entirety.

golang/cosmos/x/swingset/keeper/snapshotter.go Outdated Show resolved Hide resolved
golang/cosmos/x/swingset/keeper/snapshotter.go Outdated Show resolved Hide resolved
golang/cosmos/x/swingset/keeper/snapshotter.go Outdated Show resolved Hide resolved
golang/cosmos/x/swingset/keeper/snapshotter.go Outdated Show resolved Hide resolved
golang/cosmos/x/swingset/keeper/snapshotter.go Outdated Show resolved Hide resolved
golang/cosmos/x/swingset/keeper/extension_snapshotter.go Outdated Show resolved Hide resolved
golang/cosmos/x/swingset/keeper/store_snapshotter.go Outdated Show resolved Hide resolved
golang/cosmos/x/swingset/keeper/store_snapshotter.go Outdated Show resolved Hide resolved
golang/cosmos/x/swingset/keeper/store_snapshotter.go Outdated Show resolved Hide resolved
@mhofman mhofman force-pushed the mhofman/refactor-snapshotter branch 3 times, most recently from caacb17 to 90e0bfd Compare July 30, 2023 23:01
@mhofman mhofman requested review from JimLarson and gibson042 July 30, 2023 23:02
@mhofman
Copy link
Member Author

mhofman commented Jul 30, 2023

@michaelfig @JimLarson I believe I've addressed all feedback. It ended up being extensive changes, and introduces a couple conflicts with master, so right now I squashed everything and will rebase / relayer everything. I don't expect any code to change until your new review.

golang/cosmos/x/swingset/alias.go Outdated Show resolved Hide resolved
golang/cosmos/x/swingset/keeper/extension_snapshotter.go Outdated Show resolved Hide resolved
golang/cosmos/x/swingset/keeper/extension_snapshotter.go Outdated Show resolved Hide resolved
golang/cosmos/x/swingset/keeper/extension_snapshotter.go Outdated Show resolved Hide resolved
golang/cosmos/x/swingset/keeper/extension_snapshotter.go Outdated Show resolved Hide resolved
golang/cosmos/x/swingset/keeper/extension_snapshotter.go Outdated Show resolved Hide resolved
golang/cosmos/x/swingset/keeper/extension_snapshotter.go Outdated Show resolved Hide resolved
golang/cosmos/x/swingset/keeper/extension_snapshotter.go Outdated Show resolved Hide resolved
golang/cosmos/x/swingset/keeper/extension_snapshotter.go Outdated Show resolved Hide resolved
@JimLarson
Copy link
Contributor

The rebase landed while I was in mid-review. I've submitted what was in-flight. Please address those comments as much as possible then please confirm that I have a stable target for review.

@mhofman
Copy link
Member Author

mhofman commented Jul 31, 2023

I have added a sequence diagram of the snapshot creation process: https://github.com/Agoric/agoric-sdk/blob/mhofman/refactor-snapshotter/docs/architecture/state-sync.md

Copy link
Member

@gibson042 gibson042 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Skimmed changes since last review.

golang/cosmos/proto/agoric/swingset/swingset.proto Outdated Show resolved Hide resolved
golang/cosmos/x/swingset/keeper/extension_snapshotter.go Outdated Show resolved Hide resolved
golang/cosmos/app/app.go Show resolved Hide resolved
golang/cosmos/proto/agoric/swingset/swingset.proto Outdated Show resolved Hide resolved
golang/cosmos/x/swingset/keeper/extension_snapshotter.go Outdated Show resolved Hide resolved
golang/cosmos/x/swingset/keeper/extension_snapshotter.go Outdated Show resolved Hide resolved
Copy link
Member Author

@mhofman mhofman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I've addressed all new feedback and then some, PTAL @JimLarson

golang/cosmos/app/app.go Show resolved Hide resolved
golang/cosmos/proto/agoric/swingset/swingset.proto Outdated Show resolved Hide resolved
golang/cosmos/x/swingset/keeper/extension_snapshotter.go Outdated Show resolved Hide resolved
golang/cosmos/x/swingset/keeper/extension_snapshotter.go Outdated Show resolved Hide resolved
@mhofman mhofman requested a review from JimLarson August 2, 2023 17:13
Copy link
Contributor

@JimLarson JimLarson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM at last! Thanks for applying all the previous feedback - I've verified that previous issues have been addressed to my satisfaction. Approval is conditional on applying the two comment suggestions here. One is trivial, the other explains why the dual-channel race (which is still present) is harmless.

@mhofman mhofman force-pushed the mhofman/refactor-snapshotter branch from cf97e7d to e08e8e0 Compare August 4, 2023 08:09
@mhofman mhofman force-pushed the mhofman/refactor-snapshotter branch from 56a38b4 to ead6730 Compare August 4, 2023 08:15
@mhofman mhofman added this pull request to the merge queue Aug 4, 2023
Merged via the queue into master with commit 3679b4c Aug 4, 2023
@mhofman mhofman deleted the mhofman/refactor-snapshotter branch August 4, 2023 10:20
mhofman added a commit that referenced this pull request Aug 7, 2023
mhofman added a commit that referenced this pull request Aug 7, 2023
mhofman added a commit that referenced this pull request Aug 16, 2023
mhofman added a commit that referenced this pull request Aug 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants