-
Notifications
You must be signed in to change notification settings - Fork 212
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: export state-sync snapshot without a DB write-lock #8619
Conversation
Add a new API to the exporter, `exporter.getHostKV(key)`, so that the cosmic-swingset state-sync exporter process can query `host.height` without accidentally creating a read-write transaction too. refs #8523
Exporting a state-sync snapshot is a read-only operation, and is designed to run "in the background", i.e. in parallel with normal mutating operations. It accomplishes this by opening a read-only transaction right away, effectively capturing a snapshot of the SQLite database state, to insulate the export process from ongoing writes by the execution host. The cosmic-swingset exporter starts with a query of `host.height`, to confirm that the database has not already advanced to a new block before this snapshot/read-transaction can be taken. Previously, this query worked by using `openSwingStore`, and then calling `hostStorage.hostKVStore.get('host.height')`. This had two problems: * TOCTTOU: the `hostKVStore.get` used a different DB connection (and different txn) than the exporter, so it might return a different height, negating the accuracy of the consistency check * read-write txn: `openSwingStore` creates a read-*write* txn, even when merely opening the DB (because it might need to create the initial tables). This txn is closed right away, before `openSwingStore()` returns, so it did not present a threat to ongoing operations. But if the exporter was created while the ongoing execution side already had its own read-write txn open (e.g. while `controller.run()` was running), then it would fail, and `makeSwingStoreExporter` would fail with `SQLITE_BUSY` Instead, we take advantage of the new `swingStoreExporter.getHostKV()` API, and use *it* to fetch `host.height`. Unlike the normal swingstore, the swingstore-exporter refrains from creating read-write transactions entirely. So the cosmic-swingset export code can safely query the height without fear of getting the wrong value or failing because of an ongoing write transaction. We think this should fix the SQLITE_BUSY errors. refs #8523
To be accurate, that problem does not apply in production cases because cosmic-swingset, the caller of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fantastic. Thanks for the amazing turn around!
Verified on a new patched mainnet follower node started from state-sync that producing a snapshot while syncing up after restore no longer triggers the SQLITE_BUSY error. |
…lock fix: export state-sync snapshot without a DB write-lock
…lock fix: export state-sync snapshot without a DB write-lock
…lock fix: export state-sync snapshot without a DB write-lock
Exporting a state-sync snapshot is a read-only operation, and is designed to run "in the background", i.e. in parallel with normal mutating operations. It accomplishes this by opening a read-only transaction right away, effectively capturing a snapshot of the SQLite database state, to insulate the export process from ongoing writes by the execution host.
The cosmic-swingset exporter starts with a query of
host.height
, to confirm that the database has not already advanced to a new block before this snapshot/read-transaction can be taken.Previously, this query worked by using
openSwingStore
, and then callinghostStorage.hostKVStore.get('host.height')
. This had two problems:TOCTTOU: the
hostKVStore.get
used a different DB connection (and different txn) than the exporter, so it might return a different height, negating the accuracy of the consistency checkread-write txn:
openSwingStore
creates a read-write txn, even when merely opening the DB (because it might need to create the initial tables). This txn is closed right away, beforeopenSwingStore()
returns, so it did not present a threat to ongoing operations. But if the exporter was created while the ongoing execution side already had its own read-write txn open (e.g. whilecontroller.run()
was running), then it would fail, andmakeSwingStoreExporter
would fail withSQLITE_BUSY
Instead, we take advantage of the new
swingStoreExporter.getHostKV()
API, and use it to fetchhost.height
. Unlike the normal swingstore, the swingstore-exporter refrains from creating read-write transactions entirely. So the cosmic-swingset export code can safely query the height without fear of getting the wrong value or failing because of an ongoing write transaction.We think this should fix the SQLITE_BUSY errors.
refs #8523