Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(swing-store): faster import of swing-store #8522

Merged
merged 2 commits into from
Nov 10, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ import tmp from 'tmp';
import { kunser } from '@agoric/kmarshal';
import {
initSwingStore,
openSwingStore,
makeSwingStoreExporter,
importSwingStore,
} from '@agoric/swing-store';
Expand Down Expand Up @@ -36,7 +37,9 @@ test.before(async t => {

test('state-sync reload', async t => {
const [dbDir, cleanup] = await tmpDir('testdb');
const [importDbDir, cleanupImport] = await tmpDir('importtestdb');
t.teardown(cleanup);
t.teardown(cleanupImport);

const config = {
snapshotInitial: 2,
Expand Down Expand Up @@ -106,7 +109,11 @@ test('state-sync reload', async t => {
getArtifact: name => artifacts.get(name),
close: () => 0,
};
const ss2 = await importSwingStore(datasetExporter);
const ssi = await importSwingStore(datasetExporter, importDbDir);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This means the SwingStore you get back from importSwingStore is no longer suitable for general use, because it's got all the commit-safety modes turned off, yeah?

Please update the docs in docs/data-export.md to mention this fact, around line 170-ish where the importSwingSstore is shown.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, it is not. I have now removed the kernelStorage facet from the return value of importSwingStore.

await ssi.hostStorage.commit();
await ssi.hostStorage.close();
const ss2 = openSwingStore(importDbDir);
t.teardown(ss2.hostStorage.close);
const c2 = await makeSwingsetController(
ss2.kernelStorage,
{},
Expand Down
30 changes: 10 additions & 20 deletions packages/swing-store/docs/data-export.md
Original file line number Diff line number Diff line change
Expand Up @@ -152,21 +152,9 @@ for (const name of exporter.getArtifactNames()) {

## Import

On other end of the export process is an importer. This is a new host application, which wants to start from the contents of the export, rather than initializing a brand new (empty) kernel state.
On the other end of the export process is an importer. This is used to restore kernel state, so that a new host application can simply continue mostly as if it had been previously executing. The expectation is that the import and the execution are 2 independent events, and the execution doesn't need to be aware it was imported.

When starting a brand new instance, host applications would normally call `openSwingStore(dirPath)` to create a new (empty) SwingStore, then call SwingSet's `initializeSwingset(config, .., kernelStorage)` to let the kernel initialize the DB with a config-dependent starting state:

```js
// this is done only the first time an instance is created:

import { openSwingStore } from '@agoric/swing-store';
import { initializeSwingset } from '@agoric/swingset-vat';
const dirPath = './swing-store';
const { hostStorage, kernelStorage } = openSwingStore(dirPath);
await initializeSwingset(config, argv, kernelStorage);
```

Once the initial state is created, each time the application is launched, it will build a controller around the existing state:
For reference, after the initial state is created, each time the application is launched, it builds a controller around the existing state:

```js
import { openSwingStore } from '@agoric/swing-store';
Expand All @@ -177,7 +165,7 @@ const controller = await makeSwingsetController(kernelStorage);
// ... now do things like controller.run(), etc
```

When cloning an existing kernel, the initialization step is replaced with `importSwingStore`. The host application should feed the importer with the export data and artifacts, by passing an object that has the same API as the SwingStore's exporter:
When cloning an existing kernel, the host application first imports and commits the restored state using `importSwingStore`. The host application should feed the importer with the export data and artifacts, by passing an object that has the same API as the SwingStore's exporter:

```js
import { importSwingStore } from '@agoric/swing-store';
Expand All @@ -188,11 +176,13 @@ const exporter = {
getArtifact(name) { // return blob of artifact data },
};
const { hostStorage } = importSwingStore(exporter, dirPath);
hostStorage.commit();
// now the swingstore is fully populated
// Update any hostStorage as needed
await hostStorage.commit();
await hostStorage.close();
// now the populated swingstore can be re-opened using `openSwingStore``
```

Once the new SwingStore is fully populated with the previously-exported data, the host application can use `makeSwingsetController()` to build a kernel that will start from the exported state.
Once the new SwingStore is fully populated with the previously-exported data, the host application can update any host specific state before committing and closing the SwingStore. `importSwingStore` returns only the host facet of the SwingStore instance, as it is not suitable for immediate execution.

## Optional / Historical Data

Expand Down Expand Up @@ -223,14 +213,14 @@ Also note that when a vat is terminated, we delete all information about it, inc

When importing, the `importSwingStore()` function's options bag takes a property named `artifactMode`, with the same meanings as for export. Importing with the `operational` mode will ignore any artifacts other than those needed for current operations, and will fail unless all such artifacts were available. Importing with `replay` will ignore spans from old incarnations, but will fail unless all spans from current incarnations are present. Importing with `archival` will fail unless all spans from all incarnations are present. There is no `debug` option during import.

`importSwingStore()` returns a swingstore, which means its options bag also contains the same options as `openSwingStore()`, including the `keepTranscripts` option. This defaults to `true`, but if it were overridden to `false`, then the new swingstore will delete transcript spans as soon as they are no longer needed for operational purposes (e.g. when `transcriptStore.rolloverSpan()` is called).
While `importSwingStore()`'s options bag accepts the same options as `openSwingStore()`, since it returns only the host facet of a SwingStore, some of these options might not be meaningful, such as `keepTranscripts`.

So, to avoid pruning current-incarnation historical transcript spans when exporting from one swingstore to another, you must set (or avoid overriding) the following options along the way:

* the original swingstore must not be opened with `{ keepTranscripts: false }`, otherwise the old spans will be pruned immediately
* the export must use `makeSwingStoreExporter(dirpath, { artifactMode: 'replay'})`, otherwise the export will omit the old spans
* the import must use `importSwingStore(exporter, dirPath, { artifactMode: 'replay'})`, otherwise the import will ignore the old spans
* the `importSwingStore` call (and all subsequent `openSwingStore` calls) must not use `keepTranscripts: false`, otherwise the new swingstore will prune historical spans as new ones are created (during `rolloverSpan`).
* subsequent `openSwingStore` calls must not use `keepTranscripts: false`, otherwise the new swingstore will prune historical spans as new ones are created (during `rolloverSpan`).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

retrospective nit: it might be a good idea to retain the admonition against having keepTranscripts: false in the importSwingStore options bag (maybe as a child-bullet of the previous line).. that line was mainly about the importance of including artifactMode: 'replay', but it's also important to not prune transcripts during the import itself


## Implementation Details

Expand Down
20 changes: 14 additions & 6 deletions packages/swing-store/src/importer.js
Original file line number Diff line number Diff line change
Expand Up @@ -11,14 +11,16 @@ import { assertComplete } from './assertComplete.js';
*/

/**
* Function used to create a new swingStore from an object implementing the
* Function used to populate a swingStore from an object implementing the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: It really is about creating a new swingStore. Changing it to say "populate a swingStore" kind of makes it sound like this could be used to fill a previously-created (perhaps empty, perhaps not) database, which suggests some sort of weird merge operation between the previous contents and the import dataset.

* exporter API. The exporter API may be provided by a swingStore instance, or
* implemented by a host to restore data that was previously exported.
* implemented by a host to restore data that was previously exported. The
* returned swingStore is not suitable for execution, and thus only contains
* the host facet for committing the populated swingStore.
*
* @param {import('./exporter').SwingStoreExporter} exporter
* @param {string | null} [dirPath]
* @param {ImportSwingStoreOptions} [options]
* @returns {Promise<import('./swingStore').SwingStore>}
* @returns {Promise<Pick<import('./swingStore').SwingStore, 'hostStorage' | 'debug'>>}
*/
export async function importSwingStore(exporter, dirPath = null, options = {}) {
if (dirPath && typeof dirPath !== 'string') {
Expand All @@ -27,8 +29,14 @@ export async function importSwingStore(exporter, dirPath = null, options = {}) {
const { artifactMode = 'operational', ...makeSwingStoreOptions } = options;
validateArtifactMode(artifactMode);

const store = makeSwingStore(dirPath, true, makeSwingStoreOptions);
const { kernelStorage, internal } = store;
const { hostStorage, kernelStorage, internal, debug } = makeSwingStore(
dirPath,
true,
{
unsafeFastMode: true,
...makeSwingStoreOptions,
},
);

// For every exportData entry, we add a DB record. 'kv' entries are
// the "kvStore shadow table", and are not associated with any
Expand Down Expand Up @@ -121,5 +129,5 @@ export async function importSwingStore(exporter, dirPath = null, options = {}) {
assertComplete(internal, checkMode);

await exporter.close();
return store;
return { hostStorage, debug };
}
30 changes: 27 additions & 3 deletions packages/swing-store/src/swingStore.js
Original file line number Diff line number Diff line change
Expand Up @@ -203,9 +203,29 @@ export function makeSwingStore(dirPath, forceReset, options = {}) {
// mode that defers merge work for a later attempt rather than block any
// potential readers or writers. See https://sqlite.org/wal.html for details.

// However we also allow opening the DB with journaling off, which is unsafe
// and doesn't support rollback, but avoids any overhead for large
// transactions like for during an import.

function setUnsafeFastMode(enabled) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's probably a better name for this, but the functionality seems good to me.

const journalMode = enabled ? 'off' : 'wal';
const synchronousMode = enabled ? 'normal' : 'full';
!db.inTransaction || Fail`must not be in a transaction`;

db.unsafeMode(!!enabled);
// The WAL mode is persistent so it's not possible to switch to a different
// mode for an existing DB.
const actualMode = db.pragma(`journal_mode=${journalMode}`, {
mhofman marked this conversation as resolved.
Show resolved Hide resolved
simple: true,
});
actualMode === journalMode ||
filePath === ':memory:' ||
Fail`Couldn't set swing-store DB to ${journalMode} mode (is ${actualMode})`;
db.pragma(`synchronous=${synchronousMode}`);
}

// PRAGMAs have to happen outside a transaction
db.exec(`PRAGMA journal_mode=WAL`);
db.exec(`PRAGMA synchronous=FULL`);
setUnsafeFastMode(options.unsafeFastMode);

// We use IMMEDIATE because the kernel is supposed to be the sole writer of
// the DB, and if some other process is holding a write lock, we want to find
Expand Down Expand Up @@ -481,7 +501,11 @@ export function makeSwingStore(dirPath, forceReset, options = {}) {
}

/** @type {import('./internal.js').SwingStoreInternal} */
const internal = harden({ snapStore, transcriptStore, bundleStore });
const internal = harden({
snapStore,
transcriptStore,
bundleStore,
});

async function repairMetadata(exporter) {
return doRepairMetadata(internal, exporter);
Expand Down
6 changes: 5 additions & 1 deletion packages/swing-store/test/test-bundles.js
Original file line number Diff line number Diff line change
Expand Up @@ -118,7 +118,11 @@ test('b0 import', async t => {
},
close: async () => undefined,
};
const { kernelStorage } = await importSwingStore(exporter);
const ss = await importSwingStore(exporter);
t.teardown(ss.hostStorage.close);
await ss.hostStorage.commit();
const serialized = ss.debug.serialize();
const { kernelStorage } = initSwingStore(null, { serialized });
const { bundleStore } = kernelStorage;
t.truthy(bundleStore.hasBundle(idA));
t.deepEqual(bundleStore.getBundle(idA), b0A);
Expand Down
1 change: 1 addition & 0 deletions packages/swing-store/test/test-exportImport.js
Original file line number Diff line number Diff line change
Expand Up @@ -322,6 +322,7 @@ async function testExportImport(
}
t.is(failureMode, 'none');
const ssIn = await doImport();
t.teardown(ssIn.hostStorage.close);
await ssIn.hostStorage.commit();
let dumpsShouldMatch = true;
if (runMode === 'operational') {
Expand Down
2 changes: 2 additions & 0 deletions packages/swing-store/test/test-import.js
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,7 @@ test('import empty', async t => {
t.teardown(cleanup);
const exporter = makeExporter(new Map(), new Map());
const ss = await importSwingStore(exporter, dbDir);
t.teardown(ss.hostStorage.close);
await ss.hostStorage.commit();
const data = convert(ss.debug.dump());
t.deepEqual(data, {
Expand All @@ -69,6 +70,7 @@ const importTest = test.macro(async (t, mode) => {

// now import
const ss = await importSwingStore(exporter, dbDir, { artifactMode });
t.teardown(ss.hostStorage.close);
await ss.hostStorage.commit();
const data = convert(ss.debug.dump());

Expand Down
32 changes: 23 additions & 9 deletions packages/swing-store/test/test-repair-metadata.js
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ import path from 'path';
import test from 'ava';
import sqlite3 from 'better-sqlite3';

import { importSwingStore } from '../src/index.js';
import { importSwingStore, openSwingStore } from '../src/index.js';

import { makeExporter, buildData } from './exports.js';
import { tmpDir } from './util.js';
Expand All @@ -19,8 +19,9 @@ test('repair metadata', async t => {
// then manually deleting the historical metadata entries from the
// DB
const exporter = makeExporter(exportData, artifacts);
const ss = await importSwingStore(exporter, dbDir);
await ss.hostStorage.commit();
const ssi = await importSwingStore(exporter, dbDir);
await ssi.hostStorage.commit();
await ssi.hostStorage.close();

const filePath = path.join(dbDir, 'swingstore.sqlite');
const db = sqlite3(filePath);
Expand Down Expand Up @@ -51,6 +52,8 @@ test('repair metadata', async t => {
t.deepEqual(ss2, [7]);

// now fix it
const ss = openSwingStore(dbDir);
t.teardown(ss.hostStorage.close);
await ss.hostStorage.repairMetadata(exporter);
await ss.hostStorage.commit();

Expand All @@ -62,6 +65,7 @@ test('repair metadata', async t => {

// repair should be idempotent
await ss.hostStorage.repairMetadata(exporter);
await ss.hostStorage.commit();

const ts4 = getTS.all('v1');
t.deepEqual(ts4, [0, 2, 5, 8]); // still there
Expand All @@ -76,11 +80,15 @@ test('repair metadata ignores kvStore entries', async t => {
const { exportData, artifacts } = buildData();

const exporter = makeExporter(exportData, artifacts);
const ss = await importSwingStore(exporter, dbDir);
await ss.hostStorage.commit();
const ssi = await importSwingStore(exporter, dbDir);
await ssi.hostStorage.commit();
await ssi.hostStorage.close();

// perform the repair with spurious kv entries
exportData.set('kv.key2', 'value2');

const ss = openSwingStore(dbDir);
t.teardown(ss.hostStorage.close);
await ss.hostStorage.repairMetadata(exporter);
await ss.hostStorage.commit();

Expand All @@ -95,14 +103,17 @@ test('repair metadata rejects mismatched snapshot entries', async t => {
const { exportData, artifacts } = buildData();

const exporter = makeExporter(exportData, artifacts);
const ss = await importSwingStore(exporter, dbDir);
await ss.hostStorage.commit();
const ssi = await importSwingStore(exporter, dbDir);
await ssi.hostStorage.commit();
await ssi.hostStorage.close();

// perform the repair with mismatched snapshot entry
const old = JSON.parse(exportData.get('snapshot.v1.4'));
const wrong = { ...old, hash: 'wrong' };
exportData.set('snapshot.v1.4', JSON.stringify(wrong));

const ss = openSwingStore(dbDir);
t.teardown(ss.hostStorage.close);
await t.throwsAsync(async () => ss.hostStorage.repairMetadata(exporter), {
message: /repairSnapshotRecord metadata mismatch/,
});
Expand All @@ -115,14 +126,17 @@ test('repair metadata rejects mismatched transcript span', async t => {
const { exportData, artifacts } = buildData();

const exporter = makeExporter(exportData, artifacts);
const ss = await importSwingStore(exporter, dbDir);
await ss.hostStorage.commit();
const ssi = await importSwingStore(exporter, dbDir);
await ssi.hostStorage.commit();
await ssi.hostStorage.close();

// perform the repair with mismatched transcript span entry
const old = JSON.parse(exportData.get('transcript.v1.0'));
const wrong = { ...old, hash: 'wrong' };
exportData.set('transcript.v1.0', JSON.stringify(wrong));

const ss = openSwingStore(dbDir);
t.teardown(ss.hostStorage.close);
await t.throwsAsync(async () => ss.hostStorage.repairMetadata(exporter), {
message: /repairTranscriptSpanRecord metadata mismatch/,
});
Expand Down
Loading