-
Notifications
You must be signed in to change notification settings - Fork 206
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add a helper for publishing iteration results to chain storage #5432
Changes from all commits
1a167d7
e10038b
485a134
f007dd4
6ead428
8cfd6f5
0c195c8
a90fc3d
1b8b11c
e40a62a
f5df821
392f677
f95afa5
7d0de45
48107ba
dd1d898
0f4c1cf
e3efdc8
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,57 @@ | ||
// @ts-check | ||
|
||
import { observeIteration } from '@agoric/notifier'; | ||
import { E } from '@endo/far'; | ||
|
||
/** | ||
* Publish results of an async iterable into a chain storage node as an array | ||
* serialized using JSON.stringify or an optionally provided function (e.g., | ||
* leveraging a serialize function from makeMarshal). | ||
* Array items are possibly-duplicated [index, value] pairs, where index is a | ||
* string (ascending numeric or terminal "finish" or "fail"). | ||
* | ||
* @param {ERef<AsyncIterable>} source | ||
* @param {{ setValue: (val: any) => void }} chainStorageNode | ||
* @param {{ timerService: ERef<TimerService>, serialize?: (obj: any) => string }} powers | ||
*/ | ||
export async function publishToChainNode( | ||
source, | ||
chainStorageNode, | ||
{ timerService, serialize = JSON.stringify }, | ||
) { | ||
let nextIndex = 0n; | ||
let isNewBlock = true; | ||
let oldResults = []; | ||
let results = []; | ||
const makeAcceptor = forceIndex => { | ||
return value => { | ||
if (isNewBlock) { | ||
isNewBlock = false; | ||
oldResults = results; | ||
results = []; | ||
E(timerService) | ||
.delay(1n) | ||
.then(() => { | ||
isNewBlock = true; | ||
}); | ||
} | ||
// To avoid loss when detecting the new block *after* already consuming | ||
// results produced within it, we associate each result with an index and | ||
// include results of the previous batch. | ||
// Downstream consumers are expected to deduplicate by index. | ||
// We represent the index as a string to maintain compatibility with | ||
// JSON.stringify and to avoid overly inflating the data size (e.g. with | ||
// "@qclass" objects from makeMarshal serialize functions). | ||
const index = forceIndex || String(nextIndex); | ||
nextIndex += 1n; | ||
results.push([index, value]); | ||
const combined = harden(oldResults.slice().concat(results)); | ||
E(chainStorageNode).setValue(serialize(combined)); | ||
}; | ||
}; | ||
await observeIteration(source, { | ||
updateState: makeAcceptor(), | ||
finish: makeAcceptor('finish'), | ||
fail: makeAcceptor('fail'), | ||
}); | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,70 @@ | ||
// @ts-check | ||
|
||
import { Far } from '@endo/far'; | ||
|
||
const { details: X } = assert; | ||
|
||
// TODO: Formalize segment constraints. | ||
// Must be nonempty and disallow (unescaped) `.`, and for simplicity | ||
// (and future possibility of e.g. escaping) we currently limit to | ||
// ASCII alphanumeric plus underscore. | ||
const pathSegmentPattern = /^[a-zA-Z0-9_-]{1,100}$/; | ||
|
||
/** | ||
* Create a root storage node for a given backing function and root key. | ||
* | ||
* @param {(message: any) => any} toStorage a function for sending a storageMessage object to the storage implementation (cf. golang/cosmos/x/swingset/storage.go) | ||
* @param {string} rootKey | ||
*/ | ||
export function makeChainStorageRoot(toStorage, rootKey) { | ||
assert.typeof(rootKey, 'string'); | ||
|
||
function makeChainStorageNode(key) { | ||
const node = { | ||
getKey() { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good! The composer in #5400 can pass this method through directly. |
||
return key; | ||
}, | ||
getChildNode(name) { | ||
assert.typeof(name, 'string'); | ||
assert( | ||
pathSegmentPattern.test(name), | ||
X`Path segment must be a short ASCII identifier: ${name}`, | ||
); | ||
return makeChainStorageNode(`${key}.${name}`); | ||
}, | ||
setValue(value) { | ||
assert.typeof(value, 'string'); | ||
toStorage({ key, method: 'set', value }); | ||
}, | ||
async delete() { | ||
assert(key !== rootKey); | ||
// A 'set' with no value deletes a key if it has no children, but | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This issue has been resolved in Golang. Please remove the race and make this a direct wrapper. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. As I mentioned in #5385 (comment) the problem is solved and merged. I think you and @warner may be misunderstanding the intended semantics (implemented by #5394):
The linkage information is just used to enumerate children. To demonstrate the separation between linkage and data, that linkage information is kept in Please remove this racy workaround. It is unnecessary, and the only reason I'm requesting changes. |
||
// otherwise sets data to the empty string and leaves all nodes intact. | ||
// We want to reject silently incomplete deletes (at least for now). | ||
// This check is unfortunately racy (e.g., a vat could wake up | ||
// and set data for a child before _this_ vat receives an | ||
// already-enqueued response claiming no children), but we can tolerate | ||
// that because transforming a deletion into a set-to-empty is | ||
// effectively indistinguishable from a valid reordering where a fully | ||
// successful 'delete' is followed by a child-key 'set' (for which | ||
// absent parent keys are automatically created with empty-string data). | ||
const childCount = await toStorage({ key, method: 'size' }); | ||
if (childCount > 0) { | ||
assert.fail(X`Refusing to delete node with children: ${key}`); | ||
} | ||
toStorage({ key, method: 'set' }); | ||
}, | ||
// Possible extensions: | ||
// * getValue() | ||
// * getChildNames() and/or getChildNodes() | ||
// * getName() | ||
// * recursive delete | ||
// * batch operations | ||
// * local buffering (with end-of-block commit) | ||
}; | ||
return Far('chainStorageNode', node); | ||
} | ||
|
||
const rootNode = makeChainStorageNode(rootKey); | ||
return rootNode; | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
import { E, Far } from '@endo/far'; | ||
import { makeChainStorageRoot } from './lib-chainStorage.js'; | ||
|
||
export function buildRootObject(_vatPowers) { | ||
function makeBridgedChainStorageRoot(bridgeManager, bridgeId, rootKey) { | ||
// XXX: Should we validate uniqueness of rootKey, or is that an external concern? | ||
const toStorage = message => E(bridgeManager).toBridge(bridgeId, message); | ||
const rootNode = makeChainStorageRoot(toStorage, rootKey); | ||
return rootNode; | ||
} | ||
|
||
return Far('root', { | ||
makeBridgedChainStorageRoot, | ||
}); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was originally envisioned as a function that accepts a chainStorage node and returns a subscription kit, but upon digging in it seemed more useful to me for it to consume from an existing iterable and return nothing. Still another option would be to make it effectively a tee, but I think it makes more sense to have the caller compose e.g.
getKey()
and the base iterable.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is the right abstraction. It crosses layers and combines concerns in a muddy way. What are your trying to improve wrt the sketch in #5400?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What layers do you think are being crossed and what concerns are being combined? Transmitting arbitrary data into a chainStorage node seems very respectful of layers and independent concerns. As for #5400, I looked it over carefully but didn't see anything to address the specific requirements of publishing to chain storage... am I just missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess what jumps out at me looking at the helper is it is trying to combine the iterator observation with the block batching. I think that the block batching belongs in the cosmic-swingset layer, and the iteration client is handled by #5400.
You're right that it needs to be done at a lower layer than #5400, but it should be even lower than JS entirely. I'm trying hard to "put the code where the data is". Since it is a cosmic-swingset concern (touching block height, event log, and merkle tree), it belongs outside of vats. I appreciate you're trying to respect peoples' time, but I would really like it if you would reach out if something about the layering doesn't make sense or seems unnecessarily weird (e.g. the chainTimer hack).
Here is an adapter that explains how #5400 accomplishes these platform concerns, presuming that the batching is solved at the cosmic-swingset layer:
Is that clearer now? I'll be working on splitting
x/pubstore
Golang module fromx/swingset
. I want to implement cosmos-specific publishing concerns there, such as batching, history, and event logging.With some polish and moving the code around until we have minimal interfaces, I think this could be quite good.