-
Notifications
You must be signed in to change notification settings - Fork 213
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Document and clarify the virtual object implementation #1960
Conversation
e07c615
to
9efd717
Compare
|
||
A virtual object is an object with a durable identity whose state is state is automatically and transparently backed up in secondary storage. This means that when a given virtual object is not in active use, it need not occupy any memory within the executing vat. Thus a vat can manage an arbitrarilly large number of such objects without concern about running out of memory. | ||
|
||
A virtual object has a "kind", which defines what sort of behavior and state it will possess. You can think of a kind somewhat like a data type, but without direct linguistic support. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the following is too detailed or obscure for what you're trying to do, ignore. But "data type" is wrong, as a type can have multiple implementations. The "kind" here is a specific implementation. In conventional oo terms, it is more like a concrete class than like an interface type.
A virtual object has a "kind", which defines what sort of behavior and state it will possess. You can think of a kind somewhat like a data type, but without direct linguistic support. | |
A virtual object has a "kind", which defines what sort of behavior and state it will possess. You can think of a kind somewhat like an object implementation in the objects-as-closure pattern, but with explicit state management rather than lexically captured variables. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I was a little hesitant about my phrasing also. You're right that it's not a data type, which is why I described it as like a data type, hoping to draw an analogy. It is a type of type, but it's not abstract. One might almost want to use the world "class" if it weren't already hopelessly overloaded. "Kind" means "the things with this implementation", but I don't care for the description in terms of the objects-as-closure pattern, because I don't want to have to describe the objects-as-closure pattern nor assume people know what that means (a lot of people know it, but don't have a name for it, and this doesn't seem like the right place to try to first establish that connection). I think this paragraph still needs work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Try this on for size:
A virtual object has a "kind", which defines what sort of behavior and state it will possess. A kind is not exactly a data type, since it comes with a concrete implementation, but it indicates a family of objects that share a set of common behaviors and a common state template.
|
||
The `bodyMaker` function is passed the virtual object's state as a parameter, and is expected to return a new Javascript object with methods that close over this state. | ||
|
||
One distinguished method, named `initialize`, is called when new instances are first created, and is expected to initialize the state object. Any parameters passed to the maker function returned by `makeKind` are passed directly to the `initialize` method. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One distinguished method, named `initialize`, is called when new instances are first created, and is expected to initialize the state object. Any parameters passed to the maker function returned by `makeKind` are passed directly to the `initialize` method. | |
One distinguished method, named `initialize`, is called when new instances are first created, and is expected to initialize the state object. Any parameters passed to the maker function returned by `makeKind` are passed directly to the `initialize` method. | |
Although this `initialize` method is a method of the object returned by the `bodyMaker` it is not a method of the actual object being made. The maker returned by `makeKind` on makes objects with the remaining methods. |
The initialize
method is so special that I wonder whether it should be called, for example __initialize__
. It is weird that you cannot use this system to make objects that happen to have a normal method named initialize
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we had some way to declare initialize
as private
that would do 90% of the job, as the main purpose of the magic is to ensure that handing out a reference to the object is not also handing out a capability to reset all its state. If nobody tries to call initialize
a second time they'll never notice the magic is even there. (The other magic is the convention that the initialize method is named initialize
, but that's more of an aesthetic concern. I'd be open to marking it in some other way, though the specific suggestion of __initialize__
is so horribly butt ugly it makes me want to scream.)
Also (and this is really more of a pedantic point) initialize
is a method of the actual object being made; it's just that it's removed after it's used for safety reasons. In particular, if you had some kind of weird initialization dance that worked by having initialize
hand the object to some external worker code that called initialize
recursively, you could do that and it would work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's just that it's removed after it's used
With delete
? God that's ugly. We should definitely find a way to express the initialization function other than by making it a method of the object.
In particular, if you had some kind of weird initialization dance that worked by having initialize hand the object to some external worker code that called initialize recursively, you could do that and it would work.
That implies that you start executing before deleting it. That in turn implies that you start executing before hardening. We cannot do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's just that it's removed after it's used
With delete? God that's ugly. We should definitely find a way to express the initialization function other than by making it a method of the object.
I'm open to any suggestion that preserves reasonable ergonomics. Keep in mind that, as @warner has pointed out, the initializer needs to able to see the object that the body maker returns (or will return), not just the state, because it may need to pass it to somebody else as part of initialization.
In particular, if you had some kind of weird initialization dance that worked by having initialize hand the object to some external worker code that called initialize recursively, you could do that and it would work.
That implies that you start executing before deleting it. That in turn implies that you start executing before hardening. We cannot do that.
What do you mean by "start executing"? Keep in mind that I'm talking here about what happens during initialization, when the state is still malleable because it hasn't been entirely established yet. We're still on the initialize
call stack.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean by "start executing"?
That you call initialize
while the object still has an initialize
method. This is fatally dangerous if
initializer needs to able to see the object that the body maker returns (or will return), not just the state, because it may need to pass it to somebody else as part of initialization.
because then it is giving other things access to the object a) before it is hardened, and b) while it still has an initialize
method that may be abused.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
__initialize__
is so horribly butt ugly it makes me want to scream
That problem goes away if we stop confusing the namespaces.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just realized a very important implementation detail that you missed (maybe because you're looking at the documentation PR instead of the code?) and that I straight up forgot (meaning, among other things, that my remarks above about whacky recursive initialization are just totally, utterly, and embarrassingly wrong): the initialize
function is extracted and deleted from the object, and then the object is hardened, before initialize
is called. Nobody gets to see the object before it's hardened.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, looking at the code, just found a bug in same, which I will now proceed to fix. Grrr.
|
||
However, the vat secondary storage system's `makeWeakStore` augments the one provided by the `@agoric/store` package in two important ways: | ||
|
||
- Weak store operations may use virtual object representatives as keys, and if you do this the corresponding underlying virtual object instance identities will be used for indexing rather than the object identities of the representative objects. (I.e., two object representives for which `sameKey` is true will operate on the same weak store entry). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought we decided to fix this using WeakRefs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Forgive me, but I totally don't understand what you mean by that. Fix what using weakrefs how?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We only need to weaken object identity if we're avoiding weakrefs. If we are now allowing ourselves to use weakrefs as long as we don't leak non-determinism, for each virtual object, we can ensure there is at most one javascript instance. When we unserialize a reference to that virtual object, we reuse the existing javascript object if it has not yet been collected. Virtual object identity and JavaScript object identity can remain exactly one-to-one, which helps us avoid a wide range of JavaScript programming hazards.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We will still rename sameStructure
to sameKey
and explain it as a less precise equivalent class. But it is less precise for pass-by-copy objects, comparing them only by contents, not identity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I recall the discussion now. That would allow us to reduce the current 3-layer implementation to a 2-layer implementation, which would simplify some things (though it would also require doing delicate WeakRef
things in exchange, so I'm not sure it's a net win on implementation complexity -- though enabling users to use ===
probably pays for it). Did we eventually decide we were OK committing to Node 14 as the minimum required version?
BTW, I've implemented a placeholder sameKey
function in the virtual object manager (PR forthcoming) but it only understands the inner key magic for virtual objects (everything else goes through object identity) and I was wondering about the work of integrating it with the current sameStructure
implementation. Presumably this effort can disappear in a puff of logic if ===
works?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would indeed disappear.
It's not just the user using ===
. It is all the hidden ways users depend on JS object identity which we'd need to teach them to avoid.
SES-shim needs to support earlier versions of Node. agoric-sdk does not. Endo will be somewhere between the two attn @kriskowal . This persistence system is right on the edge between Endo and agoric-sdk. Not sure yet which way it should go.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This persistence system is very tightly integrated with the SwingSet kernel and liveslots, so I don't think it winds up as an Endo thing per se. It's plausible we'll want something like it for Endo, and also plausible that the something like it, whatever it proves to be, can cannibalize the current implementation, but I think we first have to answer some questions about the role of CapTP in Endo. Off topic here, I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@warner I think we need to have a conversation about the priority of reworking things using weak refs, relative to the other things on our plate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we now have the option of mapping one-to-one with JS object identity, I strongly encourage us to.
|
||
The `bodyMaker` function is passed the virtual object's state as a parameter, and is expected to return a new Javascript object with methods that close over this state. | ||
|
||
One distinguished method, named `initialize`, is called when new instances are first created, and is expected to initialize the state object. Any parameters passed to the maker function returned by `makeKind` are passed directly to the `initialize` method. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
__initialize__
is so horribly butt ugly it makes me want to scream
That problem goes away if we stop confusing the namespaces.
function counterBody(state) { | ||
return { | ||
initialize(name = 'thing') { | ||
state.counter = 0; | ||
state.name = name; | ||
}, | ||
inc() { | ||
state.counter += 1; | ||
}, | ||
dec() { | ||
state.counter -= 1; | ||
}, | ||
reset() { | ||
state.counter = 0; | ||
}, | ||
rename(newName) { | ||
state.name = newName; | ||
}, | ||
getCount() { | ||
return state.counter; | ||
}, | ||
getName() { | ||
return state.name; | ||
}, | ||
}; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
function counterBody(state) { | |
return { | |
initialize(name = 'thing') { | |
state.counter = 0; | |
state.name = name; | |
}, | |
inc() { | |
state.counter += 1; | |
}, | |
dec() { | |
state.counter -= 1; | |
}, | |
reset() { | |
state.counter = 0; | |
}, | |
rename(newName) { | |
state.name = newName; | |
}, | |
getCount() { | |
return state.counter; | |
}, | |
getName() { | |
return state.name; | |
}, | |
}; | |
} | |
function makeCounterKit(state) { | |
const init = (name = 'thing') => { | |
state.counter = 0; | |
state.name = name; | |
}; | |
const body = { | |
inc() { | |
state.counter += 1; | |
}, | |
dec() { | |
state.counter -= 1; | |
}, | |
reset() { | |
state.counter = 0; | |
}, | |
rename(newName) { | |
state.name = newName; | |
}, | |
getCount() { | |
return state.counter; | |
}, | |
getName() { | |
return state.name; | |
}, | |
}; | |
return harden({ | |
init, | |
body, | |
}); | |
} |
makeCounterKit
makes a counter kit, which is a record with an init
function and a body
object. You can write it as above, which also gives a name to the body
object that the init
function and body
methods can use. Since none of these do, this one could be written with everything inline
function makeCounterKit(state) {
return harden({
init(name = 'thing') {
state.counter = 0;
state.name = name;
},
body: {
inc() {
state.counter += 1;
},
dec() {
state.counter -= 1;
},
reset() {
state.counter = 0;
},
rename(newName) {
state.name = newName;
},
getCount() {
return state.counter;
},
getName() {
return state.name;
},
},
});
}
Both of these harden before anything happens, protecting ourselves from undesired mutation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The kit properties should probably be named {init, self}
rather than {init, body}
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is still too complicated and ugly to my eyes and full of places to make mistakes.
Would you feel better if instead of deleting initialize
we just replaced it with ()=>{}
?
BTW, this example illustrates an interesting difference between our respective API intuitions. I tend to see any place where the user needs to add a harden
before passing to our SDK as an ergonomics problem. Basically, if they need to put in a harden
to be safe and they don't, then they've unwittingly made themselves vulnerable. If the context is such that the harden should always be there, then the SDK should do it for them. (Better would be a language where every API surface is hardened at birth, but that's not the language we have.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you feel better if instead of deleting initialize we just replaced it with ()=>{}?
Worse. This goes in the wrong direction. Digs the hole deeper.
Better would be a language where every API surface is hardened at birth, but that's not the language we have
Agreed on that. I also agree it is not clear how to best recover from this mistake. But the recovery should not require subtle reasoning on the part of the programmer to stay safe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code should say what it means. In that regard, my code is much simpler than your's because they both describe an object with no initialize
method but only my code looks like it does that. Your code is smaller but misleads.
Patient: Doctor, it makes me want to scream when I confuse namespaces. |
f9990b1
to
e407e6b
Compare
Since this is a documentation PR that documents what's currently on master, I'm going to go ahead and merge it, but doing this is not a claim that we've resolved the API design discussion. |
@FUDCo |
No description provided.