-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use vectors for calls and imports #145
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
(Sorry for the latency.)
FWIW, I'm still unconvinced that these changes provide any tangible benefit; but if one accepts the premise that this is what people want, then the changes look fine.
@jakobkummerow, PTAL. I introduced a few convenience macros to make manual handling of vectors (and values) less verbose. Also, fixed latent crash bugs in examples where out val_vecs weren't properly initialised (which might cause attempts to release an own reference if a val happens to materialise with ref kind). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not clear to me that passing struct containing a pointer and length, instead of just passing the pointer and length, is valuable. A plain pointer + length pair is simpler, more idiomatic in C APIs, and easier to bind to.
For example, witx tooling already knows how to map a high-level Interface-Types-like array
type into separate pointer and length arguments when generating a C binding, and it's a simple technique that generates idiomatic APIs in a principled way. It does mean that you can't infer the full high-level semantics from the C API signatures alone, however that's already true for other reasons, and it doesn't have any obvious practical downsides.
include/wasm.h
Outdated
@@ -510,7 +510,7 @@ WASM_API_EXTERN const wasm_memory_t* wasm_extern_as_memory_const(const wasm_exte | |||
WASM_DECLARE_REF(instance) | |||
|
|||
WASM_API_EXTERN own wasm_instance_t* wasm_instance_new( | |||
wasm_store_t*, const wasm_module_t*, const wasm_extern_t* const imports[], | |||
wasm_store_t*, const wasm_module_t*, own wasm_extern_vec_t* imports, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did this change from a const T*
to an own T*
? If I understand the code here, this function isn't expected to take ownership of the array.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right! This was because no other places needed non-own vectors before, so I hadn't noticed it and supporting it required extra infrastructure. Fixed now.
It's a question of consistency. The API uses the vec abstraction in all other places, so it would be rather incoherent to not use it in this one. It also is more self-documenting. I haven't tried, but I would expect that bindings benefit from consistency as well. (You could argue that we shouldn't use vec anywhere except for return types, but I'm not sure that's preferable, especially on the C++ end. In any case, that would be a different and much larger change.) |
@jakobkummerow, PTAL. |
I think in terms of ownership the API right now is pretty inconsistent, and I think that @sunfishcode's suggestion could greatly improve the consistency and understandability of the API at-a-glance. For example I think it would be more understandable if vectors were only used as either return values or "here's ownership of an allocated chunk of memory". Currently vectors are sometimes used as ownership containers and sometimes used as "here's a temporary view into my data". Additionally the ownership itself is pretty:
Given this state the only real way to figure out the ownership of an API seems to be to look at the documentation, from the signature alone it's pretty unclear where ownership is being transferred. For example it seems like a weird gotcha that I think it could be more consistent to say "using a vector means transfer of ownership". That way if you ever see a |
FWIW I can say that at least from my experience consistency matters a lot with respect "what does everyone else in C do" just as much as within the API itself. For example to bind |
I see what you mean. In most cases, where ownership transfer occurs was driven by what information an object needs to keep around, and what it is more likely to reconstruct from scratch. In some cases this may be biased towards what was most natural for the V8 backend. However, "using a vector means transfer of ownership" may not be a sufficiently nuanced guiding principle, since you sometimes just want to return a pair of vector and length without passing ownership. That pairing was the primary purpose of the abstraction after all. Always passing ownership would either force premature copying, or extra levels of reference counting and indirection. The type representations are an example of this: e.g., just traversing the arguments of a function type should not require copying the vector and its types (which may get more complex in the future). Ownership ought to be more apparent in the C++ version, where we at least have types to represent it. Though I believe it is still explicit from the Happy to discuss how to make this more consistent. Heads-up though that I'm off to vacation tomorrow, so probably won't reply during the next week. |
Ah yes to be clear the C API, as is today, is clear about where ownership is transferred. The type signatures (and the For example why does I agree that APIs sometimes need to return two values, such as An example of this is the libgit2 API where |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code looks fine. (By which I don't mean to take a position in the ongoing discussion about what the best API design would be.)
@@ -648,8 +648,8 @@ public: | |||
Func() = delete; | |||
~Func(); | |||
|
|||
using callback = auto (*)(const Val[], Val[]) -> own<Trap>; | |||
using callback_with_env = auto (*)(void*, const Val[], Val[]) -> own<Trap>; | |||
using callback = auto (*)(const vec<Val>&, vec<Val>&) -> own<Trap>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just FYI, allocating the param/result data structures is highly performance sensitive for fast calls. For maximum optimization, it's important that an engine can materialize the respective objects "by hand", i.e. precisely predict the C++ object layout. That's trivially doable with a Val[]
and should still be fine with a const vec<Val>&
, so I'm just pointing this out in case further changes will be discussed/considered here.
(FWIW, we haven't gone that far yet in V8, we're still using malloc
here, which is fairly slow. C++-side stack allocation of dynamically sized arrays is not an option for us.)
I created a issue #150 for the ownership discussion. Landing this patch for now. |
Return to using vectors for function call arguments and results, as well as module imports. This was simplified to a straight array interface in #48, but additional sanity checks have been requested.
Unlike #134, this avoids the use of struct-by-value in the C API and implements the API change consistently in both C and C++ API.