Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: serde implementation for Neon #701

Closed
wants to merge 3 commits into from
Closed

RFC: serde implementation for Neon #701

wants to merge 3 commits into from

Conversation

kjvalencik
Copy link
Member

@kjvalencik kjvalencik commented Mar 18, 2021

Neon Serde

Why

Users would like an ergonomic way of converting between Rust and JavaScript types without serialization.

Prior work can be found in the community maintained neon-serde crate.

What

Serializer and Deserializer are implemented in neon-runtime instead of in neon for performance. It allows us to verify safety in a single place instead of in all operations. Further explanation is given in the module documentation.

Questions

  • What should we call the Error type? Currently SerdeError, but including serde in the name doesn't feel great.
  • Do we need additions to the public error API? It has the following features:
    • Implements Error and Display
    • Can be converted to a NeonResult with or_throw
    • Can be checked for a JavaScript exception
  • Added a ResultExt is is a superset of the functionality of JsResultExt. We may have overlapping impl issues if we try to support both. We may want to eventually deprecate and remove JsResultExt
  • Added to_js_value and from_js_value to Context. What are the thoughts on these names?
  • Should we add something for converting from a JsValue to the Value trait for easy chaining? Not sure what to call it. let data: MyData = cx.argument::<JsValue>(0)?.deserialize(&mut cx).or_throw(&mut cx)?;
  • How can we test this? Is there a public test suite? We probably need to at least attribute the pokedex data.

Future Expansion

  • More impl of ResultExt
  • Add a version of cx.argument that reduces the noise. let data: MyData = cx.argument_from_value(0)?;

@nokome
Copy link

nokome commented May 18, 2021

Hi @kjvalencik. Thank your for this PR (and all the other work that you are doing on neon) 🙏🏽 .

I stumbled across this whilst looking for an update on how to use neon-serde with neon (it appears to be broken with the latest N-API versions of neon). As a workaround I'm currently passing complicated nested serde_json to Javascript by serializing/deserializing using JSON. Obviously, I'd like to avoid this!

Since this is tagged "RFC" I thought I'd answer some of your questions (the ones that I fell that I am able to):

What should we call the Error type? Currently SerdeError, but including serde in the name doesn't feel great.

I agree SerdeError isn't great. How about separate SerializationError and DeserializationError?

Added to_js_value and from_js_value to Context. What are the thoughts on these names?

👍🏽 Both seem good to me.

Should we add something for converting from a JsValue to the Value trait for easy chaining?

This sounds really useful.

I'm an noobie Rust programmer, but please do let me know if you think there is anything I can do to help with this PR.

@kjvalencik
Copy link
Member Author

@nokome Thanks for the feedback! FYI, serializing as JSON is a perfectly valid way to transfer complicated structs. In fact, because the overhead of calling into V8 is so high for most data it's much faster to serialize to JSON than it is to go directly between Rust and JS data types, which is pretty surprising!

The only exception would be structs that include large binary types like ArrayBuffer.

@nokome
Copy link

nokome commented May 19, 2021

Thanks @kjvalencik, that's really useful to know!

@joshbenaron
Copy link

@kjvalencik so if we have a struct User, you propose converting that to a string (using serde) and then doing JSON.parse on the JS side?

Noting this is my use case:

pub struct DataItem {
    owner: String,
    tags: Vec<(String, String)>,
    data: Bytes
}

where DataItem::data can be 10GB worth of data (not small amounts). Is that still the most performant way?

@kjvalencik
Copy link
Member Author

kjvalencik commented Jun 28, 2021

@joshbenaron Binary data is about the only time transcoding directly is faster than JSON serialization. For this specific example, my recommendation would be to serialize the Bytes separately from the struct.

An advantage of this approach over using neon-serde is that the data can be moved zero-copy. serde would require copying the data.

For example:

#[derive(Serialize)]
pub struct DataItem {
    owner: String,
    tags: Vec<(String, String)>,

    #[serde(skip)]
    // This field is skipped and not included when serialized
    data: Bytes
}

// DataItem can be serialized and parsed as JSON, but won't include `data`
// `data` gets sent separately as an `ArrayBuffer`
let data = ArrayBuffer::external(&mut cx, item.data);

@joshbenaron
Copy link

joshbenaron commented Jun 28, 2021

@kjvalencik Ah I see. So just to clarify, is ArrayBuffer synonymous with neon::prelude::JsArrayBuffer?

@kjvalencik
Copy link
Member Author

@joshbenaron Yep, sorry about the incorrect name there. ArrayBuffer is the naming in JS.

@joshbenaron
Copy link

@kjvalencik Thanks for all your help! What about going from JS -> Rust. Should I split up ArrayBuffer and DataItem JSON and pass that to Rust? i.e. are the performance implications of splitting up the buffer the same in both directions (JS <-> Rust)?

@kjvalencik
Copy link
Member Author

@joshbenaron Same goes for the other direction (JS to Rust). I do have a couple follow-up questions.

  • Is 10GB a real number of is that an example? The default maximum size of the V8 heap is 1.5GB. Regardless of implementation, holding that much data in JavaScript is not possible.
  • Are the bytes actually accessed by JavaScript or is it simply carrying the data to another Rust call? If that's the case, JsBox can also be used efficiently (and wouldn't be subject to the 1.5GB limit).

@joshbenaron
Copy link

joshbenaron commented Jun 28, 2021

@kjvalencik in fact 10GB can be small. Essentially I have some JSON which has some data which can be any size. could be a petabyte. I create a signature based on all the fields and encode the JSON to binary using bincode. Does the memory limit apply in docker, and you can set the memory limit higher right?

And yes, JS preprocesses the data before going into Rust

@kjvalencik
Copy link
Member Author

@joshbenaron This is a limit of the V8 engine. The default heap size is 1.5GB. You cannot allocated more than this on the heap.

@joshbenaron
Copy link

@joshbenaron This is a limit of the V8 engine. The default heap size is 1.5GB. You cannot allocated more than this on the heap.

node --max-old-space-size=8192 app.js?

@kjvalencik
Copy link
Member Author

Yeah, that's the parameter to override.

@kjvalencik kjvalencik force-pushed the kv/serde branch 2 times, most recently from f0e0b3b to 61636ca Compare December 21, 2022 20:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants