Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v8: expose new V8 5.5 serialization API #11048

Closed
wants to merge 3 commits into from

Conversation

addaleax
Copy link
Member

Expose the new serialization API that was added in V8 5.5 to userland. The JS API is virtually a direct copy of what V8 provides on the C++ level.

This is useful Node as a possible replacement for some internals that currently use JSON, like IPC, but is likely to be useful to general userland code as well.

Checklist
  • make -j4 test (UNIX), or vcbuild test (Windows) passes
  • tests and/or benchmarks are included
  • documentation is changed or added
  • commit message follows commit guidelines
Affected core subsystem(s)

V8

/cc @nodejs/v8

@addaleax addaleax added dont-land-on-v4.x semver-minor PRs that contain new features and should be released in the next minor version. v8 engine Issues and PRs related to the V8 dependency. labels Jan 28, 2017
@nodejs-github-bot nodejs-github-bot added build Issues and PRs related to build files or the CI. c++ Issues and PRs that require attention from people who are familiar with C++. v8 engine Issues and PRs related to the V8 dependency. labels Jan 28, 2017
Copy link
Member Author

@addaleax addaleax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -61,7 +61,7 @@
'v8_android_log_stdout%': 0,

# Force disable libstdc++ debug mode.
'disable_glibcxx_debug%': 0,
'disable_glibcxx_debug%': 1,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change is weird, and I’m not sure what the right place to do it is. It’s required because the std::vector that is exposed by _GLIBCXX_DEBUG isn’t necessarily ABI-compatible with the one that’s exposed without the define; so either Node would also have to use that flag or V8 needs to disable it by default.

And in general, I’m not sure it’s intentional on @nodejs/v8’s side that debug builds expose a different ABI than release builds…?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should be able to override it from common.gypi but I don't expect it's necessary to do so.

Checked STL containers have different name mangling in libstdc++. I think the worst case is that the add-on fails to load at runtime, but not that it silently does the wrong thing.

Copy link
Member

@targos targos Jan 30, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW in V8 5.6 Release will return a std::pair<uint8_t*, size_t> instead of a vector.

Ref: https://github.com/v8/v8/blob/753a2b55781face0f10b8d9a6fb0da45ede2c693/include/v8.h#L1784

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checked STL containers have different name mangling in libstdc++. I think the worst case is that the add-on fails to load at runtime, but not that it silently does the wrong thing.

Some of the functions do get inlined, even in debug builds. (Like, this modification is not just academic; without it, I could observe actual out-of-bound writes.)

FWIW in V8 5.6 Release will return a std::pair<uint8_t*, size_t> instead of a vector.

Ok, then at least we don’t need to worry about it. :)

doc/api/v8.md Outdated
Read raw bytes from the deserializer’s internal buffer. The `length` parameter
must correspond to the length of the buffer that was passed to
[`serializer.writeRawBytes()`][].
For use inside of a custom `options.writeHostObject`.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of the wording here is taken directly from the V8 headers, btw.

@mscdex
Copy link
Contributor

mscdex commented Jan 28, 2017

How does this compare performance-wise to using JSON.parse()/JSON.stringify() for IPC? I looked into the possibility of using a binary protocol for IPC a little while ago and it wasn't even close to the JSON performance IIRC.

@addaleax
Copy link
Member Author

How does this compare performance-wise to using JSON.parse()/JSON.stringify() for IPC?

I haven’t really benchmarked that so far. A round-trip on the data in test/fixtures/url-tests.json shows that it’s about 15 % slower than JSON, a round-trip on 'x'.repeat(100000) works out about 4× faster than JSON for me; I guess it just depends a lot on what kind of data you’re sending. I’m also assuming there’s room for improvement in V8 here.

I’m definitely not suggesting that we make a decision to change the IPC procotol right now; it’s just an option we may want to keep an eye on.

Also: This protocol is a lot more expressive than JSON, so having this available is going to be useful on its own.

@targos
Copy link
Member

targos commented Jan 28, 2017

+1 but I think we should wait until the API and binary format are stable. ReleaseBuffer will be renamed to Release (and return a std::pair) in V8 5.6 for example.

@targos
Copy link
Member

targos commented Jan 28, 2017

What about allowing to extend the class and implement the optional methods instead of using an object of options?

@addaleax addaleax added the blocked PRs that are blocked by other issues or PRs. label Jan 28, 2017
@addaleax
Copy link
Member Author

What about allowing to extend the class and implement the optional methods instead of using an object of options?

That should work just as well. The options approach felt more Node-like, but I don’t feel strongly about it.

I think we should wait until the API and binary format are stable.

Sure, we can do that. I’ve marked the PR as “blocked” but tbh I am not too worried about the binary format given that it’s non-portable anyway.

@targos
Copy link
Member

targos commented Jan 28, 2017

I'm worried because I believe one of the first things users will do with this API is serialize complex objects and save them to disk for later comparison in unit tests.

@indutny
Copy link
Member

indutny commented Jan 28, 2017

Considering that it uses just v8.h, is there any reason to not do it in npm module?

I know getting things to core sounds great, but it is simpler to improve API while the module is managed separately. It can always be included in core later, when the API will be stabilized and well tested.

What do you think?

@addaleax
Copy link
Member Author

I'm worried because I believe one of the first things users will do with this API is serialize complex objects and save them to disk for later comparison in unit tests.

Yeah, sure. Like, I didn’t want to imply that I think waiting until the API is no longer experimental is a bad idea. :)

Considering that it uses just v8.h, is there any reason to not do it in npm module?

Basically no other reason than that I think (and I might be wrong) that Node core is going to be a consumer of this API anyway.

It can always be included in core later, when the API will be stabilized and well tested.

This PR is very very (maybe too) close to the V8 API, so if we do what @targos suggests and we just wait until V8 considers it stable we should be fine.

Copy link
Member

@jasnell jasnell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great to see. Almost there.

@@ -134,3 +134,150 @@ setTimeout(function() { v8.setFlagsFromString('--notrace_gc'); }, 60e3);
[V8]: https://developers.google.com/v8/
[here]: https://github.com/thlorenz/v8-flags/blob/master/flags-0.11.md
[`GetHeapSpaceStatistics`]: https://v8docs.nodesource.com/node-5.0/d5/dda/classv8_1_1_isolate.html#ac673576f24fdc7a33378f8f57e1d13a4

## Serialization API

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are going to introduce this API, even tho I know this is going into the v8 module that is already clearly marked as being fluid based on what v8 chooses to do, we should mark this explicitly as being Experimental for the time being.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jasnell done!

const v8binding = process.binding('v8');
const serdesBinding = process.binding('serdes');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps something more explicit such as v8serdes would be better? Not sure about that tho.
@nodejs/node-chakracore have you all considered implementing this mechanism yet?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I guess that depends if and how node-chakracore would want to implement this. I don’t think the binding name really matters a lot anyway.

lib/v8.js Outdated

exports.Serializer =
class Serializer extends serdesBinding.Serializer {
constructor(options = {}) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extremely minor nit but use of argument defaults has not benchmarked rather well yet. It's likely not something to worry about in this case, however

@@ -0,0 +1,93 @@
'use strict';
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A test that includes a pre-serialized buffer that can be round-tripped serialized->deser->re-serialized would be good also.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jasnell done!

Copy link
Contributor

@sam-github sam-github left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this won't land until the v8 api is stable, and maybe better documented, but left some comments on docs I don't understand.

doc/api/v8.md Outdated
* `getDataCloneError` {Function} A callback that takes a single `message`
argument and returns an error object. Defaults to the `Error` constructor.
* `writeHostObject` {Function} A callback that takes a single `object`
argument and is used to write some kind of host object, if possible. If not,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's a host object?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's a host object?

An object that’s exposed from C++, i.e. by Node or an addon. Most of the options & methods here won’t be very useful to the average user… so I’ve gone ahead and implemented v8.[de]serialize shorthands on top of these classes.


Returns the stored internal buffer. This serializer should not be used once
the buffer is released. Calling this method results in undefined behavior
if a previous write has failed.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that accurate? On failure, you aren't allowed to cleanup?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how do writes fail? do they throw errors? writeHeader/Value don't say they can throw.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On failure, you aren't allowed to cleanup?

I guess that depends on what you mean by “cleanup”? Any resources held by the serializer will be returned when it gets gc’ed.

how do writes fail? do they throw errors? writeHeader/Value don't say they can throw.

Yes, writeValue may throw errors. I’ve noted that in the corresponding section.

doc/api/v8.md Outdated
* `id` {Integer} A 32-bit unsigned integer.
* `arrayBuffer` {ArrayBuffer|SharedArrayBuffer} An `ArrayBuffer` instance.

Marks an `ArrayBuffer` as havings its contents transferred out of band.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

transferred? out-of-band?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

transferred? out-of-band?

Uh, yes. Basically says that the actual ArrayBuffer isn’t included in the resulting serialization, and that the data is made available to the deserializer through some other means.

If you have suggestions for better wording on anything, it might be a good idea to upstream them into V8, too.

doc/api/v8.md Outdated

#### deserializer.readHeader()

* Returns: {Boolean}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

true/false means?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

true/false means?

I’ve removed the return type from the documentation. Right now it’s always true except when an error is thrown.

Copy link
Member

@bnoordhuis bnoordhuis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some comments. Overall impression is good, though.

lib/v8.js Outdated

readRawBytes(length) {
const offset = this._readRawBytes(length);
return Buffer.prototype.slice.call(this.buffer, offset, offset + length);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a comment explaining that the contorted .call is because this.buffer can be either a Buffer or a plain Uint8Array? It's not immediately obvious from looking at just the code.

static void ReadUint32(const FunctionCallbackInfo<Value>& args);
static void ReadUint64(const FunctionCallbackInfo<Value>& args);
static void ReadDouble(const FunctionCallbackInfo<Value>& args);
static void ReadRawBytes(const FunctionCallbackInfo<Value>& args);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider giving all methods inline linkage if they're not used outside this file.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider giving all methods inline linkage if they're not used outside this file.

These functions are passed to V8, so what at least gcc does is to expose them as weak symbols instead of regular symbols.

I can wrap everything in here in an anonymous namespace if you want, if it’s about hiding them – is there a reason we don’t do that in our other source files?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thinking was that e.g. SerializerContext::New() is a generic enough name that it might cause symbol clashes when another file introduces the same class/method (so yes, hiding.)

is there a reason we don’t do that in our other source files?

No reason except inertia. I give most new code I write inline linkage.

(Also, I like inline better than namespace { ... } because the former doesn't require me to scroll up to figure out the namespace we're in. Maybe I should use a fancier editor.)


void SerializerContext::ThrowDataCloneError(Local<String> message) {
Local<Value> args[1] = { message };
Local<Value> getDataCloneError =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get_data_clone_error? Same suggestion applies to a couple of places further down.

SerializerContext::SerializerContext(Environment* env,
Local<Object> wrap,
Local<Value> throwDataCloneError,
Local<Value> writeHostObject)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using camelCase for params is a minor stylistic issue although I kind of see why you're doing that here.

deserializer_(env->isolate(), data_, length_, this) {
object()->Set(env->context(), env->buffer_string(), buffer);
object()->Set(env->context(), env->read_host_object_string(), readHostObject);
deserializer_.SetSupportsLegacyWireFormat(true);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe explain in a comment that otherwise it always throws an "invalid or unsupported version" exception.


Local<Value> retValue = ret.ToLocalChecked();
if (!retValue->IsObject()) {
env()->ThrowTypeError("readHostObject must return an object");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not a blocker but I admit to not being fond of code that throws JS exceptions as a side effect.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not a blocker but I admit to not being fond of code that throws JS exceptions as a side effect.

What alternative would you prefer? I think I’d like this better than a CHECK or just not doing any type checking.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking of a Local<Value>* exception_out parameter but, on second thought, since the function->Call() can raise an exception anyway, that's pointless. Never mind.


Maybe<bool> ret = ctx->deserializer_.ReadHeader(ctx->env()->context());

if (ret.IsJust()) args.GetReturnValue().Set(ret.FromJust());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consistency: you just .FromMaybe(false) without an if elsewhere. I think I like this approach better, though; it's Obviously Correct, even if it's a little more verbose.


ser->SetClassName(FIXED_ONE_BYTE_STRING(env->isolate(), "Serializer"));
target->Set(FIXED_ONE_BYTE_STRING(env->isolate(), "Serializer"),
ser->GetFunction());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you use the overload that takes a v8::Context?


des->SetClassName(FIXED_ONE_BYTE_STRING(env->isolate(), "Deserializer"));
target->Set(FIXED_ONE_BYTE_STRING(env->isolate(), "Deserializer"),
des->GetFunction());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto. There are a few other places where you use old-style (implicit-style?) methods, such as the no-arg overloads of NumberValue() and Uint32Value().

@addaleax addaleax force-pushed the v8-serdes branch 2 times, most recently from 293c67f to 68fceb8 Compare February 4, 2017 08:38
@addaleax
Copy link
Member Author

addaleax commented Feb 4, 2017

@bnoordhuis I should have addressed most of your comments, PTAL

Copy link
Member

@bnoordhuis bnoordhuis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with some final comments.

static void ReadUint32(const FunctionCallbackInfo<Value>& args);
static void ReadUint64(const FunctionCallbackInfo<Value>& args);
static void ReadDouble(const FunctionCallbackInfo<Value>& args);
static void ReadRawBytes(const FunctionCallbackInfo<Value>& args);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thinking was that e.g. SerializerContext::New() is a generic enough name that it might cause symbol clashes when another file introduces the same class/method (so yes, hiding.)

is there a reason we don’t do that in our other source files?

No reason except inertia. I give most new code I write inline linkage.

(Also, I like inline better than namespace { ... } because the former doesn't require me to scroll up to figure out the namespace we're in. Maybe I should use a fancier editor.)

MaybeLocal<Value> error =
get_data_clone_error.As<Function>()->Call(env()->context(),
object(),
1,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefer arraysize(args) if you're passing an array.

ASSIGN_OR_RETURN_UNWRAP(&ctx, args.Holder());
Maybe<bool> ret =
ctx->serializer_.WriteValue(ctx->env()->context(),
args[0]);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fits on one or two lines, doesn't it?

DeserializerContext::DeserializerContext(Environment* env,
Local<Object> wrap,
Local<Value> buffer,
Local<Value> readHostObject)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

camelCase leftOver?

}

MaybeLocal<Object> DeserializerContext::ReadHostObject(Isolate* isolate) {
Local<Value> readHostObject =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likewise.


Local<Value> retValue = ret.ToLocalChecked();
if (!retValue->IsObject()) {
env()->ThrowTypeError("readHostObject must return an object");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking of a Local<Value>* exception_out parameter but, on second thought, since the function->Call() can raise an exception anyway, that's pointless. Never mind.

if (ret.IsEmpty())
return MaybeLocal<Object>();

Local<Value> retValue = ret.ToLocalChecked();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return_value or rval if you think that's too wordy.

DeserializerContext* ctx;
ASSIGN_OR_RETURN_UNWRAP(&ctx, args.Holder());

Maybe<double> length_arg = args[0]->NumberValue(ctx->env()->context());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IntegerValue()? It returns an int64_t.

@sam-github sam-github dismissed their stale review February 14, 2017 22:03

can't figure out what changes I requested

@addaleax
Copy link
Member Author

@bnoordhuis Updated, but I’d like to wait a bit and rebase this on the V8 5.7 API + wait for https://bugs.chromium.org/p/v8/issues/detail?id=5926 to be resolved (unless you think that that’s silly).

targos pushed a commit to targos/node that referenced this pull request Mar 25, 2017
Refs: nodejs#11048

Below is the list of commits:

deps: cherry-pick 78c0be52d from V8 upstream

Original commit message:

  ValueSerializer: Promote scheduled exceptions from wasm::ErrorThrower.

  wasm::ErrorThrower doesn't actually throw exceptions, it just schedules them.
  As a result, this exception isn't handled properly by code which expects
  ValueDeserializer to actually throw. For instance, the unit tests use a
  TryCatch to catch and handle expected exceptions in unit tests.

  Before this patch, I see local unit test failures because a wasm decode test
  schedules one, but it isn't caught (and instead causes Context::New to fail
  at the beginning of the next test).

  BUG=685713

  Review-Url: https://codereview.chromium.org/2659483004
  Cr-Commit-Position: refs/heads/master@{nodejs#42718}

deps: cherry-pick 966355585 from V8 upstream

Original commit message:

  [d8] Use ValueSerializer for postMessage (instead of ad-hoc serializer)

  Review-Url: https://codereview.chromium.org/2643723010
  Cr-Commit-Position: refs/heads/master@{nodejs#42749}

deps: cherry-pick bf511b426 from V8 upstream

Original commit message:

  ValueSerializer: Support efficiently reading and writing one-byte strings.

  memcpy is faster than UTF-8 encoding/decoding. This yields 10-20% wins on
  serializing and deserializing long ASCII strings, according to
  blink_perf.bindings -- and these are already in a fast path where the entire
  string is known to be ASCII (but this has to be checked). The win may be
  larger for strings in Latin-1 but not ASCII (though I suspect this is an
  uncommon case).

  A change is also made to make ValueSerializerTest.EncodeTwoByteStringUsesPadding
  survive wire format version number changes.

  This is the first of a series of wire format changes from the previous Blink
  format. The deserializer continues to be able to read the old format, but
  Chromium M56 will no longer be able to read the messages written by this, in M58.

  BUG=chromium:686159

  Review-Url: https://codereview.chromium.org/2658793004
  Cr-Commit-Position: refs/heads/master@{nodejs#42753}

deps: cherry-pick 6f1639ed1 from V8 upstream

Original commit message:

  ValueSerializer: Distinguish between 'undefined' and an absent property.

  Dealing with this case requires a wire format change. It is possible that an
  element can be absent even in an array where the dense format was chosen
  (because the array initially had no holes), if the elements are modified while
  they are being serialized. In this case, a new tag for the "hole" is emitted.

  The logic to treat undefined in dense arrays as an absent property is restricted
  to versions of the wire format that this tag did not exist.

  BUG=chromium:686159,chromium:665820

  Review-Url: https://codereview.chromium.org/2660093002
  Cr-Original-Commit-Position: refs/heads/master@{nodejs#42784}
  Committed: https://chromium.googlesource.com/v8/v8/+/dc85f4c8338c1c824af4f7ee3274dc9f95d14e49
  Review-Url: https://codereview.chromium.org/2660093002
  Cr-Commit-Position: refs/heads/master@{nodejs#42800}

deps: cherry-pick c3856de37 from V8 upstream

Original commit message:

  ValueSerializer: Check for zero length before casting to FixedDoubleArray.

  Even though the elements kind is FAST_DOUBLE_ELEMENTS, if length is zero
  the isolate's empty_fixed_array is used. It's illegal to cast this to
  FixedDoubleArray, so we avoid the cast.

  BUG=chromium:686479

  Review-Url: https://codereview.chromium.org/2665313003
  Cr-Commit-Position: refs/heads/master@{nodejs#42867}

deps: cherry-pick 591cc0b4c from V8 upstream

Original commit message:

  ValueSerializer: Share string encoding code with String and RegExp objects.

  This avoids the need to pull in the UTF-8 encoding code from the public API,
  and allows it to take advantage of any supported way that i::String can be
  encoded (one- or two-byte).

  Backward compatibility is maintained, but this is the behavior beginning
  with this version.

  BUG=chromium:686159

  Review-Url: https://codereview.chromium.org/2665653004
  Cr-Commit-Position: refs/heads/master@{nodejs#42872}

deps: cherry-pick 79837f5f6 from V8 upstream

Original commit message:

  Improve ValueSerializer perf regression after 96635558

  BUG=687196
  R=jbroman@chromium.org

  Review-Url: https://codereview.chromium.org/2674613002
  Cr-Commit-Position: refs/heads/master@{nodejs#42938}

deps: cherry-pick 8990399dc from V8 upstream

Original commit message:

  ValueDeserializer: Only allow valid keys when deserializing object properties.

  The serializer won't ever write a more complex object. Not validating this
  allows other things to be used as keys, and converted to string when the
  property set actually occurs. It turns out this gives an opportunity to trigger
  OOM by giving an object a key which is a very large sparse array (whose string
  representation is very large).

  This case is now rejected by the deserializer.

  BUG=chromium:686511

  Review-Url: https://codereview.chromium.org/2697023002
  Cr-Commit-Position: refs/heads/master@{nodejs#43249}

deps: cherry-pick 68960eeb7 from V8 upstream

Original commit message:

  ValueDeserializer: Make sure that an exception is the legacy path.

  The entry points to the deserializer are responsible for ensuring that an
  exception is pending by the time they return. Some failures throw exceptions
  themselves, while others (like errors in the format) are exceptions caused by
  the deserializer, not coming from the runtime.

  Like the non-legacy path, a default deserialization exception should be thrown
  in such cases.

  BUG=chromium:693411

  Review-Url: https://codereview.chromium.org/2712713002
  Cr-Commit-Position: refs/heads/master@{nodejs#43390}

deps: cherry-pick 3b15d950e from V8 upstream

Original commit message:
  ValueSerializer: Add SetTreatArrayBufferViewsAsHostObjects() flag

  Add `ValueSerializer::SetTreatArrayBufferViewsAsHostObjects()` which
  instructs the `ValueSerializer` to treat ArrayBufferView objects as
  host objects.

  BUG=v8:5926

  Review-Url: https://codereview.chromium.org/2696133007
  Cr-Commit-Position: refs/heads/master@{nodejs#43281}

deps: cherry-pick 654351997 from V8 upstream

Original commit message:

  ValueSerializer: Add an explicit tag for host objects.

  This makes it no longer necessary to ensure that V8 and Blink have non-colliding
  tags, which makes it easier for them to evolve independently, and also makes
  the wire format more suitable for other V8 embedders, who would not
  necessarily be surveyed before V8 introduced a new tag that might collide
  with theirs.

  BUG=chromium:686159

  Review-Url: https://codereview.chromium.org/2709023003
  Cr-Commit-Position: refs/heads/master@{nodejs#43466}

PR-URL: nodejs#11752
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
Reviewed-By: Franziska Hinkelmann <franziska.hinkelmann@gmail.com>
@addaleax
Copy link
Member Author

Rebased now that V8 5.7 has landed.

CI: https://ci.nodejs.org/job/node-test-commit/8687/

@bnoordhuis Mind taking another look?

@targos targos self-assigned this Mar 25, 2017
@addaleax addaleax removed the blocked PRs that are blocked by other issues or PRs. label Mar 25, 2017
Copy link
Member

@bnoordhuis bnoordhuis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still LGTM at a quick glance.

@@ -80,3 +87,110 @@ exports.getHeapSpaceStatistics = function() {

return heapSpaceStatistics;
};

/* V8 serialization API */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tiniest of style nits but is there a reason for mixing C and C++-style comments?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bnoordhuis I would say C-style comments feel a bit more heading-y than C++-style comments? I never consciously noticed but I think I use // for text that only refers to the next one or two statements, whereas /* … */ refers to a longer section of code. That also seems to match how we use eslint-disable comments in our codebase.

If you feel strongly about it, I have no problem changing the format in either way. :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Never looked at it that way. I suppose it's fine, lib/ is a mixture of both; // is more prevalent but that's probably also because of the copyright header.

Expose the new serialization API that was added in V8 5.5 to userland.
The JS API is virtually a direct copy of what V8 provides on the
C++ level.

This is useful Node as a possible replacement for some internals
that currently use JSON, like IPC, but is likely to be useful to
general userland code as well.
@addaleax
Copy link
Member Author

@targos Does you assigning this to yourself mean that I should wait for a review from you before merging, or that you would like to be the person who merges this, or something else?

@targos
Copy link
Member

targos commented Mar 27, 2017

@addaleax I'd like to review this, yes. Probably later today.


This throws an error if `value` cannot be serialized.

#### serializer.releaseBuffer()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want to keep close to the V8 API, this should be called release(). Did you keep it like that because a Buffer is returned?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@targos I would assume V8 calling it Release is because ReleaseBuffer was already taken by the legacy method. But yeah, it’s nice that releaseBuffer() tells you the (otherwise not obvious) return type.

Copy link
Member

@targos targos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doc, JS and tests LGTM

lib/v8.js Outdated
i = bufferConstructorIndex;
} else {
i = arrayBufferViewTypeToIndex.get(objectToString(abView));
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this throw an error when i is undefined (unknown host object)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@targos Yes thanks for catching! Updated + test added

@targos targos removed their assignment Mar 27, 2017
@addaleax
Copy link
Member Author

Landed in 9b2dd47...1fde98b

@addaleax addaleax closed this Mar 29, 2017
@addaleax addaleax deleted the v8-serdes branch March 29, 2017 03:19
addaleax added a commit that referenced this pull request Mar 29, 2017
PR-URL: #11048
Reviewed-By: Michaël Zasso <targos@protonmail.com>
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
addaleax added a commit that referenced this pull request Mar 29, 2017
Expose the new serialization API that was added in V8 5.5 to userland.
The JS API is virtually a direct copy of what V8 provides on the
C++ level.

This is useful Node as a possible replacement for some internals
that currently use JSON, like IPC, but is likely to be useful to
general userland code as well.

PR-URL: #11048
Reviewed-By: Michaël Zasso <targos@protonmail.com>
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
addaleax pushed a commit that referenced this pull request Mar 31, 2017
Missed while reviewing 1fde98b ("v8: expose new V8 serialization API.")

PR-URL: #12118
Refs: #11048
Reviewed-By: Daniel Bevenius <daniel.bevenius@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Michaël Zasso <targos@protonmail.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
@jasnell jasnell mentioned this pull request Apr 4, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
build Issues and PRs related to build files or the CI. c++ Issues and PRs that require attention from people who are familiar with C++. semver-minor PRs that contain new features and should be released in the next minor version. v8 engine Issues and PRs related to the V8 dependency.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants