Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

src: use STL containers instead of v8 values for static module data #24384

Closed
wants to merge 2 commits into from

Conversation

joyeecheung
Copy link
Member

@joyeecheung joyeecheung commented Nov 15, 2018

Instead of putting the source code and the cache in v8::Objects,
put them in per-process std::unordered_maps. This has the following benefits:

  • It's slightly lighter in weight compared to storing things on the
    v8 heap. Also it may be slightly faster since the preivous v8::Object is
    already in dictionary mode - though the difference is very small
    given the number of native modules is limited.
  • The source and code cache generation templates are now much simpler
    since they just initialize static arrays and manipulate STL
    constructs. They are also easier to debug from the C++'s side,
    especially early in the bootstrap process when no inspector
    can be attached.
  • The static native module data can be accessed independent of any
    Environment or Isolate, and it's easy to look them up from the
    C++'s side.
  • It's now impossible to mutate the source code used to compile
    native modules from the JS land since it's completely separate
    from the v8 heap. We can still get the constant strings from
    process.binding('natives') but that's all.

A few drive-by fixes:

  • Remove DecorateErrorStack in LookupAndCompile - We don't need to
    capture the exception to decorate when we encounter
    errors during native module compilation, as those errors should be
    syntax errors and v8 is able to decorate them well. We use
    CompileFunctionInContext so there is no need to worry about
    wrappers either.
  • The code cache could be rejected when node is started with v8 flags.
    Instead of aborting in that case, simply keep a record in the
    native_module_without_cache set.
  • Refactor js2c.py a bit, reduce code duplication and inline Render()
    to make the one-byte/two-byte special treatment easier to read.
Checklist
  • make -j4 test (UNIX), or vcbuild test (Windows) passes
  • tests and/or benchmarks are included
  • documentation is changed or added
  • commit message follows commit guidelines

@nodejs-github-bot nodejs-github-bot added c++ Issues and PRs that require attention from people who are familiar with C++. lib / src Issues and PRs related to general changes in the lib or src directory. labels Nov 15, 2018
@joyeecheung
Copy link
Member Author

Also @hashseed as I understand there is no way to use the cache if node is started with v8 flags which leads to a mismatch of the flag hash in the serialized code cache data, is that correct?

return out;
}

Local<Set> ToJsSet(Local<Context> context, const std::set<std::string>& in) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, maybe this one could better be implemented as an ToV8Value() overload from util.h? It fits in with the others pretty well, I think (the downside being that we always use UTF8 there unconditionally…)

Copy link
Member Author

@joyeecheung joyeecheung Nov 16, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@addaleax The current implementation is only capable of converting std::set<const std::string> to an array...not exactly what we want here, but I can leave a TODO


namespace node {
namespace native_module {

// The native (C++) side of the native module compilation.
typedef typename std::map<std::string, UnionBytes> NativeModuleRecordMap;
typedef typename std::map<std::string, std::string> NativeModuleHashMap;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw, did you consider using std::unordered_map and std::unordered_set over std::map/std::set? I don’t know if we need to worry a lot about the performance tradeoffs here, but I personally tend to use the unordered versions when I can

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there is a reason not to use the unordered version...good idea, I'll change that

src/node_native_module.h Outdated Show resolved Hide resolved
Copy link
Contributor

@refack refack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow! 😮 Looks great.

@refack
Copy link
Contributor

refack commented Nov 16, 2018

@joyeecheung could you gist the new node_javascript.cc and node_code_cache.cc? I'm curious how did they turn out looking like.

@joyeecheung
Copy link
Member Author

joyeecheung commented Nov 16, 2018

@refack If I remove the static char data, they look like this: https://gist.github.com/joyeecheung/9b41e0b9309d9492270ba23a92e2733d

@joyeecheung
Copy link
Member Author

@refack
Copy link
Contributor

refack commented Nov 16, 2018

@refack If I remove the static char data, they look like this: gist.github.com/joyeecheung/9b41e0b9309d9492270ba23a92e2733d

IMO they look much better than before 😄

@joyeecheung
Copy link
Member Author

@joyeecheung
Copy link
Member Author

So...interesting enough, this uncovers a circular dependency in internal/streams/lazy_transform.js on the pis - since now the order of keys in internalBinding('native_module').source depends on the implementation of unordered_map (previously it depends on the order by which js2c inserts the strings into the v8::Object), now on the pis somehow lazy_transform.js can be required before crypto.js is required which triggers the dependency issue.

I opened #24396 to fix this.

@joyeecheung joyeecheung added the blocked PRs that are blocked by other issues or PRs. label Nov 16, 2018
env->set_native_modules_code_cache_hash(native_modules_code_cache_hash);
Local<Object> NativeModuleLoader::GetSourceObject(
Local<Context> context) const {
return MapToObject(context, source_);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the return value cached / should it be cached? Right now it seems like every time internalBinding('native_module').source is accessed a new object is created.

Copy link
Member Author

@joyeecheung joyeecheung Nov 16, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not cached, but that does not really matter since this is only used by tools/generate_code_cache.js, the native module loader in the binary only gets to call into a C++ function that can access the std::unordered_map directly.

@joyeecheung
Copy link
Member Author

joyeecheung commented Nov 16, 2018

So...interesting: std::unordered_map is slower than std::map in this case

https://ci.nodejs.org/view/Node.js%20benchmark/job/benchmark-node-micro-benchmarks/275

                                                                                   confidence improvement accuracy (*)   (**)  (***)
 misc/startup.js mode='process' script='benchmark/fixtures/require-cachable' dur=1                -0.95 %       ±1.05% ±1.40% ±1.82%
 misc/startup.js mode='process' script='test/fixtures/semicolon' dur=1                      *      1.46 %       ±1.23% ±1.64% ±2.14%
 misc/startup.js mode='worker' script='benchmark/fixtures/require-cachable' dur=1                 -0.35 %       ±0.85% ±1.13% ±1.47%
 misc/startup.js mode='worker' script='test/fixtures/semicolon' dur=1                             -0.09 %       ±1.06% ±1.41% ±1.83%

https://ci.nodejs.org/view/Node.js%20benchmark/job/benchmark-node-micro-benchmarks/276

                                                                                   confidence improvement accuracy (*)   (**)  (***)
 misc/startup.js mode='process' script='benchmark/fixtures/require-cachable' dur=1         **     -1.58 %       ±1.18% ±1.57% ±2.04%
 misc/startup.js mode='process' script='test/fixtures/semicolon' dur=1                             0.13 %       ±1.37% ±1.82% ±2.37%
 misc/startup.js mode='worker' script='benchmark/fixtures/require-cachable' dur=1         ***     -2.43 %       ±0.86% ±1.14% ±1.49%
 misc/startup.js mode='worker' script='test/fixtures/semicolon' dur=1                              0.12 %       ±0.87% ±1.16% ±1.51%

I tried to google around and turns out that's not quite surprising, given that we only have < 200 items, and we only insert items and don't touch them once the map is initialized, hashing on the strings could be slower than searching in the tree. e.g. See https://stackoverflow.com/questions/4846798/why-would-map-be-much-faster-than-unordered-map (this almost tempted me to try implementing a trie..nah)

@Trott
Copy link
Member

Trott commented Nov 18, 2018

Removing blocked label because 413fcad landed.

@Trott Trott removed the blocked PRs that are blocked by other issues or PRs. label Nov 18, 2018
@refack
Copy link
Contributor

refack commented Nov 18, 2018

CI (rebased on master): https://ci.nodejs.org/job/node-test-pull-request/18724/

Copy link
Member

@TimothyGu TimothyGu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM other than some fairly superficial comments.

const cachedData = getCodeCache(key);
if (!cachedData.length) {
if (!(cachedData instanceof Uint8Array)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use isUint8Array from util.types?

console.error(`Failed to generate code cache for '${key}'`);
process.exit(1);
}

const length = cachedData.length;
const length = cachedData.byteLength;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change should be unneeded right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TimothyGu Yeah, just so that it's clearer since this length is only used to calculate a string displayed in KB/MB etc. human readable format.

Copy link
Member Author

@joyeecheung joyeecheung Nov 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll switch this to const size = cachedData.byteLength - I think that makes this a bit more self-explanatory and less ambiguous

src/node.cc Outdated

// TODO(joyeecheung): use NativeModule::Compile
// TODO(joyeecheung): use NativeModuleLoader::Compile
// We duplicate the string literals here since once we refactor the boostrap
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: bootstrap

env->SetMethod(target, "compileFunction", NativeModule::CompileFunction);
env->SetMethod(target, "compileCodeCache", NativeModule::CompileCodeCache);
CHECK(target
->SetAccessor(env->context(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’d make this a method, as it signals that getting the value is a relatively expensive operation (and that the returned object is different every time this is called).

Ditto for GetCacheUsage().

Instead of putting the source code and the cache in v8::Objects,
put them in per-process std::maps. This has the following benefits:

- It's slightly lighter in weight compared to storing things on the
  v8 heap. Also it may be slightly faster since the preivous v8::Object is
  already in dictionary mode - though the difference is very small
  given the number of native modules is limited.
- The source and code cache generation templates are now much simpler
  since they just initialize static arrays and manipulate STL
  constructs. They are also easier to debug from the C++'s side,
  especially early in the bootstrap process when no inspector
  can be attached.
- The static native module data can be accessed independent of any
  Environment or Isolate, and it's easy to look them up from the
  C++'s side.
- It's now impossible to mutate the source code used to compile
  native modules from the JS land since it's completely separate
  from the v8 heap. We can still get the constant strings from
  process.binding('natives') but that's all.

A few drive-by fixes:

- Remove DecorateErrorStack in LookupAndCompile - We don't need to
  capture the exception to decorate when we encounter
  errors during native module compilation, as those errors should be
  syntax errors and v8 is able to decorate them well. We use
  CompileFunctionInContext so there is no need to worry about
  wrappers either.
- The code cache could be rejected when node is started with v8 flags.
  Instead of aborting in that case, simply keep a record in the
  native_module_without_cache set.
- Refactor js2c.py a bit, reduce code duplication and inline Render()
  to make the one-byte/two-byte special treatment easier to read.
@joyeecheung
Copy link
Member Author

Fixed a few more nits & rebased & squashed the fixup into one. CI: https://ci.nodejs.org/job/node-test-pull-request/18758/

@joyeecheung
Copy link
Member Author

Landed in 7778c03, thanks!

joyeecheung added a commit that referenced this pull request Nov 19, 2018
Instead of putting the source code and the cache in v8::Objects,
put them in per-process std::maps. This has the following benefits:

- It's slightly lighter in weight compared to storing things on the
  v8 heap. Also it may be slightly faster since the preivous v8::Object
  is already in dictionary mode - though the difference is very small
  given the number of native modules is limited.
- The source and code cache generation templates are now much simpler
  since they just initialize static arrays and manipulate STL
  constructs.
- The static native module data can be accessed independently of any
  Environment or Isolate, and it's easy to look them up from the
  C++'s side.
- It's now impossible to mutate the source code used to compile
  native modules from the JS land since it's completely separate
  from the v8 heap. We can still get the constant strings from
  process.binding('natives') but that's all.

A few drive-by fixes:

- Remove DecorateErrorStack in LookupAndCompile - We don't need to
  capture the exception to decorate when we encounter
  errors during native module compilation, as those errors should be
  syntax errors and v8 is able to decorate them well. We use
  CompileFunctionInContext so there is no need to worry about
  wrappers either.
- The code cache could be rejected when node is started with v8 flags.
  Instead of aborting in that case, simply keep a record in the
  native_module_without_cache set.
- Refactor js2c.py a bit, reduce code duplication and inline Render()
  to make the one-byte/two-byte special treatment easier to read.

PR-URL: #24384
Fixes: https://github.com/Remove
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Franziska Hinkelmann <franziska.hinkelmann@gmail.com>
Reviewed-By: Tiancheng "Timothy" Gu <timothygu99@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
@joyeecheung
Copy link
Member Author

joyeecheung commented Nov 19, 2018

This should be backported together with #24382 and #24382 (for 10.x at least, I am not sure if we need to backport this to 8.x? Mostly because I don't know if at this point 8.x patches are backported on an as-needed basis or we are still trying to backport everything possible onto 8.x)

targos pushed a commit that referenced this pull request Nov 19, 2018
Instead of putting the source code and the cache in v8::Objects,
put them in per-process std::maps. This has the following benefits:

- It's slightly lighter in weight compared to storing things on the
  v8 heap. Also it may be slightly faster since the preivous v8::Object
  is already in dictionary mode - though the difference is very small
  given the number of native modules is limited.
- The source and code cache generation templates are now much simpler
  since they just initialize static arrays and manipulate STL
  constructs.
- The static native module data can be accessed independently of any
  Environment or Isolate, and it's easy to look them up from the
  C++'s side.
- It's now impossible to mutate the source code used to compile
  native modules from the JS land since it's completely separate
  from the v8 heap. We can still get the constant strings from
  process.binding('natives') but that's all.

A few drive-by fixes:

- Remove DecorateErrorStack in LookupAndCompile - We don't need to
  capture the exception to decorate when we encounter
  errors during native module compilation, as those errors should be
  syntax errors and v8 is able to decorate them well. We use
  CompileFunctionInContext so there is no need to worry about
  wrappers either.
- The code cache could be rejected when node is started with v8 flags.
  Instead of aborting in that case, simply keep a record in the
  native_module_without_cache set.
- Refactor js2c.py a bit, reduce code duplication and inline Render()
  to make the one-byte/two-byte special treatment easier to read.

PR-URL: #24384
Fixes: https://github.com/Remove
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Franziska Hinkelmann <franziska.hinkelmann@gmail.com>
Reviewed-By: Tiancheng "Timothy" Gu <timothygu99@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
rvagg pushed a commit that referenced this pull request Nov 28, 2018
Instead of putting the source code and the cache in v8::Objects,
put them in per-process std::maps. This has the following benefits:

- It's slightly lighter in weight compared to storing things on the
  v8 heap. Also it may be slightly faster since the preivous v8::Object
  is already in dictionary mode - though the difference is very small
  given the number of native modules is limited.
- The source and code cache generation templates are now much simpler
  since they just initialize static arrays and manipulate STL
  constructs.
- The static native module data can be accessed independently of any
  Environment or Isolate, and it's easy to look them up from the
  C++'s side.
- It's now impossible to mutate the source code used to compile
  native modules from the JS land since it's completely separate
  from the v8 heap. We can still get the constant strings from
  process.binding('natives') but that's all.

A few drive-by fixes:

- Remove DecorateErrorStack in LookupAndCompile - We don't need to
  capture the exception to decorate when we encounter
  errors during native module compilation, as those errors should be
  syntax errors and v8 is able to decorate them well. We use
  CompileFunctionInContext so there is no need to worry about
  wrappers either.
- The code cache could be rejected when node is started with v8 flags.
  Instead of aborting in that case, simply keep a record in the
  native_module_without_cache set.
- Refactor js2c.py a bit, reduce code duplication and inline Render()
  to make the one-byte/two-byte special treatment easier to read.

PR-URL: #24384
Fixes: https://github.com/Remove
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Franziska Hinkelmann <franziska.hinkelmann@gmail.com>
Reviewed-By: Tiancheng "Timothy" Gu <timothygu99@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
@BridgeAR BridgeAR mentioned this pull request Dec 5, 2018
4 tasks
deepak1556 added a commit to electron/electron that referenced this pull request Dec 19, 2018
deepak1556 added a commit to electron/electron that referenced this pull request Dec 19, 2018
deepak1556 added a commit to electron/electron that referenced this pull request Dec 19, 2018
deepak1556 added a commit to electron/electron that referenced this pull request Dec 20, 2018
deepak1556 added a commit to electron/electron that referenced this pull request Dec 24, 2018
deepak1556 added a commit to electron/electron that referenced this pull request Dec 24, 2018
deepak1556 added a commit to electron/electron that referenced this pull request Dec 25, 2018
deepak1556 added a commit to electron/electron that referenced this pull request Dec 26, 2018
deepak1556 added a commit to electron/electron that referenced this pull request Jan 3, 2019
nornagon pushed a commit to electron/electron that referenced this pull request Jan 3, 2019
deepak1556 added a commit to electron/electron that referenced this pull request Jan 4, 2019
deepak1556 added a commit to deepak1556/atom-shell that referenced this pull request Jan 8, 2019
deepak1556 added a commit to electron/electron that referenced this pull request Jan 10, 2019
nornagon pushed a commit to electron/electron that referenced this pull request Jan 11, 2019
nornagon pushed a commit to electron/electron that referenced this pull request Jan 11, 2019
refack pushed a commit to refack/node that referenced this pull request Jan 14, 2019
Instead of putting the source code and the cache in v8::Objects,
put them in per-process std::maps. This has the following benefits:

- It's slightly lighter in weight compared to storing things on the
  v8 heap. Also it may be slightly faster since the preivous v8::Object
  is already in dictionary mode - though the difference is very small
  given the number of native modules is limited.
- The source and code cache generation templates are now much simpler
  since they just initialize static arrays and manipulate STL
  constructs.
- The static native module data can be accessed independently of any
  Environment or Isolate, and it's easy to look them up from the
  C++'s side.
- It's now impossible to mutate the source code used to compile
  native modules from the JS land since it's completely separate
  from the v8 heap. We can still get the constant strings from
  process.binding('natives') but that's all.

A few drive-by fixes:

- Remove DecorateErrorStack in LookupAndCompile - We don't need to
  capture the exception to decorate when we encounter
  errors during native module compilation, as those errors should be
  syntax errors and v8 is able to decorate them well. We use
  CompileFunctionInContext so there is no need to worry about
  wrappers either.
- The code cache could be rejected when node is started with v8 flags.
  Instead of aborting in that case, simply keep a record in the
  native_module_without_cache set.
- Refactor js2c.py a bit, reduce code duplication and inline Render()
  to make the one-byte/two-byte special treatment easier to read.

PR-URL: nodejs#24384
Fixes: https://github.com/Remove
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Franziska Hinkelmann <franziska.hinkelmann@gmail.com>
Reviewed-By: Tiancheng "Timothy" Gu <timothygu99@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c++ Issues and PRs that require attention from people who are familiar with C++. lib / src Issues and PRs related to general changes in the lib or src directory.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants