From 86722bdc066a49d2a282ac864c847b8d0cd925ab Mon Sep 17 00:00:00 2001 From: Antoine du Hamel Date: Tue, 11 May 2021 16:37:50 +0200 Subject: [PATCH] doc: add primordials guidelines PR-URL: https://github.com/nodejs/node/pull/38635 Reviewed-By: Akhil Marsonya Reviewed-By: Darshan Sen Reviewed-By: Rich Trott Reviewed-By: James M Snell Reviewed-By: Robert Nagy Reviewed-By: Matteo Collina Reviewed-By: Gireesh Punathil Reviewed-By: Geoffrey Booth --- doc/contributing/primordials.md | 595 ++++++++++++++++++++++++++++++++ 1 file changed, 595 insertions(+) create mode 100644 doc/contributing/primordials.md diff --git a/doc/contributing/primordials.md b/doc/contributing/primordials.md new file mode 100644 index 00000000000..fe20fdd9094 --- /dev/null +++ b/doc/contributing/primordials.md @@ -0,0 +1,595 @@ +# Usage of primordials in core + +The file `lib/internal/per_context/primordials.js` subclasses and stores the JS +built-ins that come from the VM so that Node.js built-in modules do not need to +later look these up from the global proxy, which can be mutated by users. + +Usage of primordials should be preferred for any new code, but replacing current +code with primordials should be +[done with care](#primordials-with-known-performance-issues). It is highly +recommended to ping the relevant team when reviewing a pull request that touches +one of the subsystems they "own". + +## Accessing primordials + +The primordials are meant for internal use only, and are only accessible for +internal core modules. User code cannot use or rely on primordials. It is +usually fine to rely on ECMAScript built-ins and assume that it will behave as +specified. + +If you would like to access the `primordials` object to help you with Node.js +core development or for tinkering, you can expose it on the global scope using +this combination of CLI flags: + +```bash +node --expose-internals -r internal/test/binding +``` + +## Contents of primordials + +### Properties of the global object + +Objects and functions on the global object can be deleted or replaced. Using +them from primordials makes the code more reliable: + +```js +globalThis.Array === primordials.Array; // true + +globalThis.Array = function() { + return [1, 2, 3]; +}; +globalThis.Array === primordials.Array; // false + +primordials.Array(0); // [] +globalThis.Array(0); // [1,2,3] +``` + +### Prototype methods + +ECMAScript provides a group of methods available on built-in objects that are +used to interact with JavaScript objects. + +```js +const array = [1, 2, 3]; +array.push(4); // Here `push` refers to %Array.prototype.push%. +console.log(JSON.stringify(array)); // [1,2,3,4] + +// %Array.prototype%.push is modified in userland. +Array.prototype.push = function push(val) { + return this.unshift(val); +}; + +array.push(5); // Now `push` refers to the modified method. +console.log(JSON.stringify(array)); // [5,1,2,3,4] +``` + +Primordials wrap the original prototype functions with new functions that take +the `this` value as the first argument: + +```js +const { + ArrayPrototypePush, +} = primordials; + +const array = [1, 2, 3]; +ArrayPrototypePush(array, 4); +console.log(JSON.stringify(array)); // [1,2,3,4] + +Array.prototype.push = function push(val) { + return this.unshift(val); +}; + +ArrayPrototypePush(array, 5); +console.log(JSON.stringify(array)); // [1,2,3,4,5] +``` + +### Safe classes + +Safe classes are classes that provide the same API as their equivalent class, +but whose implementation aims to avoid any reliance on user-mutable code. +Safe classes should not be exposed to user-land; use unsafe equivalent when +dealing with objects that are accessible from user-land. + +### Variadic functions + +There are some built-in functions that accept a variable number of arguments +(e.g.: `Math.max`, `%Array.prototype.push%`). It is sometimes useful to provide +the list of arguments as an array. You can use primordial function with the +suffix `Apply` (e.g.: `MathMaxApply`, `ArrayPrototypePushApply`) to do that. + +## Primordials with known performance issues + +One of the reasons why the current Node.js API is not completely tamper-proof is +performance: sometimes the use of primordials can cause performance regressions +with V8, which when in a hot code path, could significantly decrease the +performance of code in Node.js. + +* Methods that mutate the internal state of arrays: + * `ArrayPrototypePush` + * `ArrayPrototypePop` + * `ArrayPrototypeShift` + * `ArrayPrototypeUnshift` +* Methods of the function prototype: + * `FunctionPrototypeBind` + * `FunctionPrototypeCall`: creates performance issues when used to invoke + super constructors. + * `FunctionPrototype`: use `() => {}` instead when referencing a no-op + function. +* `SafeArrayIterator` +* `SafeStringIterator` +* `SafePromiseAll` +* `SafePromiseAllSettled` +* `SafePromiseAny` +* `SafePromiseRace` +* `SafePromisePrototypeFinally`: use `try {} finally {}` block instead. + +In general, when sending or reviewing a PR that makes changes in a hot code +path, use extra caution and run extensive benchmarks. + +## Implicit use of user-mutable methods + +### Unsafe array iteration + +There are many usual practices in JavaScript that rely on iteration. It's useful +to be aware of them when dealing with arrays (or `TypedArray`s) in core as array +iteration typically calls several user-mutable methods. This sections lists the +most common patterns in which ECMAScript code relies non-explicitly on array +iteration and how to avoid it. + +
+ +Avoid for-of loops on arrays + +```js +for (const item of array) { + console.log(item); +} +``` + +This code is internally expanded into something that looks like: + +```js +{ + // 1. Lookup @@iterator property on `array` (user-mutable if user-provided). + // 2. Lookup @@iterator property on %Array.prototype% (user-mutable). + // 3. Call that function. + const iterator = array[Symbol.iterator](); + // 1. Lookup `next` property on `iterator` (doesn't exist). + // 2. Lookup `next` property on %ArrayIteratorPrototype% (user-mutable). + // 3. Call that function. + let { done, value: item } = iterator.next(); + while (!done) { + console.log(item); + // Repeat. + ({ done, value: item } = iterator.next()); + } +} +``` + +Instead of utilizing iterators, you can use the more traditional but still very +performant `for` loop: + +```js +for (let i = 0; i < array.length; i++) { + console.log(array[i]); +} +``` + +The following code snippet illustrates how user-land code could impact the +behavior of internal modules: + +```js +// User-land +Array.prototype[Symbol.iterator] = () => ({ + next: () => ({ done: true }), +}); + +// Core +let forOfLoopBlockExecuted = false; +let forLoopBlockExecuted = false; +const array = [1, 2, 3]; +for (const item of array) { + forOfLoopBlockExecuted = true; +} +for (let i = 0; i < array.length; i++) { + forLoopBlockExecuted = true; +} +console.log(forOfLoopBlockExecuted); // false +console.log(forLoopBlockExecuted); // true +``` + +This only applies if you are working with a genuine array (or array-like +object). If you are instead expecting an iterator, a for-of loop may be a better +choice. + +
+ +
+ +Avoid array destructuring assignment on arrays + +```js +const [first, second] = array; +``` + +This is roughly equivalent to: + +```js +// 1. Lookup @@iterator property on `array` (user-mutable if user-provided). +// 2. Lookup @@iterator property on %Array.prototype% (user-mutable). +// 3. Call that function. +const iterator = array[Symbol.iterator](); +// 1. Lookup `next` property on `iterator` (doesn't exist). +// 2. Lookup `next` property on %ArrayIteratorPrototype% (user-mutable). +// 3. Call that function. +const first = iterator.next().value; +// Repeat. +const second = iterator.next().value; +``` + +Instead you can use object destructuring: + +```js +const { 0: first, 1: second } = array; +``` + +or + +```js +const first = array[0]; +const second = array[1]; +``` + +This only applies if you are working with a genuine array (or array-like +object). If you are instead expecting an iterator, array destructuring is the +best choice. + +
+ +
+ +Avoid spread operator on arrays + +```js +// 1. Lookup @@iterator property on `array` (user-mutable if user-provided). +// 2. Lookup @@iterator property on %Array.prototype% (user-mutable). +// 3. Lookup `next` property on %ArrayIteratorPrototype% (user-mutable). +const arrayCopy = [...array]; +func(...array); +``` + +Instead you can use other ECMAScript features to achieve the same result: + +```js +const arrayCopy = ArrayPrototypeSlice(array); +ReflectApply(func, null, array); +``` + +
+ +
+ +%Object.fromEntries% iterate over an array + +```js +{ + // Unsafe code example: + // 1. Lookup @@iterator property on `array` (user-mutable if user-provided). + // 2. Lookup @@iterator property on %Array.prototype% (user-mutable). + // 3. Lookup `next` property on %ArrayIteratorPrototype% (user-mutable). + const obj = ObjectFromEntries(array); +} + +{ + // Safe example using `SafeArrayIterator`: + const obj = ObjectFromEntries(new SafeArrayIterator(array)); +} + +{ + // Safe example without using `SafeArrayIterator`: + const obj = {}; + for (let i = 0; i < array.length; i++) { + obj[array[i][0]] = array[i][1]; + } + // In a hot code path, this would be the preferred method. +} +``` + +
+ +
+ +%Promise.all%, + %Promise.allSettled%, + %Promise.any%, and + %Promise.race% iterate over an array + +```js +// 1. Lookup @@iterator property on `array` (user-mutable if user-provided). +// 2. Lookup @@iterator property on %Array.prototype% (user-mutable). +// 3. Lookup `next` property on %ArrayIteratorPrototype% (user-mutable). +PromiseAll(array); // unsafe + +PromiseAll(new SafeArrayIterator(array)); +SafePromiseAll(array); // safe +``` + +
+ +
+ +%Map%, %Set%, %WeakMap%, and + %WeakSet% constructors iterate over an array + +```js +// User-land +Array.prototype[Symbol.iterator] = () => ({ + next: () => ({ done: true }), +}); + +// Core + +// 1. Lookup @@iterator property on %Array.prototype% (user-mutable). +// 2. Lookup `next` property on %ArrayIteratorPrototype% (user-mutable). +const set = new SafeSet([1, 2, 3]); + +console.log(set.size); // 0 +``` + +```js +// User-land +Array.prototype[Symbol.iterator] = () => ({ + next: () => ({ done: true }), +}); + +// Core +const set = new SafeSet(); +set.add(1).add(2).add(3); +console.log(set.size); // 3 +``` + +
+ +### Promise objects + +
+ +%Promise.prototype.finally% looks up then + property of the Promise instance + +```js +// User-land +Promise.prototype.then = function then(a, b) { + return Promise.resolve(); +}; + +// Core +let finallyBlockExecuted = false; +PromisePrototypeFinally(somePromiseThatEventuallySettles, + () => { finallyBlockExecuted = true; }); +process.on('exit', () => console.log(finallyBlockExecuted)); // false +``` + +```js +// User-land +Promise.prototype.then = function then(a, b) { + return Promise.resolve(); +}; + +// Core +let finallyBlockExecuted = false; +(async () => { + try { + return await somePromiseThatEventuallySettles; + } finally { + finallyBlockExecuted = true; + } +})(); +process.on('exit', () => console.log(finallyBlockExecuted)); // true +``` + +
+ +
+ +%Promise.all%, + %Promise.allSettled%, + %Promise.any%, and + %Promise.race% look up then + property of the Promise instances + +You can use safe alternatives from primordials that differ slightly from the +original methods: +* It expects an array (or array-like object) instead of an iterable. +* It wraps each promise in `SafePromise` objects and wraps the result in a new + `Promise` instance – which may come with a performance penalty. +* Because it doesn't look up `then` property, it may not be the right tool to + handle user-provided promises (which may be instances of a subclass of + `Promise`). + +```js +// User-land +Promise.prototype.then = function then(a, b) { + return Promise.resolve(); +}; + +// Core +let thenBlockExecuted = false; +PromisePrototypeThen( + PromiseAll(new SafeArrayIterator([PromiseResolve()])), + () => { thenBlockExecuted = true; } +); +process.on('exit', () => console.log(thenBlockExecuted)); // false +``` + +```js +// User-land +Promise.prototype.then = function then(a, b) { + return Promise.resolve(); +}; + +// Core +let thenBlockExecuted = false; +PromisePrototypeThen( + SafePromiseAll([PromiseResolve()]), + () => { thenBlockExecuted = true; } +); +process.on('exit', () => console.log(thenBlockExecuted)); // true +``` + +
+ +### (Async) Generator functions + +Generators and async generators returned by generator functions and async +generator functions are relying on user-mutable methods; their use in core +should be avoided. + +
+ +%GeneratorFunction.prototype.prototype%.next is + user-mutable + +```js +// User-land +Object.getPrototypeOf(function* () {}).prototype.next = function next() { + return { done: true }; +}; + +// Core +function* someGenerator() { + yield 1; + yield 2; + yield 3; +} +let loopCodeExecuted = false; +for (const nb of someGenerator()) { + loopCodeExecuted = true; +} +console.log(loopCodeExecuted); // false +``` + +
+ +
+ +%AsyncGeneratorFunction.prototype.prototype%.next is + user-mutable + +```js +// User-land +Object.getPrototypeOf(async function* () {}).prototype.next = function next() { + return new Promise(() => {}); +}; + +// Core +async function* someGenerator() { + yield 1; + yield 2; + yield 3; +} +let finallyBlockExecuted = false; +async () => { + try { + for await (const nb of someGenerator()) { + // some code; + } + } finally { + finallyBlockExecuted = true; + } +}; +process.on('exit', () => console.log(finallyBlockExecuted)); // false +``` + +
+ +### Text processing + +#### Unsafe string methods + +| The string method | looks up the property | +| ----------------------------- | --------------------- | +| `String.prototype.match` | `Symbol.match` | +| `String.prototype.matchAll` | `Symbol.matchAll` | +| `String.prototype.replace` | `Symbol.replace` | +| `String.prototype.replaceAll` | `Symbol.replace` | +| `String.prototype.search` | `Symbol.search` | +| `String.prototype.split` | `Symbol.split` | + +```js +// User-land +RegExp.prototype[Symbol.replace] = () => 'foo'; +String.prototype[Symbol.replace] = () => 'baz'; + +// Core +console.log(StringPrototypeReplace('ber', /e/, 'a')); // 'foo' +console.log(StringPrototypeReplace('ber', 'e', 'a')); // 'baz' +console.log(RegExpPrototypeSymbolReplace(/e/, 'ber', 'a')); // 'bar' +``` + +#### Unsafe string iteration + +As with arrays, iterating over strings calls several user-mutable methods. Avoid +iterating over strings when possible, or use `SafeStringIterator`. + +#### Unsafe `RegExp` methods + +Functions that lookup the `exec` property on the prototype chain: + +* `RegExp.prototype[Symbol.match]` +* `RegExp.prototype[Symbol.matchAll]` +* `RegExp.prototype[Symbol.replace]` +* `RegExp.prototype[Symbol.search]` +* `RegExp.prototype[Symbol.split]` +* `RegExp.prototype.test` + +```js +// User-land +RegExp.prototype.exec = () => null; + +// Core +console.log(RegExpPrototypeTest(/o/, 'foo')); // false +console.log(RegExpPrototypeExec(/o/, 'foo') !== null); // true +``` + +#### Don't trust `RegExp` flags + +RegExp flags are not own properties of the regex instances, which means flags +can be reset from user-land. + +
+ +List of RegExp methods that look up properties from + mutable getters + +| `RegExp` method | looks up the following flag-related properties | +| ------------------------------ | ------------------------------------------------------------------ | +| `get RegExp.prototype.flags` | `global`, `ignoreCase`, `multiline`, `dotAll`, `unicode`, `sticky` | +| `RegExp.prototype[@@match]` | `global`, `unicode` | +| `RegExp.prototype[@@matchAll]` | `flags` | +| `RegExp.prototype[@@replace]` | `global`, `unicode` | +| `RegExp.prototype[@@split]` | `flags` | +| `RegExp.prototype.toString` | `flags` | + +
+ +```js +// User-land +Object.defineProperty(RegExp.prototype, 'global', { value: false }); + +// Core +console.log(RegExpPrototypeSymbolReplace(/o/g, 'foo', 'a')); // 'fao' + +const regex = /o/g; +ObjectDefineProperties(regex, { + dotAll: { value: false }, + exec: { value: undefined }, + flags: { value: 'g' }, + global: { value: true }, + ignoreCase: { value: false }, + multiline: { value: false }, + unicode: { value: false }, + sticky: { value: false }, +}); +console.log(RegExpPrototypeSymbolReplace(regex, 'foo', 'a')); // 'faa' +```