diff --git a/0-module-and-module-source.md b/0-module-and-module-source.md new file mode 100644 index 0000000..f98d482 --- /dev/null +++ b/0-module-and-module-source.md @@ -0,0 +1,399 @@ +# First-class Module and ModuleSource + +## Synopsis + +Provide first-class `Module` and `ModuleSource` constructors and extend dynamic +import to operate on `Module` instances. + +A `ModuleSource` represents the result of compiling EcmaScript module source +text. +A `Module` instance represents the lifecycle of a EcmaScript module and allows +virtual import behavior. +Multiple `Module` instances can share a common `ModuleSource`. + +## Interfaces + +### ModuleSource + +```ts +interface ModuleSource { + constructor(source: string); +} +``` + +Semantics: A `ModuleSource` instance gives the holder no powers. +It represents a compiled EcmaScript module and does not capture any information +beyond what can be inferred from a module's source text. +Import specifiers in a module's source text cannot be interpreted without +further information. + +Note 1: `ModuleSource` does for modules what `eval` already does for scripts. +We expect content security policies to treat module sources similarly. +A `ModuleSource` instance constructed from text would not have an associated +origin. +A `ModuleSource` instance can be constructed from vetted text ([W3C Trusted +Types][trusted-types]) and host-defined import hooks may reveal module sources +that were vetted behind the scenes. + +Note 2: Multiple `Module` instances can be constructed from a single `ModuleSource`, +producing one exports namespaces for each imported `Module` instance. + +Note 3: The internal record of a `ModuleSource` instance is immutable and +serializable. +This data can be shared without cost between realms of an agent or even agents +of an agent cluster. + +### Module instances + +```ts +type ImportSpecifier = string; + +type ImportHook = (specifier: ImportSpecifier, importMeta: object) => + Promise; + +interface Module { + constructor( + source: ModuleSource, + options: { + importHook?: ImportHook, + importMeta?: object, + }, + ); + + readonly source: ModuleSource, +} +``` + +Semantics: A `Module` has a 1-1-1-1 relationship with a ***Module Environment +Record***, a ***Module Record*** and a ***module namespace exotic object***. + +The module has a lifecycle and fresh instances have not been linked, +initialized, or executed. + +Invoking dynamic import on a `Module` instance advances it and its transitive +dependencies to their end state. +Consistent with dynamic import for a stringly-named module, +dynamic import on a `Module` instance produces a promise for the corresponding +***Module Namespace Object*** + +Dynamic import induces calls to `importHook` for each unsatisfied dependency of +each module instance in separate events, before any dependency advances to the +link phase of its lifecycle. + +Dynamic import within the evaluation of a `Module` also invokes the +`importHook`. + +`Module` instances memoize the result of their `importHook` keyed on the given +Import Specifier. + +`Module` constructors, like `Function` constructors, are bound to a realm +and evaluate modules in their particular realm. + +## Examples + +### Import Kicker + +Any dynamic import function is suitable for initializing, linking, and +evaluating a module instance. +This necessarily implies advancing all of its transitive dependencies to their +terminal state or any one into a failed state. + +```js +const source = new ModuleSource(``); +const instance = new Module(source, { + importHook, + importMeta: import.meta, +}); +const namespace = await import(instance); +``` + +### Module Idempotency + +Since the Module has a bound module namespace exotic object, importing the same +instance must yield the same result: + +```js +const source = new ModuleSource(``); +const instance = new Module(source, { + importHook, + importMeta: import.meta, +}); +const namespace1 = await import(instance); +const namespace2 = await import(instance); +namespace1 === namespace2; // true +``` + +### Reusing ModuleSource + +Module sources are backed by a shared immutable module source record +that can be instantiated multiple times, even locally. +Multiple `Module` instances can share a module source and produce +separate module namespaces. + +```js +const source = new ModuleSource(``); +const instance1 = new Module(source, { + importHook: importHook1, + importMeta: import.meta, +}); +const instance2 = new Module(source, { + importHook: importHook2, + importMeta: import.meta, +})); +instance1 === instance2; // false +const namespace1 = await import(instance1); +const namespace2 = await import(instance2); +namespace1 === namespace2; // false +``` + +### Intersection Semantics with Module Blocks + +Proposal: https://github.com/tc39/proposal-js-module-blocks + +In relation to module blocks, we can extend the proposal to accommodate both, +the concept of a module block instance and module block source: + +```js +const instance = module {}; +instance instanceof Module; +instance.source instanceof ModuleSource; +const namespace = await import(instance); +``` + +To avoid needing a throw-away module-instance in order to get a module source, +we can extend the syntax: + +```js +const source = static module {}; +source instanceof ModuleSource; +const instance = new Module(source, { + importHook, + importMeta: import.meta, +}); +const namespace = await import(instance); +``` + +### Intersection Semantics with deferred execution + +The possibility to load the source, and create the instance with the default +`importHook` and the `import.meta` of the importer, that can be imported at any +given time, is sufficient: + +```js +import instance from 'module.js' deferred execution syntax; +instance instanceof Module; +instance.source instanceof ModuleSource; +const namespace = await import(instance); +``` + +If the goal is to also control the `importHook` and the `importMeta` of the +importer, then a new syntax can be provided to only get the `ModuleSource`: + +```ts +import source from 'module.js' static source syntax; +source instanceof ModuleSource; +const instance = new Module(source, { + importHook, + importMeta: import.meta, +}); +const namespace = await import(instance); +``` + +This is important, because it is analogous to block modules, but instead of +inline source, it is a source that must be fetched. + +### Intersection Semantics with import.meta.resolve() + +Proposal: https://github.com/whatwg/html/pull/5572 + +```ts +const importHook = async (specifier, importMeta) => { + const url = importMeta.resolve(specifier); + const response = await fetch(url); + const sourceText = await response.text(); + const source = new ModuleSource(sourceText); + return new Module(source, { + importHook, + importMeta: createCustomImportMeta(url), + }); +} + +const source = new ModuleSource(`export foo from './foo.js'`); +const instance = new Module(source, { + importHook, + importMeta: import.meta, +}); +const namespace = await import(instance); +``` + +In the example above, we re-use the `ImportHook` declaration for two instances, +the `source`, and the corresponding dependency for specifier `./foo.js`. When +the kicker `import(instance)` is executed, the `importHook` will be invoked +once with the `specifier` argument as `./foo.js`, and the `meta` argument with +the value of the `import.meta` associated to the kicker itself. As a result, +the `specifier` can be resolved based on the provided `meta` to calculate the +`url`, fetch the source, and create a new `Module` for the new source. This new +instance opts to reuse the same `importHook` function while constructing the +`meta` object. It is important to notice that the `meta` object has two +purposes, to be referenced by syntax in the source text (via `import.meta`) and +to be passed to the `importHook` for any dependencies of `./foo.js` itself. + +## Design + +A ***Module Source Record*** is an abstract class for immutable representations +of the dependencies, bindings, initialization, and execution behavior of a +module. + +Host-defined import-hooks may specialize module source records with annotations +such that the host can enforce content-security-policies. + +A ***EcmaScript Module Source Record*** is a concrete ***Module Source +Record*** for EcmaScript modules. + +`ModuleSource` is a constructor that accepts EcmaScript module source text and +produces an object with a [[ModuleSource]] slot referring to an ***EcmaScript +Module Source Record***. + +`ModuleSource` instances are handles on the result of compiling a EcmaScript +module's source text. +A module source has a [[ModuleSource]] internal slot that refers to a +***Module Source Record***. +Multiple `Module` instances can share a common module source. + +Module source records only capture information that can be inferred from static +analysis of the module's source text. + +Multiple `ModuleSource` instances can share a common ***Module Source Record*** +since these are immutable and so hosts have the option of sharing them between +realms of an agent and even agents of an agent cluster. + +The `Module` constructor accepts a source and +A `Module` has a 1-1-1-1 relationship with a ***Module Environment Record***, +a ***Module Source Record***, and a ***Module Exports Namespace Exotic Object***. + +## Design Rationales + +### Should `importHook` be synchronous or asynchronous? + +When a source module imports from a module specifier, you might not have the +source at hand to create the corresponding `Module` to be returned. If +`importHook` is synchronous, then you must have the source ready when the +`importHook` is invoked for each dependency. + +Since the `importHook` is only triggered via the kicker (`import(instance)`), +going async there has no implications whatsoever. +In prior iterations of this, the user was responsible for loop thru the +dependencies, and prepare the instance before kicking the next phase, that's +not longer the case here, where the level of control on the different phases is +limited to the invocation of the `importHook`. + +### Can cycles be represented? + +Yes, `importHook` can return a `Module` that was either `import()` already or +was returned by an `importHook` already. + +### Idempotency of dynamic imports in ModuleSource + +Any `import()` statement inside a module source will result of a possible +`importHook` invocation on the `Module`, and the decision on whether or not to +call the `importHook` depends on whether or not the `Module` has already +invoked it for the `specifier` in question. So, a `Module` +most keep a map for every `specifier` and its corresponding `Module` to +guarantee the idempotency of those static and dynamic import statements. + +User-defined and host-defined import-hooks will likely enforce stronger +consistency between import behavior across module instances, but module +instances enforce local consistency and some consistency in aggregate by +induction of other modules. + +### toString + +Whether `ModduleSource` instances retain the original source may vary by host +and modules should reuse +[HostHasSourceTextAvailable](https://tc39.es/ecma262/#sec-hosthassourcetextavailable) +in deciding whether to reveal their source, as they might with a `toString` +method. +The [module block][module-blocks] proposal may necessitate the retention of +text on certain hosts so that hosts can transmit use sources in their serial +representation of a module, as could be an extension to `structuredClone`. + +### Factoring ECMA-262 + +This proposal decouples a new ***Module Source Record*** and ***EcmaScript +Module Source Record*** from the existing ***Module Record*** class hierarchy +and introduces a concrete ***Virtual Module Record***. +The hope if not expectation is that this refactoring makes evident that +***Virtual Module Record***, ***Cyclic Module Record***, and the abstract +base class ***Module Record*** could be refactored into a single concrete +record (***Module Record***) since all meaningful variation can be expressed +with implementations of the abstract ***Module Source Record***. +But, this proposal does not go so far as to make that refactoring normative. + +This proposal does not impose on host-defined import behavior. + +### Referrer Specifier + +This proposal expressly avoids specifying a module referrer. +We are convinced that the `importMeta` object passed to the `Module` +constructor is sufficient to denote (have a host-specific referrer property +like `url` or a method like `resolve`) or connote (serve as a key in a `WeakMap` +side table) since the import behavior carries that exact object to the +`importHook`, regardless of whether `import.meta` is identical to `importMeta`. +This allows us virtual modules to emulate even hosts that provide empty +`import.meta` objects. + +## Design Variables + +### Whether to reify the `referrer` + +As written, we've avoided threading a `referrer` argument into the `Module` +constructor because the `importMeta` is sufficient for carrying information +about the referrer. +This is okay so long as `import.meta` and `importMeta` are different objects, +because a module should not be able to alter its own referrer over the course +of its evaluation. +Having `import.meta !== importMeta` is a topic of some confusion. +If these were made identical, we would need to thread `referrer` separately. + +### Relationship to Content-Security-Policy + +On hosts that implement a Content-Security-Policy, the [[Origin]] is +host-specific [[HostData]] of a module source. +A module source constructed from a trusted types object could also +inherit an origin. +All of this can occur outside the purview of ECMA-262 and +is orthogonal to the `Module` referrer, which can be different than the origin. + +### The name of module instances + +`Module` instance and `ModuleInstance` instance are both contenders +for the name of module instances. +We are tentatively entertaining `Module` because it feels likely that we arrive +in a world where `(module {}) instanceof Module`. +The naming calculus would shift significantly if module blocks are reified as +module source instead of module instance. + +We will no doubt arrive in a state of confusion with anything named "module" +without qualification. +For example, `WebAssembly.Module` more closely resembles a module source than a +module instance. +That would suggest module source should also be `Module`, and consequently +`ModuleInstance`. + +### Form of the kicker method + +A functionally equivalent proposal would add an `import` method to the `Module` +prototype to get a promise for the module's exports namespace instead of +overloading dynamic `import`. +Using dynamic import is consistent with an interpretation of the module blocks +proposal where module blocks evaluate to `Module` instances. + +### Options bag or flat arguments + +The options bag described here for the `Module` constructor leaves some +room open for accounting for import assertions. +It also raises questions about the default import hook and import meta, +which are answerable but have not yet been fully discussed. + +[module-blocks]: https://github.com/tc39/proposal-js-module-blocks +[trusted-types]: https://w3c.github.io/webappsec-trusted-types/dist/spec/ diff --git a/1-static-analysis.md b/1-static-analysis.md new file mode 100644 index 0000000..00b43b2 --- /dev/null +++ b/1-static-analysis.md @@ -0,0 +1,298 @@ +# Surface Module Source Static Analysis + +## Synopsis + +Extend instances of `ModuleSource` such that they reflect certain results of +static analysis, like their `import` and `export` bindings, such that tools can +inspect module graphs. + +## Dependencies + +This proposal depends on [Module and ModuleSource][0] from the [Compartments +proposal](README.md) to introduce `ModuleSource`. + +## Design + +Extend [ModuleSource][0], such that instances have the following properties: + +- `bindings`, an `Array` of `Binding`s. +- `needsImportMeta`, a `boolean` indicating that the module contains + `import.meta` syntax. + +Where a `Binding` is an ordinary `Object` with one of the valid binding shapes +for each name or wildcard (`*`) bound by `import` or `export` in the text +of the module, in their order of appearance. + +- `{ import: string, from: string }` + + For example, `import { a } from 'a.js'` would produce + `{ import: 'a', from: 'a.js' }`. + + For example, `import { a, b } from 'ab.js'` would produce: + `{ import: 'a', from: 'ab.js' }` and + `{ import: 'b', from: 'ab.js' }`. + +- `{ import: string, as: string, from: string }` + + For example, `import { a as x } from 'a.js'` would produce + `{ import: 'a', as: 'x', from: 'a.js' }`. + +- `{ export: string }` + + For example, `export { x }` would produce + `{ export: 'x' }`. + +- `{ export: string, from: string }` + + For example, `export { x } from 'x.js'` would produce + `{ export: 'x', from: 'x.js' }`. + +- `{ export: string, as: string, from: string }` + + For example, `export { x as a } from 'x.js'` would produce + `{ export: 'x', as: 'a', from: 'x.js' }`. + +- `{ importAllFrom: string, as: string }` + + For example, `import * as x from 'x.js'` would produce + `{ importAllFrom: 'x.js', as: 'x' }`. + +- `{ exportAllFrom: string }` + + For example, `export * from 'x.js'` would produce + `{ exportAllFrom: 'x.js' }`. + +- `{ exportAllFrom: string, as: string }` + + For example, `export * as x from 'x.js'` would produce + `{ exportAllFrom: 'x.js', as: 'x' }`. + +When using dynamic import to instantiate a `Module`, the JavaScript host will +continue to depend on the [[Module Source]] internal slot and the bindings +slots of the underlying Module Source Record for its own analysis, such that +user code cannot be confused by modifications to a `ModuleSource` instance that +might share the underlying immutable Module Source Record which in turn may be +safely shared among agents in an agent cluster. + +## Motivation + +A mechanism to statically analyze the shallow dependencies of a JavaScript +module will allow tools to create a module graph from module texts without +executing them, and without a heavy dependency on a full JavaScript parser. +This is the first step in many JavaScript module system tools including +build systems, bundlers, import map generators, and hot module replacement +systems, test dependency watchers. + +The weight and performance of a JavaScript meta-parser (about 1MB) often +precludes production use-cases that make direct use of JavaScript module +source. +Surfacing this feature at the language level will likely allow production +systems to operate directly on JavaScript sources instead of generated +artifacts. +This would make production systems more closely resemble systems tested during +development, and make debugging production systems map more closely to +development analogues. + +* bundlers ([Browserify][browserify], [WebPack][webpack], [Parcel][parcel], + &c), virtualize loading but not evaluation of module graphs and emulate other + host environments, like a Node.js program emulating a web browser. +* import mappers ([import-map][import-map]) like bundlers need to be able to + collect transitive dependencies according to ECMAScript language and specific + host behaviors. + A ECMAScript native module loader interface would expedite evolution of import map + runtimes in JavaScript. +* hot module replacement (HMR) systems (WebPack, SnowPack, &c), which need the + ability to instantiate new module graphs when dependencies change and the + ability to bequeath subgraphs to new graphs. + * Node.js [defers][node-hmr] to ECMAScript to provide a module loader + interface to aid HMR. +* persistent testing apparatuses ([Jest][jest]), because a persistent service + reinstantiates whole module graphs to reconstruct tests and test subjects. + * Jest currently resorts to exploiting Node.js's [vm][vm-context] module to + instantiate separate realms and attempts ([and + fails][jest-ses-interaction]) to provide the illusion of a single realm by + patching client realms with some of the intrinsics of the host realm. + +## Examples + +### Analyzing a module graph + +The following code produces a module graph from modules plainly published on +the web using URLs as import specifiers and memo keys. +No modules are executed. + +```js +const graph = new Map(); + +const load = async url => { + if (graph.has(url)) { + return; + } + const response = await fetch(url); + + // Account for redirects. + if (response.url !== url) { + graph.set(url, new Set([response.url])); + return load(response.url); + } + + const edges = new Set(); + graph.set(url, edges); + + const text = await response.text(); + const source = new ModuleSource(text); + + const dependencies = []; + for (const binding of source.bindings) { + const from = binding.from ?? binding.importAllFrom ?? binding.exportAllFrom; + if (from) { + const importUrl = new URL(binding.from, url).href; + edges.add(importUrl); + dependencies.push(load(importUrl)); + } + } + await Promise.all(dependencies); +}; + +await load('https://example.com/example.js'); +``` + +### Hot module replacement + +Hot module replacement allows a developer to automatically reload a module when +any of its transitive dependencies change, invaliding any intermediate modules, +and allowing for graceful hand-off of module scoped stage when necessary. + +
+ Hot module replacement sketch + + This sketch outlines how one can use `Module` and `ModuleSource` to construct + a watcher graph that reuses these objects between reloads when possible. + The sketch assumes the existence of a fictitious `watch` interface that is a + parody of `fetch`, except producing a promise `changed` that will settle when + the response is no longer valid. + + ```js + const getImports = source => source.bindings.map(binding => + binding.from ?? + binding.importAllFrom ?? + binding.exportAllFrom + ).filter(Boolean); + + const sources = new Map(); + const modules = new Map(); + const watchers = new Map(); + const states = new Map(); + const getStates = new Map(); + + const invalidateModule = url => { + const watcher = watchers.get(url); + if (watcher) { + watcher(); + watchers.delete(url); + } + modules.delete(url); + for (const importSpecifier of getImports(source)) { + const url = new URL(importSpecifier, url).href; + invalidateModule(url); + } + + // Hand-off state in preparation for an upgrade. + const getState = getStates.get(url); + if (getState) { + states.set(url, getState()); + getStates.delete(url); + } + }; + + const invalidateSource = url => { + invalidateModule(url); + sources.delete(url); + }; + + const importHook = async (importSpecifier, importerMeta) => { + const url = new URL(importSpecifier, importerMeta.url).href; + let module = modules.get(url); + if (!module) { + let source = sources.get(url); + if (!source) { + const response = await watch(url); + response.changed.then(() => invalidateSource(url)); + const text = await response.text(); + source = new ModuleSource(text); + sources.set(url, source); + } + const registerGetState = getState => { + getStates.set(url, getState); + }; + const state = stages.get(url); + const importMeta = { url, state, registerGetState }; + module = new Module(source, { importHook, importMeta }); + modules.set(url, module); + } + return module; + } + + const watchModule = async (url, { signal }) => { + while (!signal.aborted) { + const { promise, resolve } = Promise.defer(); + watchers.set(url, resolve); + await importHook(url, import.meta); + await promise; + // Blink once to debounce coincident changes. + await Promise.delay(100); + } + }; + + const entrypoint = 'https://example.com/example.js'; + await watchModule(entrypoint); + ``` + + This assumes a protocol for state hand-off: + + ```js + let state = import.meta.state; + import.meta.registerGetState(() => state); + ``` + +
+ +## Design Questions + +Do we also need to reflect `isAsync`? +This appears to depend on whether implementations need to know +whether execution will be asynchronous before actually beginning to execute. +XS appears to have managed to implement virtual module sources without +an explicit indicator. +ECMA-262 currently has [[isAsync]] on ***Cyclic Module Record***, +which would suggest that, if engines can be implemented without knowing +a source will be asynchronous, the specification will need to be refactored to +reflect that. + +## Design Rationales + +The property `needsImportMeta` allows virtual import hooks to omit properties +from the `importMeta` of any `Module` instance derived from the source, +having proof that the module will never access `import.meta`. +Concretely, `import.meta.resolve` would be a closure over the module's referrer +in hosts that provide it. +In module graphs with thousands of module instances that largely do not use +this property, avoiding the allocation of per-module closures can allow a +significant reduction in memory pressure. + +A similar optimization might be possible for `import`. +With the design as written, `needsImport` would only be `false` for modules +that make no use of static `import` or `export` `from` clauses and also never +use the syntactic form for dynamic `import`. +Since virtual module graphs can share relatively few `importHook` instances, +the potential savings would be negligible, so we've omitted this flag. + +[0]: ./0-module-and-module-source.md +[browserify]: https://browserify.org/ +[import-map]: https://github.com/WICG/import-maps +[jest-ses-interaction]: https://github.com/facebook/jest/issues/11952 +[jest]: https://jestjs.io/ +[node-hmr]: https://github.com/nodejs/node/issues/40594 +[parcel]: https://parceljs.org/ +[vm-context]: https://nodejs.org/api/vm.html#vm_vm_createcontext_contextobject_options +[webpack]: https://webpack.js.org/ diff --git a/2-virtual-module-source.md b/2-virtual-module-source.md new file mode 100644 index 0000000..5d9b619 --- /dev/null +++ b/2-virtual-module-source.md @@ -0,0 +1,408 @@ +# Virtual Module Sources + +## Synopsis + +Extend the `Module` constructor such that it accepts virtual module sources: +objects that implement a protocol that is sufficient for virtualizing the +evaluation of modules in languages not anticipated by ECMA-262 or a host +implementations. + +## Motivation + +Hosts can add support for linking non-EcmaScript languages by creating +new kinds of ***Module Source Record***. +For example, a host could add a [[ModuleSource]] internal slot to +[`WebAssembly.Module`][web-assembly-module] and then link arbitrary WASM into +an EcmaScript module graph. + +However, user code cannot emulate a host in this way. +To extend or experiment with new languages or new module-like features, user +code needs a way to emulate or *virtualize* a module source record. + +Such a protocol would allow user code to integrate CommonJS, JSON, or WASM +modules on hosts that do not. +It would also open a field for experimentation with other kinds of resource +modules. + +## Description + +The first argument to the `Module` constructor is a source. +If that source is an object that does not have a [[Module Source]] internal +slot, we treat this object as a protocol for loading, binding, linking, +initializing, and executing the new `Module` instance. + +## Interface + +```ts +type VirtualModuleSource = { + // Indicates the import and export bindings the module has + // between its module environment record, module namespace exotic object, + // and its dependencies. + bindings?: Array, + + // Executes the module if it is imported. + // execute may return a promise, indicating that the module uses + // the equivalent of top-level-await. + execute?: (namespace: ModuleImportsNamespace, { + import?: (importSpecifier: string) => Promise, + importMeta?: Object, + globalThis: Object, + }) => void, + + // Indicates that execute needs to receive a dynamic import function + // bound to a Module instance. + needsImport?: boolean, + + // Indicates that initialize needs to receive an importMeta. + needsImportMeta?: boolean, +}; +``` + +Bindings must be one of the shapes proposed in [module source static +analysis][1], such that for each binding gets linked in a virtual module. + +The `Module` constructor from [Module and ModuleSource](./0-module-and-module-source.md) extends to +accept virtual module sources instead of `ModuleSource` and reflects whatever +`source` its given as the `source` property of the returned instance. + +## Examples + +### JSON + +This protocol allows user code to create new module source constructors. +For example, for a host that does not support JSON, this could be +accomplished with a small virtual module source. + +```js +class JsonModuleSource { + bindings = { export: 'default' }; + constructor(text) { + // Throw SyntaxError if the source is invalid, from here. + this.#object = JSON.parse(text); + }; + execute(imports) { + // Exports of multiple module instances backed by this + // source should be referentially independent. + imports.default = clone(this.#object); + }; +} + +const source = new JsonModuleSource({ meaning: 42 }); +const module = new Module(source); +const { default: { meaning } } = await import(module); +``` + +Asset modules of various kinds would largely follow this pattern. + +### WASM + +On a host that does not provide support for WebAssembly, a virtual module +source would suffice. + +```js +class WasmModuleSource { + constructor(buffer) { + const module = new WebAssembly.Module(buffer); + this.#imports = WebAssembly.Module.imports(module); + this.#exports = WebAssembly.Module.exports(module); + this.bindings = [ + ...this.#imports.map(({ module, name }) => + ({ import: name, from: module })), + ...this.#exports.map(({ name }) => + ({ export: name })), + ]; + }; + async execute(namespace) { + const importObject = {}; + for (const { module, name, kind } of this.#imports) { + importObject[name] = namespace[name]; + } + const instance = await WebAssembly.instantiate(module, importObject); + for (const { name } of this.#exports) { + namespace[name] = instance[name]; + } + }; +} +``` + +### CommonJS + +There is no, singular perfect solution for binding CommonJS, especially +taking into account that an asynchronous loader cannot immitate Node.js's +synchronous loader. +But, any Node.js library that is portable to the web must be sufficiently +transparent to transparent analysis that bundlers can subsume them, and so, +large amounts of the CommonJS ecosystem are in fact portable. + +This is a sketch of one possible solution for binding CommonJS in ESM, +based on the solution in [Endo][endo]. + +
+ CommonJS virtual module source based on heuristic static analysis. + + ```js + class CjsModuleSource { + constructor(source, url) { + this.source = source; + // Lexical analysis of a CommonJs module reveals the bindings + // for named exports and invents bindings for imported namespace + // objects for every lexically evident require call with a string argument, + // such that require just returns a property of the + // ESM internal namespace to the already-linked imported module + // namespace. + const { bindings, requires } = lexicallyAnalyzeCjs(source, url); + this.bindings = bindings; + this.#requires = requires; + }; + async execute(namespace, { importMeta, globalThis }) { + const functor = new globalThis.Function( + 'require', 'exports', 'module', '__filename', '__dirname', + source, // Inject source URL here (caveat Github Markdown) + ); + + namespace.default = Object.create(globalThis.Object.prototype); + + // Set all exported properties on the defult and call namedExportProp to + // add them on the namespace for import *. + // Root namespace is only accessible for imports. + // Requiring from CommonJS gets the default field of the namespace. + const promoteToNamedExport = (prop, value) => { + // __esModule needs to be present for typescript-compiled modules to + // work, can't be skipped. + if (prop !== 'default') { + namespace[prop] = value; + } + }; + + const originalExports = new Proxy(namespace.default, { + set(_target, prop, value) { + promoteToNamedExport(prop, value); + namespace.default[prop] = value; + return true; + }, + defineProperty(target, prop, descriptor) { + if (has(descriptor, 'value')) { + // This will result in non-enumerable properties being enumerable + // for named import purposes. + promoteToNamedExport(prop, descriptor.value); + } + // All the defineProperty trickery with getters used for lazy + // initialization will work. + // The trap is here only to elevate the values with namedExportProp + // whenever possible. + // Replacing getters with wrapped ones to facilitate + // propagating the lazy value to the namespace is not possible because + // defining a property with modified + // descriptor.get in the trap will cause an error. + // We use Object.defineProperty instead of Reflect.defineProperty for better + // error messages. + Object.defineProperty(target, prop, descriptor); + return true; + }, + }); + + // Machinery to distinguish module.exports assignment. + let finalExports = originalExports; + const module = freeze({ + get exports() { + return finalExports; + }, + set exports(value) { + finalExports = value; + }, + }); + + const require = specifier => { + return namespace[this.#requires[specifier]]; + }; + + functor(require, moduleExports, module, filename, dirname); + + // Promotes keys from redefined module.export to top level namespace for import * + // Note: We could do it less consistently but closer to how node does it if + // we iterated over exports detected by the lexer. + const exportsHaveBeenReassigned = finalExports !== originalExports; + if (exportsHaveBeenReassigned) { + namepsace.default = finalExports; + keys(namepsace.default || {}).forEach(prop => { + if (prop !== 'default') + namepsace[prop] = namepsace.default[prop]; + }); + } + }; + } + ``` + + +### Pass-through module sources + +This example illustrates how a virtual module source can simply +reexport another module with no special logic in an executor. + +```js +import * as direct from 'real.js'; +const source = { bindings: { exportAllFrom: 'real.js', as: 'real' } }; +const module = new Module(source); +const { real: indirect } = await import(module); +direct === indirect; // true +``` + +## Dependencies + +This proposal depends on [Module and ModuleSource][0] from the [Compartments +proposal](README.md) to introduce `ModuleSource`, and [ModuleSource +analysis][1] from the [Compartments proposal](README.md). + +## Design + +For every `Module` instance that has a virtual module source, +module import machinery constructs a real ***Module Imports Namespace***, +an exotic object that allows user code to get and set the values for each +binding. + +Bindings must be one of the shapes proposed in [module source static +analysis][1], such that for each binding gets linked in a virtual module: + +- `{ import: string, from: string }` + + Links the named `import` property of the ***Module Imports Namespace*** + to the same name of the `from` module's ***Module Exports Namespace***. + +- `{ import: string, as: string, from: string }` + + Links the named `as` property of the ***Module Imports Namespace*** + to the `import` name of the `from` module's ***Module Exports Namespace***. + +- `{ export: string }` + + Links the named `export` property of this module's ***Module Imports + Namespace*** directly to its ***Module Exports Namespace***. + +- `{ export: string, as: string}` + + Links the named `export` property of this module's ***Module Imports + Namespace*** to the named `as` property of this module's ***Module Exports + Namespace***. + +- `{ export: string, as: string, from: string }` + + Links the named `export` property of the `from` module's ***Module Exports + Namespace*** to the named `as` property of this module's ***Module Exports + Namespace***, bypassing this module's ***Module Imports Namespace*** + entirely. + +- `{ importAllFrom: string, as: string }` + + Links the ***Module Exports Namespace*** of the module `importAllFrom` + to the name `as` in this module's ***Module Imports Namespace***. + +- `{ exportAllFrom: string }` + + Links all of the names except `default` exported by the `exportAllFrom` + module to this module's ***Module Exports Namespace***, bypassing this + module's ***Module Imports Namespace*** entirely. + +- `{ exportAllFrom: string, as: string }` + + Links the ***Module Exports Namespace*** of the module `exportAllFrom` + to a property named `as` on this module's ***Module Exports Namespace***, + bypassing this module's ***Module Imports Namespace*** entirely. + +In the absence of a `bindings` property, module machinery presumes an empty +bindings array. + +> Modules without bindings may still have side-effects in global scope. + +Dynamic import of a `Module` with a virtual module source induces the memoized +`importHook` of the `Module` for each binding that has an `from` property. +When a `Module` instance exists for every transitive dependency, dynamic import +advances the linkage to all new namespaces, including all links between +***module [exports] namespaces*** and ***module import namespaces*** according +to each source's bindings. + +In the absence of an `execute` property, dynamic import assumes an empty +execute function. + +> Modules without execution behavior may still have useful export bindings from +> other modules. + +Dynamic import will then execute the working set of modules according +to the existing rules of ordering, and will call `execute` for each virtual +module source, providing the linked ***Module Imports Namespace***, +a dynamic import function bound to the module instance only if the virtual +module source has a truthy `needsImport` property, and an `import.meta` object +only if the virtual module source has a `needsImportMeta` property. + +## Design questions + +### Module source serializability invariant + +Caridy PatiƱo prefers an even lower level API for virtual module sources, +where instead of virtualizing evaluation, we provide a way to construct +a `Module` instance along with its imports and exports namespaces from their +bindings. +In this model, there would be no virtual module source, just modules. +This would protect the invariant that module sources are serializable. + +### Emulated JavaScript + +[Should virtual module sources support emulated +JavaScript?](https://github.com/tc39/proposal-compartments/issues/70) +That will require, in some cases, separation of initialization from execution +as separate phases. +Should the virtual module source protocol support separate paths for modules +that do not require an initialization phase? + +### Shape of the internal namespace object + +The `execute` method of a virtual module source needs access to both +the global object and internal view of bindings. +These could be addressed singly by a reification of the module environment +record, or with separate objects as written. +Future amendments to modules may eventually also add lexical names to the +module environment record that are not properties of the global object. + +## Limitations + +Module sources compiled by the `ModuleSource` constructor capture enough data +that engines can transfer them in many ways. +The internal representation of a module source can be an immutable record that +a JavaScript host can trivially share throughout an agent cluster. +Communicating JavaScript agent clusters can transfer module sources as data. + +In the most naive case, a module source is just a record of the original source +and its type, which is not limited to JavaScript modules, since hosts can define +additional module source types. +For example, [`WebAssembly.Module`][web-assembly-module] might be trivially +extended to serve as a module source, by adding a [[ModuleSource]] slot +referring to a concrete ***Module Source Record***. +In this case, it is sufficient for a pair of JavaScript agent clusters +to transfer the original source and type. + +In a more elaborate case, the module source does not even retain the original +text, but its compiled bytecode. +In this case, a pair of compatible JavaScript agent clusters might elect to +send and receive byte code. + +Virtualized module sources are transmissible only between JavaScript agent clusters +if both the sender and receiver agree on a protocol for constructing a new +virtual module source from some serial representation. + +To wit, it would not be possible for a host to transmit arbitrary virtual +module sources between agents using a general purpose algorithm like +[structured clone][structured-clone]. +The protocol for communicating virtual module sources between hosts would +necessarily need to be implemented in user code. +Any general-purpose host-defined mechanism for transmitting module graphs would +necessarily fail when encountering a virtual module source. +For example, it may be possible to extend structured clone to transmit +instances of `ModuleSource` between any JavaScript hosts, or even +[`WebAssembly.Module`][web-assembly-module] when a host-defined extension is +present, but it would not be possible for this behavior to generalize to +virtual module sources. + +[0]: ./0-module-and-module-source.md +[1]: ./1-static-analysis.md +[web-assembly-module]: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/WebAssembly/Module +[structured-clone]: https://developer.mozilla.org/en-US/docs/Web/API/structuredClone +[endo]: https://github.com/endojs/endo diff --git a/3-evaluator.md b/3-evaluator.md new file mode 100644 index 0000000..71a0466 --- /dev/null +++ b/3-evaluator.md @@ -0,0 +1,166 @@ + +# Evaluators + +## Synopsis + +Provide an `Evaluators` constructor that produces a new `eval` function, +`Function` constructor, and `Module` constructor such that execution contexts +generated from these evaluators refer back to this set of evaluators, with a +given global object and virtualized host behavior for dynamic import in script +contexts. + +## Interfaces + +```ts +interface Evaluators { + constructor({ + globalThis?: Object, + importHook?: ImportHook, + importMeta?: Object, + }); + + eval: typeof eval, + Function: typeof Function, + Module: typeof Module, +}; +``` + +## Motivation + +### Domain Specific Languages + +Tools like Mocha, Jest, and Jasmine install the verbs and nouns of their +domain-specific-language in global scope. + +Isolating these changes currently requires creation of a new realm, +and creating new realms comes with the hazard of identity discontinuity. +For example, `array instanceof Array` is not as reliable as `Array.isArray`, +and the hazard is not limited to intrinsics that have anticipated this +problem with work-arounds like `Array.isArray` or thenable `Promise` adoption. + +Evaluators provide an alternate solution: evaluate modules or scripts in a +separate global scope with shared intrinsics. + +```js +const dsl = const new Evaluator({ + globalThis: { + __proto__: globalThis, + describe, + before, + after, + } +}); + +const source = await import(entrypoint, { reflect: 'module-source' }); +const module = new dsl.Module(source); +await import(module); +``` + +In this example, only the entrypoint module for the DSL sees additional +globals. +The `Module` constructor adopts the host's import hook. +Notably, in this model, each entrypoint module could be granted separate +closures for the DSL. +Current DSLs cannot execute concurrently or depend on dynamic scope to track +the entrypoint that called each DSL verb. + +### Enforcing the principle of least authority + +On the web, the same origin policy has become sufficiently effective at +preventing cross-site scripting attacks that attackers have been forced to +attack from within the same origin. +Conveniently for attackers, the richness of the JavaScript library ecosystem +has produced ample vectors to enter the same origin. +The vast bulk of a modern web application is its supply chain, including code +that will be eventually incorporated into the scripts that will run in the same +origin, but also the tools that generate those scripts, and the tools that +prepare the developer environment. + +The same-origin-policy protects the rapidly deteriorating fiction that +web browsers mediate an interaction between just two parties: the service and +the user. +For modern applications, particularly platforms that mediate interactions among +many parties or simply have a deep supply chain, web application developers +need a mechanism to isolate third-party dependencies and minimize their access +to powerful objects like high resolution timers or network, compute, or storage +capability bearing interfaces. + +Some hosts, including a community of embedded systems represented at [ECMA +TC53][tc53], do not have an origin on which to build a same-origin-policy, and +have elected to build their security model on isolated evaluators, through the +high-level Compartment interface. + +## Design + +Where ***Execution Contexts*** and instances of `eval`, `Function`, and +`Module` were previously bound to a realm, they become bound to their +evaluators instead, which in turn is bound to the realm. + +All references to %eval% must now refer to the +[[Realm]].[[Evaluators]].[[Eval]]. +All other references to [[Context]].[[Realm]] must be replaced with +[[Context]].[[Evaluators]].[[Realm]], particularly to address the +intrinsics of the realm. + +The rules for direct eval do not change as a consequence. +The name `eval` must be bound to the [[Context]].[[Evalutors]].[[Eval]] +for the `eval` special form to be interpreted as direct eval. +The creator of new evaluators must arrange for `evaluators.eval` +to be threaded into lexical scope for this to continue working. +For example: + +```js +const localThis = { __proto__: globalThis }; +const evaluators = new Evaluators({ globalThis: localThis }); +localThis.eval = evaluators.eval; +evaluators.eval(` + eval('var local = 42'); + typeof local === 'number'; // true +`); +``` + +The new global `Evaluators` constructor accepts a `globalThis`. +***Global Environment Record*** in ***Execution Contexts*** will +reach out to this object for properties in global scope. +If absent, the new evaluators will receive the `Evaluator` constructors +own [[Context]].[[Evaluators]]. + +The `Evaluators` constructor accepts an `importHook` having the same +signature as that specified in [Module and ModuleSource][0] of the +[Compartments proposal](README.md). +The `importHook` serves multiple roles. + +The new `Module` constructor will adopt its evaluator's [[ImportHook]] and +[[ImportMeta]] if called without the corresponding `importHook` or `importMeta` +options. Module execution contexts will in turn use the [[ImportHook]] and +[[ImportMeta]] from their `Module` constructor. + +Dynamic import in script execution contexts will use their +[[Context]].[[Evaluators]].[[ImportHook]]. +The import hook will receive the given specifier and the +[[Context]].[[Evaluators]].[[ImportMeta]]. + +## Design Questions + +### Threading globals + +Host implementors may not be able to accommodate an arbitrary value for +`globalThis`. +The proposal as written asks for the best user experience, but may need to +adjust if host implementations cannot support an arbitrary object, or if +limitations on the given object are not sufficient. + +Evaluators will be useful in two different modes: + +- Sharing a `globalThis` +- Having a separate `globalThis` + +For separate globals, it would be okay for the `Evaluator` constructor to +receive a bag of properties to copy onto a global object constructed by the +host on their behalf. + +For shared globals, copying properties isn't useful, so the argument +pattern would have to be different. + +[0]: ./0-module-and-module-source.md +[tc53]: https://www.ecma-international.org/technical-committees/tc53/ diff --git a/4-compartment.md b/4-compartment.md new file mode 100644 index 0000000..1e6f67c --- /dev/null +++ b/4-compartment.md @@ -0,0 +1,850 @@ + +# Compartment + +## Synopsis + +Compartments provide a high-level API for orchestrating modules and evaluators, +such that programs and modules can be collectively isolated to a particular +global scope. +We expect that compartments will be used to isolate Node.js packages and import +map scopes. + +Agoric's [SES shim][ses-shim], Sujitech's [ahead-of-time-SES][AOT-SES] and +Moddable's [XS][xs-compartments] are actively vetting this proposal +as a shim and native implementation respectively (2022). + +This proposal is a [SES Proposal][ses-proposal] milestone. + +## Motivation + +Many ECMAScript module behaviors are defined by the host. +The language needs a mechanism to allow programs running on one host to fully +emulate or virtualize the module loader behaviors of another host. + +Module loader behaviors defined by hosts include: + +* resolving a module import specifier to a full specifier, +* locating the source for a full module specifier, +* canonicalizing module specifiers that refer to the same module instance, +* populating the `import.meta` object. + +For example, on the web we can expect a URL to be a suitable full module +specifier, and for every module specifier to correspond to a URL. +We can expect the canonicalized module specifier to be reflected as +`import.meta.url`. +In Node.js, we can also expect import specifiers that are not fully qualified +URLs to receive special treatment. +However, in Moddable's XS engine, we can expect a module specifier +to resemble a UNIX file system path and not have a corresponding URL. + +We can also expect to have only one module instance per canonical module +specifier in a given Realm, and for `import(specifier)` to be idempotent +for the lifetime of a Realm. +Tools that require separate module memos are therefore compelled to create +realms either using Node.js's [VM context][vm-context] or `` and +[content security policies][csp] rather than a lighter-weight mechanism, +and consequently suffer identity discontinuties between instances from +different realms. + +* sub-realm sandboxes ([SES][ses] and [LavaMoat][lava-moat]) that virtualize + evaluating guest modules and limit access to globals and built-in modules. + This proposal prepares for the SES proposal to introduce `lockdown`, which + isolates all evaluators, including `eval`, `Function`, and this `Compartment` + module evaluator. That proposal will introduce the concern of per-compartment + globals and hardened shared intrinsics. + +Defining a module loader in the language also improves the language's ability +to evolve. +For example, a module loader interface that accounts for linking "virtual" +modules that are not JavaScript facilitates easier experimentation with linkage +against languages like WASM. +For another, a module loader interface allows for user space experimentation +with the notion of [import maps][import-map]. + +Defining a module loader in the language also provides valuable insight to the +design of every language feature that touches upon modules, and every new +module system feature adds uncertainty to the eventual inclusion of a module +loader to the language. + +One such insight is that module blocks will benefit from the notion of a module +descriptor as defined by this proposal. Module blocks roughly correspond to +compiled sources and are consequently not coupled to a particular host environment. +A module descriptor is necessary to carry properties of a module not captured +in the source, like the module referrer specifier and how to populate `import.meta`. + +Additionally, having a module loader interface is a prerequisite for shimming built-in +modules. + +## Interfaces + +```ts +// A ModuleDescriptor captures a module source and per-compartment metadata. +type ModuleDescriptor = + + // Describes a module by referring to a *reusable* record of the compiled + // source. + // The module source captures a static analysis + // of the `import` and `export` bindings and whether the source ever utters + // the keywords `import` or `import.meta`. + // The compartment will then asynchronously _load_ the shallow dependencies + // of the module and memoize the promise for the + // result of loading the module and its transitive dependencies. + // If the compartment _imports_ the module, it will generate and memoize + // a module instance and execute the module. + // To execute the module, the compartment will construct an `import.meta` object. + // If the source utters `import.meta`, + // * The compartment will construct an `importMeta` object with a null prototype. + // * If the module descriptor has an `importMeta` property, the compartment + // will copy the own properties of the descriptor's `importMeta` over + // the compartment's `importMeta' using `[[Set]]`. + // The compartment will then begin initializing the module. + // The compartment memoizes a promise for the module exports namespace + // that will be fulfilled at a time already defined in 262 for dynamic import. + | { + record: ModuleSource, + + // Full specifier, which may differ from the `fullSpecifier` used + // as the corresponding key of the `modules` Compartment constructor option, + // or the `fullSpecifier` argument of the `loadHook` that returned + // the module descriptor. + // If present, this will be used as the referrerSpecifier for this record's + // importSpecifiers instead. + specifier?: string, + + // Properties to copy to the `import.meta` of the resulting module instance, + // if the source mentions `import.meta`. + importMeta?: Object, + } + + // Describes a module by a *reusable* virtual module source. + // When the compartment _loads_ the module, it will use the bindings array + // of the virtual module source to discover shallow dependencies + // in lieu of compiling the source, and otherwise behaves identically + // to { source } descriptors. + // When the compartment _imports_ the module, it will construct a + // [[ModuleEnvironmentRecord]] and [[ModuleExportsNamespace]] based + // entirely on the bindings array of the virtual module source. + // If the virtual module source has a true `needsImport`, the + // compartment will construct an `import` that resolves + // the given import specifier relative to the module instance's full specifier + // and returns a promise for the module exports namespace of the + // imported module. + // If the virtual module source has a true `needsImportMeta` + // property, the compartment will construct an `importMeta` by the + // same process as any { source } module. + // `importMeta` will otherwise be `undefined`. + // The compartment will then execute the module by calling the + // `execute` function of the virtual module source, + // giving it the module environment record and an options bag containing + // either `import` or `importMeta` if needed. + // The compartment memoizes a promise for the module exports namespace + // that is fulfilled as already specified in 262 for dynamic import, + // where the behavior if the execute function is async is analogous to + // top-level-await for a module compiled from source. + | { + source: VirtualModuleSource, + + // See above. + specifier?: string, + + // See above. + importMeta?: Object, + } + + // Use a module source from the compartment in which this compartment + // was constructed. + // If the compartment _loads_ the corresponding module, the + // module source will be synchronously available only if it is already + // in the parent compartment's synchronous memo. + // Otherwise, loading induces the parent compartment to load the module, + // but does not induce the parent compartment to load the module's transitive + // dependencies. + // This compartment then resolves the shallow dependencies according + // to its own resolveHook and loads the consequent transitive dependencies. + // If the compartment _imports_ the module, the behavior is equivalent + // to loading for the retrieved module source or virtual + // module source. + | { + source: string, + + // Properties to copy to the `import.meta` of the resulting module instance. + importMeta?: Object, + } + + // FOR SHARED MODULE INSTANCES: + + // To create an alias to an existing module instance given the full specifier + // of the module in a different compartment: + | { + namespace: string, + + // The compartment from which to draw the module instance, + // by default, this compartment. + compartment: Compartment, + } + + // To create an alias to an existing module instance given a module exports + // namespace object that can pass a brand check: + | { + namespace: ModuleExportsNamespace, + } + + // If the given namespace does not pass a ModuleExportsNamespace brandcheck, + // the compartment will not support live bindings to the properties of that + // object, but will instead produce an emulation of a module exports namespace, + // using a frozen snapshot of the own properties of the given object. + // The module exports namespace for this module does not reflect any future + // changes to the shape of the given object. + | { + // Any object that does not pass a module exports namespace brand check: + namespace: ^ModuleExportsNamespace, + }; + +type CompartmentConstructorOptions = { + // Globals to copy onto this compartment's unique globalThis. + globals: Object, + + // The compartment uses the resolveHook to synchronously elevate + // an import specifier (as it appears in the source of a ModuleSource + // or bindings array of a VirtualModuleSource), to + // the corresponding full specifier, given the full specifier of the + // referring (containing) module. + // The full specifier is the memo key for the module. + resolveHook: (importSpecifier: string, referrerSpecifier: string) => string, + + // The compartment's load and import methods may need to load or execute + // additional modules. + // If the compartment does not have a module on hand, + // it will first consult the `modules` object for a descriptor of the needed module. + modules?: Record, + + // If loading or importing a module misses the compartment's memo and the + // `modules` table, the compartment calls the asynchronous `loadHook`. + // Note: This name differs from the implementation of SES shim and a + // prior revision of this proposal, where it is currently called `importHook`. + loadHook?: (fullSpecifier: string) => Promise +}; + +interface Compartment { + // Note: This single-argument form differs from earlier proposal versions, + // implementations of SES shim, and Moddable's XS, which accept three arguments, + // including a final options bag. + constructor(options?: CompartmentConstructorOptions): Compartment; + + // Accessor for this compartment's globals. + // The globalThis object initially has only "shared intrinsics", + // followed by compartment-specific "eval", "Function", "Module", and + // "Compartment", followed by any properties transferred from the "globals" + // constructor option with the semantics of Object.assign. + globalThis: Object, + + // Evaluates a program program using this compartment's associated + // evaluators. + // A subsequent proposal might add options to either the compartment + // constructor or the evaluate method to persist the global contour + // between calls to evaluate, making compartments a suitable + // tool for a REPL, such that a const or let binding from one evaluation + // persists to the next. + // Subsequent proposals might afford other modes, like sloppy globals mode. + // TODO is sloppyGlobalsMode only sensible in the context of the shim? + evaluate(source: string): any; + + // load causes a compartment to load module descriptors for the + // transitive dependencies of a specified module into its + // memo, but does not execute any modules. + // The load function is useful for tools like bundlers and importmap + // generators that load a module graph in an emulated host environment + // but cannot and should not emulate evaluation. + async load(fullSpecifier: string): Promise + + // import induces the asynchronous load phase and then executes + // the given module and any of its transitive dependencies that + // have not already begun initialization. + async import(fullSpecifier: string): Promise; + + // TC53: some embedded systems hosts exclude promises and so require these + // synchronous variants of import and load. + // On hosts where these functions cannot succeed synchronously, + // they must throw an error without the side effect of initializing any + // additional modules. + // For modules that have a top-level await, these would return + // after the module first awaits but not after any subsequent promise queue + // job. + + // If a host supports both importNow and import, they must share + // a memo of full specifiers to promises for the module exports namespace. + // importNow and loadNow followed by import or load must not accidentally + // reexecutes the underlying module or produce different namespaces. + + importNow(fullSpecifier: string): ModuleExportsNamespace; + + loadNow(fullSpecifier: string): void; +} +``` + +## User Code + +Compartments can be implemented in user-code. +This is a partial and untested sketch. + +
+ Compartment in user code + + ```js + class Compartment { + #modules = new Map(); + #descriptors = new Map(); + #referrers = new Map(); + #globalThis = Object.create(null); + + constructor({ resolveHook, loadHook, globals }) { + this.#resolveHook = resolveHook; + this.#loadHook = loadHook; + this.#importHook = async (importSpecifier, importMeta) => { + const referrerSpecifier = this.#referrers.get(importMeta); + const fullSpecifier = this.#resolveHook( + importSpecifier, + referrerSpecifier + ); + return this.#load(fullSpecifier); + }; + this.#evaluators = new Evaluators({ + globalThis: this.#globalThis, + importHook: this.#importHook + }); + // Copy eval, Function, and Module into the associated globalThis. + Object.assign(this.#globalThis, this.#evaluators); + Object.assign(this.#globalThis, globals); + } + + get globalThis() { + return this.#globalThis; + } + + evaluate(script) { + return this.#evaluators.eval(script); + } + + async #descriptor(specifier) { + let eventualDescriptor = this.#descriptors.get(specifier); + if (!eventualDescriptocr) { + eventualDescriptor = this.#loadHook(specifier); + this.#descriptors.set(specifier, eventualDescriptor); + } + return eventualDescriptor; + } + + async load(specifier) { + let eventualModule = this.#modules.get(specifier); + if (!eventualModule) { + const descriptor = await this.#descriptor(specifier); + eventualModule = await this.#load(descriptor); + this.#modules.set(specifier, eventualModule); + } + return eventualModule; + } + + async #load(descriptor, specifier) { + if ("source" in descriptor) { + const importMeta = Object.create(null); + Object.assign(importMeta, descriptor.importMeta); + this.#referrers.set(importMeta, specifier); + return new this.#evaluators.Module( + descriptor.source, + { + importHook: this.#importHook, + importMeta, + } + ); + } else if ("specifier" in descriptor) { + const compartment = descriptor.compartment || this; + return compartment.load(descriptor); + } else if ("module" in descriptor) { + return descriptor.module; + } else if ("namespace" in descriptor) { + // Contingent on a Module.get utility that obtains a `Module` + // reference for a brand-checked namespace. + const module = Module.get(descriptor.namespace); + if (module !== undefined) { + return module; + } else + // Contingent on virtual module source protocol amendment. + return this.#moduleForObject(descriptor.namespace); + } + } else { + throw new Error("Weird descriptor"); + } + } + + async import(specifier) { + return import(await this.#load(specifier)); + } + + // Contingent on virtual module source protocol amendment. + #moduleForObject(exports) { + return new this.#evaluators.Module({ + bindings: Reflect.keys(exports).map(name => ({ exports: name })), + execute(imports) { + Object.assign(imports, exports); + }, + }); + } + } + ``` +
+ +Since compartments can be implemented in terms of the lower-numbered layers of +this proposal, it may no longer be necessary for the language to provide a the +Compartment constructor. +However, one of the primary motivations is being able to use compartments to +evolve import maps in user code. +For a user-code implementation of an import-map runtime to be viable, it must +be sufficiently terse to be inlined in a script block. +The above code demonstrates how much such a runtime would likely weigh. + +## Motivating Examples and Design Rationales + +### Multiple-instantiation + +This example illustrates the use of a new compartment to support multiple +instantiation of modules, reusing the host's compartment and module source +memos as a cache. +This example creates five instances of the example module and its transitive +dependencies. + +```js +for (let i = 0; i < 5; i += 1) { + new Compartment().import('https://example.com/example.js'); +} +``` + +Assuming that the language separately adopted hypothetical `import static` +syntax to defer execution but load a module and its transitive dependencies, a +bundler would be able to observe the need to capture the example module and its +transitive dependencies, such that the *only* instances are in guest compartments. + +```js +import static 'https://example.com/example.js'; +for (let i = 0; i < 5; i += 1) { + new Compartment().import('https://example.com/example.js'); +} +``` + +### Virtualized web compartment + +This example illustrates a very reductive emulation of a web-based compartment. +The module specifier domain is strictly URLs and import specifiers are resolved +relative to the referrer module specifier using URL resolution. +The compartment populates `import.meta.url` with the response URL. + +The compartment also ensures that the import specifiers of whatever module is +loaded get resolved relative to the physical location of the resource. +If the response URL shows that the fetch followed redirects, the `loadHook` +returns a reference (`{instance: response.url}`) to the actual module instead +of returning a record. +The compartment then follows-up with a request for the redirected location. + +```js +const compartment = new Compartment({ + resolveHook(importSpecifier, referrerSpecifier) { + return new URL(importSpecifier, referrerSpecifier).href; + }, + async loadHook(url) { + const response = await fetch(url, { redirect: 'manual' }); + if (response.url !== url) { + return { instance: response.url }; + } + const source = await response.text(); + return { + source: new ModuleSource(source), + importMeta: { url: response.url }, + }; + }, +}); +await compartment.import('https://example.com/example.js'); +``` + +By returning an alias module descriptor, the compartment can ensure +that requests for both the request URL and the response URL refer +to the canonicalized module. + +--- + +For the same intended effect but a single fetch, we might +alternately use a `specifier` property of record module descriptors. +In the following example, both the request URL and the response URL +would realize cache keys in the compartment. + +```js +const compartment = new Compartment({ + resolveHook(importSpecifier, referrerSpecifier) { + return new URL(importSpecifier, referrerSpecifier).href; + }, + async loadHook(url) { + const response = await fetch(url); + const source = await response.text(); + return { + source: new ModuleSource(source), + specifier: response.url, + importMeta: { url: response.url }, + }; + }, +}); +await compartment.import('https://example.com/example.js'); +``` + +So, we guide the host to instead return the new cache key so the compartment +can memoize the `loadHook`, sending a single request for the canonicalized +module specifier. + +A design tension with the `specifier` property is that it invites a race +between two concurrent `loadHook` calls for different specifiers that +converge on the same response specifier. +Implementations must take care to ensure that the module record memo and the +module record *promise* memo point to the same record for a given ultimate full +specifier. + +### Virtualized Node.js compartment + +This example illustrates a very reductive emulation of a Node.js compartment. +As in a web based loader, the domain of full module specifiers is fully +qualified URLs. + +However, Node.js import specifiers are rarely fully qualified URL's. +Instead, we distinguish *relative module specifiers* (those starting with `.` +or `..` path components) from *absolute module specifiers* (all others). +In turn, absolute module specifiers can refer either to Node.js built-in +modules or links to modules in other *packages*. + +> For the purpose of this proposal, we beg a distinction between the +> Node.js-specific terms *relative* and *absolute*, and the +> Compartment-specific terms *full module specifier* (a suitable memo key in a +> compartment) and *import module specifier* (a specifier as might appear in a +> static import or export statement or be passed to dynamic `import`, which +> must be promoted to a full specifier in the context of the full specifier +> of the referrer module (the *referrer specifier*). + +The `resolveHook` is synchronous, and as such is not in a position +to search the file system for a package in an ancestor `node_modules` directory +that matches the prefix of the absolute import specifier, and failing that, +to indicate a built-in Node.js module. +So, instead, the `resolveHook` produces a URL based on the referrer specifier +and carries the import specifier in the fragment. + +For the purposes of this example, we do not consider the case that an import +specifier may be a fully qualified URL. + +```js +import url from 'node:url'; +import fs from 'node:fs/promises'; + +const compartment = new Compartment({ + resolveHook(importSpecifier, referrerSpecifier) { + if (importSpecifier.startsWith('./') || importSpecifier.startsWith('../')) { + return new URL(importSpecifier, referrerSpecifier).href; + } else if (importSpecifier.startsWith('node:')) { + return importSpecifier; + } else { + return new URL(`#${importSpecifier}`, referrerSpecifier); + } + }, + async loadHook(fullSpecifier) { + const { protocol, pathname, hash } = new URL(fullSpecifier); + if (protocol === 'node:') { + return { namespace: await import(fullSpecifier) }; + } + if (hash !== undefined) { + const packageDescriptor = findDescriptor( + + } + const path = url.fileURLToPath(fullSpecifier); + const link = await fs.readlink(path).catch(error => { + if (error.code === 'EINVAL') { + return undefined; + } else { + throw error; + } + }); + if (link !== undefined) { + return { + } + const source = await fs.readFile(path); + return { + source: new ModuleSource(source), + importMeta: { url: response.url }, + }; + }, +}); +await compartment.import('file://usr/share/node_modules/example/example.js'); +``` + +### Bundling or archiving + +Compartments can be employed to virtualize a foreign environment and generate +bundles, archives, or other stored forms of a program for transmission and +deferred execution. + +This first snippet is a minimal bundler. +It differs from the first minimal example only in that it generates a `sources` +Map and calls `load` instead of `import`, ensuring we do not attempt to run any +of the loaded code locally. + +```js +const sources = new Map(); +const compartment = new Compartment({ + resolveHook(importSpecifier, referrerSpecifier) { + return new URL(importSpecifier, referrerSpecifier).href; + }, + async loadHook(fullSpecifier) { + const response = await fetch(fullSpecifier); + const source = await response.text(); + sources.set(fullSpecifier, source); + return { + source: new ModuleSource(source), + importMeta: { url: response.url }, + }; + }, +}); +await compartment.load('https://example.com/example.js'); +``` + +Then, we presumably serialize the `sources` map and recreate it in another +environment. +This next figure uses the `sources` to reconstruct the original compartment +compartment module graph and execute it. + +```js +const evaluator = new Compartment({ + resolveHook(importSpecifier, referrerSpecifier) { + return new URL(importSpecifier, referrerSpecifier).href; + }, + loadHook(fullSpecifier) { + const source = sources.get(fullSpecifier); + if (source === undefined) { + throw new Error('Assertion failed: incomplete sources'); + } + return { + source: new ModuleSource(source), + importMeta: { url: response.url }, + }; + }, +}); +await evaluator.import('https://example.com/example.js'); +``` + +### Inter-compartment linkage + +One motivating use of compartments is to isolate Node.js-style packages and +limit their access to powerful modules and globals, to mitigate software supply +chain attacks. +With such an application, we would construct a special compartment for each package +and allow compartments to link modules across compartment boundaries. + +In this trivial example, we construct a pair of compartments, `even` and `odd`, +which in turn contain mutually dependent `even` and `odd` modules that +participate in a dependency cycle. +For simplicity, the domain of module specifiers is exactly the names `even` and +`odd`, and these compartments do not support resolution. + +These compartments use `{ instance, compartment }` module descriptors to indicate +linkage across compartment boundaries. + +```js +const even = new Compartment({ + resolveHook: specifier => specifier, + loadHook: async specifier => { + if (specifier === 'even') { + return { source: new ModuleSource(` + import isOdd from 'odd'; + export default n => n === 0 || isOdd(n - 1); + `) }; + } else if (specifier === 'odd') { + return { instance: specifier, compartment: odd }; + } else { + throw new Error(`No such module ${specifier}`); + } + }, +}); + +const odd = new Comaprtment({ + resolveHook: specifier => specifier, + loadHook: async specifier => { + if (specifier === 'odd') { + return { source: new ModuleSource(` + import isEven from 'even'; + export default n => n !== 0 && isEven(n - 1); + `) }; + } else if (specifier === 'even') { + return { instance: specifier, compartment: even }; + } else { + throw new Error(`No such module ${specifier}`); + } + }, +}); +``` + +An alternative design that Agoric's SES shim and XS's native Compartment +explored used module exports namespace objects as handles that could be passed +between compatment hooks. +However, to support the bundler use case, it became necessary to add a method +that could get a module exports namespace object for a module that had not yet +been loaded (`compartment.module(specifier)`, much less instantiated, nor +executed. +The invention of a module descriptor allowed us to remove this complication, +among others: the `moduleMapHook` became superfluous since both the `modules` +constructor option and the `loadHook` could use module descriptors instead. + +### Linking with a virtual module source (JSON example) + +To support non-JavaScript languages, a compartment provides a `loadHook` that +returns virtual module source implementations. +This example virtual module source declares its bindings (equivalent to +`export default` in this case) and provides an executor. +The executor receives a module environment record according to the shape +declared in its bindings. +This compartment makes the simplifying assumption that all modules are JSON. +A more elaborate version of this example will switch on the response MIME type +and account for import assertions. + +```js +const compartment = new Compartment({ + resolveHook(importSpecifier, referrerSpecifier) { + return new URL(importSpecifier, referrerSpecifier).href; + }, + async loadHook(fullSpecifier) { + const response = await fetch(fullSpecifier); + const source = await response.text(); + const source = { + bindings: [ + { export: 'default' }, + ], + execute(env) { + env.default = JSON.parse(source); + } + }; + return { source }; + }, +}); +await compartment.import('https://example.com/example.json'); +``` + +### Export Aliases and Module Imports Namespace + +This example contrasts the properties of module imports namespace and a module +exports namespace when bindings contain aliases. + +```js +const compartment = new Compartment({ + resolveHook(importSpecifier, referrerSpecifier) { + return new URL(importSpecifier, referrerSpecifier).href; + }, + loadHook(fullSpecifier) { + const record = { + bindings: [ + {export: 'internal', as: 'external'}, + ], + execute(env) { + env.internal = JSON.parse(source); // <---- + } + }; + return { record }; + }, +}); +const fullSpecifier = 'https://example.com/example.js' +await compartment.load(); +const { external } = compartment.importNow(fullSpecifier); +// ^----- +``` + +### Virtual module source reexports + +This example illustrates how a virtual module source can simply +reexport another module with no special logic in an executor. +This example makes the simplifying assumption that the compartment +does not support relative module specifiers and that module specifiers +are arbitrary names. + +```js +const compartment = new Compartment({ + resolveHook(specifier) { + return specifier; + }, + loadHook(specifier) { + switch (specifier) { + case 'alex': + return { namespace: { a: 10, b: 20, c: 30 } }, + case 'blake': + return { source: { bindings: { exportAllFrom: 'alex' } } }; + } + }, +}); + +const { a, b, c } = await compartment.import('blake'); +``` + +### Thenable Module Hazard + +An exported value named `then` can be statically imported, but dynamic import +confuses the module namespace for a thenable object. +The resolution of the promise returned by dynamic import, in this case, is the +eventual resolution of the thenable module. +And the eventual resolution is unlikely to be an intended effect. + +Consider `thenable.js`: + +```js +export function then(resolve) { + resolve(42); +} +``` + +A neighboring module might dynamically import this. + +```js +import('./thenable.js').then((x) => { + // x will be 42 in this case, not a module namespace object with a then + // function. +}) +``` + +This is the behavior of a dynamic import today, despite it being surprising. + +We have chosen to embrace this hazard since it would be worse to have +dynamic import and compartment import behave differently. + +However, with `compartment.importNow`, a program can mitigate this hazard. +With `importNow`, the following program will not invoke the `then` +function exported by `'./thenable.js'`. + +```js +await compartment.load('./thenable.js'); +const thenableNamespace = compartment.importNow('./thenable.js'); +``` + +## Design Questions + +### User code or native code + +There are some reasons to make native Compartments that are not fully addressed +by the lower-level primtiives out of which they can be implemented in user +code. + +1. A native implementation may be able to avoid reifying some intermediate + objects, which may be important for embedded systems. +2. A higher-level API will be more approachable to a more casual user. +3. The runtime for a bundler in a web page might be considerably lighter in + terms of Compartment than it might be in terms of the consitituent objects. + Bundler runtimes need to be as small as possible to meet the needs of + webpage delivery performance. + +[browserify]: https://browserify.org/ +[import-map]: https://github.com/WICG/import-maps +[lava-moat]: https://github.com/LavaMoat/LavaMoat +[node-hmr]: https://github.com/nodejs/node/issues/40594 +[ses-proposal]: https://github.com/tc39/proposal-ses +[ses-shim]: https://github.com/endojs/endo/tree/master/packages/ses +[AOT-SES]: https://github.com/DimensionDev/aot-secure-ecmascript +[webpack]: https://webpack.js.org/ +[xs-compartments]: https://blog.moddable.com/blog/secureprivate/ +[vm-context]: https://nodejs.org/api/vm.html#vm_vm_createcontext_contextobject_options +[redirect-manual]: https://fetch.spec.whatwg.org/#concept-filtered-response-opaque-redirect diff --git a/README.md b/README.md index 9daec55..7b3e15f 100644 --- a/README.md +++ b/README.md @@ -1,7 +1,5 @@ # Compartments -Evaluators for modules. - **Stage**: 1 **Champions**: @@ -18,740 +16,67 @@ Evaluators for modules. * Bradley Farias, GoDaddy * Jean-Francois Paradis, Salesforce -Agoric's [SES shim][ses-shim], Sujitech's [ahead-of-time-SES][AOT-SES] and -Moddable's [XS][xs-compartments] are actively vetting this proposal -as a shim and native implementation respectively (2022). -Most activity toward advancing this proposal occurs on those projects. - -## Synopsis - -Provide a mechanism for evaluating modules from ECMAScript module source code -and virtualizing module loader host behaviors. - -This proposal is a [SES Proposal][ses-proposal] milestone. - -## Motivation - -Many ECMAScript module behaviors are defined by the host. -The language needs a mechanism to allow programs running on one host to fully -emulate or virtualize the module loader behaviors of another host. - -Module loader behaviors defined by hosts include: - -* resolving a module import specifier to a full specifier, -* locating the source for a full module specifier, -* canonicalizing module specifiers that refer to the same module instance, -* populating the `import.meta` object. - -For example, on the web we can expect a URL to be a suitable full module -specifier, and for every module specifier to correspond to a URL. -We can expect the canonicalized module specifier to be reflected as -`import.meta.url`. -In Node.js, we can also expect import specifiers that are not fully qualified -URLs to receive special treatment. -However, in Moddable's XS engine, we can expect a module specifier -to resemble a UNIX file system path and not have a corresponding URL. - -We can also expect to have only one module instance per canonical module -specifier in a given Realm, and for `import(specifier)` to be idempotent -for the lifetime of a Realm. -Tools that require separate module memos are therefore compelled to create -realms either using Node.js's [VM context][vm-context] or `` and -[content security policies][csp] rather than a lighter-weight mechanism, -and consequently suffer identity discontinuties between instances from -different realms. - -Tools that will benefit from the ability to have multiple module graphs -in a single realm include: - -* bundlers ([Browserify][browserify], [WebPack][webpack], [Parcel][parcel], - &c), virtualize loading but not evaluation of module graphs and emulate other - host environments, like a Node.js program emulating a web browser. -* import mappers ([import-map][import-map]) like bundlers need to be able to - collect transitive dependencies according to ECMAScript language and specific - host behaviors. - A ECMAScript native module loader interface would expedite evolution of import map - runtimes in JavaScript. -* hot module replacement (HMR) systems (WebPack, SnowPack, &c), which need the - ability to instantiate new module graphs when dependencies change and the - ability to bequeath subgraphs to new graphs. - * Node.js [defers][node-hmr] to ECMAScript to provide a module loader - interface to aid HMR. -* persistent testing apparatuses ([Jest][jest]), because a persistent service - reinstantiates whole module graphs to reconstruct tests and test subjects. - * Jest currently resorts to exploiting Node.js's [vm][vm-context] module to - instantiate separate realms and attempts ([and - fails][jest-ses-interaction]) to provide the illusion of a single realm by - patching client realms with some of the intrinsics of the host realm. -* emulators ([JSDom][jsdom]) in which the emulated artifact may need a separate - module memo from the surrounding realm. -* sub-realm sandboxes ([SES][ses] and [LavaMoat][lava-moat]) that virtualize - evaluating guest modules and limit access to globals and built-in modules. - This proposal prepares for the SES proposal to introduce `lockdown`, which - isolates all evaluators, including `eval`, `Function`, and this `Compartment` - module evaluator. That proposal will introduce the concern of per-compartment - globals and hardened shared intrinsics. - -Defining a module loader in the language also improves the language's ability -to evolve. -For example, a module loader interface that accounts for linking "virtual" -modules that are not JavaScript facilitates easier experimentation with linkage -against languages like WASM. -For another, a module loader interface allows for user space experimentation -with the notion of [import maps][import-map]. - -Defining a module loader in the language also provides valuable insight to the -design of every language feature that touches upon modules, and every new -module system feature adds uncertainty to the eventual inclusion of a module -loader to the language. - -One such insight is that module blocks will benefit from the notion of a module -descriptor as defined by this proposal. Module blocks roughly correspond to -compiled sources and are consequently not coupled to a particular host environment. -A module descriptor is necessary to carry properties of a module not captured -in the source, like the module referrer specifier and how to populate `import.meta`. - -Additionally, having a module loader interface is a prerequisite for shimming built-in -modules. - -### Sketch - -Below is a rough sketch of potential interfaces. - -* A "module instance" consists of a "module exports namespace", a "module - environment record", and a "static module record". -* A "module exports namespace" is an exotic object that represents the exported - namespace of a module as already specified in ECMA 262. - This proposal does not alter module exports namespaces. -* A "module environment record" is the scope into which a module imports names - from other modules and exports names to other modules. - This proposal reifies a module environment record as an exotic object that - virtual modules can use to implement bindings. - An `import name as alias` binding will have a property with the lexically bound alias. - An `export name as alias` binding will have a property with the lexically bound name, - whereas the module exports namespace will have a property with the alias. - The environment record does not contain a property for any names that are - imported and reexported without a lexical binding. - -```ts -type ModuleExportsNamespace = Record; -type ModuleEnvironmentRecord = Record; - -// Bindings reflect the `import` and `export` statements of a module. -// A statement with multiple clauses decomposes into multiple bindings. -type Binding = - - // import { X } from 'X'; - // import { X as Y } from 'X'; - { import: string, as?: string, from: string } | - - // export { X } - // export { X as Y } - // export { X } from 'X'; - // export { X as Y } from 'X'; - { export: string, as?: string, from?: string } | - - // import * as X from 'X'; - { importAllFrom: string, as: string } | - - // export * from 'X'; - // export * as X from 'X'; - { exportAllFrom: string, as?: string }; - -// Compartments support ECMAScript modules and linkage to other kinds of modules, -// notably allowing for JSON or WASM. -// VirtualStaticModuleRecord is a *protocol* that compartments recognize if -// the `record` property of a ModuleDescriptor is neither a string nor -// an object that passes a StaticModuleRecord brand check. -// These amy provide an initializer function and may declare bindings for -// imported or exported names. -// The bindings correspond to the equivalent `import` and `export` declarations -// of an ECMAScript module. -type VirtualStaticModuleRecord = { - // Indicates the import and export bindings the module has - // between its module environment record, module exports namespace, - // and its dependencies. - bindings?: Array, - - // Initializes the module if it is imported. - // Initialize may return a promise, indicating that the module uses - // the equivalent of top-level-await. - // XXX The compartment will leave that promise to dangle, so an eventual - // rejection will necessarily go unhandled. - initialize?: (environment: ModuleEnvironmentRecord, { - import?: (importSpecifier: string) => Promise, - importMeta?: Object - }) => void, - - // Indicates that initialize needs to receive a dynamic import function that - // closes over the referrer module specifier. - needsImport?: boolean, - - // Indicates that initialize needs to receive an importMeta. - needsImportMeta?: boolean, -}; - -// Static module records are an opaque token representing the compilation -// of a module that can be reused across multiple compartments. -interface StaticModuleRecord { - // Static module records can be constructed from source. - // XS allows virtual module records and source descriptors to - // be precompiled as well. - constructor(source: string); - - // Static module records reflect their bindings for information only. - // Compartments use internal slots for the compiled code and bindings. - bindings: Array; - - // Indicates that initialize needs to receive a dynamic import function that - // closes over the referrer module specifier. - needsImport?: boolean, - - // Indicates that initialize needs to receive an importMeta. - needsImportMeta?: boolean, -} - -// A ModuleDescriptor captures a static module record and per-compartment metadata. -type ModuleDescriptor = - - // Describes a module by referring to a *reusable* record of the compiled - // source. - // The static module record captures a static analysis - // of the `import` and `export` bindings and whether the source ever utters - // the keywords `import` or `import.meta`. - // The compartment will then asynchronously _load_ the shallow dependencies - // of the module and memoize the promise for the - // result of loading the module and its transitive dependencies. - // If the compartment _imports_ the module, it will generate and memoize - // a module instance and initialize the module. - // To initialize the module, the compartment will construct an `import.meta` object. - // If the source utters `import.meta`, - // * The compartment will construct an `importMeta` object with a null prototype. - // * If the module descriptor has an `importMeta` property, the compartment - // will copy the own properties of the descriptor's `importMeta` over - // the compartment's `importMeta' using `[[Set]]`. - // The compartment will then begin initializing the module. - // The compartment memoizes a promise for the module exports namespace - // that will be fulfilled at a time already defined in 262 for dynamic import. - | { - record: StaticModuleRecord, - - // Full specifier, which may differ from the `fullSpecifier` used - // as the corresponding key of the `modules` Compartment constructor option, - // or the `fullSpecifier` argument of the `loadHook` that returned - // the module descriptor. - // If present, this will be used as the referrerSpecifier for this record's - // importSpecifiers instead. - specifier?: string, - - // Properties to copy to the `import.meta` of the resulting module instance, - // if the source mentions `import.meta`. - importMeta?: Object, - } - - // Describes a module by a *reusable* virtual static module record. - // When the compartment _loads_ the module, it will use the bindings array - // of the virtual static module record to discover shallow dependencies - // in lieu of compiling the source, and otherwise behaves identically - // to { source } descriptors. - // When the compartment _imports_ the module, it will construct a - // [[ModuleEnvironmentRecord]] and [[ModuleExportsNamespace]] based - // entirely on the bindings array of the virtual static module record. - // If the virtual static-module-record has a true `needsImport`, the - // compartment will construct an `import` that resolves - // the given import specifier relative to the module instance's full specifier - // and returns a promise for the module exports namespace of the - // imported module. - // If the virtual static-module-record has a true `needsImportMeta` - // property, the compartment will construct an `importMeta` by the - // same process as any { source } module. - // `importMeta` will otherwise be `undefined`. - // The compartment will then initialize the module by calling the - // `initialize` function of the virtual static module record, - // giving it the module environment record and an options bag containing - // either `import` or `importMeta` if needed. - // The compartment memoizes a promise for the module exports namespace - // that is fulfilled as already specified in 262 for dynamic import, - // where the behavior if the initialize function is async is analogous to - // top-level-await for a module compiled from source. - | { - record: VirtualStaticModuleRecord, - - // See above. - specifier?: string, - - // See above. - importMeta?: Object, - } - - // Use a static module record from the compartment in which this compartment - // was constructed. - // If the compartment _loads_ the corresponding module, the static - // module record will be synchronously available only if it is already - // in the parent compartment's synchronous memo. - // Otherwise, loading induces the parent compartment to load the module, - // but does not induce the parent compartment to load the module's transitive - // dependencies. - // This compartment then resolves the shallow dependencies according - // to its own resolveHook and loads the consequent transitive dependencies. - // If the compartment _imports_ the module, the behavior is equivalent - // to loading for the retrieved static module record or virtual static - // module record. - | { - record: string, - - // Properties to copy to the `import.meta` of the resulting module instance. - importMeta?: Object, - } - - // FOR SHARED MODULE INSTANCES: - - // To create an alias to an existing module instance in this compartment: - | { - // A full specifier for another module in this compartment. - instance: string, - } - - // To create an alias to an existing module instance given the full specifier - // of the module in a different compartment: - | { - instance: string, - - // A compartment instance. - // We do not preclude the possibility that compartment is the same compartment, - // as that is possible to achieve in a `loadHook`. - compartment: Compartment, - } - - // To create an alias to an existing module instance given a module exports - // namespace object that can pass a brand check: - | { - namespace: ModuleExportsNamespace, - } - - // If the given namespace does not pass a ModuleExportsNamespace brandcheck, - // the compartment will not support live bindings to the properties of that - // object, but will instead produce an emulation of a module exports namespace, - // using a frozen snapshot of the own properties of the given object. - // The module exports namespace for this module does not reflect any future - // changes to the shape of the given object. - | { - // Any object that does not pass a module exports namespace brand check: - namespace: ^ModuleExportsNamespace, - }; - -type CompartmentConstructorOptions = { - // Every Compartment has a reference to a global environment record that in - // turn contains a new globalThis object, global contour, and three - // specialized intrinsic evaluators: eval, Function, and Compartment instances. - // The new globalThis object contains a subset of the JavaScript language - // intrinsics (to be defined in this proposal) and other globals must be - // "endowed" to the compartment explicitly with the `globals` option. - // All of these evaluators close over the compartment's global environment - // record such that they evaluate code in that global environment. - // When borrowGlobals is false, a the new Compartment gets a new global - // environment record. - // When borrowGlobals is true, the new Compartment will have the - // same global environment record as associated with the Compartment - // constructor used to construct the compartment. - // To borrow the globals of an arbitrary compartment, use that compartment's - // Compartment constructor, like - // new compartment.globalThis.Compartment({ borrowGlobals: true }). - borrowGlobals: boolean, - - // Globals to copy onto this compartment's unique globalThis. - // Constructor options with globals and borrowGlobals: true would be incoherent and - // effect an exception. - globals: Object, - - // The compartment uses the resolveHook to synchronously elevate - // an import specifier (as it appears in the source of a StaticModuleRecord - // or bindings array of a VirtualStaticModuleRecord), to - // the corresponding full specifier, given the full specifier of the - // referring (containing) module. - // The full specifier is the memo key for the module. - // TODO This proposal does not yet account for import assertions, - // which evidence shows are actually import type specifiers as they were - // originally designed and may need to also participate in the memo key. - resolveHook: (importSpecifier: string, referrerSpecifier: string) => string, - - // The compartment's load and import methods may need to load or initialize - // additional modules. - // If the compartment does not have a module on hand, - // it will first consult the `modules` object for a descriptor of the needed module. - modules?: Record, - - // If loading or importing a module misses the compartment's memo and the - // `modules` table, the compartment calls the asynchronous `loadHook`. - // Note: This name differs from the implementation of SES shim and a - // prior revision of this proposal, where it is currently called `importHook`. - loadHook?: (fullSpecifier: string) => Promise -}; - -interface Compartment { - // Note: This single-argument form differs from earlier proposal versions, - // implementations of SES shim, and Moddable's XS, which accept three arguments, - // including a final options bag. - constructor(options?: CompartmentConstructorOptions): Compartment; - - // Accessor for this compartment's globals. - // If borrowGlobals is true, globalThis is object identical to the incubating - // compartment's globalThis. - // If borrowGlobals is false, globalThis is a unique, ordinary object - // intrinsic to this compartment. - // The globalThis object initially has only "shared intrinsics", - // followed by compartment-specific "eval", "Function", and "Compartment", - // followed by any properties transferred from the "globals" - // constructor option with the semantics of Object.assign. - globalThis: Object, - - // Evaluates a program program using this compartment's associated global - // environment record. - // A subsequent proposal might add options to either the compartment - // constructor or the evaluate method to persist the global contour - // between calls to evaluate, making compartments a suitable - // tool for a REPL, such that a const or let binding from one evaluation - // persists to the next. - // Subsequent proposals might afford other modes, like sloppy globals mode. - // TODO is sloppyGlobalsMode only sensible in the context of the shim? - evaluate(source: string): any; - - // load causes a compartment to load module descriptors for the - // transitive dependencies of a specified module into its - // memo, but does not initialize any modules. - // The load function is useful for tools like bundlers and importmap - // generators that load a module graph in an emulated host environment - // but cannot and should not emulate evaluation. - async load(fullSpecifier: string): Promise - - // import induces the asynchronous load phase and then initializes - // the given module and any of its transitive dependencies that - // have not already begun initialization. - async import(fullSpecifier: string): Promise; - - // TC53: some embedded systems hosts exclude promises and so require these - // synchronous variants of import and load. - // On hosts where these functions cannot succeed synchronously, - // they must throw an error without the side effect of initializing any - // additional modules. - // For modules that have a top-level await, these would return - // after the module first awaits but not after any subsequent promise queue - // job. - - // If a host supports both importNow and import, they must share - // a memo of full specifiers to promises for the module exports namespace. - // importNow and loadNow followed by import or load must not accidentally - // reinitialize the underlying module or produce different namespaces. - - importNow(fullSpecifier: string): ModuleExportsNamespace; - - loadNow(fullSpecifier: string): void; -} -``` - -## Motivating Examples and Design Rationales - -### Multiple-instantiation - -This example illustrates the use of a new compartment to support multiple -instantiation of modules, reusing the host's compartment and static module record -memos as a cache. -This example creates five instances of the example module and its transitive -dependencies. - -```js -for (let i = 0; i < 5; i += 1) { - new Compartment().import('https://example.com/example.js'); -} -``` - -Assuming that the language separately adopted hypothetical `import static` -syntax to defer execution but load a module and its transitive dependencies, a -bundler would be able to observe the need to capture the example module and its -transitive dependencies, such that the *only* instances are in guest compartments. - -```js -import static 'https://example.com/example.js'; -for (let i = 0; i < 5; i += 1) { - new Compartment().import('https://example.com/example.js'); -} -``` - -### Virtualized web compartment - -This example illustrates a very reductive emulation of a web-based compartment. -The module specifier domain is strictly URLs and import specifiers are resolved -relative to the referrer module specifier using URL resolution. -The compartment populates `import.meta.url` with the response URL. - -The compartment also ensures that the import specifiers of whatever module is -loaded get resolved relative to the physical location of the resource. -If the response URL shows that the fetch followed redirects, the `loadHook` -returns a reference (`{instance: response.url}`) to the actual module instead -of returning a record. -The compartment then follows-up with a request for the redirected location. - -```js -const compartment = new Compartment({ - resolveHook(importSpecifier, referrerSpecifier) { - return new URL(importSpecifier, referrerSpecifier).href; - }, - async loadHook(url) { - const response = await fetch(url, { redirect: 'manual' }); - if (response.url !== url) { - return { instance: response.url }; - } - const source = await response.text(); - return { - record: new StaticModuleRecord(source), - importMeta: { url: response.url }, - }; - }, -}); -await compartment.import('https://example.com/example.js'); -``` - -By returning an alias module descriptor, the compartment can ensure -that requests for both the request URL and the response URL refer -to the canonicalized module. - ---- - -For the same intended effect but a single fetch, we might -alternately use a `specifier` property of record module descriptors. -In the following example, both the request URL and the response URL -would realize cache keys in the compartment. - -```js -const compartment = new Compartment({ - resolveHook(importSpecifier, referrerSpecifier) { - return new URL(importSpecifier, referrerSpecifier).href; - }, - async loadHook(url) { - const response = await fetch(url); - const source = await response.text(); - return { - record: new StaticModuleRecord(source), - specifier: response.url, - importMeta: { url: response.url }, - }; - }, -}); -await compartment.import('https://example.com/example.js'); -``` - -So, we guide the host to instead return the new cache key so the compartment -can memoize the `loadHook`, sending a single request for the canonicalized -module specifier. - -A design tension with the `specifier` property is that it invites a race -between two concurrent `loadHook` calls for different specifiers that -converge on the same response specifier. -Implementations must take care to ensure that the module record memo and the -module record *promise* memo point to the same record for a given ultimate full -specifier. - - -### Bundling or archiving - -Compartments can be employed to virtualize a foreign environment and generate -bundles, archives, or other stored forms of a program for transmission and -deferred execution. - -This first snippet is a minimal bundler. -It differs from the first minimal example only in that it generates a `sources` -Map and calls `load` instead of `import`, ensuring we do not attempt to run any -of the loaded code locally. - -```js -const sources = new Map(); -const compartment = new Compartment({ - resolveHook(importSpecifier, referrerSpecifier) { - return new URL(importSpecifier, referrerSpecifier).href; - }, - async loadHook(fullSpecifier) { - const response = await fetch(fullSpecifier); - const source = await response.text(); - sources.set(fullSpecifier, source); - return { - record: new StaticModuleRecord(source), - meta: { url: response.url }, - }; - }, -}); -await compartment.load('https://example.com/example.js'); -``` - -Then, we presumably serialize the `sources` map and recreate it in another -environment. -This next figure uses the `sources` to reconstruct the original compartment -compartment module graph and execute it. - -```js -const evaluator = new Compartment({ - resolveHook(importSpecifier, referrerSpecifier) { - return new URL(importSpecifier, referrerSpecifier).href; - }, - loadHook(fullSpecifier) { - const source = sources.get(fullSpecifier); - if (source === undefined) { - throw new Error('Assertion failed: incomplete sources'); - } - return { - record: new StaticModuleRecord(source), - meta: { url: response.url }, - }; - }, -}); -await evaluator.import('https://example.com/example.js'); -``` - -### Inter-compartment linkage - -One motivating use of compartments is to isolate Node.js-style packages and -limit their access to powerful modules and globals, to mitigate software supply -chain attacks. -With such an application, we would construct a special compartment for each package -and allow compartments to link modules across compartment boundaries. - -In this trivial example, we construct a pair of compartments, `even` and `odd`, -which in turn contain mutually dependent `even` and `odd` modules that -participate in a dependency cycle. -For simplicity, the domain of module specifiers is exactly the names `even` and -`odd`, and these compartments do not support resolution. - -These compartments use `{ instance, compartment }` module descriptors to indicate -linkage across compartment boundaries. - -```js -const even = new Compartment({ - resolveHook: specifier => specifier, - loadHook: async specifier => { - if (specifier === 'even') { - return { record: new StaticModuleRecord(` - import isOdd from 'odd'; - export default n => n === 0 || isOdd(n - 1); - `) }; - } else if (specifier === 'odd') { - return { instance: specifier, compartment: odd }; - } else { - throw new Error(`No such module ${specifier}`); - } - }, -}); - -const odd = new Comaprtment({ - resolveHook: specifier => specifier, - loadHook: async specifier => { - if (specifier === 'odd') { - return { record: new StaticModuleRecord(` - import isEven from 'even'; - export default n => n !== 0 && isEven(n - 1); - `) }; - } else if (specifier === 'even') { - return { instance: specifier, compartment: even }; - } else { - throw new Error(`No such module ${specifier}`); - } - }, -}); -``` - -An alternative design that Agoric's SES shim and XS's native Compartment -explored used module exports namespace objects as handles that could be passed -between compatment hooks. -However, to support the bundler use case, it became necessary to add a method -that could get a module exports namespace object for a module that had not yet -been loaded (`compartment.module(specifier)`, much less instantiated, nor -initialized. -The invention of a module descriptor allowed us to remove this complication, -among others: the `moduleMapHook` became superfluous since both the `modules` -constructor option and the `loadHook` could use module descriptors instead. - -### Linking with a virtual record (JSON example) - -To support non-JavaScript languages, a compartment provides a `loadHook` that -returns virtual-static-module-record implementations. -This example virtual-static-module-record declares its bindings (equivalent to -`export default` in this case) and provides an initializer. -The initializer receives a module environment record according to the shape -declared in its bindings. -This compartment makes the simplifying assumption that all modules are JSON. -A more elaborate version of this example will switch on the response MIME type -and account for import assertions. - -```js -const compartment = new Compartment({ - resolveHook(importSpecifier, referrerSpecifier) { - return new URL(importSpecifier, referrerSpecifier).href; - }, - async loadHook(fullSpecifier) { - const response = await fetch(fullSpecifier); - const source = await response.text(); - const record = { - bindings: [ - {export: 'default'}, - ], - initialize(env) { - env.default = JSON.parse(source); - } - }; - return { record }; - }, -}); -await compartment.import('https://example.com/example.json'); -``` - -### Thenable Module Hazard - -An exported value named `then` can be statically imported, but dynamic import -confuses the module namespace for a thenable object. -The resolution of the promise returned by dynamic import, in this case, is the -eventual resolution of the thenable module. -And the eventual resolution is unlikely to be an intended effect. - -Consider `thenable.js`: - -```js -export function then(resolve) { - resolve(42); -} -``` - -A neighboring module might dynamically import this. - -```js -import('./thenable.js').then((x) => { - // x will be 42 in this case, not a module namespace object with a then - // function. -}) -``` - -This is the behavior of a dynamic import today, despite it being surprising. - -We have chosen to embrace this hazard since it would be worse to have -dynamic import and compartment import behave differently. - -However, with `compartment.importNow`, a program can mitigate this hazard. -With `importNow`, the following program will not invoke the `then` -function exported by `'./thenable.js'`. - -```js -await compartment.load('./thenable.js'); -const thenableNamespace = compartment.importNow('./thenable.js'); -``` - -[browserify]: https://browserify.org/ -[import-map]: https://github.com/WICG/import-maps -[jest-ses-interaction]: https://github.com/facebook/jest/issues/11952 -[jsdom]: https://www.npmjs.com/package/jsdom -[lava-moat]: https://github.com/LavaMoat/LavaMoat -[node-hmr]: https://github.com/nodejs/node/issues/40594 -[parcel]: https://parceljs.org/ -[ses-proposal]: https://github.com/tc39/proposal-ses -[ses-shim]: https://github.com/endojs/endo/tree/master/packages/ses -[AOT-SES]: https://github.com/DimensionDev/aot-secure-ecmascript -[webpack]: https://webpack.js.org/ -[xs-compartments]: https://blog.moddable.com/blog/secureprivate/ -[vm-context]: https://nodejs.org/api/vm.html#vm_vm_createcontext_contextobject_options -[redirect-manual]: https://fetch.spec.whatwg.org/#concept-filtered-response-opaque-redirect +# Synopsis + +Compartments are a mechanism for isolating and providing limited power to +programs within a shared realm. +Each compartment shares the intrinsics of a realm, but a different set of +evaluators (`eval`, `Function`, and a new evaluator, `Module`) and a global +object. +Having a separate global object allows each compartment to be granted access to +only those powerful objects it needs, its own isolated evaluators, powerless +constructors, and shared prototypes. + +The Compartments proposal was approved for Stage 1 (exploration of a problem) +with the charter, "to compartmentalize host behaviors". +The problem we set out to solve was excess authority flowing from global scope +and host behaviors into third-party dependencies and plugins in large +applications. +Through exploring this problem, we discovered that the bulk of the solution, by +weight, was virtualizing the EcmaScript module loader. +Provided an EcmaScript module loader, we could then build a solution for +isolating code for both scripts and modules. + +Over the course of two years, we refined the Compartment class to account for +the need to make and import bundles, emulate various host module specifier +namespaces, link modules between multiple compartments, and support +non-EcmaScript module languages. + +We then began working with champions of module blocks, module fragments, +deferred import, and import reflection to ensure these proposals were coherent. +From these discussions, we discovered a set of lower-level interfaces from +which compartments could be constructed in user code that were more coherent +with these other proposals. + +With that, the Compartments proposal consists of five layers: + +- [Module and ModuleSource][0]: Provide first-class `Module` and + `ModuleSource` constructors and extend dynamic import to operate on `Module` + instances. + +- [Surface Module Source Static Analysis][1]: Extend instances of + `ModuleSource` such that they reflect certain results of static analysis, + like their `import` and `export` bindings, such that tools can inspect module + graphs. + +- [Virtual Module Sources][2]: Extend the `Module` constructor such that it + accepts virtual module sources: objects that implement a protocol that is + sufficient for virtualizing the evaluation of modules in languages not + anticipated by ECMA-262 or host implementations. + +- [Evaluators][3]: Provide an `Evaluators` constructor that produces + a new `eval` function, `Function` constructor, and `Module` constructor + such that execution contexts generated from these evaluators refer + back to this set of evaluators, with a given global object and virtualized + host behavior for dynamic import in script contexts. + +- [Compartment][4]: Compartments are a high-level mechanism for isolating + and providing limited power to programs within a shared realm. + Compartments can be implemented in user code using `Evaluators`, `Module`, + and `ModuleSource`. + +[0]: ./0-module-and-module-source.md +[1]: ./1-static-analysis.md +[2]: ./2-virtual-module-source.md +[3]: ./3-evaluator.md +[4]: ./4-compartment.md