Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Divide into low-level chapters #71

Merged
merged 19 commits into from
Jul 19, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
399 changes: 399 additions & 0 deletions 0-module-and-module-source.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,399 @@
# First-class Module and ModuleSource

## Synopsis

Provide first-class `Module` and `ModuleSource` constructors and extend dynamic
import to operate on `Module` instances.

A `ModuleSource` represents the result of compiling EcmaScript module source
text.
A `Module` instance represents the lifecycle of a EcmaScript module and allows
virtual import behavior.
Multiple `Module` instances can share a common `ModuleSource`.

## Interfaces

### ModuleSource

```ts
interface ModuleSource {
constructor(source: string);
}
```

Semantics: A `ModuleSource` instance gives the holder no powers.
It represents a compiled EcmaScript module and does not capture any information
beyond what can be inferred from a module's source text.
Import specifiers in a module's source text cannot be interpreted without
further information.

Note 1: `ModuleSource` does for modules what `eval` already does for scripts.
We expect content security policies to treat module sources similarly.
A `ModuleSource` instance constructed from text would not have an associated
origin.
A `ModuleSource` instance can be constructed from vetted text ([W3C Trusted
Types][trusted-types]) and host-defined import hooks may reveal module sources
that were vetted behind the scenes.

Note 2: Multiple `Module` instances can be constructed from a single `ModuleSource`,
producing one exports namespaces for each imported `Module` instance.

Note 3: The internal record of a `ModuleSource` instance is immutable and
serializable.
This data can be shared without cost between realms of an agent or even agents
of an agent cluster.

### Module instances

```ts
type ImportSpecifier = string;

type ImportHook = (specifier: ImportSpecifier, importMeta: object) =>
Promise<Module>;

interface Module {
constructor(
source: ModuleSource,
options: {
importHook?: ImportHook,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kriskowal the reason why I went with 2nd and 3rd argument instead of an option bags here is that when importHook is omitted, what is importMeta for then? it seems that they are bound, and if one if present, the other must be present, which is easier to represent via arguments.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is of course still an open question, but we’ll need an options bag regardless for import assertions per #37 and it’s likely that there are reasonable defaults for both of these.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if I inherit importHook from Evaluator but I want to provide a special importMeta?

importMeta?: object,
},
);

readonly source: ModuleSource,
}
```

Semantics: A `Module` has a 1-1-1-1 relationship with a ***Module Environment
Record***, a ***Module Record*** and a ***module namespace exotic object***.

The module has a lifecycle and fresh instances have not been linked,
initialized, or executed.

Invoking dynamic import on a `Module` instance advances it and its transitive
dependencies to their end state.
Consistent with dynamic import for a stringly-named module,
dynamic import on a `Module` instance produces a promise for the corresponding
***Module Namespace Object***

Dynamic import induces calls to `importHook` for each unsatisfied dependency of
each module instance in separate events, before any dependency advances to the
link phase of its lifecycle.

Dynamic import within the evaluation of a `Module` also invokes the
`importHook`.

`Module` instances memoize the result of their `importHook` keyed on the given
Import Specifier.

`Module` constructors, like `Function` constructors, are bound to a realm
and evaluate modules in their particular realm.

## Examples
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kriskowal I think we are missing the section about Module.get(namespace) from the original gist. Without this capability, there is no way to interconnect a manually created module subgraph with the host-created graph, so it is very limiting.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now, I’m assuming that the deferred execution proposal or import reflection accommodate this case and intend to circle back for Module.get if necessary or helpful.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is Module.get(namespace)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Module.get(ModuleNamespace) => Module would reveal the Module underlying a given module exports namespace object, such that it could be returned by an importHook.


### Import Kicker

Any dynamic import function is suitable for initializing, linking, and
evaluating a module instance.
This necessarily implies advancing all of its transitive dependencies to their
terminal state or any one into a failed state.

```js
const source = new ModuleSource(``);
const instance = new Module(source, {
importHook,
importMeta: import.meta,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, now I'm even more confused about import.meta. Here, the calling code is passing its own import.meta as the value of the importMeta option. Does this make sense?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes sense when you want to share the importMeta with the child module (e.g. child module uses new URL(..., import.meta.url))

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Passing import.meta is sufficient in cases where the host and guest module share the same referrer and the import.meta directly states the referrer, as in import.meta.url, which is the scope of this example.

It is also possible to arrange for an importHook for which passing import.meta directly from host to guest would not be sufficient, as would be the case if importMeta is used as a key in a WeakMap to address the corresponding referrer, since import.meta is not object identical to the importMeta passed to the Module constructor or importHook. For these cases, more ceremony is necessary but the motivating cases are all supported.

});
const namespace = await import(instance);
```

### Module Idempotency

Since the Module has a bound module namespace exotic object, importing the same
instance must yield the same result:

```js
const source = new ModuleSource(``);
const instance = new Module(source, {
importHook,
importMeta: import.meta,
});
const namespace1 = await import(instance);
const namespace2 = await import(instance);
namespace1 === namespace2; // true
```

### Reusing ModuleSource

Module sources are backed by a shared immutable module source record
that can be instantiated multiple times, even locally.
Multiple `Module` instances can share a module source and produce
separate module namespaces.

```js
const source = new ModuleSource(``);
const instance1 = new Module(source, {
importHook: importHook1,
importMeta: import.meta,
});
const instance2 = new Module(source, {
importHook: importHook2,
importMeta: import.meta,
}));
instance1 === instance2; // false
const namespace1 = await import(instance1);
const namespace2 = await import(instance2);
namespace1 === namespace2; // false
```

### Intersection Semantics with Module Blocks

Proposal: https://github.com/tc39/proposal-js-module-blocks

In relation to module blocks, we can extend the proposal to accommodate both,
the concept of a module block instance and module block source:

```js
const instance = module {};
instance instanceof Module;
instance.source instanceof ModuleSource;
const namespace = await import(instance);
```

To avoid needing a throw-away module-instance in order to get a module source,
we can extend the syntax:

```js
const source = static module {};
source instanceof ModuleSource;
const instance = new Module(source, {
importHook,
importMeta: import.meta,
});
const namespace = await import(instance);
```

### Intersection Semantics with deferred execution

The possibility to load the source, and create the instance with the default
`importHook` and the `import.meta` of the importer, that can be imported at any
given time, is sufficient:

```js
import instance from 'module.js' deferred execution syntax;
instance instanceof Module;
instance.source instanceof ModuleSource;
const namespace = await import(instance);
```

If the goal is to also control the `importHook` and the `importMeta` of the
importer, then a new syntax can be provided to only get the `ModuleSource`:

```ts
import source from 'module.js' static source syntax;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: If we have the second, giving us the ModuleSource, does that subsume the need for the first?

If so, and given the static module {} syntax above, could this one be

import source from static 'module.js';

Isn't this exactly the same issue as module reflection? However it is spelled, doesn't the ability to import the ModuleSource subsume all of these?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, importing module sources overlaps with import reflection and importing module instances overlaps with dynamic import. There are many ways that Module Harmony can unfold that cover all the motivating cases, but I’m in favor of separate module and static module syntaxes for both module blocks and import (by whatever spelling) , to cover both deferred execution and module source reflection.

import module from 'module';
module instancof Module // true, maybe pre-loaded, must not be executed

import static module from 'module';
// module is a module source, albeit `ModuleSource`, `WebAssembly.Module`, or a virtual module source
// maybe virtual module sources are instances of `ModuleSource` at this juncture

(module {}) instanceof Module
(module {}).source instanceof ModuleSource
(static module {}) instanceof ModuleSource

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@erights having only the static version of the syntax would be fine IMO, but having access to the "dynamic" version (which is basically static + current import.meta and default importHook) is also very useful and intuitive IMO.

source instanceof ModuleSource;
const instance = new Module(source, {
importHook,
importMeta: import.meta,
});
const namespace = await import(instance);
```

This is important, because it is analogous to block modules, but instead of
inline source, it is a source that must be fetched.

### Intersection Semantics with import.meta.resolve()

Proposal: https://github.com/whatwg/html/pull/5572

```ts
const importHook = async (specifier, importMeta) => {
const url = importMeta.resolve(specifier);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The import.meta object is ambient and reachable by syntax. I am concerned about how much power it would convey if it has its own resolve function. Would this function cause I/O?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe import.meta.resolve is already going to ship in both browsers and node - in browsers I believe it does not do i/o, because it's using the URL cache, and in node I believe it does do i/o, since it checks the disk.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't limit what host put on the importMeta, why is that a question? 🤔

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@erights This is a reductive example. The importMeta can be a key in a side table to the underlying referrer, fully hiding resolution. Also, importMeta will not be object identical to import.meta and not shared by multiple modules. It’ll be copied onto import.meta if needed.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@erights there is a difference between the importMeta provided to the instance of Module, and the import.meta accessible from the evaluation of the source associated to the instance of Module. This is well described in the spec today. In essence, the developer is responsible for providing importMeta object that can be passed to any subsequence invocation of the associated importHook, but when the source access import.meta, a new object is created, and it is populated by the importMeta properties. This means you can easily virtualize what ended up being accessible from within the source via import.meta syntax.

const response = await fetch(url);
const sourceText = await response.text();
const source = new ModuleSource(sourceText);
return new Module(source, {
importHook,
importMeta: createCustomImportMeta(url),
});
}

const source = new ModuleSource(`export foo from './foo.js'`);
const instance = new Module(source, {
importHook,
importMeta: import.meta,
});
const namespace = await import(instance);
```

In the example above, we re-use the `ImportHook` declaration for two instances,
the `source`, and the corresponding dependency for specifier `./foo.js`. When
the kicker `import(instance)` is executed, the `importHook` will be invoked
once with the `specifier` argument as `./foo.js`, and the `meta` argument with
the value of the `import.meta` associated to the kicker itself. As a result,
the `specifier` can be resolved based on the provided `meta` to calculate the
`url`, fetch the source, and create a new `Module` for the new source. This new
instance opts to reuse the same `importHook` function while constructing the
`meta` object. It is important to notice that the `meta` object has two
purposes, to be referenced by syntax in the source text (via `import.meta`) and
to be passed to the `importHook` for any dependencies of `./foo.js` itself.

## Design

A ***Module Source Record*** is an abstract class for immutable representations
of the dependencies, bindings, initialization, and execution behavior of a
module.

Host-defined import-hooks may specialize module source records with annotations
such that the host can enforce content-security-policies.

A ***EcmaScript Module Source Record*** is a concrete ***Module Source
Record*** for EcmaScript modules.

`ModuleSource` is a constructor that accepts EcmaScript module source text and
produces an object with a [[ModuleSource]] slot referring to an ***EcmaScript
Module Source Record***.

`ModuleSource` instances are handles on the result of compiling a EcmaScript
module's source text.
A module source has a [[ModuleSource]] internal slot that refers to a
***Module Source Record***.
Multiple `Module` instances can share a common module source.

Module source records only capture information that can be inferred from static
analysis of the module's source text.

Multiple `ModuleSource` instances can share a common ***Module Source Record***
since these are immutable and so hosts have the option of sharing them between
realms of an agent and even agents of an agent cluster.

The `Module` constructor accepts a source and
A `Module` has a 1-1-1-1 relationship with a ***Module Environment Record***,
a ***Module Source Record***, and a ***Module Exports Namespace Exotic Object***.

## Design Rationales

### Should `importHook` be synchronous or asynchronous?

When a source module imports from a module specifier, you might not have the
source at hand to create the corresponding `Module` to be returned. If
`importHook` is synchronous, then you must have the source ready when the
`importHook` is invoked for each dependency.

Since the `importHook` is only triggered via the kicker (`import(instance)`),
going async there has no implications whatsoever.
In prior iterations of this, the user was responsible for loop thru the
dependencies, and prepare the instance before kicking the next phase, that's
not longer the case here, where the level of control on the different phases is
limited to the invocation of the `importHook`.

### Can cycles be represented?

Yes, `importHook` can return a `Module` that was either `import()` already or
was returned by an `importHook` already.

### Idempotency of dynamic imports in ModuleSource

Any `import()` statement inside a module source will result of a possible
`importHook` invocation on the `Module`, and the decision on whether or not to
call the `importHook` depends on whether or not the `Module` has already
invoked it for the `specifier` in question. So, a `Module`
most keep a map for every `specifier` and its corresponding `Module` to
guarantee the idempotency of those static and dynamic import statements.

User-defined and host-defined import-hooks will likely enforce stronger
consistency between import behavior across module instances, but module
instances enforce local consistency and some consistency in aggregate by
induction of other modules.

### toString

Whether `ModduleSource` instances retain the original source may vary by host
and modules should reuse
[HostHasSourceTextAvailable](https://tc39.es/ecma262/#sec-hosthassourcetextavailable)
in deciding whether to reveal their source, as they might with a `toString`
method.
The [module block][module-blocks] proposal may necessitate the retention of
text on certain hosts so that hosts can transmit use sources in their serial
representation of a module, as could be an extension to `structuredClone`.

### Factoring ECMA-262

This proposal decouples a new ***Module Source Record*** and ***EcmaScript
Module Source Record*** from the existing ***Module Record*** class hierarchy
and introduces a concrete ***Virtual Module Record***.
The hope if not expectation is that this refactoring makes evident that
***Virtual Module Record***, ***Cyclic Module Record***, and the abstract
base class ***Module Record*** could be refactored into a single concrete
record (***Module Record***) since all meaningful variation can be expressed
with implementations of the abstract ***Module Source Record***.
But, this proposal does not go so far as to make that refactoring normative.

This proposal does not impose on host-defined import behavior.

### Referrer Specifier

This proposal expressly avoids specifying a module referrer.
We are convinced that the `importMeta` object passed to the `Module`
constructor is sufficient to denote (have a host-specific referrer property
like `url` or a method like `resolve`) or connote (serve as a key in a `WeakMap`
side table) since the import behavior carries that exact object to the
`importHook`, regardless of whether `import.meta` is identical to `importMeta`.
This allows us virtual modules to emulate even hosts that provide empty
`import.meta` objects.

## Design Variables

### Whether to reify the `referrer`

As written, we've avoided threading a `referrer` argument into the `Module`
constructor because the `importMeta` is sufficient for carrying information
about the referrer.
This is okay so long as `import.meta` and `importMeta` are different objects,
because a module should not be able to alter its own referrer over the course
of its evaluation.
Having `import.meta !== importMeta` is a topic of some confusion.
If these were made identical, we would need to thread `referrer` separately.

### Relationship to Content-Security-Policy

On hosts that implement a Content-Security-Policy, the [[Origin]] is
host-specific [[HostData]] of a module source.
A module source constructed from a trusted types object could also
inherit an origin.
All of this can occur outside the purview of ECMA-262 and
is orthogonal to the `Module` referrer, which can be different than the origin.

### The name of module instances

`Module` instance and `ModuleInstance` instance are both contenders
for the name of module instances.
We are tentatively entertaining `Module` because it feels likely that we arrive
in a world where `(module {}) instanceof Module`.
The naming calculus would shift significantly if module blocks are reified as
module source instead of module instance.

We will no doubt arrive in a state of confusion with anything named "module"
without qualification.
For example, `WebAssembly.Module` more closely resembles a module source than a
module instance.
That would suggest module source should also be `Module`, and consequently
`ModuleInstance`.

### Form of the kicker method

A functionally equivalent proposal would add an `import` method to the `Module`
prototype to get a promise for the module's exports namespace instead of
overloading dynamic `import`.
Using dynamic import is consistent with an interpretation of the module blocks
proposal where module blocks evaluate to `Module` instances.

### Options bag or flat arguments

The options bag described here for the `Module` constructor leaves some
room open for accounting for import assertions.
It also raises questions about the default import hook and import meta,
which are answerable but have not yet been fully discussed.

[module-blocks]: https://github.com/tc39/proposal-js-module-blocks
[trusted-types]: https://w3c.github.io/webappsec-trusted-types/dist/spec/
Loading