Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cross-thread/realm issues esp. with module blocks #74

Open
Jamesernator opened this issue Aug 6, 2022 · 2 comments
Open

Cross-thread/realm issues esp. with module blocks #74

Jamesernator opened this issue Aug 6, 2022 · 2 comments

Comments

@Jamesernator
Copy link

Jamesernator commented Aug 6, 2022

(This partially follows this comment however it covers wider concerns and possible solutions).

So the new proposal introduces Module which allows for customizing import behaviour with JS hooks. Additionally one desirable is the capability of loading such modules in other realms or threads (e.g. in Workers).

However as currently specified, Module cannot support any evaluation inside another thread or realm due to the fact that in their construction contain realm objects that can't be exposed to other ShadowRealms and certainly not threads.

i.e. Suppose we had this module:

function importHook(specifier, importMeta) {
    // ...
}

// Object in this realm
const importMeta = { someApi: () => { /* ... */ } };
const source = new ModuleSource(`
    console.log(import.meta);
`);
const mod = new Module(source, {
    importHook,
    importMeta,
});

We can't possibly evaluate this module in another thread or even a ShadowRealm due to the fact that importMeta is an object tied to the original realm.

Now host vended modules don't neccessarily share this property, if we had the proposed form for loading a module as a Module (from the import reflection proposal):

import module someModule from "./someModule.js";

Such a module should be perfectly sharable into shadow realms or across threads, as the import.meta objects and such need not have been created. i.e. This should be able to be supported:

import module someModule from "./someModule.js";

const shadowRealm = new ShadowRealm();
// This should be fine as the host can create import.meta in the right realm
const someExport = await shadowRealm.import(someModule, "default");

Okay so this split might not be the biggest problem, one might be able to transfer module sources or such into shadow realms/threads instead. e.g. From the user's perspective it may be the case that modules created by new Module simply can't be shared. However this "solution" starts to really break down in the presence of module blocks:

Module Blocks exacerbate these problems

So the entire point of the module block proposal is that you can get an unevaluated module and send it elsewhere to be evaluated. For those not familiar with the proposal it would allow something along these lines:

// worker.js

self.addEventListener("message", async ({ data: { moduleBlock } }) => {
    const mod = await import(moduleBlock);
    const result = await mod.default();
    postMessage({ result });
});
// Module to evaluate in another thread
const moduleBlock = module {
    import * as someLib from "./someLib.js";
    
    export default function numberCrunch() {
        let total = 0;
        for (let i = 0; i < 100000000; i++) {
            total += i;
        }
        return total;
    }
}

const worker = new Worker("./worker.js");
worker.onmessage = ({ result }) => {
    // do something with result
}

// perform task in other thread
worker.postMessage({ moduleBlock });

However an issue is quickly presented with module blocks, in order for this to work the host needs to be able to create the module block on the other thread. Now because module blocks inherit their loading and import.meta behaviour from the parent module now if we create a Module instance with a module block:

const source = new ModuleSource(`
    const moduleBlock = module {
        // Oops this object is in the wrong thread
        console.log(import.meta);
    };
    
    const worker = new Worker("./worker.js");
    
    // This breaks because this Module isn't a "host" module, this completely
    // breaks the entire module-blocks feature
    worker.postMessage({ moduleBlock });
`);
const mod = new Module(source, {
    importHook,
    importMeta,
});

// Throws an error
await import(mod);

In order to repair this every module would need to know if they were loaded using a userland Module, if they were they would need to communicate module blocks in a completely different way to threads, essentially destroying the whole feature anytime a non-host loader was used.

A rough solution idea

So this problem ultimately stems from the fact that host modules have capabilities otherwise inaccessible to the currently specified modules. And that capability is the ability to create objects in any realm/thread.

I'd like to propose a change that allows hosts to essentially offer the same capabilities to userland modules (if they want), while still preserving the usual run-to-completion invariants and such.

In particular I'd propose that Module objects are created with the following things:

  • A module source
  • An import hook, this is just a function that returns Module instances, this is fairly similar to the current proposal but doesn't capture importMeta
  • An importMeta initializer MODULE that can be used to initialize import.meta in the host
  • Some arbitrary data that is structured cloned along with the Module object

The exact shape isn't too important, but basically the idea is that everything can either be transferred into another realm, or there is a way to communicate back to the original realm for behaviour like importHook.

For example consider the following module:

function importHook(specifier, data) {
    // ... some resolution behaviour
    return someModule;
}

// Let's suppose for simplicity module blocks exist
const initializeImportMetaModule = module {
    export default function initializeImportMeta(data) {
        // Nothing particularly interesting just a passthrough
        return { url: data.url };
    }
};

const source = new ModuleSource(`
    import other from "other";
    
    export default "got: " + other;
`);
const data = { url: "uuid:35a109f8-005e-4b03-bf87-464e4198872a" };
const mod = new Module(source, { importHook, data, initializeImportMetaModule });

now to evaluate such a module in another realm (or thread, but the same behaviour applies):

const shadowRealm = new ShadowRealm();
const result = await shadowRealm.import(mod);

we essentially perform the following process:

  • First clone the module source, data and initializeImportMetaModule objects into the ShadowRealm
  • As part of the module linking process:
    • For each specifier in module: // In this case "other"
      • Schedule a Job to call importHook in the original realm with the specifier and original data
      • Await the job getting a Module object (IN THE ORIGINAL REALM)
      • Clone the resulting module of importHook into the realm we're evaluating in
      • Recursively link the resulting module using the same process
  • Note: Now that linking is complete we can proceed to apply importMetaHook
  • Import the initializeImportMetaModule into the realm using the same process
  • With the resulting module, call initializeImportMetaModule.default with the cloned data
  • Finally evaluate the module in the correct realm with the appropriate import.meta

Now this is just one possible design, we could instead have other designs i.e. where importHook itself must be a module and can be cloned to other threads, or tweaks like allowing the cloned data to be customized each time.

@Jamesernator Jamesernator changed the title Cross-thread/realm issues esp with module blocks Cross-thread/realm issues esp. with module blocks Aug 6, 2022
@caridy
Copy link
Collaborator

caridy commented Aug 19, 2022

@Jamesernator the narrative so far has been that Module instances are not serializable, only ModuleSource instances are. The reason being that the coordination needed across realms for modules to be serializable violates the ShadowRealm's Callable boundary.

We are still weighting whether or not it is possible to serialize certain instances (depending on their configuration). This is subject to:

a) can we have module instances with default hooks?
b) can we transfer instances with default hooks?

Beyond this point, we haven't really go deeper. At some point we discussed serialization of the hooks, but because those hooks will have to receive non-callables, and return non-callable, that idea wasn't sound.

@Jamesernator
Copy link
Author

Jamesernator commented Aug 19, 2022

the narrative so far has been that Module instances are not serializable, only ModuleSource instances are. The reason being that the coordination needed across realms for modules to be serializable violates the ShadowRealm's Callable boundary.

If the (in my opinion) logical integration with module blocks was done, then we have a weird divide between user created module blocks vs host created being serializable or not.

Basically it would be nice if hosts could grant equal power to userland modules, as otherwise the mere existence of a module loader breaks host APIs for those modules. i.e. One can't use a loader unless one can also replace every host API that might possibly accept them and somehow repair them. Such repair work would be extremely cumbersome, i.e. the motivating problem for this issue that is loading with a userland loader breaks the code that is being loaded:

const mod = new Module(new ModuleSource(`
    const workerMod = module {
        import foo from "foo";
        
        export default function lib() {
        
        }
    }
    
    const worker = new Worker("./someWorker.js");

    // This postMessage can't succeed because workerMod has an
    // importHook/importMeta in the current thread
    worker.postMessage({ module: workerMod });
`), { importHook, importMeta; {} });

It doesn't seem like a good proposal would require a loader to also repair Worker and every other API that accepts host module blocks to also accept the loader's own module blocks. Maintaining such repairs would be considerable effort, and in many cases such repairs would often lag behind actual implementations of such objects.

The reason being that the coordination needed across realms for modules to be serializable violates the ShadowRealm's Callable boundary.

but because those hooks will have to receive non-callables, and return non-callable, that idea wasn't sound.

The rough solution I proposed above solves both problems by requiring any code that is to be executed inside other realms be essentially an uncompiled module, i.e. instead of having a same realm object that initializes import.meta:

const mod = new Module(someSource, {
    // Oops same realm object can't be cloned into a ShadowRealm (or thread)
    importMeta: {
        resolve() {
        
        },
    },
});

we have an actual module that is responsible for creating import.meta:

const mod = new Module(someSource, {
    // This module gets executed within the appropriate ShadowRealm/thread
    importMetaHook: module {
        export function initializeImportMeta() {
            return {
                resolve() {
                
                }
            }
        }
    }
});

that module is executed inside the ShadowRealm/thread that the module is to be evaluated in, as such all objects are of the correct realm.

The only tricky part is giving a way to communicate appropriate data necessary for populating import.meta, which is why I suggest having some data object that is structured cloned into the appropriate ShadowRealm/thread alongside importMetaHook and passed in as an argument (And with structured clone becoming part of ecma262 the neccessary framework for such cloning would be available to use).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants