Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Determine how to express JS dependencies in Rust crates #36

Closed
aturon opened this issue Jan 30, 2018 · 16 comments
Closed

Determine how to express JS dependencies in Rust crates #36

aturon opened this issue Jan 30, 2018 · 16 comments
Labels

Comments

@aturon
Copy link
Contributor

aturon commented Jan 30, 2018

As part of the overall packaging pipeline, we will need a way within Rust to express JS dependencies. This happens in two parts:

This issue is about the second step, supplying the JS. There are at least three sources for the JS imports we want to consider:

  • Functions expected to already exist on the host.
  • Functions coming from JS provided directly with the crate.
  • Functions coming from an npm package.

Let's discuss which of these options we want to support, and how to express them on the Rust side.

@ashleygwilliams
Copy link
Member

ashleygwilliams commented Jan 31, 2018

i think, if possible, all 3 would be extremely useful. i would prioritize them:

  1. functions expected to already exist on the host.
  2. functions coming from an npm package.
  3. functions coming from JS provided directly with the crate.

with 1 being basically required, 2 a nice to have, and 3 is a stretch goal. mostly because if we can do 1 that is a work around for 2 and 3 until we get to them :). 2 is slightly prioritized over 3 because it also can function as a type of work around for 3 (separate JS files can always be published as packages if you really need to or set it up so they already exist on the host)

@lukewagner
Copy link
Contributor

There's also JS generated by wasm-bindgen which isn't present in the crate but, after the wasm-bindgen step in the pipeline, might as well have been since it's all just ESMs that import and are imported by the .wasm module.

@aturon
Copy link
Contributor Author

aturon commented Jan 31, 2018

I'd like to have some design work here going on in parallel with @ashleygwilliams's work on the npm publication side, if possible. Is anyone particularly interested in taking on the design questions here?

@koute
Copy link

koute commented Jan 31, 2018

Functions coming from JS provided directly with the crate.

Since I have the most experience with this one (as I've implemented it already in stdweb and cargo-web) I'll bite.

I can think of many ways we might want to do this, but one fairly important usecase is inline JavaScript. (See stdweb's js! or Emscripten's EM_ASM)

So, very roughly, I think something like this would make sense:

  1. Teach rustc how to emit custom WASM sections.
  2. Write a procedural macro which will accept the following parameters in some way (actual syntax is up to bikeshedding):
    1. a snippet of JavaScript code,
    2. types of the arguments the snippet expects (native WASM types only),
    3. type of its return value,
    4. values it would have to pass to the snippet when running it.
  3. With that procedural macro generate an entry in a WASM custom section and an entry in WASM's import section, and then emit an expression which will call that import with given parameter values at the place where the macro was invoked.
  4. In the postprocessing step the postprocessor (wasm-bindgen, cargo-web or something else) would read the WASM, grab the custom section and convert it to normal module imports.

This would provide a minimal API surface based on which stuff like stdweb's js! macro (which isn't limited only to WASM-native types) could be built, and would be mostly pain-free to support from the bundler side.

Of course there is also the unresolved elephant-in-the-room in the form of TypeScript, which we also might or might not want to support.

If there would be a rough consensus that something like this is acceptable I would be willing to maybe write a full RFC.

@Pauan
Copy link

Pauan commented Jan 31, 2018

@koute I agree with almost everything you said, I just want to touch on a couple points:

With that procedural macro generate an entry in a WASM custom section

Why does it need to generate a custom section? Isn't generating a normal wasm import good enough?

I imagine that the way it would work is that when you use the js! macro (or whatever syntax we decide on), it will generate a .wasm and .js file.

The .js file would contain the js! macro code (and export a bunch of functions, one function per js! macro), and the .wasm file would import the .js file and call the exports of the .js file.

Of course there is also the unresolved elephant-in-the-room in the form of TypeScript, which we also might or might not want to support.

TypeScript is JavaScript. Specifically it's a strict superset of JavaScript with erased static types, and the static types do not affect the runtime at all[1], so the types are purely used at develop-time for the sake of the programmer.

TypeScript is compiled down to a combination of .js and .d.ts files. This is done so that the files can be consumed by either JavaScript or TypeScript and everything will Just Work(tm).

In other words, whatever support Rust gets for JavaScript should automatically work for TypeScript as well, so we can treat them as being the same thing.

What would be useful is to have a program that can take a TypeScript .d.ts file and output Rust bindings for it, but that should be a separate program, outside of rustc (and probably made by a third party).

Of course a program like that will only be approximate at best, because TypeScript's type system is extremely seriously unsound.


  • [1] Angular 2 does some funky pre-processing of the TypeScript, which it then uses to do dependency injection based on the types, but that's an Angular-specific thing, it's outside of TypeScript itself.

@koute
Copy link

koute commented Feb 1, 2018

Why does it need to generate a custom section? Isn't generating a normal wasm import good enough?

Yes, that's also a valid alternative. You could also have Rust generate a .js file, however:

  • custom sections approach could be made environment-agnostic (so you could, for example, put something else than JavaScript code in there, and use the resulting .wasm file in a non-web environment; and AFAIK we do want to keep the wasm32-unknown-unknown target mostly environment agnostic)
  • it potentially simplifies linking multiple .wasm modules together (since then you'd just concatenate the custom sections together instead of dealing with a gazillion .js files)

Although the .js file approach also has its pros since it would use the same mechanism which will be used for import-ing regural .js deps.

Personally I would lean towards the custom section approach, but I'm curious what others think about it.

TypeScript is JavaScript. Specifically it's a strict superset of JavaScript with erased static types, and the static types do not affect the runtime at all, so the types are purely used at develop-time for the sake of the programmer.

In other words, whatever support Rust gets for JavaScript should automatically work for TypeScript as well, so we can treat them as being the same thing.

Yes, that's precisely the point, however without any mechanism to say "arg #n is of typescript type X" the generated snippets would be - effectively - untyped if you'd want to include it in a TS project.

@Pauan
Copy link

Pauan commented Feb 1, 2018

@koute custom sections approach could be made environment-agnostic (so you could, for example, put something else than JavaScript code in there)

But you said the procedural macro will accept a snippet of JavaScript code, so it's already not target-agnostic. I believe the target-agnostic way of including outside code is to use #[wasm_module = "foo"] + extern.

Actually, would that procedural macro even need to be included in rustc? Couldn't you write a procedural macro that simply compiles down to #[wasm_module = "foo"] + extern?

we do want to keep the wasm32-unknown-unknown target mostly environment agnostic)

I think it should be an error to use the js! macro with wasm32-unknown-unknown (assuming we get new targets like wasm32-web-unknown and wasm32-node-unknown)

it potentially simplifies linking multiple .wasm modules together (since then you'd just concatenate the custom sections together instead of dealing with a gazillion .js files)

Are you suggesting that there will be a WebAssembly linker that can only link WebAssembly modules together (it cannot link JavaScript modules), but it somehow understands the custom section so it can generate the appropriate JavaScript code?

Yes, that's precisely the point, however without any mechanism to say "arg #n is of typescript type X" the generated snippets would be - effectively - untyped if you'd want to include it in a TS project.

Yes, but why is that a problem? TypeScript already handles untyped JavaScript perfectly fine (in fact that's one of its biggest selling points), and I don't see any benefit to typing the JavaScript snippets anyways, because they will be using WebAssembly types, so from TypeScript's perspective they will always be accepting type number and returning type number

@koute
Copy link

koute commented Feb 1, 2018

But you said the procedural macro will accept a snippet of JavaScript code, so it's already not target-agnostic.

In this case it would accept a snippet of JavaScript code, but it might have just as well accept something else. (And anyway - verifying that the snippet passed is in fact valid JS is out-of-scope of what Rust itself should do, I think.)

Actually, would that procedural macro even need to be included in rustc? Couldn't you write a procedural macro that simply compiles down to #[wasm_module = "foo"] + extern?

That is a great question. I think that you can put externs wherever you want, so I guess it wouldn't need to be included in rustc? But for that you would have to have https://github.com/rust-lang-nursery/rust-wasm/issues/30 first.

In general the point of this wouldn't necessarily be making it work (if you use stdweb it already works); the point would be to standarize on something that the whole community agrees on so that external tools could depend on it and everyone doesn't have to keep on reinventing the same wheel.

I think it should be an error to use the js! macro with wasm32-unknown-unknown (assuming we get new targets like wasm32-web-unknown and wasm32-node-unknown)

What I proposed wouldn't be a js! macro per-se. It would be wasm_extern_snippet! or something along these lines. Basically it would be a limited, low level mechanism to export snippets of code native to the host's environment. Right now you probably wouldn't use it directly due to WASM's quite limited support for types.

Are you suggesting that there will be a WebAssembly linker that can only link WebAssembly modules together (it cannot link JavaScript modules), but it somehow understands the custom section so it can generate the appropriate JavaScript code?

I don't know what are the plans are for linking multiple .wasm files, however technically it wouldn't have to understand the custom section - just be able to concatenate them together.

Yes, but why is that a problem? TypeScript already handles untyped JavaScript perfectly fine (in fact that's one of its biggest selling points), and I don't see any benefit to typing the JavaScript snippets anyways

I don't know - you tell me. (: I'm just saying that it might be worth consideration, somehow, for when WASM starts supporting richer types than raw numerics. But I agree that this probably isn't super important for snippets meant to be used by a given crate internally.

@alexcrichton
Copy link
Contributor

In a recent discussion one point that came up specifically about "coming from JS provided directly with the crate" is that we should strive to avoid moving output files on the filesystem. The reason why also led me to thinking that declaring dependencies on npm is also tricky!

The rationale, as I understood it, for not moving files on the filesystem was primarily related to preserving ES6 import paths. That makes sense to me as well in the sense of let's say we've got something like:

// src/lib.rs
#[wasm_module = "./foo.js"]
extern {
    fn foo();
}
// src/foo.js
import { bar } from './bar.js';

export function foo() {
    bar();
}
// src/bar.js
export function bar() {}

If the foo.js were inlined into the crate itself (or something like that) I think there'd be similar issues? But in general if rustc were to try and slurp up files into custom wasm sections it'd have to also know to slurp up bar.js, but to do that we'd have to actually parse JS which I imagine is not trivial to do!

@Pauan
Copy link

Pauan commented Feb 2, 2018

@koute In general the point of this wouldn't necessarily be making it work (if you use stdweb it already works); the point would be to standarize on something that the whole community agrees on so that external tools could depend on it and everyone doesn't have to keep on reinventing the same wheel.

I absolutely agree, which is why I suggested using extern, because it's the standard way of using non-Rust things in Rust.


@alexcrichton But in general if rustc were to try and slurp up files into custom wasm sections it'd have to also know to slurp up bar.js, but to do that we'd have to actually parse JS which I imagine is not trivial to do!

Yes, this is a tricky problem (PureScript also has this problem, and they decided that it wasn't worth it to fix it, much to my disappointment).

There's only three solutions:

  1. Parse the JS files and rename the imports.

    If you choose to parse the JS files, then things get even harder if it has dynamic import expressions:

    // src/foo.js
    let filename;
    
    if (Math.random() < 0.5) {
        filename = "bar";
    } else {
        filename = "qux";
    }
    
    export function foo() {
        // At runtime this will randomly import either `./bar.js` or `./qux.js`
        return import("./" + filename + ".js").then((x) => {
            console.log(x);
        });
    }

    Bundlers handle this by simply not allowing dynamic expressions with import().

    In other words, import("./bar.js") is okay, but import(some_dynamic_expression) is not okay.

    This is unfortunate, because the ECMAScript spec does fully allow dynamic import() expressions, and there are genuinely useful reasons to use dynamic import() expressions!

  2. Don't move the JS files at all.

  3. Move the JS files, but ensure that they are always kept in the same relative order.

    In other words:

    • If foo.js is in the same folder as bar.js, when moving they must be placed in the same folder

    • If foo.js is in the parent folder of bar.js, when moving it must be placed in the parent folder of bar.js

    Note: with this solution it's not necessary for it to parse JavaScript, it can simply slurp up all the .js files (along with the relative path for each JS file) and then recreate the folder structure later.

    If it's not desired to slurp up all the .js files, then we can have some mechanism for specifying exactly which .js files should be slurped (perhaps package.json could be used for that).

I'm personally in favor of solution 3 (moving the files but preserving the order). This has the benefit that dynamic import() expressions will still work correctly, and it also means we don't need to parse JavaScript. But I don't have strong opinions about it, any of the three solutions will work.

@sendilkumarn
Copy link
Member

  1. Move the JS files, but ensure that they are always kept in the same relative order.

There will be a huge pile of JS files, that we need to move around. Even when we specify selected .js file, very highly the chances of it being a huge pile.

As far as I understand, the bindgen will generate a file that will hold both the import and the exports (bindings.js) can we add dynamic imports in this files and then we can move around bindings.js. This will benefit in two ways,

  1. It will not change anything for the current JS system
  2. We can easily port that js files on various targets (w.r.t importing logic) (since ESM are not yet supported in node)

@Pauan
Copy link

Pauan commented Feb 9, 2018

@sendilkumarn There will be a huge pile of JS files, that we need to move around. Even when we specify selected .js file, very highly the chances of it being a huge pile.

Are you sure? My understanding is that we only need to bundle up the JS files which are included with the Rust crate. Npm files will be handled in a later bundling stage (unrelated to rustc).

So the number of JS files should be pretty small. I imagine most Rust crates will have 0 JS files, some Rust crates will have 1 JS file, and it should be rare to have a Rust crate with more than 2 JS files in it.

And in any case, why is it a problem to have many JS files? If the linker can handle 2 files, then it should be able to handle infinite files (within memory limits).

@sendilkumarn
Copy link
Member

Moving JS files will break its relative dependencies right ?

Linkers can handle them, But I am not sure when there are circular deps.

Instead of moving the JS, does it makes sense to point these files, this will not break any existing libs.

@Pauan
Copy link

Pauan commented Feb 11, 2018

@sendilkumarn Moving JS files will break its relative dependencies right ?

That's why I said that it needs to maintain the relative ordering of the files, to prevent breaking relative dependencies.

Linkers can handle them, But I am not sure when there are circular deps.

Any ES6 linker/bundler must handle circular dependencies, it is mandated by the ECMAScript spec. So that's a given.

When I said "linker" I meant the wasm-npm-packager tool specifically. I'm not sure what to call the wasm-npm-packager tool, it's not really a linker, or a bundler.

Just to be clear, we are discussing how to handle JS dependencies with the Rust compiler and the wasm-npm-packager tool. So the workflow will be as described in https://github.com/rust-lang-nursery/rust-wasm/issues/35#issuecomment-364370573

In this case, my suggestion is that when compiling a Rust crate, the Rust compiler will simply grab all the .js files, insert them into the .wasm custom section, and then later the wasm-npm-packager tool extracts the .js files from the .wasm file (recreating the same folder structure).

It doesn't need to parse the JS, it doesn't need to understand JS imports, it is just moving files around, nothing else. So circular dependencies will work just fine. Relative imports will work just fine. Even dynamic import() will work just fine.

Instead of moving the JS, does it makes sense to point these files, this will not break any existing libs.

The JS files are spread across multiple crates and folders, and npm requires that all of the files are contained within a single root folder, so we have to move the files no matter what.

Theoretically, if Cargo changed the way that it fetches and extracts crates, it might be able to avoid moving the .js files (because Cargo would extract all of the crates into a single folder), but that sounds like a big and unnecessary change.

Moving the files really isn't a big deal at all. Package managers and compilers are constantly shifting files around: both npm and yarn move tens of thousands of files for a single npm install or yarn install.

As an optimization, it might make sense to use hard links rather than copying the files, but that's an internal implementation detail which can be optimized later (from the user's perspective it should have the same behavior either way).

@fitzgen
Copy link
Member

fitzgen commented Aug 2, 2018

As discussed in today's WG meeting, we'd like to have a fleshed out RFC accepted for expressing npm dependencies by the Rust 2018 Release Candidate in 6 weeks: 2018-09-13.

@ashleygwilliams
Copy link
Member

closing this in favor of the RFC rustwasm/rfcs#4

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

8 participants