Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

repl: add top-level static import #17285

Closed
wants to merge 1 commit into from

Conversation

devsnek
Copy link
Member

@devsnek devsnek commented Nov 24, 2017

Adds support for top-level static import, to achieve this some
rearranging of lib/module's esm loader was needed, and it was moved to
internal/loader/singleton.

Checklist
  • make -j4 test (UNIX), or vcbuild test (Windows) passes
  • tests and/or benchmarks are included
  • commit message follows commit guidelines
Affected core subsystem(s)

repl, module

@nodejs-github-bot nodejs-github-bot added the lib / src Issues and PRs related to general changes in the lib or src directory. label Nov 24, 2017
@devsnek devsnek force-pushed the feature/repl-toplevel-import branch 2 times, most recently from 72ae712 to 5178fc7 Compare November 24, 2017 07:24
@mscdex mscdex added esm Issues and PRs related to the ECMAScript Modules implementation. repl Issues and PRs related to the REPL subsystem. and removed lib / src Issues and PRs related to general changes in the lib or src directory. labels Nov 24, 2017
lib/repl.js Outdated
@@ -222,6 +226,15 @@ function REPLServer(prompt,
}
}

if (!wrappedCmd && experimentalModules && code.startsWith('import')) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about leading whitespace? The code.startsWith('import') will miss it.

specifiers = specifiers
.map(([imported, local]) => `${local} = exports['${imported}'];`)
.join('\n');

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👆 Can you provide an example of what the generated code will look like?

/* transforms
"import { x, y as z } from 'z'; x(); z + 1"

"(async () => {
Copy link
Member

@jdalton jdalton Nov 25, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@devsnek Awesome! Thanks for adding the transformation example.

Where is x and z initialized? Can you add that to the transformed source example?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

they aren't, like the await transform i rely on them attaching top scope

lib/repl.js Outdated
@@ -226,7 +226,7 @@ function REPLServer(prompt,
}
}

if (!wrappedCmd && experimentalModules && code.startsWith('import')) {
if (!wrappedCmd && experimentalModules && /^(\s+?)?import/.test(code)) {
Copy link
Member

@jdalton jdalton Nov 25, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could see the sniff getting pretty robust... what about leading multiline comments, e.g. /*x*/ import...

Since processTopLevelImport fails gracefully, you might just keep the preflight check super simple, e.g. /\bimport\b.test(code).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll just change it to code.includes('import') at this point

Copy link
Member

@jdalton jdalton Nov 26, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, the word boundary check just prevents matches for "important", "imports", etc.

@jdalton
Copy link
Member

jdalton commented Nov 25, 2017

BTW, I ❤️ all this official Acorn use in Node.

@devsnek
Copy link
Member Author

devsnek commented Nov 26, 2017

@guybedford i'm getting some strange behavior with this that i'm not sure how to debug. i did some pretty extensive searching through the loader but i'm honestly not sure how this error is caused, can you take a look?

@guybedford
Copy link
Contributor

@devsnek I think you'll need to try a few things to track it down. Some ideas which may or may not be valid - perhaps try removing the loader deletion code in https://github.com/nodejs/node/pull/17285/files#diff-27a256465ce1725db0b0222b7c754b94R8 to keep it around in memory to see if it could be some GC issue. Or try following the ModuleJob link logic to see that it is setting "this.module" to a valid module object and if so that it is persisting correctly.

@devsnek have you seen #15713?

@devsnek
Copy link
Member Author

devsnek commented Nov 26, 2017

@guybedford this seems to have fixed it, i tried to trace the scopes on why this was messed up but i wasn't able to figure it out. This makes me worried that perhaps this is a fix on the issue but not the solution to the problem.

@@ -91,16 +91,16 @@ class ModuleJob {
   }

   async run() {
-    const module = await this.instantiate();
+    await this.instantiate();
     try {
-      module.evaluate();
+      this.module.evaluate();
     } catch (e) {
       e.stack;
       this.hadError = true;
       this.error = e;
       throw e;
     }
-    return module;
+    return this.module;
   }
 }

have you seen #15713?

yes i have seen it, hence // TODO(devsnek): switch to dynamic import when implemented, it also will be behind an additional flag, so it won't really be available in this context for a while.

@benjamn
Copy link
Contributor

benjamn commented Nov 27, 2017

Does this PR support live bindings? In other words, if the imported module later re-exports something that was imported earlier in the REPL, will the value of that variable/binding be correctly updated in the REPL? If not, that's is a pretty big departure from the semantics of ECMAScript modules.

@devsnek
Copy link
Member Author

devsnek commented Nov 27, 2017

@benjamn it does not, I considered attaching every export as a getter/setter, but that seemed kinda over the top. Do other people have an opinion on this?

Copy link
Contributor

@benjamn benjamn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please have a look at some of the projects that support live bindings in the Node REPL:

Given that other projects have gotten live bindings to work, I believe an implementation without that functionality will not be well-received by the Node community.

I would like to say this PR just needs a little work, and even offer my help cleaning it up, but I honestly think you might be going down the wrong path here, and you should take some time to learn more about the ECMAScript modules spec (especially as it relates to the REPL), get familiar with prior work in this space, and maybe pick up a few of the (many) tools that exist to do AST-based code transformation more safely than you're doing it here.

Full disclosure: I'm the author of Reify, and @jdalton is the author of @std/esm, which is a fork of Reify.

.join(';');

// re-insert any code that followed the import:
// "import { a, b } from y; a(); b + 5"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please be aware that import declarations are hoisted, so

a(); b + 5; import { a, b }

is perfectly legal, because the import declaration is evaluated before the rest of the code.

// "import { a, b } from y; a(); b + 5"
// ^^^^^^^^^^
const following = root.body.slice(1)
.map(({ expression: { start, end } }, i) => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about nodes that are not ExpressionStatements (and thus do not have an .expression property)?

x = exports['x'];z = exports['y'];
});
x();return (z + 1)
})();"
Copy link
Contributor

@benjamn benjamn Nov 27, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The async function and await expression seem unnecessary here. Instead, you could just extract any ImportDeclaration nodes from the code, load those modules asynchronously first, and then evaluate the remaining code once the imported variables have been defined.

I believe the Node REPL's eval API supports a callback, which should make this relatively easy, but don't quote me on that (I haven't looked at that code recently).

See my other comments about hoisting for an explanation of why extracting ImportDeclarations and loading them first would be more correct than what you're doing here.

.import(${root.body[0].source.raw})
.then((exports) => {
${specifiers}
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code needs to be capable of handling multiple import declarations.

@jdalton
Copy link
Member

jdalton commented Nov 27, 2017

@benjamn @devsnek

Given that other projects have gotten live bindings to work, I believe an implementation without that functionality will not be well-received by the Node community.

Okay, just throwing this out there:

Since this cowpath has already been paved by the @std/esm loader, the approach could modded to hand off import in the Node REPL to @std/esm instead of using Acorn alone to tackle it. 😊

The @std/esm loader uses Acorn too but, unlike standalone Acorn, also supports object-rest-spread (Node 8) and async generators (Node 9 --harmony). It follows Node's ESM rules by default, supports being loaded in the REPL, and supports import statements in the REPL as well.

If y'all are up for it, I'm totally down for working with Node folks and @devsnek on any features or tweaks needed to get the official Node REPL scenario running with @std/esm.

@devsnek
Copy link
Member Author

devsnek commented Nov 28, 2017

@jdalton i think that initially integrating @std/esm sounds like a good idea, and i was pretty excited by it, but the cons outweigh the pros. As time moves on having all the features and syntax stuff of js being handled without much upkeep on our part is great, but it comes with some downsides.

  1. the size of the repo is actually a thing, and continuously adding deps is pretty annoying when you're on a 128gb macbook air and this isn't the only thing you work on.
  2. implementing code node features with userland modules makes us see node core functionality from the perspective of userland modules
  3. node's repl, in my mind, represents a raw connection between a dev and node. as soon as we begin adding things between the two, we start adding discrepancy between how node operates and how it appears to operate.

And as a side note, also i've been planning to refactor the repl to be nicer on the eyes (getting rid of self, moving default eval to another file, etc)

@jdalton
Copy link
Member

jdalton commented Nov 28, 2017

@devsnek

I don't think any of your points are insurmountable. @std/esm is a zero dependency ~70kB package. To put that into context punycode, which I was an admin on, was used by Node core and is ~49kB and Acorn is ~460kB. Let me know if ya have a change of heart. Happy to help anytime.

@devsnek
Copy link
Member Author

devsnek commented Nov 28, 2017

@jdalton if you wanted to implement it i would recommend not waiting for me to figure out how std/esm's api works. I'll still keep this open but if people want something else i say lets do whats best for everyone.

@devsnek devsnek force-pushed the feature/repl-toplevel-import branch 9 times, most recently from a6cf253 to 2006a9e Compare December 1, 2017 15:48
@devsnek
Copy link
Member Author

devsnek commented Dec 13, 2017

@bmeck just to be clear, you're saying no to

> import { x } from "a"
> import { x } from "b"

in addition to

> import { x } from "a"; import { x } from "b";

the second would never work, the first currently does work.

p.s. please everyone put in > when discussing repl inputs/outputs it can get hard to tell

@evanlucas
Copy link
Contributor

evanlucas commented Dec 13, 2017

@jdalton technically, anyone can request changes on a pull request I'm pretty sure. Issues should only go to the TSC if consensus cannot be reached.

Edit: @bmeck beat me to it :]

@benjamn
Copy link
Contributor

benjamn commented Dec 13, 2017

What sort of APIs exist in Node and/or V8 that could enable a contributor like @devsnek to make progress on the Environment Record implementation?

I don't think import.meta needs to work in the REPL, since the REPL isn't really a module per se, so it doesn't have a meaningful import.meta.url for example.

@MylesBorins
Copy link
Contributor

@benjamn if we implement import.meta.require it might be neccessary though... unless the repl exposes that lexical variable on it's own scope

@bmeck
Copy link
Member

bmeck commented Dec 13, 2017

@devsnek

I feel like separate inputs replacing the binding is fine. My example above that I would not want to work is a single input, same binding id imported, and same specifier used in both import declarations.

@benjamn
Copy link
Contributor

benjamn commented Dec 13, 2017

In my discussion of @bmeck's examples, each line should be taken as a command entered by itself. However, I believe the only reason this matters is that the import declarations are evaluated before the rest of the command. In other words, import declarations are hoisted within the command itself, but not hoisted before previous commands.

My example above that I would not want to work is a single input, same binding id imported, and same specifier used in both import declarations.

Making this an error would be fine with me, since my only reason for wanting to allow replacement is so that you can change your mind later in the REPL session.

@MylesBorins Is the REPL ever going to stop exposing require, though? I'm not against import.meta in the REPL if there's a good reason for it; I just don't think it matters much to this PR.

@bmeck
Copy link
Member

bmeck commented Dec 13, 2017

@benjamn we can do this by rewriting the source in a slightly more complex way. I don't think V8 currently has a way to do this that can be sloppy mode.

> import {x} from './foo';
> import {x} from './foo';

can roughly be run in any way that acts like an environment record / fulfills the initialization using equivalents of spec based methods matching. The spec actually uses environment records that have objects backing them and we can use those, but I don't want to define the behavior in terms for descriptors.

Given the above we can probably reuse most of the existing stuff and just be sure the REPL doesn't use the global environment record when running. Roughly:

> import {x} from './foo';
/*
Object.defineProperty(REPL_CONTEXT_OBJECT, 'x', {
  get() {x; let x}, // setup TDZ
  set() {x; let x}, // setup TDZ
  configurable: true
});
with (REPL_CONTEXT_OBJ) {
  {
    let _ = await import('./foo'); // this isn't quite right, it should avoid the .then() export if it exists
    Object.defineProperty(REPL_CONTEXT_OBJECT, 'x', {
      get() {return _.x;}, // remove TDZ, still not writable
      set(v) {const x=null;x=v}, // give same error as writing to immutable binding
      configurable: true
    });
  }
}
*/

Running twice.

Where REPL_CONTEXT_OBJ is never directly mutable by userland code except variable declaration/access/assignment in the REPL's input source text. The API can create a view of this namespace like a Module Namespace Object if it wants to be reified to user code probably, but I don't think letting people use Object.defineProperty on it would be a safe thing since it is not possible to fully be replicated with an environment record.

This avoids the leakage of the implementation being done with objects behind the scenes.

@benjamn
Copy link
Contributor

benjamn commented Dec 13, 2017

@bmeck If you're talking about Environment Records that are actually backed by objects, then I think we're pretty much back to the implementation in this PR. The REPL context object also is not accessible to REPL command code, as far as I know.

I would much prefer changing the default behavior of the REPL so that context !== global, and using that context object for imported bindings, instead of wrapping every REPL command with a with (context) {...} block that accomplishes the same thing.

As long as live bindings work, I guess I don't care if they're backed by getters that could have been defined with Object.defineProperty or some other magic, though I'd prefer not to invent new language features (as I'm sure you would, too).

@bmeck
Copy link
Member

bmeck commented Dec 13, 2017

@benjamn

If you're talking about Environment Records that are actually backed by objects, then I think we're pretty much back to the implementation in this PR. The REPL context object also is not accessible to REPL command code, as far as I know.

Yes, this is all just some minor implementation changes to avoid doing something that doesn't fit with spec mechanism observable behavior. I only want to prevent observable differences by explicitly stating the context object uses descriptors/is exposed in some way, the implementation being backed by Objects seems fine to me.

I will need some time to fully review the PR in depth and look for any leaks, but wanted to be clear as the behaviors above were not fully defined when reading it on first read through.

@benjamn
Copy link
Contributor

benjamn commented Dec 13, 2017

This discussion is really making me think TC39 should dedicate some time to thinking about module syntax in REPL environments, or anywhere commands are evaluated incrementally. Maybe a discussion of REPL deviations belongs in an Annex section of the ECMA-262 spec? Besides Node, I know of several TC39 members who work on DevTools for their sponsor companies, and I'm sure they've been frustrated by these issues before.

@bmeck
Copy link
Member

bmeck commented Dec 13, 2017

crossreference for V8 issue "Provide a REPL version for parsing top-level code": https://bugs.chromium.org/p/v8/issues/detail?id=6903

@alexkozy
Copy link
Member

As an author of original DevTools hack, I am not sure that this pull request is move into right direction. Top level await was mostly fun feature that does not change semantic significantly, contains trivial rewrites and can be easily workarounded by user by introducing top level (async function main() { ... }) wrapper. Dynamic imports are coming and will be available soon, so user would be able to right await import(...) maybe we should just wait.

V8 repl mode can address this issue as well and based on number of differences between normal JavaScript and JavaScript in repl - it probably requires some attention from guys who works on spec maybe it is time when we need something that will be supported by different browser and Node.js.

@bmeck
Copy link
Member

bmeck commented Dec 13, 2017

@ak239 This PR does add a very specific solution to a problem. await import() has a differing behavior from an import declaration if the target module has a then export:

Given foo:

// ./foo
export let x = 'foo';
export function then () {
  return null;
}

An import declaration grabs the binding

> import {x} from './foo'; x
'foo'

while a dynamic import gets the result of resolving the module namespace (null).

> (await import('./foo')).x
TypeError: Cannot read property 'x' of null

@benjamn
Copy link
Contributor

benjamn commented Dec 13, 2017

@ak239 Top-level await isn't on any standards path yet, so I find it really weird that you would (1) go ahead and implement top-level await in the Node REPL, (2) admit that it's just a fun hack, but then (3) say this PR is a bad idea because you can use the top-level await hack instead. I respectfully reject this line of reasoning.

Consider what you're saying: instead of typing

> import assert from "assert"

you would have Node REPL users type

> let assert = (await import("assert")).default

every time they want to import a module? That seems like a move in the wrong direction, not to mention it sacrifices live bindings.

Negative comments are fine in a forum like this, but they should be constructive. Please read through this PR in detail before jumping to conclusions about how much of a hack it is, and especially before commenting to that effect. If you read my initial review comments, you'll see that I was pessimistic too, but I've come around to the goals of this approach (even if some details still need to be worked out).

@alexkozy
Copy link
Member

alexkozy commented Dec 13, 2017

I have not implemented top-level await in the Node REPL. I implemented and landed this approach in Chrome DevTools before Node.js. This approach was adopted by Node.js community (thanks @TimothyGu).

I am pretty sure that we should have right balance and do not write own language by rewriting original JavaScript using acorn. At least we should first figure out the spec for this kind of rewriting to have the same behavior in different repl environments including Node.js repl and Chrome DevTools console. So my constructive point: collect a lot of different use cases when repl behavior should be different vs non-repl behavior, talk about this with language spec experts, implement native repl support in V8 and finally remove top level await rewriting from codebase.

@benjamn
Copy link
Contributor

benjamn commented Dec 13, 2017

We can make progress on a REPL spec in parallel with this PR (and other REPL experimentation, for that matter). The beauty of the REPL is that it doesn't have a legacy problem. If Node makes a backwards-incompatible change to the way the REPL works, you don't have to go back and rewrite all your previous REPL commands. They worked in the past, and slightly different commands work now. In other words, we can easily change our minds about this stuff, so we shouldn't pretend we have to figure everything out before making any changes.

The fact that this PR uses Acorn to parse import declarations is an implementation detail, easily removed whenever V8 decides to expose its own parser (which would be great).

@bmeck
Copy link
Member

bmeck commented Dec 14, 2017

@benjamn I have created https://github.com/bmeck/js-repl-goal to move discussion about such a goal there, we can make the requirements on a PR basis and I don't see any reason to block PRs that look future safe.

@ak239 lots of people on this comment thread are in TC39 actually so there are plenty of language experts, we just haven't been talking fully in spec language acting as if we will try to upstream the behavior. Made a repository to talk about it, though unsure if we should propose it to TC39 without browser buy in.

@gsathya
Copy link
Member

gsathya commented Dec 14, 2017

The developer ergonomics of having this does seem compelling and there seems to some appetite from the community for having this.

Looking at the implementor concerns, maybe it would be useful to land this initially behind a flag (say, --repl=module)? This would also alleviate concerns about having both script and module like behavior.

Then, we could get better feedback on usage numbers and patterns and make a more informed decision about having this in V8 or going through a standards body.

@devsnek
Copy link
Member Author

devsnek commented Dec 14, 2017

it's already behind the esm flag, I don't really see a need to flag it in release, people who use nightly can bring up any issues they find

@TimothyGu
Copy link
Member

Closing this PR for now. I personally don't think this is a good idea, and even if it becomes one we'd need much more work to get this feature fully consistent with user expectations (see https://github.com/bmeck/js-repl-goal).

@TimothyGu TimothyGu closed this Jan 22, 2018
@benjamn
Copy link
Contributor

benjamn commented Jan 24, 2018

@TimothyGu I respectfully disagree with your personal opinion, and I believe this PR should remain open, unless you can provide some other reasons for closing it. For example, if you've discussed this with other Node/TSC members, and they have reasonable objections to it, that would count for more than one person's opinion, in my opinion. As another example that goes beyond one person's opinion, if you knew about a plan to remove acorn parsing/transformation from the Node REPL, that might pose some problems for the current implementation of this PR.

I genuinely welcome any additional explanation you can provide. In the meantime, here are my thoughts on why this PR is valuable:

Supporting static import declarations in the REPL is a good idea, especially if you're concerned about user expectations, because not supporting them would be inconsistent with user expectations. Since the REPL does not make a distinction based on .js/.mjs file extensions (and it's not clear to me how it could), I think most Node developers would expect the REPL to accommodate some combination of ESM and CJS semantics, as long as both are supported by Node. As CJS/Script becomes more of a second-class environment behind ESM, it would be strange for Node's default REPL to remain purely a Script environment, only able to use require and dynamic import().

Your comment about "just bank[ing] on const { importName } = await import(path)" can be challenged by observing that live bindings don't work with that pattern. That's important because this PR takes care to preserve live bindings in the REPL, thanks to @devsnek's patience with the feedback that he got from me and @jdalton and others. If preserving live bindings feels like needless complexity to you, I would ask that you stay involved in this discussion, rather than using your position to end it.

The discussion in this PR has focused primarily on the appropriate implementation (e.g., whether/how to use the REPL context object for the imported symbols), and details of the interactions between imported symbols and other REPL declarations. I'm confident that the general approach of this PR can accommodate whatever semantics are eventually formalized. In other words, this PR is compatible with any future development in this area. If I'm wrong about that, I'd love to know why, since that would certainly count as a reason to close this PR.

The basic behavior of REPL import declarations, as implemented in this PR, seems fairly straightforward:

  • extract any import declarations from REPL commands,
  • instantiate/evaluate the imported modules asynchronously,
  • expose the imported symbols to the REPL (enforcing appropriate redeclaration semantics), and then
  • evaluate the remaining code.

If we can agree on that behavior, then I think we can work out the precise implementation and semantics over time. Please note: this freedom is unusual in programming language design! Usually you have to get everything right early on, or you'll be stuck with the wrong decisions later. Fortunately, because we're talking about the REPL, which is a session-based programming environment, there is less of a legacy code problem, as I explained in this comment.

With all of that in mind, I believe this PR provides something really valuable: a chance for Node developers to use an experimental feature, gated behind the --experimental-modules flag, to gain a better understanding of its benefits, ergonomics, and edge cases.

@devsnek
Copy link
Member Author

devsnek commented Jan 24, 2018

@benjamn for what its worth i'm also in agreement with tim, i just hadn't gotten around to closing this yet. i would love for these features to be part of node but this pr was a hack at best.

@devsnek devsnek deleted the feature/repl-toplevel-import branch January 24, 2018 16:03
@bmeck
Copy link
Member

bmeck commented Jan 24, 2018

To note, the semantics of the REPL goal linked above are not set in stone, it has not been presented at any stage to TC39. I do think having it behind a flag would be fine and even help to define what spec a REPL goal should be aiming for; but, getting declaration semantics correct is difficult without VM support.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
esm Issues and PRs related to the ECMAScript Modules implementation. repl Issues and PRs related to the REPL subsystem.
Projects
None yet
Development

Successfully merging this pull request may close these issues.