Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: --config=file command line argument #11997

Closed
jasnell opened this issue Mar 23, 2017 · 22 comments · Fixed by #12028
Closed

Proposal: --config=file command line argument #11997

jasnell opened this issue Mar 23, 2017 · 22 comments · Fixed by #12028
Labels
cli Issues and PRs related to the Node.js command line interface. feature request Issues that request new features to be added to Node.js.

Comments

@jasnell
Copy link
Member

jasnell commented Mar 23, 2017

In a number of conversations (e.g. #11888), the idea of introducing either a NODEOPT environment variable that would allow arbitrary command line flags to be set via the environment, or a .noderc configuration file that could provide configuration options has been discussed. These choices have their merits and their disadvantages. One of the key disadvantages for me is the potential for the configuration collisions across applications and spooky action coming from environment settings that are unknown to the user. I much prefer explicit opt-in solutions.

One approach that could work that can potentially fill the gap is allowing configuration settings to be set using a .noderc type configuration file that is explicitly passed in via command line argument:

$ node --config=/some/path/.noderc app.js

If a user wanted to make use of environment variables in various environments, they could easily do so via variable expansion:

$ export NODEOPT=/some/path/.noderc_production
$ node --config=${NODEOPT}
@jasnell jasnell added the feature request Issues that request new features to be added to Node.js. label Mar 23, 2017
@mscdex mscdex added the cli Issues and PRs related to the Node.js command line interface. label Mar 23, 2017
@jchip
Copy link

jchip commented Mar 23, 2017

The key requirement I wanted to address with #11888 is being able to do the preload even in node invocations I didn't start explicitly.

For example, if I do node -r ./preload.js app.js, and app.js in turn invokes node, then that would not have the effects of preload.js.

How can I get that with this approach?

@addaleax
Copy link
Member

I’d really like it if we could set env vars and flags in (almost) 1:1 correspondence. That would make our API more consistent and give users control over which node processes they want to target.

Also, I’m not really a fan of adding a config file for Node, except maybe for the REPL (because that’s an actual application that happens to ship with the platform).

@mscdex
Copy link
Contributor

mscdex commented Mar 23, 2017

Using an environment variable for each flag might cause issues/headaches if a node process forks, since process.env is used as the default environment and some people may not be expecting flags to leak into such child processes. *shrug*

I realize there are already other environment variables that could have similar issues, but I think adding a new one per flag could make it worse?

@addaleax
Copy link
Member

@mscdex That’s more or less what I meant – people can use environment variables if they want that recursive behaviour, and flags otherwise. Imho that’s a pretty traditional approach, and one of the primary reasons environment variables exist.

@jasnell
Copy link
Member Author

jasnell commented Mar 23, 2017

The key problem I have with that approach is that the environment variables would have an effect even if the users don't want to use them, and many users may not be aware of them at all. For some environment variables (like NODE_DISABLE_COLORS) that's not a big deal at all. For others (like the proposed NODE_PRELOAD) it could have huge unforeseen impact. It's too indiscriminate to allow all flags to be set that way -- note that there are several flags that we explicitly decided would not be usable as environment variables specifically because of security concerns. It's far too easy for untrusted code to set environment variables in ways the average user typically would never notice.

@addaleax
Copy link
Member

note that there are several flags that we explicitly decided would not be usable as environment variables specifically because of security concerns

Right, that’s why I said “almost” – I can see that there are valid reasons for exceptions. I don’t think NODE_PRELOAD should be such an exception, though.

It's far too easy for untrusted code to set environment variables in ways the average user typically would never notice.

I’d be really curious to a scenario in which this is true… if you allow untrusted code to run and set arbitrary environment variables, it seems to me you’d have a problem anyway (messing with HOME, SHELL, LD_PRELOAD, etc.).

@jasnell
Copy link
Member Author

jasnell commented Mar 23, 2017

I'd be really curious to a scenario in which this is true

One scenario is a variation of this: http://blog.npmjs.org/post/141702881055/package-install-scripts-vulnerability ... Install scripts run under the users identity and permissions and have the ability to modify both file system and environment variables in persistent ways that may not be detected. Yes, users can turn off install scripts per the vulnerability report, but how many users know about that and how many actually do turn them off? Sure, this is not a problem that is specific to the use of environment variables and there are all sorts of other risks and vulnerabilities this would allow, but it's one way that a NODE_PRELOAD environment variable could be abused to inject malicious code in every Node.js process running under a user's account. I don't think we should be making it any easier.

@jasnell
Copy link
Member Author

jasnell commented Mar 23, 2017

(and yes, the install script problem generally applies to all environment variables including HOMe, SHELL, LD_PRELOAD, etc... which is why install scripts really ought to be turned off by default)

@jchip
Copy link

jchip commented Mar 23, 2017

@jasnell for the security concern, I'd like to mention what I did in #11888, where NODE_PRELOAD can only specify modules from the installed Node's global node_modules. I know you said you are stewing over it. I wonder if that's an option opened for discussion?

@jasnell
Copy link
Member Author

jasnell commented Mar 23, 2017

Yeah, I haven't completely ruled it out, by any means. I'm still exploring and kicking around ideas. My key concern here is that I don't want to make it easier for attackers. I definitely not saying that this would introduce a new security risk but I'm afraid that it could make an existing one easier to exploit. So I'm wanting to be very careful in how we proceed.

@sam-github
Copy link
Contributor

I don't object to this proposal, but it doesn't address the use-case that NODEOPT="-r node-report" or NODE_PRELOAD=node-report does.

This use-case requires the developer of the code that runs node to pre-anticipate that they want runtime configuration, and to implement it by writing their start script appropriately. They could do this already, actually, with some variant of:

scripts: {
  "start": "node $(cat ~/.nodeopt) server/server.js"
}

However, that doesn't solve my use-cases.

The problem I've run into (not just once, but over-and-over again, with strong-pm/strong-supervisor, with several indepent projects at multiple companies building docker containers for node servers, when webexed into customer sites at 3am, when looking at building tooling for running node apps in the cloud, etc.) is how to start and configure thirdparty apps, apps that I didn't write, that weren't written to be runtime configurable, and where I can't programmatically rewrite their start scripts. This --config suggestion, and all the "just hand modify the app's start script" suggestions are not possible under these conditions.

There is only one convention for starting node servers, and that is npm start: https://docs.npmjs.com/cli/start

What this means in practice is that node server's aren't directly runnable by node... they are actually shell scripts. The script may be quite simple ("node --abort-on-uncaught-exception server/server.js"), or might be more complex ("./start.sh").

This means that tooling that packages and deploys generic node servers, ones that are written in the conventional way, but have not installed any production tooling, and have not specially written their start script, have no way of being customized in production. I believe this is one cause of the poor production debug and deploy tooling situation for node ATM.

#2738 and strongloop/strong-supervisor#56 (comment) for example, --max-old-space is something a cloud environment might want to set, and could do with NODEOPT=--max-old-space=XXX, but at the moment our only choice is to exhaustively tell every single app author about the CLI opt, and how to tune it.

"NODEOPT=--abort-on-uncaught-exception" would also be useful.

Deployment tools aren't the only use-case, @jasnell (I think) mentioned how something like NODEOPT could surprisingly apply to node process spawned by npm life-cycle scripts. Its true, it could surprise people. However, life-cycle scripts can literally NOT be modified by a user, if I do npm install, and in the middle of it one of the install scripts spawns a node child, and its dying, and I want to inject node-report, because something is happening that I can't reproduce when running the package script standalone, I'm currently dead in the water. npm downloaded the package and ran the script, I don't have any way to modify the script before npm runs it. Being able to do NODEOPT="-rnode-report --abort-on-uncaught-exception" would give me control. Spooky action at a distance? Sure, it is, but when you can't get close to the misbehaving code, you need action-at-a-distance.

I don't think every node option should be valid in NODEOPT (-e makes no sense), but right now we are doing a one-by-one addition of env vars.

More related issues (non-exhaustive, just from my notes):

@jasnell
Copy link
Member Author

jasnell commented Mar 23, 2017

After speaking with a few folks about it in detail, I'm going to drop my general objection to NODEOPT but still definitely prefer it over the NODE_PRELOAD proposal.

My preference would be for:

  • Any security-related command line flag (such as --tls-cipher-list and the --revert= flags) to be unsupported in NODEOPT by default, perhaps with an explicit opt in that would allow those to be used. (The --tls-cipher-list flag was one of those that was specifically limited to command line when it was introduced).
  • child_process to require an explicit opt-in in order to automatically inherit NODEOPT.
  • For Node.js embedders (e.g. electron) to explicitly be required to opt-in to NODEOPT support.

@gibfahn
Copy link
Member

gibfahn commented Mar 23, 2017

Any security-related command line flag (such as --tls-cipher-list and the --revert= flags) to be unsupported in NODEOPT by default, perhaps with an explicit opt in

+1, I think we'd definitely need opt-in/opt-out support with this feature.

child_process to require an explicit opt-in in order to automatically inherit NODEOPT

Do you mean that the parent would need a command line parameter, or that you'd have to opt-in in code in the child? I'd be +1 on the first, the second seems unnecessary (and would break a lot of the use cases for this).

When we say opt-in or opt-out, I'm assuming the only way to do that would be with a command-line option like --nodeopt=on|off|danger, and that you'd only have to do that in the parent node process. Is that correct?

@addaleax
Copy link
Member

child_process to require an explicit opt-in in order to automatically inherit NODEOPT.

I kinda don’t get what the point of having an environment variable is then? They are specifically there for the situation where you want to pass options to an entire process hierarchy…

@jchip
Copy link

jchip commented Mar 24, 2017

My requirement is to be able to get the -r effect even if node is invoked without any command line parameters. The NODEOPT approach is more extensive.

Any kind of required opt-in/opt-out via command line parameter nullifies the intend of the feature, because if I have control of the command line parameter, then whatever I need to do is already possible as is.

There've been a lot of ideas going around, but so far I see that having some kind of ~/.noderc file is receiving the least -1s (none that I've seen actually).

Here is another idea I've been thinking:

  • Allow a file ~/.nodeopt, inside which you can specify one command line parameter per line
  • Have an env var NODEOPT_RC. If it's set to true or 1, then node automatically reads ~/.nodeopt for command line parameters.

Is this an acceptable approach?

@jasnell
Copy link
Member Author

jasnell commented Mar 24, 2017

@addaleax ... yes, I get that, but there are a number of backwards compatibility concerns to consider. Command line args are not currently inherited automatically by child processes. Existing child processes could fail or experience issues if they suddenly start automatically inheriting command line options set by NODEOPT -- keeping in mind that the person launching the node process may not be even aware that some deep dependency is even launching a child process that could be impacted.

@bnoordhuis
Copy link
Member

I could get behind a NODEOPT flag as long as it's ignored when getuid() != geteuid(). There is precedence in other languages/runtimes, e.g., PERL5OPT.

That said, I'd be remiss if I didn't point out the drawbacks:

1a. Set-and-forget. You set it in your .bashrc and six months later you run into this obscure issue that takes you two days to debug. Turns out it's that NODEOPT=--obscure-flag you copy/pasted from Stack Overflow without really understanding what it does. That's 90% of modern-day programming right there for you.

1b. A variation on 1a that affects us: people reporting issues that we cannot reproduce because the reporter neglects to mention they set NODEOPT. Technical solutions like a --safe-mode flag won't help because there is always a subset of people that simply won't use it before reporting an issue.

2a. People tend to want to rewrite the environment variable when passing it to child processes, e.g. to change the location of the log file, but inevitably end up getting it wrong. Not really our problem but still a problem. (Also, how does it interact with process.execArgv?)

2b. Core probably needs to apply rewrite rules as well, e.g., for port numbers. We already do that but it would have to be extended to cover NODEOPT as well. Argument parsing is a messy affair.

@jasnell
Copy link
Member Author

jasnell commented Mar 24, 2017

@jchip ... just to follow up on the earlier conversation around NODE_PRELOAD, after running it through a few scenarios, I don't think limiting it to just global installed modules actually buys much. For the security concerns I have, the issue is less about where those are loaded from and more around the lack of visibility when security related settings are used or modified.

@jasnell
Copy link
Member Author

jasnell commented Mar 24, 2017

@gibfahn ... for child_process, I mean opt-in via an option when launching a child_process with fork() (and specifically only with fork())

const fork = require('child_process').fork;
fork('script', {nodeopt: true});

We could allow the default value for the nodeopt option to be modified at the parent but I really don't like the idea of blindly passing it along.

@sam-github
Copy link
Contributor

I have some code around that implements NODEOPT, but doesn't yet blacklist options that aren't safe (-e, -p, ...), I'll PR it. It uses SafeGetenv(), so NODEOPT isn't inherited across setuid().

@jchip I've seem some fairly strong -1s on ~/.noderc in other threads, though not here because I don't think its being considered seriously. I'm not keen on it. Injecting env vars into production envs can usually be done easily (through cloud consoles, docker run -e ..., etc). Injecting files isn't so easy. They also have persistence problems, the files exist past the lifetime of the processes using them, whereas env vars exist only in process memory so disappear with the process.

@bmeck
Copy link
Member

bmeck commented Mar 24, 2017

I think we should use a whitelist not blacklist for any security concerns.

@sam-github
Copy link
Contributor

@bmeck pls comment on the PR, there isn't a list per-se, black or white, and it looks like a "whitelisty" approach would mean adding a case statement into node.cc for almost every v8 option, which would be pretty hairy to maintain I suspect you will see.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cli Issues and PRs related to the Node.js command line interface. feature request Issues that request new features to be added to Node.js.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants