Skip to content
This repository has been archived by the owner on Apr 22, 2023. It is now read-only.

Node should not automatically change rlimits #8704

Closed
bmeck opened this issue Nov 10, 2014 · 54 comments
Closed

Node should not automatically change rlimits #8704

bmeck opened this issue Nov 10, 2014 · 54 comments

Comments

@bmeck
Copy link
Member

bmeck commented Nov 10, 2014

This changed in 6820054 .
Which appears to have landed in 0.11.14 but there is not a clear reason why.

I feel it should be reverted.

  1. principle of least surprise is violated by altering system resource limits automatically, not even configurable via a command line switch
  2. we can no longer set a low limit for node and higher limit in child processes
  3. this alters logic where EMFILE for soft limits was respected in 0.10.x- but now only hard limits are respected in 0.11.x+
@trevnorris
Copy link

/cc @bnoordhuis

@rvagg
Copy link
Member

rvagg commented Nov 10, 2014

  1. is slightly embarrassing for Node not to be a good UNIX citizen here (see also 1)

Perhaps just a commandline flag to opt in that can be clearly documented for the poor folks stuck in the world of OSX.

@bnoordhuis
Copy link
Member

Counterarguments are that RLIMIT_NOFILE is an antiquated concept and that the default file descriptor limit on OS X is ridiculously low. (And many OS X users are fairly clueless about how to remedy that; the reason I added that code is that I got sick of closing the same bogus bug reports all the time.)

@bmeck @rvagg I don't mind reverting it but can you give a real-world example where this change in behavior is actually problematic?

@bmeck
Copy link
Member Author

bmeck commented Nov 11, 2014

@bnoordhuis it is merely unexpected and not in line with other tools. I do not have real world use cases that have done this in production, but it certainly freaked me out seeing me set ulimit and node ignore the soft limit.

@OrangeDog
Copy link

For a more concrete example, when running restricted code in a VM in a child process, I don't want my ulimit settings to be ignored.

@indutny
Copy link
Member

indutny commented Nov 11, 2014

@bnoordhuis I told ya people will notice it

@piscisaureus
Copy link

  1. we can no longer set a low limit for node and higher limit in child processes

It's not possible to restore the old rlimit after fork? Seems like a pretty simple patch.

@bmeck
Copy link
Member Author

bmeck commented Nov 11, 2014

@piscisaureus we can't have node have a low limit while spawn() creates children with higher limits than node since node will automatically ignores the soft limit

@piscisaureus
Copy link

we can't have node have a low limit while spawn() creates children with higher limits than node since node will automatically ignores the soft limit

But you can't nowadays either. There's no way to set the rlimit for child processes.
This problem seems completely academic to me.

@bmeck
Copy link
Member Author

bmeck commented Nov 11, 2014

we do not expose this on the spawn API directly but the C APIs work just fine, same if you use bash's ulimit. Generally only useful when your node process wants to limit connections and have children handle many more fds, but using node as a message broker an not wanting to have a ton of fds leaking forever I can set the limit low for the broker.

run.sh

ulimit -Sn 1000
node broker.js $WORKER_FD_LIMIT $NUM_FDS_TO_OPEN

broker.js

require('child_process').exec(
   'ulimit -Sn '+process.argv[2]+'; node -p "for (var i = 0; i < '+process.argv[3]+'; i++) {require(\'fs\').openSync(process.execPath, \'r\')}"',
   function (err, stdout, stderr) {
     console.log(stdout)
     console.log(stderr)
   }
)

@piscisaureus
Copy link

@bmeck So is this code from some project? I don't believe it.

I agree completely with bnoordhuis that RLIMIT_NOFILE is an antiquated concept. I get the feeling there is almost no-one that uses this as a feature while it there a ton of users that are bitten by it one way or another.

But perhaps that default RLIMIT_NOFILE sometimes has it's purpose, which I doubt. Right now, the first thing every experienced node user has to do on a new box is raise the rlimit. This affects all processes, even those for which a low limit would have been justified. Talk about being a good UNIX citizen.

I have very little tolerance for keeping (or adding) features just because there's a hypothetical use case.
Ignoring the limit seems like the way to go.

@bmeck
Copy link
Member Author

bmeck commented Nov 11, 2014

@piscisaureus not from a production thing, as stated above. I am very against being the odd man out and setting this without at least exposing a CLI flag not to do this.

@bmeck
Copy link
Member Author

bmeck commented Nov 11, 2014

as an aside, being a good unix citizen by automatically changing server settings that could have been set by the admin??

we also are not talking about keeping or adding a feature, we are talking about removing.

@trevnorris
Copy link

@bnoordhuis Thoughts on (my preference) ifdef'ing the change to affect OSX only or at least adding a CLI option to not automatically change this?

@yunong
Copy link
Member

yunong commented Nov 12, 2014

How could node possible know about the resource constraints, runtime environment and many other variables of the system? I emphatically agree with @rvagg and @bmeck here -- Node.js should behave like any other good Unix citizen (PostgreSQL comes to mind).

This will induce subtle bugs to users -- I can see this taking many hours to track down and fix -- since it's being done implicitly. The costs far outweigh the benefits here.

@OrangeDog
Copy link

@piscisaureus and @bnoordhuis here's my production code that this will break, if you don't believe bmeck.

var options = {
    encoding: 'utf8',
    timeout: 1500,
    killSignal: 'SIGKILL'
};

var command = 'ulimit -t 2 && ulimit -f 0 && ulimit -v 1000000 && ulimit -n 128 && nice node ' + path.join(__dirname, 'sandbox.js');

var child = child_process.exec(command, options, function(err, stdout, stderr) {
    var results = stdout.split(/[\r\n]+/).filter(function(line) { return line !== ''; });
    err = err || stderr.split(/[\r\n]+/).filter(function(line) { return line !== ''; }).join('\n');
    callbackFn(err, results);
});

child.stdin.write(script);
child.stdin.end();

@bnoordhuis
Copy link
Member

Thoughts on (my preference) ifdef'ing the change to affect OSX only or at least adding a CLI option to not automatically change this?

An #ifdef introduces different behavior on different platforms, that's undesirable, but a flag is acceptable.

@bmeck
Copy link
Member Author

bmeck commented Nov 12, 2014

@bnoordhuis introducing different implicit behavior is what this rlimit code is doing already. If we flag it I am going to want big red text for sys admins to find quickly.

@brycebaril
Copy link

Making node act subtly different than it used to and different than most other applications is going to add to the support burden. It quite possibly will end up just being a shell game of support issues, moving them from one place to another.

Independent of whether or not we may collectively think rlimits are a useful feature, it is a statement that "Node.js knows better than unix" which isn't a positive statement. The story "Why does Node.js ignore rlimits? Well, developers writing in Node are dumb." is not a message we should want to send. Being anything but a good unix citizen has a discrediting effect on Node and only lends strength to the "Node Hipsters" image.

@sam-github
Copy link

Node doesn't ignore hard limits. It can't ignore hard limits, the system imposes them.

Soft limits are just defaults, much like the process group, fds 0,1,2, the controlling terminal, the default handling of SIGHUP, the environment, etc. Saying that node changing its process settings on startup to be more appropriate to that of a unix server than a generic application is some violation of unix convention or "good unix citizenship" is bewildering.

I think this an excellent change, though I'm sure it will hurt someone. Every change can do that. IMO, we should make node great for the majority use-case, and still be workable for the minority here by adding:

  • command line settings of the file limit (for those who want to artificially set lower limits)
  • exposure of process.ulimit() (or the more modern get/set rlimit equivalents) so code can choose its own limits (this would be particularly handy for unit tests, I suspect)

@rmg
Copy link

rmg commented Nov 12, 2014

/me deletes many words

What @sam-github wrote while I was typing.

@OrangeDog
Copy link

@sam-github that's all well and good, but there already exists a way to change limits. Why break that and then add two more?

@bmeck
Copy link
Member Author

bmeck commented Nov 12, 2014

@sam-github it ignores soft limits. I never thought it changed hard limits. Find me other server software that does this, so far my only examples are GUI applications that take this approach to things. Do you want to explain why all the blog posts on ulimits are now incorrect when people expect to change the limits to something other than the hard limit? Do you want to explain to sys admins why soft limits don't work with node? Even if we expose process.ulimit what are we going to explain to windows people, more broken APIs, great. Do you want countless articles about an old solved problem replaced by an new problem that does not have the SEO to compete with all the existing docs?

This does not sound nicer to deal w/ than learning 1 thing; we would replace it with more APIs, avoiding convention, and the expectations you to know the system better than your users. Cause why would anyone ever have a use case or need backwards compatibility for something back from the 80s.

Find me other server software before you start saying this won't affect people / no one will notice.

How do you think this issue came into being? A soft limit was ignored, scripts failed, and a huge question of why does node not respect soft limits when ALL other programming environments do. Remember that node is a runtime/programming environment not a super specialized tool that should ALWAYS bump the soft limit.

@sindresorhus
Copy link

This was actually a very good change for OS X users. The default limit is ridiculously low and it forces every users to deal with this at some point. That's usually fine for hard core developers, but do realize a lot of beginners are now using Node because of Yeoman, Grunt, Gulp, Bower and other front-end focused tools. The low limit has caused countless of support issues and annoyed users not understanding the EMFILE errors. Sometimes the right thing is being pragmatic than theoretically correct. I would strongly suggest this is kept as default on at least OS X. I'm fine with an opt-out flag.

@SBoudrias
Copy link

I agree with @sam-github and @sindresorhus on this one. Considering the majority use case and offering a way to by-pass for the minority use case would be the way to go.

This is not a minor fixes that won't impact node users. It fixes a major issue everyone starting with node hit at some point.

@sam-github
Copy link

@bmeck

Find me other server software

Apache docs suggest it attempts to up its ulimits by calling setrlimit: http://httpd.apache.org/docs/2.2/vhosts/fd-limits.html

Nginx also has builtin support for upping the file limit, though it default to not doing so. nginx doesn't expect every admin to have to muck around calling ulimit to change the nginx fd limits, it allows it to be done via conf: http://nginx.org/en/docs/ngx_core_module.html#worker_rlimit_nofile

And node is used for more than server software, it seems people are hitting the limits on OS X even for non-server usage (hopefully, people operating node as a server are upping its hard limits if they want good concurrency, which is kind of the point of node as a server).

before you start saying this won't affect people / no one will notice.

Please, what I said was the exact opposite of what you are attributing to me: "I think this an excellent change, though I'm sure it will hurt someone."

You, for example. Its very frustrating, I'm sure your debugging time was quite painful, but you are only one person, there are many others affected by current behaviour who don't even know what ulimit is, much less how to use it to up the limits for node, or why their gulp CLIs are failing, and are not particularly sympathetic to a "we could fix this in node, but instead would like you to spend more time configuring your shell environment".

@bmeck
Copy link
Member Author

bmeck commented Nov 15, 2014

@sam-github a quick grep off Apache httpd's svn shows it setting some things, but not automatically and not RLIMIT_NOFILE ( https://gist.github.com/bmeck/f08f26e53a330c958e07 ). I'm not arguing we shouldn't allow changing the limits, I'm arguing against doing it automatically / implicitly.

@sam-github
Copy link

@bmeck I understand what you are arguing, I just disagree.

I think more people would be hurt by increasing nofiles to the hard-limit being opt-in, as you want, than by it being opt-out.

To be hurt by opt-out, you need to somehow depend on node FAILING, which is possible, as you have proved, but I don't think its as common as wanting node to SUCCEED.

I guess we need more votes. :-) Anybody who wants to jump in with links to down-stream bugs reported by poor users not intimate with ulimits, please do.

Also, if we had an opt-in API: process.ulimit({nofile: "max"}) would eventually show up at the top of every script used on OS X, and I think it would be widely viewed as a process.makeNodeActuallyWork() API, and APIs like that are unfortunate. But someone could do that, make it opt-in, and then wait for a series of bug reports along the lines of "why on earth is this not the default?", at which point this conversation could continue.

@bmeck
Copy link
Member Author

bmeck commented Nov 17, 2014

@sam-github as @piscisaureus pointed out on irc the default limit on linux kernels is also low (soft 1024, hard 4096). I don't see this as a process.makeNodeActuallyWork since I spend a lot of time doing system management, and also don't hear this complaint much from linux people. As for the open issue count, it is fairly small even with closed issues on Node's issue tracker.

I think the votes will come down to sys-admin vs non for how this should work. Non sys admins would welcome the change, even if they eventually have to learn how to bump the hard limits. Sys admins would see this as a wart and would want process.ulimit({nofile: 'respected'}).

@sam-github
Copy link

I've run/adminned linux systems since the early 90's, and @rmg is most definitely sysadmin/devops as well as a developer, so while there is a debate, its a mischaracterization to describe it as sysadmin vs others.

@bmeck
Copy link
Member Author

bmeck commented Nov 17, 2014

@ghost
Copy link

ghost commented Nov 18, 2014

The problem is that OSX has a particularly bad default configuration but people choose OSX in the first place because of its supposed "just works" ideology which it can never completely fulfill. Extra implicit magic is bad, especially for something with as many broad use-cases as node. What about just changing the EMFILE error message to be more informative and inform OSX users how to fix the problem?

@OrangeDog
Copy link

@bmeck I don't know what linux you're using, but ulimit -o isn't a thing in bash.

@sam-github
Copy link

Speaking of classic unix software tool philosophy and ulimits, ran into this last night while hacking on my favorite file watcher: https://github.com/clibs/entr/blob/master/entr.c#L137

@rmg
Copy link

rmg commented Nov 18, 2014

Part of the disagreement in this thread seems to be predicated on the notion that soft limits fall under the sysadmin domain.

From a FreeBSD getrlimit(2) and Darwin getrlimit(2):

Only the super-user may raise the maximum limits. Other users may only alter rlim_cur within the range from 0 to rlim_max or (irreversibly) lower rlim_max.

From a POSIX setrlimit(3):

Soft limits may be changed by a process to any value that is less than or equal to the hard limit. A process may (irreversibly) lower its hard limit to any value that is greater than or equal to the soft limit. Only a process with appropriate privileges can raise a hard limit.

From a Linux setrlimit(2):

an unprivileged process may only set its soft limit to a value in the range from 0 up to the hard limit, and (irreversibly) lower its hard limit. A privileged process (under Linux: one with the CAP_SYS_RESOURCE capability) may make arbitrary changes to either limit value.

These strongly suggest to me that it is perfectly appropriate (and expected) behaviour for any process, including node, to adjust its own soft limits as it sees fit because they are tools for the developer to use. In other words, soft limits are dev tuneable and hard limits are sysadmin tuneable.

My dev-hat opinion:

  • this saves me explaining ulimit to people deploying my apps

My ops-hat opinion:

  • this saves me the step in every deployment where I increase the soft limit
  • this does not impede my ability to enforce resource restrictions because that is done with hard limits

@bmeck
Copy link
Member Author

bmeck commented Feb 6, 2015

since 0.12 has landed I'm going to close this unless there are further comments and consider node's stance on unix conventions as being wavy when discussing

@jasnell
Copy link
Member

jasnell commented Jun 3, 2015

@bmeck @piscisaureus @sam-github @joyent/node-coreteam @bnoordhuis ... what do we want to do with this one? Worth keeping open? Is this resolved?

@rvagg
Copy link
Member

rvagg commented Jun 3, 2015

argh! I don't think either @bmeck or I want to give up on this but it's landed so it'd probably be a breaking change for both io.js and 0.12 to revert it.

@bmeck
Copy link
Member Author

bmeck commented Jun 3, 2015

i would prefer it be reverted, but yes it would be a breaking change to revert, though was a breaking change to add as well.

@jasnell
Copy link
Member

jasnell commented Jun 4, 2015

Well, should we put it up for TSC review?
On Jun 3, 2015 4:48 PM, "Bradley Meck" notifications@github.com wrote:

i would prefer it be reverted, but yes it would be a breaking change to
revert, though was a breaking change to add as well.


Reply to this email directly or view it on GitHub
#8704 (comment).

@rvagg
Copy link
Member

rvagg commented Jun 4, 2015

sure, let's have that discussion at least

@trevnorris
Copy link

I don't really care much either way, but a revert should require much more helpful error messages about what happened. Make it very easy for newb devs to find a fix.

@jasnell
Copy link
Member

jasnell commented Jun 4, 2015

Ok, I've marked it tsc-agenda but let's lay out some of the basics and see if we can work it out here first.

  1. Is this change limited to Node.js or did it land in io.js also?
  2. What is the concise one sentence reason why the change was made? What does this fix?
  3. What breaks if it's reverted?
  4. What is the work around / alternative if it is reverted?

@piscisaureus
Copy link

I stated my opinion here and I don't have much to add to it. At some point I also did some digging to find other software that changes the soft limit, and it turned out that chrome does it too (and others that I don't remember).

The only argument I am susceptible to is that node shouldn't raise the limit for the (non-node) child processes it may spawn. Therefore:

-1 on reverting this patch
+1 on restoring the original ulimit after fork()

@jasnell

Is this change limited to Node.js or did it land in io.js also?

It landed before the fork.

What is the concise one sentence reason why the change was made? What does this fix?

OS X has a really low soft "open file" limit. This makes node crash with an 'EMFILE' error when the number of open files (or network connections) goes up.

What breaks if it's reverted?

Node starts crashing unexpectedly again.

What is the work around / alternative if it is reverted?

The workaround is that all node users have to manually raise the limit (for all processes) with the ulimit command. This has to be done after every reboot.

@OrangeDog
Copy link

A bit biased. A valid workaround is to start node or your shell with increased limits.
Doesn't have to be every process, and doesn't have to be manually re-done every reboot (.bashrc could deal with shell limits, or aliasing node).

@sindresorhus
Copy link

It would be really disappointing if this was reverted. The majority of node users benefited from this change. You could of course improve the error message (should be done regardless), but the reality is that every single OS X user would have to fix this manually, which makes it harder to get started with Node, and wastes a LOT of time.

@Fishrock123
Copy link

How about we only turn this on for OS X? I am very -1 on anything other than having this (or equivalent functionality) as the default for OS X.

@rmg
Copy link

rmg commented Jun 4, 2015

I still don't understand the objection to the behaviour. It's not changing the hard limit, which is the only limit that the sys admin (aka, user) can reasonably expect to be obeyed. It seems more reasonable to me that a handful of apps adopt some boilerplate to lower their soft limit on startup than have the majority of apps adopt some boilerplate to raise their soft limit like the commit in question does already.

With the current behaviour, the workaround boilerplate is:

// this app has a memory leak, so lower the to hide our ENOMEM with an EMFILE crash
require('posix').setrlimit('nofile', {soft: 100});

If reverted, the boilerplate is:

// this app can handle a lot more than the soft limit because node is awesome
var posix = require('posix');
var limits = posix.getrlimit('nofile');
posix.setrlimit('nofile', {soft: limits.hard});

@bmeck
Copy link
Member Author

bmeck commented Jun 4, 2015

@rmg my problem is:

  1. are not following convention, which even if this is a common problem saw breakage when this rolled out. notably it broke the sandbox module.
  2. are overriding any soft limit a user may be using, thus forcing hard limits.
  3. cannot recover the original soft limit.

@rmg
Copy link

rmg commented Jun 4, 2015

@bmeck thank you, those are excellent points that I hadn't been able to distill from this thread.

Somewhat ironically, between your points and my own boilerplate examples, I suddenly care a lot less about this behaviour one way or the other.

@piscisaureus
Copy link

The resolution was:

  • The current (rlimit-changing) behavior will be maintained.
  • Node/libuv should restore the ulimit for the child processes it spawns.
  • Support for leaving the ulimit untouched by setting an an environment variable will be added.
  • Bradley Meck is going to create (node and libuv) patches for this.

So it doesn't need to be discussed at the TSC meeting of June 17th.

@rvagg
Copy link
Member

rvagg commented Jun 17, 2015

can you remove the label please?

@piscisaureus
Copy link

Closing this issue; let's continue when the PRs come in.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests