Skip to content
This repository has been archived by the owner on Apr 22, 2023. It is now read-only.

(cluster) --debug-port is forced #8159

Closed
soyuka opened this issue Aug 13, 2014 · 24 comments
Closed

(cluster) --debug-port is forced #8159

soyuka opened this issue Aug 13, 2014 · 24 comments
Labels

Comments

@soyuka
Copy link

soyuka commented Aug 13, 2014

Hi,

https://github.com/joyent/node/blob/master/lib/cluster.js#L291

    if (!hasDebugArg)
      execArgv = ['--debug-port=' + debugPort].concat(execArgv);

Why is debugging forced in the cluster module? Is it a wanted behavior? Did someone forgot to remove it?

It's breaking some code of ours (Unitech/pm2#610 (comment)) and leading to a memory leak:

  • restart a cluster a lot
  • debug port is going crazy and breaks node and the V8 engine

May I PR on this to remove those debug forcing stuff?

@sam-github
Copy link

What do you mean by "forced"? node was called with the --debug flag, debug was explicitly requested, not "forced".

In v0.11, if you call node --debug=5000 clustered-app.js, then the master should listen on 5000, worker 1 on 50001.

In v0.10, calling node with --debug=9000 would get a stream of Failed to open socket messages as both master and workers all try to open the same port.

Why does the v0.11 behaviour cause a memory leak, and what do you mean by debug port "going crazy"?

@soyuka
Copy link
Author

soyuka commented Aug 15, 2014

First, I'm saying forced because without adding a --debug option on the master, the cluster is having the --debug flag, and we might not want it because of ports.
Those ports are increasing every time we launch a process again and reaches the 65535 limit. V8 throws Debug port must be in range 1024 to 65535. but the debug port is still increasing and become a memory leak.

I could try to find a proper way of reproducing this, but if the master has no --debug argument why should the child had one?

@Unitech
Copy link

Unitech commented Aug 15, 2014

Yes the --debug flag is set by default (https://github.com/joyent/node/blob/master/lib/cluster.js#L291).

Concerning the debug port, the cluster module increment the id every time a new worker is created. So assigning .debugPort + worker_id (https://github.com/joyent/node/blob/master/lib/cluster.js#L276), cause at a certain point to assign a port that is outside the port limit and so I get this:

Debug port must be in range 1024 to 65535.
Usage: node [options] [ -e script | script.js ] [arguments] 
       node debug script.js [arguments] 

Options:
  -v, --version        print node's version
  -e, --eval script    evaluate script
  -p, --print          evaluate script and print result
  -i, --interactive    always enter the REPL even if stdin
                       does not appear to be a terminal
  --no-deprecation     silence deprecation warnings
  --trace-deprecation  show stack traces on deprecations
  --v8-options         print v8 command line options
  --max-stack-size=val set max v8 stack size (bytes)

So the worker is directly killed, then spawned back, then killed, till V8 throw strange errors (EMFILE...).

@sam-github
Copy link

--debug flag is set by default (https://github.com/joyent/node/blob/master/lib/cluster.js#L291)

I don't see the --debug set on that line, only --debug-port, and --debug-port doesn't cause the port to be opened (not until SIGUSR1).

I agree that with enough workers, or a high-enough base port, eventually the debug-port strategy of "base+worker ID" won't work. That is a problem.

But the "memory leak" and "debug forced" still isn't making sense to me.

Missing from this report is how node is being called. node --debug, node --debug=XX, .. something else?

@cjihrig
Copy link

cjihrig commented Aug 15, 2014

@sam-github with regards to the "base+worker ID" problem - how do you feel about letting the user pass in their own execArgv (see #7750, although it would require slight tweaking for this case)? The current solution is a good baseline, but for advanced cases, I think we should let them pass their own port numbers. The alternative I see is a linear port scan. We should also provide some rollover mechanism to keep the value between 1024 and 65535.

@soyuka
Copy link
Author

soyuka commented Aug 18, 2014

+1 @cjihrig, we already hack the fork execArgv directly from the cluster.settings option. But the debug port should be the one the user choose when forking not automatic anyway.

Linear port scan is not really great and should not be needed only for debugging tasks.

@sam-github
Copy link

User doesn't have access to the .fork() if they are using a supervisor or cluster wrapper. Forcing debug port for workers to be chosen by the user just pushes the problem back into their lap, and then when they choose the master base-port, + the worker ID, and it breaks, we get to say "that's your problem", so I'm not that excited about it.

That said, since it means cluster debugging wouldn't work out of the box with node 0.12, much like it doesn't for v0.10, it gives me a feature to add to strong-supervisor, where I'd implement a debug port allocation strategy for all workers.

To see how unworkable debugging clustered apps was in v0.10, check out the workarounds in this blog: http://strongloop.com/strongblog/whats-new-nodejs-v0-12-debugging-clusters/. That blog is pretty much a description of the problem to be solved.

One thing here is that cluster IDs always go up, workers don't get the "lowest possible" ID. That make sense, I don't propose changing it.

But for port, we could perhaps always allocate "debug ports" as the lowest possible above the base debug port. Until workers exit, the effect would be that the port allocation would be exactly as it is now, where worker debug port === base port + worker ID. But after workers start to exit, they would re-use the lowest available port.

This would mean that the debugPort for a worker should be exposed as a property on a Worker, since its not necessarily derivable from simply knowing the worker ID.

If --debug/--debug-brk don't propagate automatically from a master down to the workers, strong-supervisor would probably do something like the above itself, modifying the execArgv, or using whatever other mechanism is available.

As a no-one-has-time-for-this idea... it might be possible to ONLY have the master open a debug TCP port, and extend the debug protocol to include selection of a worker ID. The master would then use node cluster IPC to forward the debug commands to the currently selected woker. This would be much more like debugging "threaded" apps in other languages. Since the debug agent is getting re-written anyhow for v0.12 (having gotten deleted from v8), maybe this isn't out of the question.

@bnoordhuis and @bajtos, what do you think?

@bnoordhuis
Copy link
Member

As a no-one-has-time-for-this idea... it might be possible to ONLY have the master open a debug TCP port, and extend the debug protocol to include selection of a worker ID. The master would then use node cluster IPC to forward the debug commands to the currently selected woker. This would be much more like debugging "threaded" apps in other languages. Since the debug agent is getting re-written anyhow for v0.12 (having gotten deleted from v8), maybe this isn't out of the question.

The V8 debugger protocol is close to being deprecated and will probably get dropped by upstream sometime in the next 6-12 months. So, while your suggestion would work as an interim solution, that means we'll have to revisit it in the not too distant future.

(In case you're wondering what the deal is with the debugger protocol: the plan is to replace it with Chrome's remote debugging protocol. I don't know if your suggestion would work with that, too early to tell.)

@sam-github
Copy link

I got the impression from the v8 list that the new chrome remote debugging protocol is implemented still by using V8's DebugCommandProcessor.processDebugJSONRequest, and from you that node would remimplement the TCP debug protocol in a thread, calling into the above request handler. Is that a bit up in the air, still?

My thinking was that if node is forced to implement its own debug protocol (or reimplement and maintain itself the old debug protocol), then it makes it easier for us to extend it with cluster-specific commands (before, we would have had to maintain a fork to v8, which is uncomfortable).

Are you anticipatining that Node will eventually move over to a version of Chrome's socket.io-based debug protocol? Implementing it anew, or perhaps ripping it out of Chrome?

@bnoordhuis
Copy link
Member

I got the impression from the v8 list that the new chrome remote debugging protocol is implemented still by using V8's DebugCommandProcessor.processDebugJSONRequest, and from you that node would remimplement the TCP debug protocol in a thread, calling into the above request handler. Is that a bit up in the air, still?

That's the plan for v0.12. But for v0.13...

Are you anticipatining that Node will eventually move over to a version of Chrome's socket.io-based debug protocol? Implementing it anew, or perhaps ripping it out of Chrome?

...the plan is to copy Chrome's implementation (where the word "copy" is part literal, part conceptual.)

Chrome's debugger is a mix of C++ and JS. The C++ glue will have to be redone for node.js but the JS code can probably be used as-is.

Unitech added a commit to Unitech/node that referenced this issue Aug 28, 2014
@soyuka
Copy link
Author

soyuka commented Aug 28, 2014

I don't see the --debug set on that line, only --debug-port, and --debug-port doesn't cause the port to be opened (not until SIGUSR1).

Still, the debug-port is increased on every call that you do and then the Debug port must be in range 1024 to 65535. error occurs. It's not really safe and should be fixed before v0.12 shouldn't it?

Could we only make this call when a --debug flag is present? It'd be a temporary fix of course.

@cjihrig
Copy link

cjihrig commented Aug 28, 2014

It'd be a temporary fix of course.

Famous last words. If you're looking for a workaround for that error, couldn't you just ensure that the port is in the valid range instead of completely removing the flag altogether?

@soyuka
Copy link
Author

soyuka commented Aug 28, 2014

What are you talking about? I'm just saying that --debug-port should only be append if execArgv.debug is set like you proposed early. Or left to the user choice for instance.

Ofc it should be in a valid range, and that's the main part of the issue because after a while it raises over the 65635 limit and we can't do anything about it.

But as @sam-github mentions the debug part of node will be rewritten so we might find a better solution to those cluster ports later. In the mean time just left this option to the user choice and if port is already taken throw an error.

@mjseidel
Copy link

@joyent/node-coreteam nominate:0.12.3

This is a nasty surprise for callers of cluster.fork().

mjseidel pushed a commit to mjseidel/node that referenced this issue Apr 13, 2015
See: nodejs#8159

Per above discussion, cluster.fork() currently sends --debug-port to
forked processes regardless of whether debug is enabled, and more
importantly, without any bounds checking. This will rather unexpectedly
crash and Node process that forks enough times.

This patch simply bounds checks --debug-port and doesn't set arg if out
of bounds (V8 requires 1024 < debug-port < 65535). This will prevent
surprises to callers of cluster.fork() while waiting for the debug part
of node to be rewritten as mentioned in issue discussion above.
mjseidel pushed a commit to mjseidel/node that referenced this issue Apr 13, 2015
See: nodejs#8159

Per above discussion, cluster.fork() currently sends --debug-port to
forked processes regardless of whether debug is enabled, and more
importantly, without any bounds checking. This will rather unexpectedly
crash any Node process that forks enough times.

This patch simply bounds checks --debug-port and doesn't set arg if out
of bounds (V8 requires 1024 < debug-port < 65535). This will prevent
surprises to callers of cluster.fork() while waiting for the debug part
of node to be rewritten as mentioned in issue discussion above.
mjseidel pushed a commit to mjseidel/node that referenced this issue Apr 15, 2015
See: nodejs#8159

Per above discussion, cluster.fork() currently sends --debug-port to
forked processes regardless of whether debug is enabled, and more
importantly, without any bounds checking. This will rather unexpectedly
crash any Node process that forks enough times.

This patch simply bounds checks --debug-port and doesn't set arg if out
of bounds (V8 requires 1024 < debug-port < 65535). This will prevent
surprises to callers of cluster.fork() while waiting for the debug part
of node to be rewritten as mentioned in issue discussion above.
@nikmartin
Copy link

I'm not positive I understand all of the above posts, but on node v0.12.2 my worker processes are being forked with debug set:

2232 ?        Ssl    0:00 /usr/local/bin/node /usr/local/lib/node_modules/forever/bin/monitor boot.js
 7094 ?        S      1:03  \_ /usr/local/bin/node /var/myapp-server/boot.js
 7095 ?        Sl    11:49      \_ /usr/local/bin/node --debug-port=5859 /var/myapp-server/app.js
 7096 ?        Sl    13:07      \_ /usr/local/bin/node --debug-port=5860 /var/myapp-server/app.js

Is there a way to stop this from happening in this version? This app is in production and even though these ports are closed, I'm concerned about the performance implications of this, and the fact that this port cycling could eventually overflow if enough workers are [re]started

@cjihrig
Copy link

cjihrig commented Jun 17, 2015

Debug port is no longer set by default as of nodejs/node@309c0f4.

@misterdjules
Copy link

@cjihrig Can nodejs/node@309c0f4 be backported to joyent/node in some form, or is everything in there a breaking change?

@cjihrig
Copy link

cjihrig commented Jun 18, 2015

@misterdjules conservatively, I would say not to backport it. It's only related to debugging, and has a very simple workaround to get the old behavior (pass a debug flag to the cluster master), but it's technically still a behavior change. Your call :-)

@sam-github
Copy link

It is a behaviour change, but the existing behaviour causes problems with 0.12. Also, the debug port passing to cluster workers was a feature introduced in 0.12, that started to cause problems when it was used in production, so fixing this kind of thing will make 0.12 more stable, I think.

@euskode
Copy link

euskode commented Jun 19, 2015

Specifically, what are these production problems?

Isn't this simply setting the port but not actually putting each cluster worker into full debug mode?

@nikmartin
Copy link

My main concern about --debug in production is that there might be a lot more plumbing wired up inside node when the debug flag is set that causes more overhead than necessary

@sam-github
Copy link

@nikmartin there is not

@sam-github
Copy link

@euskode https://github.com/joyent/node/blob/master/src/node.cc#L2903, node fails to start if port is out-of-bounds, even if debugging is never turned on and the port is never used.

@soyuka
Copy link
Author

soyuka commented Jun 19, 2015

My issue here was a memory leak, consequence of the debug port increasing
on each cluster instance reload.
Le ven. 19 juin 2015 à 22:15, Sam Roberts notifications@github.com a
écrit :

@euskode https://github.com/euskode
https://github.com/joyent/node/blob/master/src/node.cc#L2903, node fails
to start if port is out-of-bounds, even if debugging is never turned on and
the port is never used.


Reply to this email directly or view it on GitHub
#8159 (comment).

cjihrig added a commit to nodejs/node that referenced this issue Jul 22, 2015
Currently, each cluster worker is assigned an ever increasing
--debug-port argument. A long running cluster application that
does not use the debugger can run into errors related to the
port range. This commit mitigates the problem by only setting
the debug port if the master is started with debug arguments, or
the user explicitly defines debug arguments for the worker. This
commit also adds a new debug port offset counter that is only
incremented when a worker is created that utilizes debugging.

Fixes: nodejs/node-v0.x-archive#8159
Refs: #1524
PR-URL: #1949
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
Reviewed-By: Oleg Elifantiev <oleg@elifantiev.ru>
cjihrig added a commit to nodejs/node that referenced this issue Jul 24, 2015
Currently, each cluster worker is assigned an ever increasing
--debug-port argument. A long running cluster application that
does not use the debugger can run into errors related to the
port range. This commit mitigates the problem by only setting
the debug port if the master is started with debug arguments, or
the user explicitly defines debug arguments for the worker. This
commit also adds a new debug port offset counter that is only
incremented when a worker is created that utilizes debugging.

Fixes: nodejs/node-v0.x-archive#8159
Refs: #1524
PR-URL: #1949
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
Reviewed-By: Oleg Elifantiev <oleg@elifantiev.ru>
cjihrig added a commit to nodejs/node that referenced this issue Jul 30, 2015
Currently, each cluster worker is assigned an ever increasing
--debug-port argument. A long running cluster application that
does not use the debugger can run into errors related to the
port range. This commit mitigates the problem by only setting
the debug port if the master is started with debug arguments, or
the user explicitly defines debug arguments for the worker. This
commit also adds a new debug port offset counter that is only
incremented when a worker is created that utilizes debugging.

Fixes: nodejs/node-v0.x-archive#8159
Refs: #1524
PR-URL: #1949
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
Reviewed-By: Oleg Elifantiev <oleg@elifantiev.ru>
cjihrig added a commit to nodejs/node that referenced this issue Aug 1, 2015
Currently, each cluster worker is assigned an ever increasing
--debug-port argument. A long running cluster application that
does not use the debugger can run into errors related to the
port range. This commit mitigates the problem by only setting
the debug port if the master is started with debug arguments, or
the user explicitly defines debug arguments for the worker. This
commit also adds a new debug port offset counter that is only
incremented when a worker is created that utilizes debugging.

Fixes: nodejs/node-v0.x-archive#8159
Refs: #1524
PR-URL: #1949
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
Reviewed-By: Oleg Elifantiev <oleg@elifantiev.ru>
cjihrig added a commit to nodejs/node that referenced this issue Aug 3, 2015
Currently, each cluster worker is assigned an ever increasing
--debug-port argument. A long running cluster application that
does not use the debugger can run into errors related to the
port range. This commit mitigates the problem by only setting
the debug port if the master is started with debug arguments, or
the user explicitly defines debug arguments for the worker. This
commit also adds a new debug port offset counter that is only
incremented when a worker is created that utilizes debugging.

Fixes: nodejs/node-v0.x-archive#8159
Refs: #1524
PR-URL: #1949
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
Reviewed-By: Oleg Elifantiev <oleg@elifantiev.ru>
cjihrig added a commit to nodejs/node that referenced this issue Aug 4, 2015
Currently, each cluster worker is assigned an ever increasing
--debug-port argument. A long running cluster application that
does not use the debugger can run into errors related to the
port range. This commit mitigates the problem by only setting
the debug port if the master is started with debug arguments, or
the user explicitly defines debug arguments for the worker. This
commit also adds a new debug port offset counter that is only
incremented when a worker is created that utilizes debugging.

Fixes: nodejs/node-v0.x-archive#8159
Refs: #1524
PR-URL: #1949
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
Reviewed-By: Oleg Elifantiev <oleg@elifantiev.ru>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

10 participants