-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pm2 reload downtime in cluster mode #3143
Comments
Could you use this snippet (from the docs) : var http = require("http"),
app = require("express")();
app.use("/", function (req, res) {
return res.send(404);
});
var server = http.createServer(app);
server.listen(4000, function () {
process.send('ready')
});
process.on('SIGINT', function() {
server.close(function(err) {
process.exit(err ? 1 : 0);
});
}); And then start it with |
I have tried with the above snippet and experience the same issue. |
Hello, i have the same bug. It used to work, but now with graceful reload is not doing it's job, my process are reloaded at the same time.
I downgraded to pm2@2.1.1 . It's working if i > 2 , if i =2 , both are rebooted at the same time. |
I was looking at some of the code in the commit that @brendonboshell referenced. I noticed that this line which is the if (!proc.process.pid) {
console.error('app=%s id=%d does not have a pid', proc.pm2_env.name, proc.pm2_env.pm_id);
proc.pm2_env.status = cst.STOPPED_STATUS;
return cb(null, { error : true, message : 'could not kill process w/o pid'});
} Has what looks like an error object as the second parameter. Is that correct? i.e. we usually pass the error object as the first parameter in the callback and this pattern is followed in the other parts of the code in that file. But it might be that this isn't a "true" error condition which is why it's being signaled as a property and being passed in the second param. |
I can't get version 2.1.1 to reload with zero downtime in my environment using the same approach that @brendonboshell has used with |
@laurentdebricon on ver 2.1.1 I just added an extra process with
|
It just seems to be those first two. If I scale out to 10 instances and reload the first two reloaded at the same time and the rest are distributed.
|
Bump |
Can confirm, 3.1.2 loses requests on reload, switching to 2.1.1 fixes the issue |
@brendonboshell @laurentdebricon I am running on 3.2.2 and I have fixed this problem for myself by using |
For using an ecosystem.json file, I found that #!/bin/bash
cd /path/to/appname
PM2_CONCURRENT_ACTIONS=1 pm2 reload ecosystem.json Source: Lines 88 to 93 in 9178610
|
I'm still experiencing downtime when using pm2 reload {processname}. Most of the times when I |
I am facing the same issue as you faced. Do you have any solution for that? |
I am facing the same issue, any solution ? |
+1, same issue, use pm2 (5.1.2) cluster-mode, nodejs app and json config in app root. |
Can confirm, even the "graceful restart" that is supposed to happen via |
+1, same issue with pm2(5.3.0), nextjs and nginx. |
Please see the issue I reported previously - "pm2 reload causing connection timeout and downtime". Using
pm2 reload
causes connection errors and downtime when running a process in cluster mode. Since I have been stuck running 1.0.1 in production for some time, I have been motivated to track down the cause of this issue. The issue still appears in the latest version 2.6.1.Using
git bisect
I have been able to track this issue down to commit d0a3f49 "(god)(stopProcessId) refactor: now it only kill process without disconnecting in cluster mode"To reproduce
server.js
:Run
./bin/pm2 --no-daemon
on master/2.6.1.Run
../../pm2/bin/pm2 start server.js -i 2 --name api
Run
ab -n 100000 -c 1 http://127.0.0.1:4000/v1/
While
ab
is running, run../../pm2/bin/pm2 reload api && sleep 2 && ../../pm2/bin/pm2 reload api && sleep 2 && ../../pm2/bin/pm2 reload api && sleep 2 && ../../pm2/bin/pm2 reload api && sleep 2 && ../../pm2/bin/pm2 reload api && sleep 2 && ../../pm2/bin/pm2 reload api && sleep 2 && ../../pm2/bin/pm2 reload api && sleep 2 && ../../pm2/bin/pm2 reload api && sleep 2 && ../../pm2/bin/pm2 reload api && sleep 2 && ../../pm2/bin/pm2 reload api && sleep 2 && ../../pm2/bin/pm2 reload api && sleep 2 && ../../pm2/bin/pm2 reload api && sleep 2 && ../../pm2/bin/pm2 reload api && sleep 2 && ../../pm2/bin/pm2 reload api && sleep 2 && ../../pm2/bin/pm2 reload api && sleep 2 && ../../pm2/bin/pm2 reload api && sleep 2 && ../../pm2/bin/pm2 reload api && sleep 2 && ../../pm2/bin/pm2 reload api && sleep 2 && ../../pm2/bin/pm2 reload api && sleep 2 && ../../pm2/bin/pm2 reload api && sleep 2
Observe the following output:
git revert d0a3f49
and observe thatab
completes without error.I have also reproduced this issue with a small script
Solution
It appears that reverting d0a3f49 solves the issue, but I am not sure what the motivation for that change was. I have been running 2.6.1 in production for about a week and, since I regularly use
pm2 reload
, I have noticed a number of connection errors in my nginx logs. I suspect this issue is related to the above.Update (10 Sep)
I have downgraded PM2 2.1.1 overnight on my production machine. See this chart from my status server for the past 24 hours. I use
pm2 reload
every hour and you can clearly see many downtime running under PM2 2.6.1 and no downtime while running PM2 2.1.1.The text was updated successfully, but these errors were encountered: