IPC channel stops delivering messages to cluster workers #9706

davidvetrano · 2016-11-20T04:40:33Z

Version: 6.9.1 (also reproduced on 4.4.1 and 0.12.4)
Platform: OS X, possibly others (Unable to repro on Linux but I believe it does happen but with less frequency)
Subsystem: Cluster / IPC

When many IPC messages are sent between the master process and cluster workers, IPC channels to workers stop delivering messages. I have not been unable to restore working functionality of the workers and so they must be killed to resolve the issue. Since IPC has stopped working, simply using Worker.destroy() does not work since the method will wait for the disconnect event which never arrives (because of this issue).

I am able to repro on OS X by running the following script:

var cluster = require('cluster');
var express = require('express'); // tested with 4.14.0

const workerCount = 2;
const WTMIPC = 25;
const MTWIPC = 25;

if (cluster.isMaster) {
    var workers = {}, worker;
    for (var i = 0; i < workerCount; i++) {
        worker = cluster.fork({});
        workers[worker.process.pid] = worker;
    }

    var workerPongReceivedTime = {};
    cluster.on('online', function(worker) {
        worker.on('message', function(message) {
            var currentTime = Date.now();
            if (message.type === 'pong') {
                workerPongReceivedTime[worker.process.pid] = currentTime;
                console.log('received pong\tmaster-to-worker\t' + (message.timeReceived - message.timeSent) + '\tworker-to-master ' + (currentTime - message.timeSent));
            } else if (message.type === 'fromEndpoint') {
                for (var i = 0; i < MTWIPC; i++) {
                    worker.send({ type: 'toWorker' });
                }
            }
        });
    });

    setInterval(function() {
        var currentTime = Date.now();
        console.log('sending ping');
        Object.keys(workers).forEach(function(workerPid) {
            workers[workerPid].send({ type: 'ping', time: Date.now() });

            if (currentTime - workerPongReceivedTime[workerPid] > 10000) {
                console.log('Worker missed pings: ' + workerPid);
            }
        });
    }, 1000);
} else {
    var app = express();

    app.get('/test', function(req, res) {
        for (i = 0; i < WTMIPC; i++) {
            process.send({ type: 'fromEndpoint' });
        }

        res.send({ test: 123 });
    });

    app.listen(7080, function() {
        console.log('server started');
    });

    process.on('message', function(message) {
        if (message.type === 'ping') {
            process.send({ type: 'pong', timeSent: message.time, timeReceived: Date.now() });
        }
    });
}

and using ApacheBench to place the server under load as follows:

ab -n 100000 -c 200 'http://localhost:7080/test'

I see the following, for example:

server started
server started
sending ping
received pong	master-to-worker	1	worker-to-master 1
received pong	master-to-worker	0	worker-to-master 1
sending ping
received pong	master-to-worker	1	worker-to-master 3
received pong	master-to-worker	19	worker-to-master 21
sending ping
received pong	master-to-worker	2	worker-to-master 5
received pong	master-to-worker	4	worker-to-master 7
sending ping
received pong	master-to-worker	3	worker-to-master 4
received pong	master-to-worker	4	worker-to-master 6
sending ping
received pong	master-to-worker	9	worker-to-master 10
received pong	master-to-worker	2	worker-to-master 10
sending ping
received pong	master-to-worker	2	worker-to-master 4
received pong	master-to-worker	4	worker-to-master 6
sending ping
received pong	master-to-worker	2	worker-to-master 4
received pong	master-to-worker	4	worker-to-master 6

... (about 10k - 60k requests later) ...

sending ping
sending ping
sending ping
sending ping
sending ping
sending ping
sending ping
sending ping
sending ping
sending ping
Worker missed pings: 97462
sending ping
Worker missed pings: 97462
Worker missed pings: 97463
sending ping
Worker missed pings: 97462
Worker missed pings: 97463

As I alluded to earlier, I have seen an issue on Linux which I believe is related but I have been so far unable to repro using this technique on Linux.

The text was updated successfully, but these errors were encountered:

santigimeno · 2016-11-20T07:33:36Z

What's ulimit value for open files? You can see the value by executing ulimit -a

It could be you're hitting that value. Does increasing that value help? See: http://superuser.com/a/303058

davidvetrano · 2016-11-21T23:32:04Z

@santigimeno I'm able to repro with the hard and soft file descriptor limits set to 1M.

santigimeno · 2016-11-22T09:19:30Z

@davidvetrano yes, I could reproduce the issue on OS X and FreeBSD.

What I have observed is that at some point the master doesn't receive an NODE_HANDLE_ACK message in response to a NODE_HANDLE command sent to the workers causing that the ping messages are being stored in the child_process ._handleQueue and they're never actually sent. In fact, the problem is that, for some reason, the NODE_HANDLE message is actually sent (apparently with success) to the workers, but the workers never receive it. I thought this was not possible in the IPC channel as it's an AF_UNIX SOCK_STREAM connection. Any idea why this could be happening?

/cc @bnoordhuis

pidgeonman · 2016-12-13T23:53:31Z

I lose IPC communication on Linux too, having 41 workers and rather heavy DB access on each. No error is given on console.

rooftopsparrow · 2017-01-25T21:26:40Z

Update: After testing again, v6.1 does in fact have has the bug. Please disregard.

I've taken the above snippet and did a manual "git bisect" on all minor versions on macOS (10.12.2) and the behavior is not exhibited on v6.1.0 but is introduced in ^v6.2.0, and is still prevalent in v7.

~~I'm currently running a real git bisect on the commits between v6.1 and v6.2 to hopefully identify the commit where this regression took place.~~

santigimeno · 2017-01-28T08:33:17Z

Yeah, I have also reproduced it in 4.7.1.

Trott · 2017-07-16T03:04:36Z

@santigimeno This should remain open?

/cc @bnoordhuis, @cjihrig, @mcollina

bnoordhuis · 2017-07-17T15:44:10Z

@santigimeno #13235 fixed this, didn't it?

That PR is on track for v6.x and it seems reasonable to me to also target v4.x since it's a rather insidious bug.

santigimeno · 2017-07-17T16:08:21Z

I think so. I still can reproduce it with current master on FreeBSD

santigimeno · 2017-07-17T16:33:52Z

@bnoordhuis sorry I hadn't read your comment before answering...

#13235 fixed this, didn't it?

I had forgotten about this one but it certainly looks like it could have been solved by #13235, but from a quick check it doesn't look it's solved so it may be a different issue.

It is possible that `recvmsg()` may return an error on ancillary data reception when receiving a `NODE_HANDLE` message (for example `MSG_CTRUNC`). This would end up, if the handle type was `net.Socket`, on a `message` event with a non null but invalid `sendHandle`. To improve the situation, send a `NODE_HANDLE_NACK` that'll cause the sending process to retransmit the message again. In case the same message is retransmitted 3 times without success, close the handle and print a warning. Fixes: #9706 PR-URL: #13235 Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl> Reviewed-By: Colin Ihrig <cjihrig@gmail.com>

MylesBorins · 2017-08-15T00:12:17Z

I've landed the fix in v6.x-staging in 5160d3d

It is possible that `recvmsg()` may return an error on ancillary data reception when receiving a `NODE_HANDLE` message (for example `MSG_CTRUNC`). This would end up, if the handle type was `net.Socket`, on a `message` event with a non null but invalid `sendHandle`. To improve the situation, send a `NODE_HANDLE_NACK` that'll cause the sending process to retransmit the message again. In case the same message is retransmitted 3 times without success, close the handle and print a warning. Fixes: #9706 PR-URL: #13235 Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl> Reviewed-By: Colin Ihrig <cjihrig@gmail.com>

santigimeno · 2017-08-15T16:36:39Z

To be clear. This is still an issue, #13235 didn't fix it.

It is possible that `recvmsg()` may return an error on ancillary data reception when receiving a `NODE_HANDLE` message (for example `MSG_CTRUNC`). This would end up, if the handle type was `net.Socket`, on a `message` event with a non null but invalid `sendHandle`. To improve the situation, send a `NODE_HANDLE_NACK` that'll cause the sending process to retransmit the message again. In case the same message is retransmitted 3 times without success, close the handle and print a warning. Fixes: #9706 PR-URL: #13235 Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl> Reviewed-By: Colin Ihrig <cjihrig@gmail.com>

pitaj · 2018-02-19T08:56:05Z

Believe I'm running into this issue while working on parallelizing a build process (creating a pool of workers and distributing hundreds of tasks among them). That's on Node 8.8.1

rooftopsparrow · 2018-02-21T17:50:43Z

Hello,

I was hoping in node 8 this was fixed but alas it seems to be even worse now. Using the script provided above its very easy to repoduce, and its alarmingly easy to see bad effects when you increase the worker count from 2 to something like 8. I was able to see the immediate effects when doing a larger cluster.

here are logs from the script right after ab was started

sending ping
received pong	master-to-worker	0	worker-to-master 1
received pong	master-to-worker	0	worker-to-master 1
received pong	master-to-worker	0	worker-to-master 1
received pong	master-to-worker	0	worker-to-master 1
received pong	master-to-worker	1	worker-to-master 1
received pong	master-to-worker	1	worker-to-master 1
received pong	master-to-worker	1	worker-to-master 1
received pong	master-to-worker	1	worker-to-master 1
sending ping
received pong	master-to-worker	81	worker-to-master 157
received pong	master-to-worker	131	worker-to-master 180
sending ping
received pong	master-to-worker	1	worker-to-master 5
received pong	master-to-worker	1054	worker-to-master 1058
received pong	master-to-worker	54	worker-to-master 58
sending ping
received pong	master-to-worker	4	worker-to-master 9
received pong	master-to-worker	8	worker-to-master 12
sending ping
received pong	master-to-worker	5	worker-to-master 9
received pong	master-to-worker	9	worker-to-master 16
sending ping
received pong	master-to-worker	4	worker-to-master 8
received pong	master-to-worker	8	worker-to-master 12
received pong	master-to-worker	4431	worker-to-master 4440
sending ping
received pong	master-to-worker	5	worker-to-master 5
received pong	master-to-worker	14	worker-to-master 19
received pong	master-to-worker	5995	worker-to-master 6012
sending ping
received pong	master-to-worker	5	worker-to-master 10
received pong	master-to-worker	8	worker-to-master 12
received pong	master-to-worker	5961	worker-to-master 5969
received pong	master-to-worker	4958	worker-to-master 4966
received pong	master-to-worker	3957	worker-to-master 3965
received pong	master-to-worker	2953	worker-to-master 2961
received pong	master-to-worker	1949	worker-to-master 1957
received pong	master-to-worker	949	worker-to-master 957
sending ping
received pong	master-to-worker	3	worker-to-master 11
received pong	master-to-worker	7	worker-to-master 15
received pong	master-to-worker	11	worker-to-master 19
sending ping
received pong	master-to-worker	4	worker-to-master 12
received pong	master-to-worker	7	worker-to-master 16
received pong	master-to-worker	11	worker-to-master 20
sending ping
Worker missed pings: 39339
Worker missed pings: 39340
Worker missed pings: 39341
received pong	master-to-worker	3	worker-to-master 11
received pong	master-to-worker	7	worker-to-master 15
received pong	master-to-worker	10	worker-to-master 19
sending ping
Worker missed pings: 39339
Worker missed pings: 39340
Worker missed pings: 39341
received pong	master-to-worker	3	worker-to-master 12
received pong	master-to-worker	8	worker-to-master 16
received pong	master-to-worker	11	worker-to-master 20
sending ping
Worker missed pings: 39339
Worker missed pings: 39340
Worker missed pings: 39341
Worker missed pings: 39343
received pong	master-to-worker	4	worker-to-master 12
received pong	master-to-worker	7	worker-to-master 16
received pong	master-to-worker	11	worker-to-master 20
received pong	master-to-worker	10188	worker-to-master 10210
received pong	master-to-worker	9211	worker-to-master 9222
received pong	master-to-worker	8210	worker-to-master 8221
received pong	master-to-worker	7206	worker-to-master 7217
received pong	master-to-worker	6202	worker-to-master 6213
received pong	master-to-worker	5202	worker-to-master 5213
received pong	master-to-worker	4192	worker-to-master 4203
received pong	master-to-worker	3182	worker-to-master 3193
received pong	master-to-worker	2178	worker-to-master 2189
received pong	master-to-worker	1173	worker-to-master 1184
received pong	master-to-worker	173	worker-to-master 184

santigimeno · 2018-02-21T18:06:38Z

@pitaj @rooftopsparrow what platforms are you observing the issue on?

rooftopsparrow · 2018-02-21T18:32:43Z

This was tested on macOS 10.13.2, node v8.9.4, and latest express.

pitaj · 2018-02-21T23:48:18Z

@santigimeno I tested it in win32, Node 8.8.1

santigimeno · 2018-02-22T09:39:50Z

@rooftopsparrow can you check using this PR that includes libuv@1.19.2. libuv/libuv@e6168df might have improved things for OS X.

rooftopsparrow · 2018-02-22T17:36:11Z

Yes, I can try that when I get a moment. I've also done a little more exploring to figure out what is going on with these missing ( or really really delayed ) pings and will hopefully have a better idea of what is going later

pitaj · 2018-02-22T19:08:17Z

@santigimeno I'm currently building that PR on my Window 10 x64 system here, then I'll test it.

santigimeno · 2018-02-22T20:46:29Z

@pitaj the fix I was thinking about was specifically for OS X so I don't think you'll find any difference.

santigimeno · 2018-02-25T17:28:07Z

I've been running the test for several hours on OS X with 8 workers and the messages seem to be indeed VERY delayed but the original issue where the IPC channel is blocked never happened.

pitaj · 2018-02-26T15:24:24Z

It appears to me that if you send enough messages through the IPC channel, it will completely lock up the main process. I'm running mocha tests and the timeout just doesn't happen, nor does a hard-coded timeout. This seems to occur at exactly 1000 messages per child process (8000 handles total).

Edit: it's been running for 17 hours now and still hasn't concluded.

gireeshpunathil · 2018-05-21T05:49:43Z

ping @pitaj - is there a conclusion on this?

pitaj · 2018-05-21T05:52:37Z

It seems like it eventually concludes if given enough time... but it might be exponential time as very small changes in message number (like 990 vs 1000) result in very large changes in the amount of time required.

gireeshpunathil · 2018-05-21T05:59:28Z

@pitaj - thanks for the quick response, let me investigate. Does the same test case in the OP apply as such, or is there a more refined version?

gireeshpunathil · 2018-05-21T11:29:49Z

thanks @pitaj - that was a vital info. So it turns out to be a performance issue as opposed to functional. When the slowness is observed, do you see a very large number of connections in TIME_WAIT state?

such as:
netstat -na | wc -l showing in thousands?

In my recreate when I profiled Node, I see these patterns in the workers. Out of total 48569 samples,

   ticks  total  nonlib   name
  22836   47.0%   47.8%  T _kpersona_get
   1986    4.1%    4.2%  T _kpersona_find
   1272    2.6%    2.7%  T _sendto$NOCANCEL
   1022    2.1%    2.1%  T _mach_port_move_member
    686    1.4%    1.4%  T _pwrite$NOCANCEL

were spent in these obsure routines which I assume to belong to Darwin's tcp layer.

In MAC the clearing up of the closed sockets seem to be slowing down in-flight socket operations in the TCP space, this is a speculation based on the debugging in the connected issue, no documented evidence to support it. However, the circumstances are matching. Frequency of the issue in Linux being very less also supports this theory.

Proof: If I remove tcp from the picture by removing the express code, and put up a 100ms interval in which you send the IPC message ('fromEndpoint') I don't see missing pings, and the behavior is consistent across platforms.

Related to nodejs/help#1179 and the relevant debug data is nodejs/help#1179 (comment)

I would suggest to see if there are tcp tunables at the OS level that can address the throttle at the tcp stack, or else consider reducing load per server through horizontal scaling.

pitaj · 2018-05-21T20:49:28Z

@gireeshpunathil I'm on Windows, and it seems like your comment is mostly about Linux?

gireeshpunathil · 2018-05-22T04:35:56Z

@pitaj, thanks. I was following the code and the platform from the original postings.

Are you using the same code on Windows, or something different? if so, please pass it on. Also, what is the observation - similar to mac os, same as mac os, or different?

I too tested in windows, and I got some surprising result (certain tunings to the original test,and we get complete hang!) . We need to separate that issue from this, so let me hear from you.

gireeshpunathil · 2018-05-22T06:41:35Z

looked at the windows hang, and understood the reason too.

every time a client connects, 25 messages (fromEndpoint) go from the node to the master.
every time the master receives a message of that type, it sends 25 messages back (625 messages per worker)

So depending on the number of concurrent requests, performance can really vary, and the dependancy between the requests and the latency is exponential (s you already observed earlier):

but it might be exponential time as very small changes in message number (like 990 vs 1000) result in very large changes in the amount of time required.

There is nothing Windows specific issue observed here from Node.js perspective, other than potential difference in the system configuration / resources. So my original proposal on horizontal scaling stands.

gireeshpunathil · 2018-05-29T09:10:51Z

Not a Node.js bug, closing. Exponential stress in the tcp layer causes process to slow down, suggested to share work between multiple hosts.

pitaj · 2018-05-29T22:21:52Z

@gireeshpunathil I have some repro code that doesn't use any TCP AFAIK, unless the IPC channel uses TCP itself.

I can throw that up on a gist later today.

pitaj · 2018-05-30T02:31:59Z

https://gist.github.com/pitaj/e8463a023847697e72e7f01ac8ed2fd6

gireeshpunathil · 2018-05-31T06:58:47Z

Thanks @pitaj for the repro. Turns out that the windows issue is unrelated (compelte hang) to the originally posted issue (slow response) on macos and family.

I am able to reproduce the hang. Looking at multiple dumps, I see that the main thread of different processes (including the master) are engaged in:

node.exe!uv_pipe_write_impl(uv_loop_s * loop, uv_write_s * req, uv_pipe_s * handle, const uv_buf_t * bufs, unsigned int nbufs, uv_stream_s * send_handle, void(*)(uv_write_s *, int) cb) Line 1347	C
node.exe!uv_write(uv_write_s * req, uv_stream_s * handle, const uv_buf_t * bufs, unsigned int nbufs, void(*)(uv_write_s *, int) cb) Line 139	C
node.exe!node::LibuvStreamWrap::DoWrite(node::WriteWrap * req_wrap, uv_buf_t * bufs, unsigned __int64 count, uv_stream_s * send_handle) Line 345	C++
node.exe!node::StreamBase::Write(uv_buf_t * bufs, unsigned __int64 count, uv_stream_s * send_handle, v8::Local<v8::Object> req_wrap_obj) Line 222	C++
node.exe!node::StreamBase::WriteString<1>(const v8::FunctionCallbackInfo<v8::Value> & args) Line 300	C++
node.exe!node::StreamBase::JSMethod<node::LibuvStreamWrap,&node::StreamBase::WriteString<1> >(const v8::FunctionCallbackInfo<v8::Value> & args) Line 408	C++
node.exe!v8::internal::FunctionCallbackArguments::Call(v8::internal::CallHandlerInfo * handler) Line 30	C++
node.exe!v8::internal::`anonymous namespace'::HandleApiCallHelper<0>(v8::internal::Isolate * isolate, v8::internal::Handle<v8::internal::HeapObject> new_target, v8::internal::Handle<v8::internal::HeapObject> fun_data, v8::internal::Handle<v8::internal::FunctionTemplateInfo> receiver, v8::internal::Handle<v8::internal::Object> args, v8::internal::BuiltinArguments) Line 110	C++
node.exe!v8::internal::Builtin_Impl_HandleApiCall(v8::internal::BuiltinArguments args, v8::internal::Isolate * isolate) Line 138	C++
node.exe!v8::internal::Builtin_HandleApiCall(int args_length, v8::internal::Object * * args_object, v8::internal::Isolate * isolate) Line 126	C++
[External Code]

this is a known issue with libuv where multiple parties attempt to write to the same pipe, from either sides, under rare situations.

#7657 posted this originally, and libuv/libuv#1843 fixed it recently. It will be sometime before Node.js consumes it.

Notable changes: - Building via cmake is now supported. PR-URL: libuv/libuv#1850 - Stricter checks have been added to prevent watching the same file descriptor multiple times. PR-URL: libuv/libuv#1851 Refs: nodejs#3604 - An IPC deadlock on Windows has been fixed. PR-URL: libuv/libuv#1843 Fixes: nodejs#9706 Fixes: nodejs#7657 - uv_fs_lchown() has been added. PR-URL: libuv/libuv#1826 Refs: nodejs#19868 - uv_fs_copyfile() sets errno on error. PR-URL: libuv/libuv#1881 Fixes: nodejs#21329 - uv_fs_fchmod() supports -A files on Windows. PR-URL: libuv/libuv#1819 Refs: nodejs#12803 PR-URL: nodejs#21466 Reviewed-By: Anna Henningsen <anna@addaleax.net> Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl> Reviewed-By: Santiago Gimeno <santiago.gimeno@gmail.com> Reviewed-By: James M Snell <jasnell@gmail.com>

Notable changes: - Building via cmake is now supported. PR-URL: libuv/libuv#1850 - Stricter checks have been added to prevent watching the same file descriptor multiple times. PR-URL: libuv/libuv#1851 Refs: #3604 - An IPC deadlock on Windows has been fixed. PR-URL: libuv/libuv#1843 Fixes: #9706 Fixes: #7657 - uv_fs_lchown() has been added. PR-URL: libuv/libuv#1826 Refs: #19868 - uv_fs_copyfile() sets errno on error. PR-URL: libuv/libuv#1881 Fixes: #21329 - uv_fs_fchmod() supports -A files on Windows. PR-URL: libuv/libuv#1819 Refs: #12803 PR-URL: #21466 Reviewed-By: Anna Henningsen <anna@addaleax.net> Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl> Reviewed-By: Santiago Gimeno <santiago.gimeno@gmail.com> Reviewed-By: James M Snell <jasnell@gmail.com>

Notable changes: - Building via cmake is now supported. PR-URL: libuv/libuv#1850 - Stricter checks have been added to prevent watching the same file descriptor multiple times. PR-URL: libuv/libuv#1851 Refs: nodejs#3604 - An IPC deadlock on Windows has been fixed. PR-URL: libuv/libuv#1843 Fixes: nodejs#9706 Fixes: nodejs#7657 - uv_fs_lchown() has been added. PR-URL: libuv/libuv#1826 Refs: nodejs#19868 - uv_fs_copyfile() sets errno on error. PR-URL: libuv/libuv#1881 Fixes: nodejs#21329 - uv_fs_fchmod() supports -A files on Windows. PR-URL: libuv/libuv#1819 Refs: nodejs#12803 PR-URL: nodejs#21466 Reviewed-By: Anna Henningsen <anna@addaleax.net> Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl> Reviewed-By: Santiago Gimeno <santiago.gimeno@gmail.com> Reviewed-By: James M Snell <jasnell@gmail.com>

Notable changes: - Building via cmake is now supported. PR-URL: libuv/libuv#1850 - Stricter checks have been added to prevent watching the same file descriptor multiple times. PR-URL: libuv/libuv#1851 Refs: #3604 - An IPC deadlock on Windows has been fixed. PR-URL: libuv/libuv#1843 Fixes: #9706 Fixes: #7657 - uv_fs_lchown() has been added. PR-URL: libuv/libuv#1826 Refs: #19868 - uv_fs_copyfile() sets errno on error. PR-URL: libuv/libuv#1881 Fixes: #21329 - uv_fs_fchmod() supports -A files on Windows. PR-URL: libuv/libuv#1819 Refs: #12803 Backport-PR-URL: #24103 PR-URL: #21466 Reviewed-By: Anna Henningsen <anna@addaleax.net> Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl> Reviewed-By: Santiago Gimeno <santiago.gimeno@gmail.com> Reviewed-By: James M Snell <jasnell@gmail.com>

mscdex added the cluster Issues and PRs related to the cluster subsystem. label Nov 20, 2016

santigimeno added child_process Issues and PRs related to the child_process subsystem. freebsd Issues and PRs related to the FreeBSD platform. os Issues and PRs related to the os subsystem. labels Nov 22, 2016

gireeshpunathil added performance Issues and PRs related to the performance of Node.js. macos Issues and PRs related to the macOS platform / OSX. labels May 21, 2018

gireeshpunathil closed this as completed May 29, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

IPC channel stops delivering messages to cluster workers #9706

IPC channel stops delivering messages to cluster workers #9706

davidvetrano commented Nov 20, 2016 •

edited

Loading

santigimeno commented Nov 20, 2016

davidvetrano commented Nov 21, 2016

santigimeno commented Nov 22, 2016

pidgeonman commented Dec 13, 2016

rooftopsparrow commented Jan 25, 2017 •

edited

Loading

santigimeno commented Jan 28, 2017

Trott commented Jul 16, 2017

bnoordhuis commented Jul 17, 2017

santigimeno commented Jul 17, 2017

santigimeno commented Jul 17, 2017

MylesBorins commented Aug 15, 2017

santigimeno commented Aug 15, 2017

pitaj commented Feb 19, 2018

rooftopsparrow commented Feb 21, 2018

santigimeno commented Feb 21, 2018

rooftopsparrow commented Feb 21, 2018

pitaj commented Feb 21, 2018

santigimeno commented Feb 22, 2018

rooftopsparrow commented Feb 22, 2018

pitaj commented Feb 22, 2018

santigimeno commented Feb 22, 2018

santigimeno commented Feb 25, 2018

pitaj commented Feb 26, 2018 •

edited

Loading

gireeshpunathil commented May 21, 2018

pitaj commented May 21, 2018

gireeshpunathil commented May 21, 2018

gireeshpunathil commented May 21, 2018

pitaj commented May 21, 2018

gireeshpunathil commented May 22, 2018

gireeshpunathil commented May 22, 2018

gireeshpunathil commented May 29, 2018

pitaj commented May 29, 2018

pitaj commented May 30, 2018

gireeshpunathil commented May 31, 2018

IPC channel stops delivering messages to cluster workers #9706

IPC channel stops delivering messages to cluster workers #9706

Comments

davidvetrano commented Nov 20, 2016 • edited Loading

santigimeno commented Nov 20, 2016

davidvetrano commented Nov 21, 2016

santigimeno commented Nov 22, 2016

pidgeonman commented Dec 13, 2016

rooftopsparrow commented Jan 25, 2017 • edited Loading

santigimeno commented Jan 28, 2017

Trott commented Jul 16, 2017

bnoordhuis commented Jul 17, 2017

santigimeno commented Jul 17, 2017

santigimeno commented Jul 17, 2017

MylesBorins commented Aug 15, 2017

santigimeno commented Aug 15, 2017

pitaj commented Feb 19, 2018

rooftopsparrow commented Feb 21, 2018

santigimeno commented Feb 21, 2018

rooftopsparrow commented Feb 21, 2018

pitaj commented Feb 21, 2018

santigimeno commented Feb 22, 2018

rooftopsparrow commented Feb 22, 2018

pitaj commented Feb 22, 2018

santigimeno commented Feb 22, 2018

santigimeno commented Feb 25, 2018

pitaj commented Feb 26, 2018 • edited Loading

gireeshpunathil commented May 21, 2018

pitaj commented May 21, 2018

gireeshpunathil commented May 21, 2018

gireeshpunathil commented May 21, 2018

pitaj commented May 21, 2018

gireeshpunathil commented May 22, 2018

gireeshpunathil commented May 22, 2018

gireeshpunathil commented May 29, 2018

pitaj commented May 29, 2018

pitaj commented May 30, 2018

gireeshpunathil commented May 31, 2018

davidvetrano commented Nov 20, 2016 •

edited

Loading

rooftopsparrow commented Jan 25, 2017 •

edited

Loading

pitaj commented Feb 26, 2018 •

edited

Loading