-
Notifications
You must be signed in to change notification settings - Fork 7.3k
cluster: tests multiple open of ports in workers #8027
cluster: tests multiple open of ports in workers #8027
Conversation
I spent about an hour trying to fix this, and can see some of the problems, but threading through the cluster messaging protocol to fix this is more than I can do tonight.
|
See nodejs/node-v0.x-archive#8027, work-around is to create a single dgram socket, and to pass it to Lynx.
@sam-github Thanks much for the tests. @indutny while I hate failing tests, these do point out issues in the implementation. What do you think about merging them? |
OK :) |
I'm -1 on including tests that are failing, unless we're able to bring in the fixes at the same time. We want to ship with a green build. My main question is if these are differences between how it works in 0.10 and 0.11? Do you want these to be considered blockers for releasing 0.12? Figuring out the right api for worker exclusive listening is desired, but I don't think we can get that properly addressed before 0.12. |
@sam-github So, for the v0.12 release we'll have to do one of the following:
While no one thinks the current implementation is 100% correct, we won't want to merge failing tests because that would indicate something that we've added to the roadmap. In your opinion, since you obviously have a lot of experience with this issue, what's the work effort to fix these failing tests? Meaning, if they could be fixed we'd be more than happy to merge both the tests and the fixes. |
I think the cluster master asserting and dying should be blocking for 0.12. This isn't really a PR, its OK not to merge, its more of an issue with attached reproductions in node unit test form, though if you agree that some of the failures are blocking, I can seperate them out. Below is rundown of test output for 0.10 and 0.11: dgram-3: pass on 0.10, asserts the cluster master in 0.11. I think this is 0.12 blocking.
dgram-4: fails on both 10 and 11, I don't know why, but I suspect that the second open is returning the first, now closed, socket instead of an open one. possibly just "unsupported", though that sucks
listen-2 and listen-3 both fail on 0.10, it seems closing servers does not work even in 0.10
listen-2 passes on 0.11 listen-3 fails on 0.11, it kills the cluster master I think not being able to close a server is pretty close to release blocking, and that asserting the cluster master is definitely release blocking
|
@bnoordhuis I'd like your feedback on the current @tjfontaine Based on #8027 (comment) I'd like your feedback on if any/all of those cases would/should be considered blockers for v0.12. I've had little to do with either of these areas of code thus far, but if we do consider any of these a v0.12 blocker I'm willing to work on it. |
I hate the implementation - it's one horrible hack on top of another - but feature-wise, it seems pretty okay. |
Why does dgram do an explicit This would fit my expectations of how dgram would work in cluster. bind to a port (including 0), and you get a socket/port shareable between workers. Don't explicitly bind, and the workers socket is its own, dynamically and ephemerally bound to a un-shared port (as net's actively connected stream sockets are). |
I'm not 100% sure but it could be legacy from the libev days. The explicit bind() isn't necessary with libuv though it doesn't hurt either. I suppose one minor upside is that it catches 'out of ephemeral ports' errors early. |
Note that the explicit bind runs afoul of #5587, causing even dgram sockets that don't need any sharing to fail to be useful on Windows, and in a rather nasty way: they assert the cluster master. |
From @bnoordhuis' response, it seems like the @tjfontaine Think we should consider this a blocker for v0.12 until these tests pass? @sam-github If so, can you help out creating fixes for these tests? |
@sam-github there's now a work around for the implicit |
Reviewed-by: Trevor Norris <trev.norris@gmail.com> Reviewed-by: Fedor Indutny <fedor@indutny.com>
Reviewed-by: Trevor Norris <trev.norris@gmail.com> Reviewed-by: Fedor Indutny <fedor@indutny.com>
Adding a compatibility section to node.exe embedded manifest so that Node is declared explicitly compatible with Windows 8.1. Required so that os.release() can return the correct version on Windows 8.1. See http://msdn.microsoft.com/en-us/library/windows/desktop/ms724451(v=vs.85).aspx Reviewed-by: Trevor Norris <trev.norris@gmail.com>
Reviewed-by: Trevor Norris <trev.norris@gmail.com>
SSL_CTX is shared between multiple connections and is not a right place to store per-connection data. fix #8348 Reviewed-By: Trevor Norris
Add a simple test to cover workers' implementation of Worker.prototype.destroy(). Before adding this test, this code wouldn't be covered by the tests suite, and any regression introduced in workers' implementation of Worker.prototype.destroy wouldn't be caught. Fixes: #8223 Reviewed-by: Trevor Norris <trev.norris@gmail.com>
When V8 started supporting Promises natively it also introduced a microtack queue. This feature operates similar to process.nextTick(), and created an issue where neither knew when the other had run. This patch has nextTick() call the microtask queue runner at the end of processing callbacks in the nextTickQueue. Fixes: #7714 Reviewed-by: Trevor Norris <trev.norris@gmail.com>
Reviewed-by: Trevor Norris <trev.norris@gmail.com>
Increase the performance of new Buffer construction by initializing all properties before SetIndexedPropertiesToExternalArrayData call. Reviewed-by: Trevor Norris <trev.norris@gmail.com>
When calling write() after end() has been called on an OutgoingMessage, an error is emitted and the write's callback is called with an instance of Error. Fix #7477. Reviewed-By: Fedor Indutny <fedor@indutny.com>
Export External getters for a internal structs: SSL, SSL_CTX.
Expose basic hooks for AsyncWrap via the async_wrap binding. Right now only the PROVIDER types are exposed. This is a preliminary step before more functionality is added. PR-URL: #8110 Signed-off-by: Trevor Norris <trev.norris@gmail.com> Reviewed-by: Fedor Indutny <fedor@indutny.com> Reviewed-by: Alexis Campailla <alexis@janeasystems.com> Reviewed-by: Julien Gilli <julien.gilli@joyent.com>
When instantiating a new AsyncWrap allow the parent AsyncWrap to be passed. This is useful for cases like TCP incoming connections, so the connection can be tied to the server receiving the connection. Because the current architecture instantiates the *Wrap inside a v8::FunctionCallback, the parent pointer is currently wrapped inside a new v8::External every time and passed as an argument. This adds ~80ns to instantiation time. A future optimization would be to add the v8::External as the data field when creating the v8::FunctionTemplate, change the pointer just before making the call then NULL'ing it out afterwards. This adds enough code complexity that it will not be attempted until the current approach demonstrates it is a bottle neck. PR-URL: #8110 Signed-off-by: Trevor Norris <trev.norris@gmail.com> Reviewed-by: Fedor Indutny <fedor@indutny.com> Reviewed-by: Alexis Campailla <alexis@janeasystems.com> Reviewed-by: Julien Gilli <julien.gilli@joyent.com>
Call a user-defined callback at specific points in the lifetime of an asynchronous event. Which are on instantiation, just before/after the callback has been run. **If any of these callbacks throws an exception, there is no forgiveness or recovery. A message will be displayed and a core file dumped.** Currently these only tie into AsyncWrap, meaning no call to a hook callback will be made for timers or process.nextTick() events. Though those will be added in a future commit. Here are a few notes on how to make the hooks work: - The "this" of all event hook callbacks is the request object. - The zero field (kCallInitHook) of the flags object passed to setupHooks() must be set != 0 before the init callback will be called. - kCallInitHook only affects the calling of the init callback. If the request object has been run through the create callback it will always run the before/after callbacks. Regardless of kCallInitHook. - In the init callback the property "_asyncQueue" must be attached to the request object. e.g. function initHook() { this._asyncQueue = {}; } - DO NOT inspect the properties of the object in the init callback. Since the object is in the middle of being instantiated there are some cases when a getter is not complete, and doing so will cause Node to crash. PR-URL: #8110 Signed-off-by: Trevor Norris <trev.norris@gmail.com> Reviewed-by: Fedor Indutny <fedor@indutny.com> Reviewed-by: Alexis Campailla <alexis@janeasystems.com> Reviewed-by: Julien Gilli <julien.gilli@joyent.com>
Float libuv/libuv@484a3a9 to fix incorrect indentation in REPL.
Add documentation for the callback parameter of http.ClientRequest's and http.ServerResponse's write and end methods.
In `tls.markdown`, there was a misuse of 'a' which has been replaced with 'an'. In `timers.markdown`... line 31: misuse of 'a', replaced with 'an' line 59: unclear wording, haywire 'a', added new comma
Dtrace probes were removed from libuv recently, but their usage by node was not completely removed, causing build breaks on SmartOS. Even though the build is working on other platforms, these probes are not fired by libuv anymore, so there's no point in using them on these platforms too. Reviewed-by: Trevor Norris <trev.norris@gmail.com>
PR-URL: #8847 Reviewed-by: Trevor Norris <trev.norris@gmail.com>
Marking these two tests as flaky, since they have been failing intermittenly in recent builds: test-debug-signal-cluster test-cluster-basic
PR-URL: #8546 Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
PR-URL: #8845 Reviewed-by: Trevor Norris <trev.norris@gmail.com>
Reviewed-By: Fedor Indutny <fedor@indutny.com> PR-URL: #6442
Fix a Windows-only build error that was introduced in commit 1183ba4 ("zlib: support concatenated gzip files"). Rename the NO_ERROR and FAILED enumerations, they conflict with macros of the same name in <winerror.h>. PR-URL: #8893 Reviewed-By: Fedor Indutny <fedor@indutny.com> Reviewed-By: Rod Vagg <rod@vagg.org> Reviewed-by: Timothy J Fontaine <tjfontaine@gmail.com>
In cases where many small writes are made to a stream lacking _writev, the array data structure backing the WriteReq buffer would greatly increase GC pressure. Specifically, in the fs.WriteStream case, the clearBuffer routine would only clear a single WriteReq from the buffer before exiting, but would cause the entire backing array to be GC'd. Switching to [].shift lessened pressure, but still the bulk of the time was spent in memcpy. This replaces that structure with a linked list-backed queue so that adding and removing from the queue is O(1). In the _writev case, collecting the buffer requires an O(N) loop over the buffer, but that was already being performed to collect callbacks, so slowdown should be neglible. PR-URL: #8826 Reviewed-by: Timothy J Fontaine <tjfontaine@gmail.com> Reviewed-by: Trevor Norris <trev.norris@gmail.com>
Add documentation for the callback parameter of http.ClientRequest's and http.ServerResponse's end methods. Signed-off-by: Julien Gilli <julien.gilli@joyent.com>
The url.parse() function now checks whether an escapable character is in the URL before trying to escape it. PR-URL: #8638 [trev.norris@gmail.com: Switch to use continue instead of if] Signed-off-by: Trevor Norris <trev.norris@gmail.com>
Make "--with-intl=none" the default and add "intl-none" option to vcbuild.bat. If icu data is missing print a warning unless either --download=all or --download=icu is set. If set then automatically download, verify (MD5) and unpack the ICU data if not already available. There's a "list" of URLs being used, but right now only the first is picked up. The logic works something like this: * If there is no directory deps/icu, * If no zip file (currently icu4c-54_1-src.zip), * Download zip file (icu-project.org -> sf.net) * Verify the MD5 sum of the zipfile * If bad, print error and exit * Unpack the zipfile into deps/icu * If deps/icu now exists, use it, else fail with help text Add the configuration option "--with-icu-source=..." Usage: * --with-icu-source=/path/to/my/other/icu * --with-icu-source=/path/to/icu54.zip * --with-icu-source=/path/to/icu54.tgz * --with-icu-source=http://example.com/icu54.tar.bz2 Add the configuration option "--with-icu-locals=...". Allows choosing which locales are used in the "small-icu" case. Example: configure --with-intl=small-icu --with-icu-locales=tlh,grc,nl (Also note that as of this writing, neither Klingon nor Ancient Greek are in upstream CLDR data. Serving suggestion only.) Don't use hard coded ../../out paths on windows. This was suggested by @misterdjules as it causes test failures. With this fix, "out" is no longer created on windows and the following can run properly: python tools/test.py simple Reduce space by about 1MB with ICU 54 (over without this patch). Also trims a few other source files, but only conditional on the exact ICU version used. This is to future-proof - a file that is unneeded now may be needed in future ICUs. Also: * Update distclean to remove icu related files * Refactor some code into tools/configure.d/nodedownload.py * Update docs * Add test PR-URL: #8719 Fixes: #7676 (comment) [trev.norris@gmail.com small change to test's whitespace and logic] Signed-off-by: Trevor Norris <trev.norris@gmail.com>
PR-URL: #8964 Reviewed-by: Trevor Norris <trev.norris@gmail.com>
Socket.prototype.connect() sometimes throws on bad inputs after an asynchronous operation. This commit makes the input validation synchronous. This commit also removes some hard coded IP addresses. PR-URL: #8180 Fixes: #8140 Reviewed-By: Trevor Norris <trev.norris@gmail.com> Reviewed-By: Fedor Indutny <fedor@indutny.com> Reviewed-By: Timothy J Fontaine <tjfontaine@gmail.com>
If the data length passed to smalloc.alloc() the array_length will be zero, causing an overflow check to fail. This prevents that from happening. Signed-off-by: Trevor Norris <trev.norris@gmail.com>
Note that test/simple/test-cluster-dgram-4.js is a simplified case, since dgram.send() for clients does an implict .bind(0), so it could also be written as: A = dgram.createSocket('udp4') A.send(...) A.close(); ... time passes... B = dgram.createSocket('udp4') B.send(...) B.close(); Above was the original form, dscape/lynx kills the cluster master when run in a worker. IMO, dgram-4 is the most serious of this, using multiple dgram client sockets is currently impossible with cluster. Also note test/simple/test-cluster-net-listen-2.js, the TCP equivalent of dgram-4, passes, because TCP close causes the cluster master to be notified, UDP close does not. Also, note that I use the ephemeral "flag" port `0` in the tests, the same failure is seen with any port.
In case that would help the tests pass... but it does not.
Note that test/simple/test-cluster-dgram-4.js is a simplified case, since
dgram.send() for clients does an implict .bind(0), so it could also be written
as:
Above was the original form, dscape/lynx kills the cluster master when run in
a worker.
IMO, dgram-4 is the most serious of this, using multiple dgram client sockets
is currently impossible with cluster.
Also note test/simple/test-cluster-net-listen-2.js, the TCP equivalent of
dgram-4, passes, because TCP close causes the cluster master to be notified,
UDP close does not.
Also, note that I use the ephemeral "flag" port
0
in the tests, the samefailure is seen with any port.