-
Notifications
You must be signed in to change notification settings - Fork 29.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Domain usage broken in 0.12.x and later #4359
Comments
@okeuday I haven't had the time to take a closer look at this yet, I'll do that in a few hours. Is this a regression from v0.12.8 to v0.12.9, or does this not work for any v0.12 version? |
@misterdjules I have checked to determine the msg_size test doesn't work with v0.12.8. I am not sure about v0.12.x versions previous to 8. The output when the msg_size test doesn't work is similar to:
(when the msg_size test is working, the forward output is much larger, since it forwards until the timeout value expires) The PREFIX above is the prefix provided to the configure script for CloudI. The msg_size test is forwarding service requests between all the different CloudI API programming language implementations and it shows that it is unable to handle the exception used to return back to polling on its socket, within nodejs. CloudI is an application server for all the services, including the nodejs services. |
@misterdjules I have confirmed that v0.12.0 works fine with the msg_size test. So, the problem is somewhere between 1 and 8. |
@misterdjules The problem occurs with v0.12.2 but not with v0.12.1, so it is related to a change that was put into the v0.12.2 release. |
Could it have been f2a45ca? That was landed right before 0.12.1 though |
@evanlucas It shouldn't be, since v0.12.1 works dependably for the msg_size test. The domain.js file didn't change between the v0.12.1 tag and the v0.12.2 tag, so this must be related to C++ source code. |
@evanlucas Perhaps it is f2a45ca , I misread the version information about it. |
@evanlucas I really am not familiar with the node.js source code, however, I don't see how f2a45ca would be responsible due to the differences in the github repo based on the version tags v0.12.1 and v0.12.2. The main change that I have seen is moving deps/uv from version 1.0.2 to version 1.4.2 which would relate to socket use in the CloudI API. The other changes for the v0.12.2 release don't seem like they should be related, but only based on a brief glance without previous knowledge of the node.js source code. |
@okeuday How can I run the msg_size.js test once I built CloudI with |
@misterdjules You need to do |
@okeuday When running
Here's what I get with node v0.12.2:
So the output with v0.12.1 doesn't seem "larger" than with v0.12.2. It will be very time consuming for me to make sense of the whole cloudi code base, and to figure out what is expected and what is a bug. So instead of having me use the whole cloudi code base, can you please isolate the |
@misterdjules Your output from node v0.12.1 shows that CloudI didn't stay up due to one or more of the services failing initialization, so it was probably a file path problem for node as you were switching versions (since the file path to node is detected by the configure script and used in There isn't an easy way to do this test outside of CloudI, since it is testing CloudI and its integration with the various programming languages. It is easy to remove all the integration tests, but msg_size within the configuration file ( If it is too problematic to reproduce, I can try to narrow it down further. I have been compiling the node repository tagged at v0.12.2 with specific sections from the v0.12.1 tag, to try and see what is responsible for the problem. So far, it doesn't look like the deps/uv update is related to the problem. |
Yes, that would be very much appreciated. |
@misterdjules The commit that causes the problem is 7e9d2f8 and is related to socket usage. Domain usage may be related to reproducing this problem, but it should be mainly a socket problem. I am not sure about how to fix it and have not attempted to revert the change on a particular (local) branch (to test). |
@okeuday I'm not sure I understand how 7e9d2f8 is related to socket usage. Would you mind clarifying? The |
@misterdjules Both https://github.com/nodejs/node/commits/v0.12.2-release and I assumed it was related to sockets due to |
@okeuday 0b5731a is a commit that was merged in the v0.12 branch as part of regular merges we used to do of the v0.10 branch into the v0.12 branch. That commit points to a tree that is node v0.10.x, not v0.12.x, which is why the file When bisecting commits to find the one that caused a regression in v0.12 you should not use these v0.10 commits, only the merge commits that merged them into v0.12. |
@misterdjules I used The main socket usage by the msg_size test is initialized at https://github.com/CloudI/CloudI/blob/develop/src/api/javascript/CloudI.js#L220-L223 (CloudI has an OS process forking the process used for executing node.js, creating a socket file descriptor for the javascript source code to use). Separate objects are being used for the same file descriptor due to problems doing it as a single socket object (which I don't remember). If this approach is not meant to work going forward (beyond the 7a37910 commit), the https://github.com/CloudI/CloudI/blob/develop/src/api/javascript/CloudI.js#L220-L223 code could change, as long as some approach may work. |
@okeuday It's great that you found the commit that is responsible for the regression. Now, libuv is responsible for many things other than providing an abstraction for network sockets. So if the problem was triggered by a libuv upgrade, I wouldn't narrow it down to a network socket issue just yet. What would be great is to have a small program that reproduces the problem outside of Cloudi. Without such a thing, it will be difficult for us to help you find the source of the problem and fix it. Because it's not clear what the problem is, I'll close this issue until we have more information. We can still assist you in coming up with that minimal script that reproduces the issue by continuing the discussion here or in #node-dev on Freenode. |
I used the commit CloudI/CloudI@071d716 to fix the problem in the Javascript CloudI API and that works with node.js 4.2.3. I won't have time to write a separate test, but this information should be enough to help someone else do so. Thanks! |
I understand the contents of #66 . This issue is not meant to focus on the removal of domain. Instead, it is meant to report broken behavior when using domain with node.js versions 0.12.x and later when using the source code https://github.com/CloudI/CloudI/blob/develop/src/api/javascript/CloudI.js (the integration test which is unable to function is https://github.com/CloudI/CloudI/blob/develop/src/tests/msg_size/msg_size.js).
I do not see a way to avoid domain usage (https://github.com/strongloop/zone doesn't support cluster/crypto/fs/process/tls/https/udp/zlib, async-listener doesn't support exceptions (nodejs/node-v0.x-archive#6011 (comment))) and with domain usage being broken, I do not see a way to fix the source code for node.js versions 0.12.x and higher.
If you recreate the failure in the msg_size integration test, you will see in the cloudi.log output that forwards are unable to take place. The exception created by the forward_ function call (in msg_size.js) is not seen as it was before, with domain in 0.10.x. The more complicated messaging integration test (https://github.com/CloudI/CloudI/blob/develop/src/tests/messaging/messaging.js) does not have problems running with node.js versions 0.12.x and higher.
The text was updated successfully, but these errors were encountered: