Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AIX: hang on pseudo-tty/no_interleaved_stdio #9765

Closed
mhdawson opened this issue Nov 23, 2016 · 5 comments
Closed

AIX: hang on pseudo-tty/no_interleaved_stdio #9765

mhdawson opened this issue Nov 23, 2016 · 5 comments
Assignees
Labels
aix Issues and PRs related to the AIX platform.

Comments

@mhdawson
Copy link
Member

  • Version: master
  • Platform: AIX
  • Subsystem: message

As an example:
https://ci.nodejs.org/job/node-test-commit-aix/2049/nodes=aix61-ppc64/console

I've seen this test hang twice in the last couple of days:

ok 1240 message/vm_dont_display_syntax_error
  ---
  duration_ms: 0.166
  ...
@mhdawson
Copy link
Member Author

Processes on machine before cleanup

# ps -ef |grep iojs
    iojs 2949218 6619382   0 05:45:45      -  0:00 gmake test-ci
    iojs 3145926 4915380  29                  0:00 <defunct>
    iojs 3473600 3015330  20                  0:00 <defunct>
    iojs 3866720 1639028  24                  0:00 <defunct>
    iojs 4915380 5898258   0   Nov 18      -  0:06 /usr/bin/python tools/test.py -p tap --logfile test.tap --mode=release --flaky-tests=dontcare addons doctool inspector known_issues message parallel pseudo-tty sequential
    iojs 5177518       1   0   Nov 19      -  0:00 gmake run-ci -j 5
    iojs 5308610 1376986   0   Nov 18      -  0:00 gmake test-ci
    iojs 5898258 2818470   0   Nov 18      -  0:00 gmake test-ci
    root 6029456 3211754   0 09:00:17  pts/5  0:00 grep iojs
    iojs 6619382 3604782   0 05:38:24      -  0:00 gmake run-ci -j 5
    iojs 6881478 1704638   0   Nov 22      -  0:00 gmake test-ci
    iojs 1442260       1   0   Nov 14      -  4:10 java -Xmx128m -Dorg.jenkinsci.plugins.gitclient.Git.timeOut=30 -jar /home/iojs/slave.jar -secret b9dc892bc932e06f3d0784216ad9b09d5e2fa7e7ec4ff80702eed2d67729e761 -jnlpUrl https://ci.nodejs.org/computer/test-osuosl-aix61-ppc64_be-1/slave-agent.jnlp
    iojs 2294190 3474074  29                  0:00 <defunct>
    iojs 2818470       1   0   Nov 18      -  0:00 gmake run-ci -j 5
    iojs 3604782 1442260   0 05:38:24      -  0:00 /bin/sh -xe /tmp/hudson308746056710968896.sh
    iojs 3670424 2949218   0 05:46:42      -  0:06 /usr/bin/python tools/test.py -p tap --logfile test.tap --mode=release --flaky-tests=dontcare addons doctool inspector known_issues message parallel pseudo-tty sequential
    iojs 1376986       1   0   Nov 18      -  0:00 gmake run-ci -j 5
    iojs 1639028 6881478   0   Nov 22      -  0:04 /usr/bin/python tools/test.py -p tap --logfile test.tap --mode=release --flaky-tests=dontcare addons doctool inspector known_issues message parallel pseudo-tty sequential
    iojs 1704638       1   0   Nov 22      -  0:00 gmake run-ci -j 5
    iojs 2490932 3670424  29                  0:00 <defunct>
    iojs 3015330 3342862   0   Nov 19      -  0:04 /usr/bin/python tools/test.py -p tap --logfile test.tap --mode=release --flaky-tests=dontcare addons doctool inspector known_issues message parallel pseudo-tty sequential
    iojs 3342862 5177518   0   Nov 19      -  0:00 gmake test-ci
    iojs 3474074 5308610   0   Nov 18      -  0:06 /usr/bin/python tools/test.py -p tap --logfile test.tap --mode=release --flaky-tests=dontcare addons doctool inspector known_issues message parallel pseudo-tty sequential
#

@mhdawson
Copy link
Member Author

Next test usually run after message/vm_dont_display_syntax_error

ok 1241 pseudo-tty/no_interleaved_stdio
  ---
  duration_ms: 0.150

@mhdawson
Copy link
Member Author

Discussed previously here: #8847

Seems to be related to an issue with python on AIX. Last time we could not recreate. Since we are seeing it more frequently will submit a PR while @gireeshpunathil continues to investigate the python issue.

@mhdawson
Copy link
Member Author

Putting together a PR to exclude the test for now. Just testing here: https://ci.nodejs.org/job/node-test-commit-aix/2057/

@mhdawson mhdawson changed the title AIX: hang on message/vm_dont_display_syntax_error AIX: hang on pseudo-tty/no_interleaved_stdio Nov 23, 2016
@Fishrock123
Copy link
Contributor

😐 These sounds like an AIX platform bug.

@mscdex mscdex added the aix Issues and PRs related to the AIX platform. label Nov 23, 2016
addaleax pushed a commit that referenced this issue Dec 5, 2016
pseudo-tty/no_interleaved_stdio has hung a few times
in the last couple of days on AIX.  We believe
it is not a Node.js issue but an issue with python
on AIX. Its being investigated under:
#7973.
Excluding this additional test until we can
resolve the python issue.

Fixes #9765
PR-URL: #9772
Reviewed-By: Sam Roberts <sam@strongloop.com>
Reviewed-By: Gibson Fahnestock <gibfahn@gmail.com>
addaleax pushed a commit to addaleax/node that referenced this issue Dec 8, 2016
pseudo-tty/no_interleaved_stdio has hung a few times
in the last couple of days on AIX.  We believe
it is not a Node.js issue but an issue with python
on AIX. Its being investigated under:
nodejs#7973.
Excluding this additional test until we can
resolve the python issue.

Fixes nodejs#9765
PR-URL: nodejs#9772
Reviewed-By: Sam Roberts <sam@strongloop.com>
Reviewed-By: Gibson Fahnestock <gibfahn@gmail.com>
MylesBorins pushed a commit that referenced this issue Dec 20, 2016
pseudo-tty/no_interleaved_stdio has hung a few times
in the last couple of days on AIX.  We believe
it is not a Node.js issue but an issue with python
on AIX. Its being investigated under:
#7973.
Excluding this additional test until we can
resolve the python issue.

Fixes #9765
PR-URL: #9772
Reviewed-By: Sam Roberts <sam@strongloop.com>
Reviewed-By: Gibson Fahnestock <gibfahn@gmail.com>
MylesBorins pushed a commit that referenced this issue Dec 21, 2016
pseudo-tty/no_interleaved_stdio has hung a few times
in the last couple of days on AIX.  We believe
it is not a Node.js issue but an issue with python
on AIX. Its being investigated under:
#7973.
Excluding this additional test until we can
resolve the python issue.

Fixes #9765
PR-URL: #9772
Reviewed-By: Sam Roberts <sam@strongloop.com>
Reviewed-By: Gibson Fahnestock <gibfahn@gmail.com>
MylesBorins pushed a commit that referenced this issue Dec 21, 2016
pseudo-tty/no_interleaved_stdio has hung a few times
in the last couple of days on AIX.  We believe
it is not a Node.js issue but an issue with python
on AIX. Its being investigated under:
#7973.
Excluding this additional test until we can
resolve the python issue.

Fixes #9765
PR-URL: #9772
Reviewed-By: Sam Roberts <sam@strongloop.com>
Reviewed-By: Gibson Fahnestock <gibfahn@gmail.com>
gibfahn pushed a commit that referenced this issue Mar 17, 2017
The tests in pseudo-tty takes the form of child node writing some data
and exiting, while parent python consume them through pseudo tty
implementations, and validate the result.

While there is no synchronization between child and parent, this works
for most platforms, except AIX, where the child exits even before the
parent could setup the read loop, under race conditions

Fixing the race condition is ideally done through sending ACK messages
to and forth, but involves massive changes and have side effect. The
workaround is to address them in AIX alone, by adding a reasonable
delay.

PR-URL: #11715
Fixes: #7973
Fixes: #9765
Fixes: #11541
Reviewed-By: Michael Dawson <michael_dawson@ca.ibm.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Gibson Fahnestock <gibfahn@gmail.com>
italoacasas pushed a commit to italoacasas/node that referenced this issue Mar 20, 2017
The tests in pseudo-tty takes the form of child node writing some data
and exiting, while parent python consume them through pseudo tty
implementations, and validate the result.

While there is no synchronization between child and parent, this works
for most platforms, except AIX, where the child exits even before the
parent could setup the read loop, under race conditions

Fixing the race condition is ideally done through sending ACK messages
to and forth, but involves massive changes and have side effect. The
workaround is to address them in AIX alone, by adding a reasonable
delay.

PR-URL: nodejs#11715
Fixes: nodejs#7973
Fixes: nodejs#9765
Fixes: nodejs#11541
Reviewed-By: Michael Dawson <michael_dawson@ca.ibm.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Gibson Fahnestock <gibfahn@gmail.com>
jungx098 pushed a commit to jungx098/node that referenced this issue Mar 21, 2017
The tests in pseudo-tty takes the form of child node writing some data
and exiting, while parent python consume them through pseudo tty
implementations, and validate the result.

While there is no synchronization between child and parent, this works
for most platforms, except AIX, where the child exits even before the
parent could setup the read loop, under race conditions

Fixing the race condition is ideally done through sending ACK messages
to and forth, but involves massive changes and have side effect. The
workaround is to address them in AIX alone, by adding a reasonable
delay.

PR-URL: nodejs#11715
Fixes: nodejs#7973
Fixes: nodejs#9765
Fixes: nodejs#11541
Reviewed-By: Michael Dawson <michael_dawson@ca.ibm.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Gibson Fahnestock <gibfahn@gmail.com>
ryzokuken added a commit to ryzokuken/node that referenced this issue Mar 18, 2018
Rename the tests appropriately alongside mentioning the subsystem
Also, make a few basic changes to make sure the tests conform to the standard test structure

1. Renamed test-regress-GH-io-1068 to test-tty-stdin-end
2. Renamed test-regress-GH-io-1811 to test-zlib-kmaxlength-rangeerror
3. Renamed test-regress-GH-node-9326 to test-kill-segfault-freebsd
4. Renamed test-timers-regress-nodejsGH-9765 to test-timers-setimmediate-infinite-loop
5. Renamed test-tls-pfx-nodejsgh-5100-regr to test-tls-pfx-authorizationerror
6. Renamed test-tls-regr-nodejsgh-5108 to test-tls-tlswrap-segfault

Fixes: nodejs#19105
Refs: nodejs#19105
Refs: https://github.com/nodejs/node/blob/master/doc/guides/writing-tests.md#test-structure
lpinca pushed a commit that referenced this issue Mar 18, 2018
Rename the tests appropriately alongside mentioning the subsystem.
Also, make a few basic changes to make sure the tests conform to the
standard test structure.

- Rename test-regress-GH-io-1068 to test-tty-stdin-end
- Rename test-regress-GH-io-1811 to test-zlib-kmaxlength-rangeerror
- Rename test-regress-GH-node-9326 to test-kill-segfault-freebsd
- Rename test-timers-regress-GH-9765 to test-timers-setimmediate-infinite-loop
- Rename test-tls-pfx-gh-5100-regr to test-tls-pfx-authorizationerror
- Rename test-tls-regr-gh-5108 to test-tls-tlswrap-segfault

PR-URL: #19332
Fixes: #19105
Reviewed-By: Gireesh Punathil <gpunathi@in.ibm.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Weijia Wang <starkwang@126.com>
Reviewed-By: Yuta Hiroto <hello@hiroppy.me>
Reviewed-By: Richard Lau <riclau@uk.ibm.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Shingo Inoue <leko.noor@gmail.com>
MylesBorins pushed a commit that referenced this issue Mar 20, 2018
Rename the tests appropriately alongside mentioning the subsystem.
Also, make a few basic changes to make sure the tests conform to the
standard test structure.

- Rename test-regress-GH-io-1068 to test-tty-stdin-end
- Rename test-regress-GH-io-1811 to test-zlib-kmaxlength-rangeerror
- Rename test-regress-GH-node-9326 to test-kill-segfault-freebsd
- Rename test-timers-regress-GH-9765 to test-timers-setimmediate-infinite-loop
- Rename test-tls-pfx-gh-5100-regr to test-tls-pfx-authorizationerror
- Rename test-tls-regr-gh-5108 to test-tls-tlswrap-segfault

PR-URL: #19332
Fixes: #19105
Reviewed-By: Gireesh Punathil <gpunathi@in.ibm.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Weijia Wang <starkwang@126.com>
Reviewed-By: Yuta Hiroto <hello@hiroppy.me>
Reviewed-By: Richard Lau <riclau@uk.ibm.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Shingo Inoue <leko.noor@gmail.com>
MylesBorins pushed a commit that referenced this issue Mar 20, 2018
Rename the tests appropriately alongside mentioning the subsystem.
Also, make a few basic changes to make sure the tests conform to the
standard test structure.

- Rename test-regress-GH-io-1068 to test-tty-stdin-end
- Rename test-regress-GH-io-1811 to test-zlib-kmaxlength-rangeerror
- Rename test-regress-GH-node-9326 to test-kill-segfault-freebsd
- Rename test-timers-regress-GH-9765 to test-timers-setimmediate-infinite-loop
- Rename test-tls-pfx-gh-5100-regr to test-tls-pfx-authorizationerror
- Rename test-tls-regr-gh-5108 to test-tls-tlswrap-segfault

PR-URL: #19332
Fixes: #19105
Reviewed-By: Gireesh Punathil <gpunathi@in.ibm.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Weijia Wang <starkwang@126.com>
Reviewed-By: Yuta Hiroto <hello@hiroppy.me>
Reviewed-By: Richard Lau <riclau@uk.ibm.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Shingo Inoue <leko.noor@gmail.com>
BethGriggs pushed a commit that referenced this issue Dec 3, 2018
Rename the tests appropriately alongside mentioning the subsystem.
Also, make a few basic changes to make sure the tests conform to the
standard test structure.

- Rename test-regress-GH-io-1068 to test-tty-stdin-end
- Rename test-regress-GH-io-1811 to test-zlib-kmaxlength-rangeerror
- Rename test-regress-GH-node-9326 to test-kill-segfault-freebsd
- Rename test-timers-regress-GH-9765 to test-timers-setimmediate-infinite-loop
- Rename test-tls-pfx-gh-5100-regr to test-tls-pfx-authorizationerror
- Rename test-tls-regr-gh-5108 to test-tls-tlswrap-segfault

PR-URL: #19332
Fixes: #19105
Reviewed-By: Gireesh Punathil <gpunathi@in.ibm.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Weijia Wang <starkwang@126.com>
Reviewed-By: Yuta Hiroto <hello@hiroppy.me>
Reviewed-By: Richard Lau <riclau@uk.ibm.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Shingo Inoue <leko.noor@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
aix Issues and PRs related to the AIX platform.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants