Skip to content
This repository has been archived by the owner on Jun 18, 2021. It is now read-only.

node-report CI runs failing to start node on smartos #84

Closed
hhellyer opened this issue Apr 25, 2017 · 14 comments
Closed

node-report CI runs failing to start node on smartos #84

hhellyer opened this issue Apr 25, 2017 · 14 comments
Assignees

Comments

@hhellyer
Copy link
Contributor

hhellyer commented Apr 25, 2017

The smartos machines seem to be failing to run node 6 in our CI runs. I found this when testing my PR branch but have confirmed it below with a build against master using Node 6:

https://ci.nodejs.org/view/post-mortem/job/nodereport-continuous-integration/154/MACHINE=smartos16-64/console

The error is occuring during setup, before node-report is installed:
ld.so.1: node: fatal: relocation error: file /home/iojs/build/workspace/nodereport-continuous-integration/MACHINE/smartos16-64/node-v6.10.2-sunos-x64/bin/node: symbol _ZNSt8__detail15_List_node_base7_M_hookEPS0_: referenced symbol not found

It looks likely to be a setup or installation error on the build machines. It extracted node-v6.10.2-sunos-x64.tar.gz to test with. (Full details in the console output above.)

@gibfahn
Copy link
Member

gibfahn commented Apr 25, 2017

cc/ @nodejs/platform-smartos

@misterdjules
Copy link

Looking into it.

@misterdjules
Copy link

On that machine, the latest v6.x node binary downloaded from nodejs.org has the following runtime dependencies:

[root@e993206b-b8f0-4e8f-b8fd-508bffe36e16 /var/tmp/node-v6.10.2-sunos-x64]# ldd ./bin/node 
        libkstat.so.1 =>         /lib/64/libkstat.so.1
        libumem.so.1 =>  /lib/64/libumem.so.1
        libsocket.so.1 =>        /lib/64/libsocket.so.1
        libnsl.so.1 =>   /lib/64/libnsl.so.1
        librt.so.1 =>    /lib/64/librt.so.1
        libsendfile.so.1 =>      /lib/64/libsendfile.so.1
        libstdc++.so.6 =>        /usr/lib/64/libstdc++.so.6
        libstdc++.so.6 (GLIBCXX_3.4.18) =>       (version not found)
        libstdc++.so.6 (CXXABI_1.3) =>   (version not found)
        libm.so.2 =>     /lib/64/libm.so.2
        libgcc_s.so.1 =>         /usr/lib/64/libgcc_s.so.1
        libpthread.so.1 =>       /lib/64/libpthread.so.1
        libc.so.1 =>     /lib/64/libc.so.1
        libmp.so.2 =>    /lib/64/libmp.so.2
        libmd.so.1 =>    /lib/64/libmd.so.1
[root@e993206b-b8f0-4e8f-b8fd-508bffe36e16 /var/tmp/node-v6.10.2-sunos-x64]# 

We can see that the dependency on the C++ runtime is not met. Looking into the binary, we can see that it was built to look for runtime libraries in a directory created by the gcc-4.8 package:

[root@e993206b-b8f0-4e8f-b8fd-508bffe36e16 /var/tmp/node-v6.10.2-sunos-x64]# elfdump -d ./bin/node | egrep '(RUNPATH|RPATH)'
      [13]  RUNPATH           0x2c2231            /opt/local/gcc48/x86_64-sun-solaris2.11/lib/amd64:/opt/local/gcc48/lib/amd64
      [14]  RPATH             0x2c2231            /opt/local/gcc48/x86_64-sun-solaris2.11/lib/amd64:/opt/local/gcc48/lib/amd64
[root@e993206b-b8f0-4e8f-b8fd-508bffe36e16 /var/tmp/node-v6.10.2-sunos-x64]# 

However, that package is not installed on that machine:

[root@e993206b-b8f0-4e8f-b8fd-508bffe36e16 /var/tmp/node-v6.10.2-sunos-x64]# pkgin list | grep gcc
gcc47-libs-4.7.4nb2  The GNU Compiler Collection (GCC) support shared libraries
gcc49-4.9.3nb1       The GNU Compiler Collection (GCC) - 4.9 Release Series
gcc49-libs-4.9.3nb2  The GNU Compiler Collection (GCC) support shared libraries
gccmakedep-1.0.3     Create dependencies in Makefiles using gcc
[root@e993206b-b8f0-4e8f-b8fd-508bffe36e16 /var/tmp/node-v6.10.2-sunos-x64]#

That package couldn't be installed on that machine becaue it is not available in the pkgsrc repository that it uses.

Thus, at runtime, the linker chooses to link to the C++ runtime library of the global zone (/usr/lib/64/libstdc++.so.6), after trying to use the C++ runtime library in the gcc-4.8 package's directories:

[root@e993206b-b8f0-4e8f-b8fd-508bffe36e16 /var/tmp/node-v6.10.2-sunos-x64]# LD_DEBUG=all ./bin/node 2>&1 | grep libstdc | head
07429: file=libstdc++.so.6;  needed by bin/node
07429: find object=libstdc++.so.6; searching
07429:  trying path=/opt/local/gcc48/x86_64-sun-solaris2.11/lib/amd64/libstdc++.so.6
07429:  trying path=/opt/local/gcc48/lib/amd64/libstdc++.so.6
07429:  trying path=/lib/64/libstdc++.so.6
07429:  trying path=/usr/lib/64/libstdc++.so.6
07429: file=/usr/lib/64/libstdc++.so.6  [ ELF ]; generating link map
07429:             libstdc++.so.6              GLIBCXX_3.4.18
07429:             libstdc++.so.6              CXXABI_1.3
07429: file=/usr/lib/64/libstdc++.so.6;  analyzing  [ RTLD_LAZY RTLD_GLOBAL RTLD_WORLD RTLD_NODELETE ]
[root@e993206b-b8f0-4e8f-b8fd-508bffe36e16 /var/tmp/node-v6.10.2-sunos-x64]#

So it seems to me the problem is that node v6 binaries are built with gcc 4.8, but SmartOS 16.x images do not ship with a gcc-4.8 runtime. This is something that is described at nodejs/build#628 (comment).

The latest SmartOS 15.4 LTS images should not have these issues, but the Jenkins job would need to make sure that the gcc-4.8 runtime libraries are installed.

Let me know if I can do anything else to help.

@hhellyer
Copy link
Contributor Author

@misterdjules - Ok, I can follow that we don't have the right gcc libraries installed and it sounds like there's a difference in their availability for 15.x and 16.x.
I don't have access to the build machines @nodejs/build do but I don't know what to ask them to do.
I'd assume for 15.x they need to install the libraries but what do we do about 16.x?
(I may well have missed something!)

@hhellyer
Copy link
Contributor Author

Node worked on smartos for my latest build of a branch I was raising a PR for: https://ci.nodejs.org/view/post-mortem/job/nodereport-continuous-integration/156/

I haven't seen anything to say anyone has updated the machines but if it works when I next build master I'll close this.

@misterdjules
Copy link

Node worked on smartos for my latest build of a branch I was raising a PR for: https://ci.nodejs.org/view/post-mortem/job/nodereport-continuous-integration/156/

That's because build 156 has a different NODE_VERSION parameter than build 154. Build 156 uses nightly builds, which are built using GCC 4.9:

[root@headnode (coal) /var/tmp]# elfdump -d node-v8.0.0-nightly20170425ba7bac5c37-sunos-x64/bin/node | grep RUNPATH
      [13]  RUNPATH           0x38f2ba            /opt/local/lib/:/opt/local/gcc49/x86_64-sun-solaris2.11/lib/amd64:/opt/local/gcc49/lib/amd64
[root@headnode (coal) /var/tmp]# 

for which runtime libraries are available on SmartOS 16.x images.

Ok, I can follow that we don't have the right gcc libraries installed and it sounds like there's a difference in their availability for 15.x and 16.x.

Correct.

I don't have access to the build machines @nodejs/build do but I don't know what to ask them to do.
I'd assume for 15.x they need to install the libraries but what do we do about 16.x?

I think for now we should consider that some binary releases available from nodejs.org just don't work on SmartOS 16.x images, so there's no point in running some of these jobs on these images.

For SmartOS 15.x images, when testing against versions of node that are built with GCC 4.8, the Jenkins job should run the pkgsrc command that installs the GCC 4.8 runtime libraries, with something like pkgin install -y gcc48-libs-4.8.4nb1. When testing against versions of node that are built with GCC 4.9, the Jenkins job should run the pkgsrc command that installs the GCC 4.9 runtime libraries, with something like pkgin install -y gcc49-libs-4.9.3nb1.

@gibfahn
Copy link
Member

gibfahn commented Apr 26, 2017

I think for now we should consider that some binary releases available from nodejs.org just don't work on SmartOS 16.x images, so there's no point in running some of these jobs on these images.

Is there an easy way to work out which ones we should skip, i.e. a Node release line or a command like:

somecommand && exit 0
# or
[[ "$NODE_VERSION" ~= ">7" ]] && exit 0

For SmartOS 15.x images, when testing against versions of node that are built with GCC 4.8, the Jenkins job should run the pkgsrc command that installs the GCC 4.8 runtime libraries, with something like pkgin install -y gcc48-libs-4.8.4nb1. When testing against versions of node that are built with GCC 4.9, the Jenkins job should run the pkgsrc command that installs the GCC 4.9 runtime libraries, with something like pkgin install -y gcc49-libs-4.9.3nb1.

So we need to be installing these every time we run the job?

@misterdjules
Copy link

Is there an easy way to work out which ones we should skip, i.e. a Node release line or a command like:

I don't know what releases are built with which compiler, so I can't recommend anything. @nodejs/build probably knows.

So we need to be installing these every time we run the job?

Yes, or have it installed when the build agent is provisioned. I don't know how @nodejs/build handle these.

@jbergstroem
Copy link
Member

@misterdjules we don't install any specific compilers; just follow whatever pkgin or smartos provides. See below link to supported platforms.

I'm not a fan of having jenkins install any packages at any time. If we need to update gcc specficially for node-report we should probably do so on separate machines to keep node.js testing in line with supported platforms.

@mhdawson
Copy link
Member

mhdawson commented Apr 26, 2017

This is the logic I had added to the release jobs:

# smartos14 is only supported in Node versions 7 and lower
NODE_VERSION=$(python tools/getnodeversion.py)
MAJOR_VERSION=$(echo $NODE_VERSION | cut -d '.' -f 1)
SMARTOS_VERSION=`grep 'release' /etc/pkgsrc_version | awk '{print \$2}'`
BUILD_RELEASE="DONT_RUN"
echo $SMARTOS_VERSION
if [[ "${SMARTOS_VERSION:0:4}" -ge "2015" && ${MAJOR_VERSION} -gt "7" ]]; then
  BUILD_RELEASE="RUN"
fi

if [[ "${SMARTOS_VERSION:0:4}" -eq "2014" && ${MAJOR_VERSION} -ge "0" && ${MAJOR_VERSION} -le "7" ]]; then
  BUILD_RELEASE="RUN"
fi

if [[ "${SMARTOS_VERSION:0:4}" -eq "2013" && ${MAJOR_VERSION} -eq "0" ]]; then
  BUILD_RELEASE="RUN"
fi

if [[ "$BUILD_RELEASE" = "RUN" ]]; then
... code to build.
else 
  echo Node version ${MAJOR_VERSION}.x is not built on ${SMARTOS_VERSION} skipping 
fi

@mhdawson
Copy link
Member

@jbergstroem just to note this is not specific to testing node-report but any module with native code. We've not seen it before since CITGM is not enabled for smartos.

It would seem that for modules we should make sure that we only test on machines that match the support level.

So when we test a module with Node version 6, we should only run on the smartos14 test machines. I think we should be able to use a variation on what I posted for the release jobs in the comment above to get the right combination. We'll have to add smartos14 to the matrix, but I see we do have some test machines at that level.

@gibfahn is that enough info to make the required changes ?

@rnchamberlain
Copy link
Contributor

Just for reference, see also nodejs/node#11444

@rnchamberlain rnchamberlain self-assigned this May 26, 2017
@rnchamberlain
Copy link
Contributor

I've added smartos14-64 to the configuration matrix, and set the combination filter field to this:

(MACHINES=="all" || MACHINES.contains(MACHINE))
 && !((NODE_VERSION=="v4" || NODE_VERSION=="v5") && (MACHINE.contains("s390") || MACHINE.contains("aix")))
 && !((NODE_VERSION=="v6" || NODE_VERSION=="v7") && (MACHINE.contains("smartos15") || MACHINE.contains("smartos16")))
 && !((NODE_VERSION.contains("v8")) && (MACHINE.contains("smartos14")))

So node version 6 and 7 run on smartos14 only, and node version 8 runs on smartos15 or 16 only

@rnchamberlain
Copy link
Contributor

CI runs using node 6 are now ok, running the node-report tests on test-joyent-smartos14-x64-2 (smartos14-64), e.g.:

https://ci.nodejs.org/view/post-mortem/job/nodereport-continuous-integration/180/

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants