Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Separate terminfo library in ncurses breaks Python build #4049

Closed
vanzod opened this issue Jan 18, 2017 · 19 comments
Closed

Separate terminfo library in ncurses breaks Python build #4049

vanzod opened this issue Jan 18, 2017 · 19 comments
Milestone

Comments

@vanzod
Copy link
Member

vanzod commented Jan 18, 2017

Following the addition of --with-termlib in ncurses-6.0.eb (see PR #3545), if Python gets built against this library using Python-2.7.12-foss-2016b.eb it fails since it cannot find the required symbols:

./libpython2.7.so: error: undefined reference to 'tgetnum'
./libpython2.7.so: error: undefined reference to 'PC'
./libpython2.7.so: error: undefined reference to 'BC'
./libpython2.7.so: error: undefined reference to 'UP'
./libpython2.7.so: error: undefined reference to 'tgetent'
./libpython2.7.so: error: undefined reference to 'tgetstr'
./libpython2.7.so: error: undefined reference to 'tgetflag'

If ncurses is built without that flag Python building completes successfully.

@ocaisa Suggestions?

@ocaisa
Copy link
Member

ocaisa commented Jan 19, 2017

I think this is something that can fixed in the python easyblock, when it sets up ncurses linking you can check if the termlib static library exists and if so append it to the static linking list.

@vanzod
Copy link
Member Author

vanzod commented Jan 19, 2017

Oddly enough when I add libtinfo.a to the list of static libraries the linker complains that it is not PIC compliant. Any idea why I am seeing such error?

@ocaisa
Copy link
Member

ocaisa commented Jan 19, 2017 via email

@vanzod
Copy link
Member Author

vanzod commented Jan 20, 2017

@ocaisa Thanks for the clarification. The problem was that in my system although I have both the dummy and the GCCcore versions of ncurses 6.0, EB is picking the dummy for static linking, generating the missing symbols error. So we need to find a way to solve the problem in case the --with-termlib flag is used for ncurses.

@boegel
Copy link
Member

boegel commented Jan 20, 2017

@ocaisa you can build with -fPIC when a dummy toolchain is used by specifying CFLAGS via buildopts, cfr. https://github.com/hpcugent/easybuild-easyconfigs/blob/master/easybuild/easyconfigs/z/zlib/zlib-1.2.8.eb#L17

Updating the Python easyblock in case the ncurses is known to require additional libraries sounds like a good idea.

@boegel boegel added this to the 3.x milestone Jan 20, 2017
@vanzod
Copy link
Member Author

vanzod commented Jan 20, 2017

@ocaisa
Copy link
Member

ocaisa commented Jan 20, 2017 via email

@boegel
Copy link
Member

boegel commented Jan 21, 2017

@ocaisa Assuming that the system C compiler is GCC is a pretty safe bet, no?

@ocaisa
Copy link
Member

ocaisa commented Jan 21, 2017

Sure, of course, but the fix is a hack when you compare it to the capabilities available when you use an understood compiler and the recommended approach of toolchainopts

@boegel
Copy link
Member

boegel commented Jan 21, 2017

easybuilders/easybuild-framework#1233 is an opportunity to fix that

@vanzod
Copy link
Member Author

vanzod commented Jan 23, 2017

@ocaisa Even with the "hack" in my PR we are still stuck at the fact that the terminfo library is not position independent. Here is the full error stack.

/usr/software/software/Core/binutils/2.26/bin/ld.gold: error: /usr/software/software/Core/ncurses/6.0/lib/libtinfo.a(lib_tgoto.o): requires dynamic R_X86_64_PC32 reloc against 'tparm' which may overflow at runtime; recompile with -fPIC
/usr/software/software/Core/binutils/2.26/bin/ld.gold: error: /usr/software/software/Core/ncurses/6.0/lib/libtinfo.a(lib_tparm.o): requires dynamic R_X86_64_PC32 reloc against '_nc_prescreen' which may overflow at runtime; recompile with -fPIC
/usr/software/software/Core/binutils/2.26/bin/ld.gold: error: /usr/software/software/Core/ncurses/6.0/lib/libtinfo.a(lib_tputs.o): requires dynamic R_X86_64_PC32 reloc against 'stdout' which may overflow at runtime; recompile with -fPIC
/usr/software/software/Core/binutils/2.26/bin/ld.gold: error: /usr/software/software/Core/ncurses/6.0/lib/libtinfo.a(trim_sgr0.o): requires dynamic R_X86_64_PC32 reloc against 'tparm' which may overflow at runtime; recompile with -fPIC
/usr/software/software/Core/binutils/2.26/bin/ld.gold: error: /usr/software/software/Core/ncurses/6.0/lib/libtinfo.a(comp_error.o): requires dynamic R_X86_64_PC32 reloc against '_nc_globals' which may overflow at runtime; recompile with -fPIC
/usr/software/software/Core/binutils/2.26/bin/ld.gold: error: /usr/software/software/Core/ncurses/6.0/lib/libtinfo.a(comp_hash.o): requires dynamic R_X86_64_PC32 reloc against '_nc_get_hash_info' which may overflow at runtime; recompile with -fPIC
/usr/software/software/Core/binutils/2.26/bin/ld.gold: error: /usr/software/software/Core/ncurses/6.0/lib/libtinfo.a(doalloc.o): requires dynamic R_X86_64_PC32 reloc against 'realloc' which may overflow at runtime; recompile with -fPIC
/usr/software/software/Core/binutils/2.26/bin/ld.gold: error: /usr/software/software/Core/ncurses/6.0/lib/libtinfo.a(lib_baudrate.o): requires dynamic R_X86_64_32 reloc which may overflow at runtime; recompile with -fPIC
/usr/software/software/Core/binutils/2.26/bin/ld.gold: error: /usr/software/software/Core/ncurses/6.0/lib/libtinfo.a(lib_cur_term.o): requires dynamic R_X86_64_PC32 reloc against 'cur_term' which may overflow at runtime; recompile with -fPIC
/usr/software/software/Core/binutils/2.26/bin/ld.gold: error: /usr/software/software/Core/ncurses/6.0/lib/libtinfo.a(lib_napms.o): requires dynamic R_X86_64_PC32 reloc against '__errno_location' which may overflow at runtime; recompile with -fPIC
/usr/software/software/Core/binutils/2.26/bin/ld.gold: error: /usr/software/software/Core/ncurses/6.0/lib/libtinfo.a(lib_setup.o): requires dynamic R_X86_64_PC32 reloc against 'TABSIZE' which may overflow at runtime; recompile with -fPIC
/usr/software/software/Core/binutils/2.26/bin/ld.gold: error: /usr/software/software/Core/ncurses/6.0/lib/libtinfo.a(lib_ti.o): requires dynamic R_X86_64_PC32 reloc against '_nc_find_type_entry' which may overflow at runtime; recompile with -fPIC
/usr/software/software/Core/binutils/2.26/bin/ld.gold: error: /usr/software/software/Core/ncurses/6.0/lib/libtinfo.a(lib_ttyflags.o): requires dynamic R_X86_64_PC32 reloc against 'tcsetattr' which may overflow at runtime; recompile with -fPIC
/usr/software/software/Core/binutils/2.26/bin/ld.gold: error: /usr/software/software/Core/ncurses/6.0/lib/libtinfo.a(name_match.o): requires dynamic R_X86_64_PC32 reloc against '_nc_globals' which may overflow at runtime; recompile with -fPIC
/usr/software/software/Core/binutils/2.26/bin/ld.gold: error: /usr/software/software/Core/ncurses/6.0/lib/libtinfo.a(read_entry.o): requires dynamic R_X86_64_PC32 reloc against 'malloc' which may overflow at runtime; recompile with -fPIC
/usr/software/software/Core/binutils/2.26/bin/ld.gold: error: /usr/software/software/Core/ncurses/6.0/lib/libtinfo.a(access.o): requires dynamic R_X86_64_PC32 reloc against '__xstat' which may overflow at runtime; recompile with -fPIC
/usr/software/software/Core/binutils/2.26/bin/ld.gold: error: /usr/software/software/Core/ncurses/6.0/lib/libtinfo.a(alloc_ttype.o): requires dynamic R_X86_64_PC32 reloc against 'malloc' which may overflow at runtime; recompile with -fPIC
/usr/software/software/Core/binutils/2.26/bin/ld.gold: error: /usr/software/software/Core/ncurses/6.0/lib/libtinfo.a(comp_captab.o): requires dynamic R_X86_64_32 reloc which may overflow at runtime; recompile with -fPIC
/usr/software/software/Core/binutils/2.26/bin/ld.gold: error: /usr/software/software/Core/ncurses/6.0/lib/libtinfo.a(db_iterator.o): requires dynamic R_X86_64_PC32 reloc against '_nc_globals' which may overflow at runtime; recompile with -fPIC
/usr/software/software/Core/binutils/2.26/bin/ld.gold: error: /usr/software/software/Core/ncurses/6.0/lib/libtinfo.a(free_ttype.o): requires dynamic R_X86_64_PC32 reloc against '_nc_user_definable' which may overflow at runtime; recompile with -fPIC
/usr/software/software/Core/binutils/2.26/bin/ld.gold: error: /usr/software/software/Core/ncurses/6.0/lib/libtinfo.a(getenv_num.o): requires dynamic R_X86_64_32 reloc which may overflow at runtime; recompile with -fPIC
/usr/software/software/Core/binutils/2.26/bin/ld.gold: error: /usr/software/software/Core/ncurses/6.0/lib/libtinfo.a(home_terminfo.o): requires dynamic R_X86_64_PC32 reloc against '_nc_globals' which may overflow at runtime; recompile with -fPIC
/usr/software/software/Core/binutils/2.26/bin/ld.gold: error: /usr/software/software/Core/ncurses/6.0/lib/libtinfo.a(lib_options.o): requires dynamic R_X86_64_PC32 reloc against 'cur_term' which may overflow at runtime; recompile with -fPIC
/usr/software/software/Core/binutils/2.26/bin/ld.gold: error: /usr/software/software/Core/ncurses/6.0/lib/libtinfo.a(lib_raw.o): requires dynamic R_X86_64_PC32 reloc against '_nc_set_tty_mode_sp' which may overflow at runtime; recompile with -fPIC
/usr/software/software/Core/binutils/2.26/bin/ld.gold: error: /usr/software/software/Core/ncurses/6.0/lib/libtinfo.a(entries.o): requires dynamic R_X86_64_PC32 reloc against '_nc_head' which may overflow at runtime; recompile with -fPIC
/usr/software/software/Core/binutils/2.26/bin/ld.gold: error: /usr/software/software/Core/ncurses/6.0/lib/libtinfo.a(init_keytry.o): requires dynamic R_X86_64_32 reloc against '_nc_tinfo_fkeys' which may overflow at runtime; recompile with -fPIC
/usr/software/software/Core/binutils/2.26/bin/ld.gold: error: /usr/software/software/Core/ncurses/6.0/lib/libtinfo.a(lib_has_cap.o): requires dynamic R_X86_64_PC32 reloc against 'cur_term' which may overflow at runtime; recompile with -fPIC
/usr/software/software/Core/binutils/2.26/bin/ld.gold: error: /usr/software/software/Core/ncurses/6.0/lib/libtinfo.a(key_defined.o): requires dynamic R_X86_64_PC32 reloc against 'SP' which may overflow at runtime; recompile with -fPIC
/usr/software/software/Core/binutils/2.26/bin/ld.gold: error: /usr/software/software/Core/ncurses/6.0/lib/libtinfo.a(add_tries.o): requires dynamic R_X86_64_PC32 reloc against 'calloc' which may overflow at runtime; recompile with -fPIC
./Modules/posixmodule.c:7578: warning: the use of `tempnam' is dangerous, better use `mkstemp'
./Modules/posixmodule.c:7631: warning: the use of `tmpnam_r' is dangerous, better use `mkstemp'
collect2: error: ld returned 1 exit status```

@vanzod
Copy link
Member Author

vanzod commented Jan 23, 2017

Another thing I am trying to understand is why EB is picking the ncurses with the dummy toolchain instead of the one built with GCCcore, which would make much more sense. By looking through the attached log the problem seems to arise when loading the dependencies for libreadline. The weird thing is that if I run the same lmod command on the system, it returns the GCCcore/ncurses as dependency. And at this point I am totally lost.
easybuild-Python-2.7.12-20170123.103930.sHAOs.log.txt

@vanzod
Copy link
Member Author

vanzod commented Jan 31, 2017

I think I found the source of the issue. First of all let's start by saying that this affects only the Python-3.5.2 for foss-2016b and intel-2016b toolchains (maybe others but I am working with those two right now) since it has XZ-5.5.2 as a dependency. XZ-5.5.2 is in turn dependent upon the gettext-0.19.8 library. However EB (let's take the foss toolchain as example) cannot use an hypotetical gettext-0.19.8-GCC-5.4.0-2.26.eb since at this toolchain level (at the dummy level gettext dependencies are stripped down) it would depend on libxml2-2.9.4-GCC-5.4.0-2.26.eb which in turn depends on XZ-5.2.2-GCC-5.4.0-2.26.eb and here we have a loop of dependencies. For this reason EB builds XZ-5.2.2-GCC-5.4.0-2.26.eb with gettext-0.19.8.eb and this last one depends on ncurses-6.0.eb. At this point, when compiling Python-3.5.2 it fails because it links against the ncurses library with the stripped symbols.
For the same reason it is also not possible to use a gettext-0.19.8-GCC-5.4.0-2.26.eb easyconfig since this would depend on the gettext at the dummy level.

Unfortunately I do not see right now a way to break the dependency loop and avoid linking against the dummy gettext. Suggestions?

@ocaisa
Copy link
Member

ocaisa commented Feb 1, 2017

It's not a bug, the problem is the XZ easyconfig, it explicitly asks for gettext with the dummy toolchain:

('gettext', '0.19.8', '', True),

The way to fix this would be to supply a gettext easyconfig at GCC or GCCcore level and remove the explicit specification of the dummy toolchain (the True argument in the dependency spec) for XZ

@vanzod
Copy link
Member Author

vanzod commented Feb 1, 2017

@ocaisa Yes, I figured that out 10 minutes after I sent the message. Check the edited post to see why it is not possible to use a gettext-0.19.8-GCC-5.4.0-2.26.eb unless I remove the libxml2 dependency from it.

@vanzod
Copy link
Member Author

vanzod commented May 9, 2017

Thanks to @boegel we finally found that the problem arises when trying to build Python when the GCC/5.4.0-2.26 module is already loaded in the environment and ncurses-6.0 is built with separate terminfo symbols at the dummy level. The reason why the dummy ncurses gets loaded instead of the GCCcore one comes from how $MODULEPATH is managed by EB. Here is the breakdown of the issue.

In the initial environment the GCC/5.4.0-2.26 is loaded. $MODULEPATH is then:

<prefix>/GCC/5.4.0-2.26:<prefix>/GCCcore/5.4.0:<prefix>/Core

EB issues a module use Core command (see here) which pushes the Core path on the top of the list:

<prefix>/Core:<prefix>/GCC/5.4.0-2.26:<prefix>/GCCcore/5.4.0

EB loads GCC/5.4.0-2.26 as dependency, which pushes the corresponding path to the top:

<prefix>/GCC/5.4.0-2.26:<prefix>/Core:<prefix>/GCCcore/5.4.0

However, since GCCcore/5.4.0 is a conditional dependency in the GCC/5.4.0-2.26 module and it is already present in $MODULEPATH, the corresponding path remains after the Core path, causing the Core/ncurses/6.0 library path to get added to LDFLAGS instead of the GCCcore/5.4.0/ncurses/6.0.

@boegel
Copy link
Member

boegel commented May 9, 2017

Long story short: don't load (EasyBuild-generated) modules before running EasyBuild.

We'll look into implementing a detection/warning mechanism (or maybe something stricter) for this in easybuilders/easybuild-framework#153

@pforai
Copy link
Contributor

pforai commented Aug 10, 2017

One other thing to note is that the only dummy version of ncurses 6.0 builds with separate tinfo library while others do still included it. That may bite in some unexpected ways when one moves to include dummy in minimal toolchains and tries to rebuild a bigger tree ontop of this.

@boegel
Copy link
Member

boegel commented Aug 10, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants