Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

-flto on clang/gcc? #7400

Closed
indutny opened this issue Jun 24, 2016 · 27 comments
Closed

-flto on clang/gcc? #7400

indutny opened this issue Jun 24, 2016 · 27 comments
Labels
build Issues and PRs related to build files or the CI. stalled Issues and PRs that are stalled.

Comments

@indutny
Copy link
Member

indutny commented Jun 24, 2016

I wonder if we should give a try to the Link Time Optimizations flag in clang. It appears to be used when building v8 (though, as far as I can tell no on mac), and may be useful for building OpenSSL and/or core itself.

Thoughts?

cc @nodejs/collaborators

@indutny
Copy link
Member Author

indutny commented Jun 24, 2016

cc @bnoordhuis

@mscdex mscdex added the build Issues and PRs related to build files or the CI. label Jun 24, 2016
@indutny
Copy link
Member Author

indutny commented Jun 24, 2016

Here are tls-throughput result comparison on my mac (10 runs):

tls/throughput.js dur="5" type="buf" size="2": ./out/Release/node-flto: 6.1295 ./out/Release/node: 5.6876 ....... 7.77%
tls/throughput.js dur="5" type="buf" size="1024": ./out/Release/node-flto: 1567 ./out/Release/node: 1455 ........ 7.70%
tls/throughput.js dur="5" type="buf" size="1048576": ./out/Release/node-flto: 4167.8 ./out/Release/node: 4026.7 . 3.50%
tls/throughput.js dur="5" type="asc" size="2": ./out/Release/node-flto: 5.5664 ./out/Release/node: 5.0363 ...... 10.52%
tls/throughput.js dur="5" type="asc" size="1024": ./out/Release/node-flto: 1462.2 ./out/Release/node: 1336.8 .... 9.38%
tls/throughput.js dur="5" type="asc" size="1048576": ./out/Release/node-flto: 3947.5 ./out/Release/node: 3847.4 . 2.60%
tls/throughput.js dur="5" type="utf" size="2": ./out/Release/node-flto: 5.5544 ./out/Release/node: 5.0786 ....... 9.37%
tls/throughput.js dur="5" type="utf" size="1024": ./out/Release/node-flto: 1328.5 ./out/Release/node: 1255.7 .... 5.80%
tls/throughput.js dur="5" type="utf" size="1048576": ./out/Release/node-flto: 3051.1 ./out/Release/node: 2985.1 . 2.21%

@indutny indutny changed the title -flto on clang? -flto on clang/gcc? Jun 24, 2016
@indutny
Copy link
Member Author

indutny commented Jun 24, 2016

I suspect that it will also slightly improve start time and overall performance.

@mscdex
Copy link
Contributor

mscdex commented Jun 24, 2016

Out of curiosity, was this prompted by news of ThinLTO?

@bnoordhuis
Copy link
Member

@indutny Did you build with CXX=clang++? If yes, can you pass LINK=clang++ as well? GYP links with g++ by default and in that case you won't get cross-compilation unit LTO.

@indutny
Copy link
Member Author

indutny commented Jun 24, 2016

@mscdex yup

@bnoordhuis yep, and also I am on OS X, so gcc is not an option anyway.

@indutny
Copy link
Member Author

indutny commented Jun 24, 2016

@bnoordhuis btw, g++ has -flto option too. Though, it accepts number of threads there.

@indutny
Copy link
Member Author

indutny commented Jun 24, 2016

@bnoordhuis
Copy link
Member

btw, g++ has -flto option too.

What I mean is that when you compile with clang++ but link with g++, then g++ won't know how to do LTO across files because the .o files don't contain GIMPLE (because they contain LLVM bitcode.)

indutny added a commit to indutny/io.js that referenced this issue Jun 24, 2016
Introduce `--with-lto` configure option to use Link Time Optimization
compiler pass in gcc/clang compilers.

This flag slightly improves performance of C/C++ heavy code:

    tls/throughput.js dur="5" type="buf" size="2": ./out/Release/node-flto: 6.1295 ./out/Release/node: 5.6876 ....... 7.77%
    tls/throughput.js dur="5" type="buf" size="1024": ./out/Release/node-flto: 1567 ./out/Release/node: 1455 ........ 7.70%
    tls/throughput.js dur="5" type="buf" size="1048576": ./out/Release/node-flto: 4167.8 ./out/Release/node: 4026.7 . 3.50%
    tls/throughput.js dur="5" type="asc" size="2": ./out/Release/node-flto: 5.5664 ./out/Release/node: 5.0363 ...... 10.52%
    tls/throughput.js dur="5" type="asc" size="1024": ./out/Release/node-flto: 1462.2 ./out/Release/node: 1336.8 .... 9.38%
    tls/throughput.js dur="5" type="asc" size="1048576": ./out/Release/node-flto: 3947.5 ./out/Release/node: 3847.4 . 2.60%
    tls/throughput.js dur="5" type="utf" size="2": ./out/Release/node-flto: 5.5544 ./out/Release/node: 5.0786 ....... 9.37%
    tls/throughput.js dur="5" type="utf" size="1024": ./out/Release/node-flto: 1328.5 ./out/Release/node: 1255.7 .... 5.80%
    tls/throughput.js dur="5" type="utf" size="1048576": ./out/Release/node-flto: 3051.1 ./out/Release/node: 2985.1 . 2.21%

See: nodejs#7400
@indutny
Copy link
Member Author

indutny commented Jun 24, 2016

@bnoordhuis oh, of course. Do you think we need to do something about it?

@mscdex
Copy link
Contributor

mscdex commented Jun 24, 2016

FWIW I tried -flto last night with gcc/g++ (5 and 6) and didn't see any performance change (although the binary was a few MB smaller). I even tried adding other lto-related arguments and changing linkers, but still nothing. I still could have been doing something wrong though.

@bnoordhuis
Copy link
Member

@indutny The PR probably needs to enforce that LINK and LINK.target == CXX.target, unless explicitly overridden.

@RReverser
Copy link
Member

and didn't see any performance change (although the binary was a few MB smaller

In fact, that's pretty common case - from what I observed, LTO in gcc usually gives only cross-object dead code elimination (which is already a good thing, even though in theory it can do more).

@indutny
Copy link
Member Author

indutny commented Jun 25, 2016

Alright, so clang only then?

@RReverser
Copy link
Member

Well, I think it's still favorable for release builds, no matter what compiler we're using.

@jbergstroem
Copy link
Member

So, if we wanted to play around with release binaries for 64-bit linux we'd want to set up a recent clang on centos5?

@mscdex
Copy link
Contributor

mscdex commented Jun 28, 2016

@RReverser I'm not so sure this would be good to have in release builds (and/or as a default) if -flto strips out symbols that may be used by third party addons.

@indutny
Copy link
Member Author

indutny commented Jun 28, 2016

@jbergstroem gcc supports this too.

addaleax pushed a commit to addaleax/node that referenced this issue Feb 3, 2017
Introduce `--with-lto` configure option to use Link Time Optimization
compiler pass in gcc/clang compilers.

This flag slightly improves performance of C/C++ heavy code:

    tls/throughput.js dur="5" type="buf" size="2": ./out/Release/node-flto: 6.1295 ./out/Release/node: 5.6876 ....... 7.77%
    tls/throughput.js dur="5" type="buf" size="1024": ./out/Release/node-flto: 1567 ./out/Release/node: 1455 ........ 7.70%
    tls/throughput.js dur="5" type="buf" size="1048576": ./out/Release/node-flto: 4167.8 ./out/Release/node: 4026.7 . 3.50%
    tls/throughput.js dur="5" type="asc" size="2": ./out/Release/node-flto: 5.5664 ./out/Release/node: 5.0363 ...... 10.52%
    tls/throughput.js dur="5" type="asc" size="1024": ./out/Release/node-flto: 1462.2 ./out/Release/node: 1336.8 .... 9.38%
    tls/throughput.js dur="5" type="asc" size="1048576": ./out/Release/node-flto: 3947.5 ./out/Release/node: 3847.4 . 2.60%
    tls/throughput.js dur="5" type="utf" size="2": ./out/Release/node-flto: 5.5544 ./out/Release/node: 5.0786 ....... 9.37%
    tls/throughput.js dur="5" type="utf" size="1024": ./out/Release/node-flto: 1328.5 ./out/Release/node: 1255.7 .... 5.80%
    tls/throughput.js dur="5" type="utf" size="1048576": ./out/Release/node-flto: 3051.1 ./out/Release/node: 2985.1 . 2.21%

Fixes: nodejs#7400
@Trott
Copy link
Member

Trott commented Jul 8, 2017

This should stay open, yes? (Hi! I'm triaging inactive issues!)

@bnoordhuis
Copy link
Member

#7408 is the accompanying (but stalled) pull request.

@addaleax @indutny Is this still something you want to do?

@octaviansoldea
Copy link

octaviansoldea commented Jul 4, 2018

Hello

I am trying to build Node.js with lto. However, I get the following error, when executing the tests, i.e. when running - make test - :


Path: abort/test-addon-uv-handle-leak
assert.js:270
    throw err;
    ^

AssertionError [ERR_ASSERTION]: uv loop at [0x3882800] has active handles
[0x7f59f800b490] timer
	Close callback: 0x7f5a08d1a130 CloseCallback(uv_handle_s*) [/home/octavian/Octavian/node_pgo_lto/07July/03/node_lto_for_debug/test/addons/uv-handle-leak/build/Release/binding.node]
	Data: 0x7f5a08f1b120 example_instance [/home/octavian/Octavian/node_pgo_lto/07July/03/node_lto_for_debug/test/addons/uv-handle-leak/build/Release/binding.node]
	(First field): 0x7f5a08f1adc0  [/home/octavian/Octavian/node_pgo_lto/07July/03/node_lto_for_debug/test/addons/uv-handle-leak/build/Release/binding.node]
[0x7f59f800b530] timer
	Close callback: 0x7f5a08d1a130 CloseCallback(uv_handle_s*) [/home/octavian/Octavian/node_pgo_lto/07July/03/node_lto_for_debug/test/addons/uv-handle-leak/build/Release/binding.node]
	Data: (nil) 
[0x7f59f800b5d0] timer
	Close callback: 0x7f5a08d1a130 CloseCallback(uv_handle_s*) [/home/octavian/Octavian/node_pgo_lto/07July/03/node_lto_for_debug/test/addons/uv-handle-leak/build/Release/binding.node]
	Data: 0x42 
/home/octavian/Octavian/node_pgo_lto/07July/03/node_lto_for_debug/out/Release/node[40943]: ../src/debug_utils.cc:218:void node::CheckedUvLoopClose(uv_loop_t*): Assertion `0 && "uv_loop_close() while having open handles"' failed.
 1: 0x728c80 node::Abort() [/home/octavian/Octavian/node_pgo_lto/07July/03/node_lto_for_debug/out/Release/node]
 2: 0x74bf30  [/home/octavian/Octavian/node_pgo_lto/07July/03/node_lto_for_debug/out/Release/node]
 3: 0x718525  [/home/octavian/Octavian/node_pgo_lto/07July/03/node_lto_for_debug/out/Release/node]
 4: 0x82fc16 node::worker::Worker::~Worker() [/home/octavian/Octavian/node_pgo_lto/07July/03/node_lto_for_debug/out/Release/node]
 5: 0x82fe01 node::worker::Worker::~Worker() [/home/octavian/Octavian/node_pgo_lto/07July/03/node_lto_for_debug/out/Release/node]
 6: 0x743c13 node::Environment::RunCleanup() [/home/octavian/Octavian/node_pgo_lto/07July/03/node_lto_for_debug/out/Release/node]
 7: 0x1062131  [/home/octavian/Octavian/node_pgo_lto/07July/03/node_lto_for_debug/out/Release/node]
 8: 0x7acf4e node::Start(int, char**) [/home/octavian/Octavian/node_pgo_lto/07July/03/node_lto_for_debug/out/Release/node]
 9: 0x7f5a0c40a830 __libc_start_main [/lib/x86_64-linux-gnu/libc.so.6]
10: 0x71de89 _start [/home/octavian/Octavian/node_pgo_lto/07July/03/node_lto_for_debug/out/Release/node]

    at Object.<anonymous> (/home/octavian/Octavian/node_pgo_lto/07July/03/node_lto_for_debug/test/abort/test-addon-uv-handle-leak.js:56:5)
    at Module._compile (internal/modules/cjs/loader.js:689:30)
    at Object.Module._extensions..js (internal/modules/cjs/loader.js:700:10)
    at Module.load (internal/modules/cjs/loader.js:599:32)
    at tryModuleLoad (internal/modules/cjs/loader.js:538:12)
    at Function.Module._load (internal/modules/cjs/loader.js:530:3)
    at Function.Module.runMain (internal/modules/cjs/loader.js:742:12)
    at startup (internal/bootstrap/node.js:260:19)
    at bootstrapNodeJSCore (internal/bootstrap/node.js:584:3)
Command: out/Release/node --experimental-worker /home/octavian/Octavian/node_pgo_lto/07July/03/node_lto_for_debug/test/abort/test-addon-uv-handle-leak.js

Could you please help? Did anybody encounter the same error? I saw that there was some activity in the recent past regarding memory leaks, therefore, I am wondering if this is something related to newer version of Node.js or not.

I am looking forward to hearing from you.

Thank you in advance,

@octaviansoldea

(edit by @addaleax: formatting)

@addaleax
Copy link
Member

addaleax commented Jul 4, 2018

@octaviansoldea Which compiler/OS/libc are you using?

Generally, that test is very new, and you shouldn’t worry about it failing with this error at this point

@addaleax
Copy link
Member

addaleax commented Jul 4, 2018

It might also help to have something like “steps to reproduce”?

@octaviansoldea
Copy link

octaviansoldea commented Jul 5, 2018

@addaleax

Thank you for your message and feedback. Here are the steps I am using when compiling Node.js.

The versions of gcc and g++ used are
gcc (Ubuntu 5.4.1-2ubuntu116.04) 5.4.1 20160904
g++ (Ubuntu 5.4.1-2ubuntu1
16.04) 5.4.1 20160904

The Linux version I am using is
50~16.04.1-Ubuntu SMP Wed May 30 11:18:27 UTC 2018

The modifications I am introducing are related to two files
configure and common.gypi, and they are described in the attached file
lto_code_ changes.txt.

For compilation, please use

./configure --enable-lto

at the configuration step, and

make -j number

at proper compilation. In this context, I recommend using

number = 1.5 * number_of_processors_available.

Moreover, please note that the line

'lto': ' -flto=4 -fuse-linker-plugin -ffat-lto-objects ',

influences the linking time. One can use "... -flto=number ...", where number is different
than 4, provided there are enough processors. In this context, I recommend using

number = 1.5 * number_of_processors_available

as mentioned above.

Any help and feedback is greatly appreciated.

@octaviansoldea

octaviansoldea pushed a commit to octaviansoldea/node that referenced this issue Jul 5, 2018
This modification allows for compiling with link time optimization (lto)
using the flag --enable-lto.

Refs: nodejs#7400
addaleax pushed a commit that referenced this issue Jul 13, 2018
This modification allows for compiling with link time optimization (lto)
using the flag --enable-lto.

Refs: #7400

PR-URL: #21677
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Anna Henningsen <anna@addaleax.net>
targos pushed a commit that referenced this issue Jul 14, 2018
This modification allows for compiling with link time optimization (lto)
using the flag --enable-lto.

Refs: #7400

PR-URL: #21677
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Anna Henningsen <anna@addaleax.net>
@jasnell
Copy link
Member

jasnell commented Oct 17, 2018

Ping... what's the status on this?

@ChALkeR
Copy link
Member

ChALkeR commented Oct 17, 2018

#21677 landed and does roughly the same as #7408, but with a different flag name and only for Linux.

Perhaps that should be enough to mark this resolved?

Or should the discussion for other OS and/or enabling it by default happen in this issue?

@bnoordhuis
Copy link
Member

Closing due to inactivity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
build Issues and PRs related to build files or the CI. stalled Issues and PRs that are stalled.
Projects
None yet
Development

No branches or pull requests