-
Notifications
You must be signed in to change notification settings - Fork 30.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
osx, build: Node v6.3.0 won't build on Clang 3.2 (OSX 10.8 Darwin 12.5.0) #7618
Comments
/cc @nodejs/build regarding our OSX testing infra |
I'm not familiar with clang, so I'm not entirely sure which version we should be looking at, but the last version number on the failing machine shows 3.2 which is less than 3.4. shrug |
Urg. I tested 3.4 because that was the oldest version listed in https://github.com/nodejs/node/blob/master/BUILDING.md. If 3.2 needs to be supported, then we can detect builtin presence and fallback to one of the alternative definitions. |
@mscdex @zbjornson I had assumed that the clang version was the I think this clang level is the default one for OSX |
Apple versions clang kinda unusually. Their 4.2 is based on LLVM 3.2, and their 5.1 is based on LLVM 3.4. I can work on a patch today, unless the answer is to require people who build to update xcode to a version with LLVM 3.4+ support. |
@zbjornson No, no warning. OS X 10.8 with Apple LLVM version 4.2 (clang-425.0.24) (based on LLVM 3.2svn). For anyone who wants to try it, you have to build with |
Opening gambit: #7644 |
Contra gambit: #7645 :) |
Said builtins are not supported by older versions of apple-gcc, breaking the build on OS X 10.8. Fixes: nodejs#7618 Refs: nodejs#4290 Refs: nodejs#7157
Any update on this ? |
I have a small change (addressing my own comment) that I seem to have forgotten to push and that's sitting on a computer that I can't get to 'till tonight. Not sure what @bnoordhuis wants to do, but I'd propose moving forward with #7645 because it benchmarks faster at least on Windows. Will push that change and post the updated benchmarks tonight. |
I'm not hugely invested in this issue so I don't really care which way it goes. The thing I like about my own PR is that it reduces line count and removes the intrinsics (something I've never been a fan of) but if it's that much slower with VS, then I guess we should go with Zach's PR. |
(Really sorry about the delay.) I'm also not highly invested in this. #7645 currently benchmarks faster on windows in 5 of 6 cases (slower for aligned 16-bit; faster for other combos of size/alignment) and I think would be okay to merge. If anyone wants, I could work to make the final case to match Ben's on perf. I haven't had time to benchmark on linux recently. If we go with Ben's, I do have an npm package for fast byte swapping that would satisfy anyone else who needs as-fast-as-possible swapping. Thus, I don't really care which moves forward. Edit Looked at linux benchmarks. Even with |
So to summarise, the two options are #7644 (@bnoordhuis ), which makes the code cleaner/more maintainable but reduces performance on windows, or #7645 (@zbjornson ), which increases performance on Windows but keeps the intrinsics. @zbjornson Do you have a rough estimate of how significant the speedup will be? Is it only windows which is currently slower on #7644? As VS 2013 has been dropped on master, the windows performance loss only affects us if it happens on VS 2015, so it'd be useful to have some specific numbers. If you've written a benchmark you could point me to, I'd be happy to try it with VS 2015. It'd be great to have master building on Clang 3.2 again either way! EDIT: Looking at #7645 (comment), it sounds like #7644 also includes the performance boost from #7157, so we'll get @zbjornson's performance improvements either way! |
Mostly correct, except for the edit:
Less important:
If you'd like to play with benchmarks, they're in https://github.com/nodejs/node/blob/master/benchmark/buffers/buffer-swap.js. You probably want to just pick one len (a high value like 8192 so it uses the C++ impl) and drop the n, otherwise it's dreadfully slow to run them all. Given the above, I think it would make sense to use the intrinsics on Windows only and remove them from the rest, which will keep perf where it's easy and commonly available, and cut the line count down. I can try to get to this tonight. Edit example godbolt: https://godbolt.org/g/4rvmAa |
@zbjornson You're right, the edit was unclear. I meant that the performance improvements from #7157 were in both #7644 and #7645 except for windows as mentioned earlier. It would seem to make a lot of sense to only keep the intrinsics in for windows if that's where they make a big performance difference. Is there a way to get to at least the previous performance for aligned 16-bit types? The speedup for the other types might make it worth it even if not.
Could you explain a little more here? #7645 doesn't get much faster that what? Are you saying that #7645 doesn't get much faster than #7644 (i.e. using the builtins doesn't give a large boost without using non-default compiler flags)? So I guess your updated PR would be halfway between #7645 and #7644? That makes sense to me (as long as @bnoordhuis approves). |
Sorry, to be clear: #7645 does not have a perf regression for aligned 16-bit types on Windows, if that's what you mean. Rather, Ben's PR gives an improvement to this case.
I meant, even if someone were to compile node with Here's what the hand-vectorized code looks like: Working on the halfway PR now that I've figured that stuff out. :) |
OK, #7645 is now modified to use builtins for Windows only and vanilla C++ for the rest. Benchmarks faster than or comparably to #7644 in all cases on Windows, and aligned cases on Linux: I could keep fiddling with this to improve unaligned perf on Linux, but given that unaligned is less common than aligned, I'd also be happy with this as-is. |
Removes use of builtins that are unavailable for older clang. Per benchmarks, only uses builtins on Windows, where speedup is significant. Fixes: nodejs#7618
Removes use of builtins that are unavailable for older clang. Per benchmarks, only uses builtins on Windows, where speedup is significant. Also adds test for unaligned ucs2 buffer write. Between #3410 and #7645, bytes were swapped twice on bigendian platforms if buffer was not two-byte aligned. See comment in #7645. PR-URL: #7645 Fixes: #7618 Reviewed-By: Anna Henningsen <anna@addaleax.net> Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl> Reviewed-By: James M Snell <jasnell@gmail.com>
Removes use of builtins that are unavailable for older clang. Per benchmarks, only uses builtins on Windows, where speedup is significant. Also adds test for unaligned ucs2 buffer write. Between #3410 and #7645, bytes were swapped twice on bigendian platforms if buffer was not two-byte aligned. See comment in #7645. PR-URL: #7645 Fixes: #7618 Reviewed-By: Anna Henningsen <anna@addaleax.net> Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl> Reviewed-By: James M Snell <jasnell@gmail.com> Conflicts: src/node_buffer.cc
The build is failing on OSX 10.8 but passing on 10.10. It looks like the problem is due to this commit: 4014ecb, which was added in #7157 and backported in #7546. I note that @zbjornson said that:
But I'm still getting this problem. I assume something similar to #4290 would fix it. @bnoordhuis
Failing machine:
Passing machine:
Error:
The text was updated successfully, but these errors were encountered: