-
Notifications
You must be signed in to change notification settings - Fork 151
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issues with musl libc #13
Comments
@4ricci did you have a workaround for the issues you reported or was something else wrong on the system? |
These patches fix the first three problems I described: --- a/src/libbu/affinity.c
+++ b/src/libbu/affinity.c
@@ -58,11 +58,7 @@ parallel_set_affinity(int cpu)
/* Linux and BSD pthread affinity */
-#ifdef HAVE_CPU_SET_T
- cpu_set_t set_of_cpus; /* bsd */
-#else
- cpuset_t set_of_cpus; /* linux */
-#endif
+ cpu_set_t set_of_cpus;
int ret;
int ncpus = bu_avail_cpus();
--- a/misc/tools/env2c/CMakeLists.txt
+++ b/misc/tools/env2c/CMakeLists.txt
@@ -55,6 +55,7 @@ endif(BRLCAD_ENABLE_ADDRESS_SANITIZER AND ${BRLCAD_OPTIMIZED_BUILD} MATCHES "OFF
add_executable(env2c env2c.cxx)
target_link_libraries(env2c Threads::Threads)
+target_link_options(env2c PRIVATE -Wl,-z,stack-size=2097152)
if (O3_COMPILER_FLAG)
# If we have the O3 flag, use it (See this. A proper solution could be to use --- a/src/util/ttcp.c
+++ b/src/util/ttcp.c
@@ -20,6 +20,7 @@
*/
#undef _POSIX_C_SOURCE
+#define _POSIX_C_SOURCE 200112L
#undef _XOPEN_SOURCE
#define BSD43
/* #define BSD42 */ I don't know what causes the |
Thank you for the response; and interesting set of changes. Alas, the changes cannot be committed as-is as they're not portable -- other Linux distros need cpuset_t for example and -Wl,-z,stack-size is not valid on various compilers (e.g., visual studio, ibm compiler, ..). The posix_source change might be okay, but it's doing the complete opposite of what is intended (i.e., there is non-posix code in here that conforms with bsd43 instead). With the first issue, what's going on wrong is our cmake test for cpu_set_t changed, and the preprocessor symbol used here (HAVE_CPU_SET_T) wasn't updated to reflect it. Looks like it was broken in 9649d61. The cmake test is arguably wrong, so the proper fix is change the cmake logic so HAVE_CPU_SET_T correctly toggles to the right construct and musl should just work. I'm going to test a change here for that and, if you would, let me know if it works for you. For the other two issues, can you provide more information on what went wrong? An error log would help. |
2.After further investigation, I discovered
Applying this change: --- a/misc/tools/env2c/env2c.cxx
+++ b/misc/tools/env2c/env2c.cxx
@@ -89,7 +89,10 @@ process_file(env_outputs &env_t)
return;
}
while (std::getline(fs, sline)) {
- if (!std::regex_match(sline, getenv_regex)) {
+ std::cerr << "Doing line: " << sline << "\n";
+ bool match = std::regex_match(sline, getenv_regex);
+ std::cerr << "Done line: " << sline << "\n";
+ if (!match) {
continue;
}
std::smatch envvar;
@@ -217,7 +220,7 @@ main(int argc, const char *argv[])
/*************************************************************/
/* Process the files (mulithreaded mapping for performance) */
/*************************************************************/
- unsigned int hwc = std::thread::hardware_concurrency();
+ unsigned int hwc = 1;
if (!hwc) {
hwc = 10;
} produces this output:
If I understand correctly, this happens because musl has a small default thread stack size, it's not the program's fault.
Of course, my patch was just a quick workaround, since the build system ignores 3.On musl, without an appropriate macro,
Actually it is POSIX, but an older version (200112L instead of 200809L). Anyway, defining
4.I already provided the build log. |
The issue 4. is caused by librt, which has the same name of a libc component. The correct solution would be to rename the conflicting libraries (or at least just librt), but since AFAIK this is not going to happen, I fear BRL-CAD will never work with musl. Fixing the other problems might still make sense on non-musl systems, if not, then this issue could be closed. |
It will take some time to sort through each of the issues you've reported here, but I wanted to address this latest since it's a rather well-known issue, not new.
Speaking of which, I believe that's a POSIX violation in itself. In my draft, it's mentioned around page 2434 of 1003.1 how libraries are to be resolved in order to avoid naming collisions like this.
Do you have a citation for the name being reserved? My understanding is the name of the library is not dictated by POSIX.1b (and it's not in my draft). It was merely Sun's original convention in the early 90s. Many platforms rolled the aio realtime interface into libc and no longer even have a librt. Other platforms are POSIX-conformant and do not have a librt (but do sport the headers and functions defined by POSIX.1b. That said, BRL-CAD's librt library predates POSIX by more than 10 years and is our flagship library with many custom integrations that would be detrimentally and expensively affected by a name change.
As I started off with, this is not a new issue and I dare say was even a much harder problem in the mid-90's. For all platforms thus far including incredibly isoteric systems with hard-coded librt's, we've been able to figure out solutions without renaming and impacting our integrations. There is almost certainly a solution that does not involve renaming librt. Note, the comment you cite in our build system is regarding direct installation into /usr such that installing BRL-CAD could very well overwrite a system library (not just because of librt, there are several potential collisions). The general recommendation of LSB and other standards is to not have user apps in /usr regardless, instead defaulting to /opt or a subdir like /usr/local/. To figure out a solution for this, I think the other issues will need to be addressed first, and then review docs on the compiler+linker to see how we can avoid collision (e.g., via versioning, naming, prioritization, location, etc). |
On a system with posix man-pages installed, you can look at the descriptions of
This is the behavior of the compiler, not the dynamic linker, but the argument that conforming libraries should not use those names still stands, in my opinion. But I understand renaming would be unfeasible for BRL-CAD. |
A couple quick comments: On the "-Wl,-z,stack-size" issue - if the CMake logic tests at configure time to see if those options are valid for the current compiler, it should be fine to go ahead and apply them when they are valid. I'd much rather do that than modifying to skip long lines, since the point of that utility is to survey the code looking for environment flags the user can set and I don't want to miss one because the line is long - very unexpected user behavior. I doubt it's an issue right now, but that would be exactly the kind of weird corner case problem that would have a developer pulling their hair out down the road wondering why it doesn't work... I don't know if we have a linker options test set up at the moment, but it certainly should be possible to do (even up to running a long line regex match test case, if necessary.) On the librt issue, I should perhaps mention that there is an option short of an all-encompassing, across-the-board rename (if it comes to that and musl in the end truly does preclude a functional solution with the librt name.) CMake gives us an option to change the output file name of a target without otherwise having to alter the build logic, and that command could be hidden behind a user level CMake option to allow the output name to be changed if necessary. That will of course still not work for situations where external integrations expect BRL-CAD's librt to be present, but for use cases where such integrations are not a concern that is a technically feasible answer. |
Oh, nice. I was able to build successfully and it seems to work, I didn't expect it to be as easy as Regarding the |
Hi @r-ricci I just wanted to update you that I've been working on a musl-based build recently, and have been working to get all the necessary changes incorporated. Thank you for all your efforts and debugging the issues. Having gone through it myself, now, I'm impressed how far along you got in discerning the issues and finding fixes. As an aside, I found another simpler workaround to the issue of ld resolving to /lib/ld-musl-x86_64.so.1 for librt. If you export LD_PRELOAD=/path/to/build/lib/librt.so, that avoids the relocation symbol errors and compilation succeeds. Of course, that's just a temporary workaround and has to be updated and set post-install too, but it is an option that seems to work. What'll probably be needed unless/until musl changes their ld is to create a cmake test that checks if ld resolves librt so anything other than what we specify and, if it does, modify the output name. That said, I did find the specific line in musl's ld code where it skips libraries named 'librt' (and 'libpthread' and 'libm' and 'libdl' and 'libutil' and 'libxnet'), so I'll see if they're possibly amenable to a patch that simply doesn't. Thanks again for your efforts and feedback! |
We're now building cleanly out of the box on musl, so this issue can be closed. Please feel free to open a new tickets if you observe any additional issues. Thank you again for all the excellent rundown details on each issue. They definitely were helpful. |
center text 2
I tried to build brlcad-7.32.4-1 on a musl-based linux system (Void Linux) and I encountered various issues. These don't occur on the glibc version of Void Linux.
brlcad/src/libbu/affinity.c
Lines 61 to 65 in 0f3f062
It looks like neither of musl and glibc have
cpuset_t
, while both havecpu_set_t
, but for some reasonHAVE_CPU_SET_T
is not defined with musl. I think the ifdef could be removed completely.misc/tools/env2c/env2c.cxx
segfaults. The problem is not related to brlcad and could be fixed by using a link flag such as-Wl,-z,stack-size=2097152
. However the LDFLAGS environment variable is not used when building this program, is this intentional?gethostbyname(3), used in
src/util/ttcp.c
, is no longer in POSIX. glibc provides it by default, while musl requires an appropriate feature macro, e.g.#define _POSIX_C_SOURCE 200112L
.Various programs crash at runtime during build with several errors of the kind
Error relocating: symbol not found
. Any ideas about what could be the cause?The logs below illustrate the last problem, after I applied patches to fix the first three.
cmake log
cmake.txt
build log
build.txt
The text was updated successfully, but these errors were encountered: