Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

llvm-strip throws invalid relocation offset when using --strip-debug #22291

Closed
dschuff opened this issue Jul 30, 2024 Discussed in #22289 · 11 comments
Closed

llvm-strip throws invalid relocation offset when using --strip-debug #22291

dschuff opened this issue Jul 30, 2024 Discussed in #22289 · 11 comments
Assignees

Comments

@dschuff
Copy link
Member

dschuff commented Jul 30, 2024

Discussed in #22289

Originally posted by gmarella July 28, 2024
Hi,

I am trying to use llvm-strip that gets shipped with emscripten to strip the debug information from the static library compiled using emcc. It always throws the error invalid relocation offset when the strip command is executed. This looked basic when we started but not able to get it working. Any inputs?

make debug-strip
cp out/libsimple_thread.a out/libsimple_thread.a.orig.debug
~/repos/emsdk/upstream/bin/llvm-strip --strip-debug out/libsimple_thread.a
~/repos/emsdk/upstream/bin/llvm-strip: error: 'out/libsimple_thread.a': 'simple_thread.o': invalid relocation offset
make: *** [debug-strip] Error 1

I also tried removing the debug section from the simple_thread.o first and then creating the archive but hit the same error during archive creation step.

~/repos/emsdk/upstream/bin/llvm-objcopy --strip-debug ./out/simple_thread.o ./out/simple_thread_lean.o
~/repos/emsdk/upstream/bin/llvm-ar crus libmylib.a out/simple_thread_lean.o
/Users/gmarella/repos/emsdk/upstream/bin/llvm-ar: error: libmylib.a: 'simple_thread_lean.o': invalid relocation offset

Code structure is as follows (Gist is also at https://gist.github.com/gmarella/9caa8a88fdd221c1fed4062e5a81c782 )

  • Makefile
  • src
    • main.cpp
    • simple_thread.cpp
    • simple_thread.h

Libary code

simple_thread.cpp

#include <iostream>
#include <string>
#include <pthread.h>
#include <unistd.h>

using namespace std;

void* thread_function(void *)
{
    cout << "Inside thread function" << endl;
    while (true)
    {
        cout << "Looping in thread" << endl;
        sleep(1);
    }

    return nullptr;
}

simple_thread.h

#pragma once

void* thread_function(void *);

main.cpp

#include <stddef.h>
#include <string.h>
#include <stdint.h>
#include <iostream>

#include <pthread.h>
#include "simple_thread.h"

using namespace std;

void InitThreads()
{
    pthread_t thread;
    pthread_create(&thread, NULL, &thread_function, NULL);
    pthread_join(thread, NULL);
}

int main()
{
    InitThreads();

    return 0;
}

Makefile

EMSCRIPTEN_BASE=~/repos/emsdk/upstream/emscripten
LLVM_TOOLS_BASE=~/repos/emsdk/upstream/bin

EMCC = $(EMSCRIPTEN_BASE)/emcc
EMAR = $(EMSCRIPTEN_BASE)/emar

OUT_DIR = out
SERVE_DIR = serve
SRC_DIR = src

# Define the source file(s)
MAIN_SRC = $(SRC_DIR)/main.cpp 

LIB_SRC_FILES = simple_thread.cpp
LIB_SRC = $(addprefix $(SRC_DIR)/, $(LIB_SRC_FILES))
LIB_OBJS = $(LIB_SRC_FILES:.cpp=.o)
LIB_NAME = libsimple_thread.a

# Define the HTML file
HTML_OUT = index

# Compiler flags
CFLAGS = -g -pthread -fPIC

LFLAGS = -s WASM=1 --emrun

TARGET = output

all: $(TARGET)

$(OUT_DIR):
	mkdir -p $(OUT_DIR)

$(SERVE_DIR):
	mkdir -p $(SERVE_DIR)

$(LIB_OBJS): $(LIB_SRC) | $(OUT_DIR)
	$(EMCC) $(CFLAGS) -c $(LIB_SRC) -o $(OUT_DIR)/$@

$(LIB_NAME): $(LIB_OBJS)
	$(EMAR) rcs $(OUT_DIR)/$(LIB_NAME) $(OUT_DIR)/$(LIB_OBJS)

$(TARGET): $(MAIN_SRC) $(LIB_NAME) | $(SERVE_DIR)
	$(EMCC) $(CFLAGS) $(LFLAGS) $(MAIN_SRC) $(OUT_DIR)/$(LIB_NAME) -o $(SERVE_DIR)/$(HTML_OUT).html

# Rule to clean the output (for cleanup purposes)
clean:
	rm -rf $(OUT_DIR) $(SERVE_DIR)

debug-strip:
	cp $(OUT_DIR)/$(LIB_NAME) $(OUT_DIR)/$(LIB_NAME).orig.debug
	$(LLVM_TOOLS_BASE)/llvm-strip --strip-debug $(OUT_DIR)/$(LIB_NAME)

@dschuff
Copy link
Member Author

dschuff commented Jul 30, 2024

I can confirm that running --strip-debug on an object file and then attempting to add it to an archive file breaks. When running a debug build with asserts, I got this:

/usr/lib/gcc/x86_64-linux-gnu/14/../../../../include/c++/14/bits/stl_vector.h:1127: reference std::vector<llvm::object::WasmSection>::operator[](size_type) [_Tp = llvm::object::WasmSection, _Alloc = std::allocator<llvm::object::WasmSection>]: Assertion '__n < this->size()' failed.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.	Program arguments: llvm-strip --strip-debug hello_world.a
 #0 0x00007efc2e1eb021 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /s/llvm-upstream/llvm-project/llvm/lib/Support/Unix/Signals.inc:723:11
 #1 0x00007efc2e1eb51b PrintStackTraceSignalHandler(void*) /s/llvm-upstream/llvm-project/llvm/lib/Support/Unix/Signals.inc:798:1
 #2 0x00007efc2e1e9516 llvm::sys::RunSignalHandlers() /s/llvm-upstream/llvm-project/llvm/lib/Support/Signals.cpp:105:5
 #3 0x00007efc2e1ebcb5 SignalHandler(int) /s/llvm-upstream/llvm-project/llvm/lib/Support/Unix/Signals.inc:413:1
 #4 0x00007efc2d8591a0 (/lib/x86_64-linux-gnu/libc.so.6+0x3d1a0)
 #5 0x00007efc2d8a70ec __pthread_kill_implementation ./nptl/pthread_kill.c:44:76
 #6 0x00007efc2d859102 gsignal ./signal/../sysdeps/posix/raise.c:27:6
 #7 0x00007efc2d8424f2 abort ./stdlib/abort.c:81:7
 #8 0x00007efc2dad30fe (/lib/x86_64-linux-gnu/libstdc++.so.6+0xd30fe)
 #9 0x00007efc2ebbb214 std::vector<llvm::object::WasmSection, std::allocator<llvm::object::WasmSection>>::operator[](unsigned long) /usr/lib/gcc/x86_64-linux-gnu/14/../../../../include/c++/14/bits/stl_vector.h:1127:2
#10 0x00007efc2ebb4f38 llvm::object::WasmObjectFile::parseLinkingSectionSymtab(llvm::object::WasmObjectFile::ReadContext&) /s/llvm-upstream/llvm-project/llvm/lib/Object/WasmObjectFile.cpp:841:31
#11 0x00007efc2ebb3d1b llvm::object::WasmObjectFile::parseLinkingSection(llvm::object::WasmObjectFile::ReadContext&) /s/llvm-upstream/llvm-project/llvm/lib/Object/WasmObjectFile.cpp:638:23
#12 0x00007efc2ebb0275 llvm::object::WasmObjectFile::parseCustomSection(llvm::object::WasmSection&, llvm::object::WasmObjectFile::ReadContext&) /s/llvm-upstream/llvm-project/llvm/lib/Object/WasmObjectFile.cpp:1169:21
#13 0x00007efc2ebafd8e llvm::object::WasmObjectFile::parseSection(llvm::object::WasmSection&) /s/llvm-upstream/llvm-project/llvm/lib/Object/WasmObjectFile.cpp:396:5
#14 0x00007efc2ebaf8ad llvm::object::WasmObjectFile::WasmObjectFile(llvm::MemoryBufferRef, llvm::Error&) /s/llvm-upstream/llvm-project/llvm/lib/Object/WasmObjectFile.cpp:382:10
#15 0x00007efc2ebb98f2 std::__detail::_MakeUniq<llvm::object::WasmObjectFile>::__single_object std::make_unique<llvm::object::WasmObjectFile, llvm::MemoryBufferRef&, llvm::Error&>(llvm::MemoryBufferRef&, llvm::Error&) /usr/lib/gcc/x86_64-linux-gnu/14/../../../../include/c++/14/bits/unique_ptr.h:1076:34
#16 0x00007efc2ebaf29f llvm::object::ObjectFile::createWasmObjectFile(llvm::MemoryBufferRef) /s/llvm-upstream/llvm-project/llvm/lib/Object/WasmObjectFile.cpp:72:7
#17 0x00007efc2eb895c2 llvm::object::ObjectFile::createObjectFile(llvm::MemoryBufferRef, llvm::file_magic, bool) /s/llvm-upstream/llvm-project/llvm/lib/Object/ObjectFile.cpp:203:12
#18 0x00007efc2eb9b160 llvm::object::SymbolicFile::createSymbolicFile(llvm::MemoryBufferRef, llvm::file_magic, llvm::LLVMContext*, bool) /s/llvm-upstream/llvm-project/llvm/lib/Object/SymbolicFile.cpp:71:12
#19 0x00007efc2e9f6ba0 llvm::object::SymbolicFile::createSymbolicFile(llvm::MemoryBufferRef) /s/llvm-upstream/llvm-project/llvm/include/llvm/Object/SymbolicFile.h:176:12
#20 0x00007efc2e9ef270 getSymbolicFile(llvm::MemoryBufferRef, llvm::LLVMContext&, llvm::object::Archive::Kind, llvm::function_ref<void (llvm::Error)>) /s/llvm-upstream/llvm-project/llvm/lib/Object/ArchiveWriter.cpp:527:10
#21 0x00007efc2e9ec564 computeMemberData(llvm::raw_ostream&, llvm::raw_ostream&, llvm::object::Archive::Kind, bool, bool, llvm::SymtabWritingMode, SymMap*, llvm::LLVMContext&, llvm::ArrayRef<llvm::NewArchiveMember>, std::optional<bool>, llvm::function_ref<void (llvm::Error)>) /s/llvm-upstream/llvm-project/llvm/lib/Object/ArchiveWriter.cpp:855:12
#22 0x00007efc2e9eaa11 llvm::writeArchiveToStream(llvm::raw_ostream&, llvm::ArrayRef<llvm::NewArchiveMember>, llvm::SymtabWritingMode, llvm::object::Archive::Kind, bool, bool, std::optional<bool>, llvm::function_ref<void (llvm::Error)>) /s/llvm-upstream/llvm-project/llvm/lib/Object/ArchiveWriter.cpp:1056:49
#23 0x00007efc2e9eebb0 llvm::writeArchive(llvm::StringRef, llvm::ArrayRef<llvm::NewArchiveMember>, llvm::SymtabWritingMode, llvm::object::Archive::Kind, bool, bool, std::unique_ptr<llvm::MemoryBuffer, std::default_delete<llvm::MemoryBuffer>>, std::optional<bool>, llvm::function_ref<void (llvm::Error)>) /s/llvm-upstream/llvm-project/llvm/lib/Object/ArchiveWriter.cpp:1316:13
#24 0x00007efc2eef8711 llvm::objcopy::deepWriteArchive(llvm::StringRef, llvm::ArrayRef<llvm::NewArchiveMember>, llvm::SymtabWritingMode, llvm::object::Archive::Kind, bool, bool) /s/llvm-upstream/llvm-project/llvm/lib/ObjCopy/Archive.cpp:70:17
#25 0x00007efc2eef855e llvm::objcopy::executeObjcopyOnArchive(llvm::objcopy::MultiFormatConfig const&, llvm::object::Archive const&) /s/llvm-upstream/llvm-project/llvm/lib/ObjCopy/Archive.cpp:105:10
#26 0x0000556cadc48ae3 executeObjcopy(llvm::objcopy::ConfigManager&) /s/llvm-upstream/llvm-project/llvm/tools/llvm-objcopy/llvm-objcopy.cpp:179:21
#27 0x0000556cadc4839e llvm_objcopy_main(int, char**, llvm::ToolContext const&) /s/llvm-upstream/llvm-project/llvm/tools/llvm-objcopy/llvm-objcopy.cpp:251:15
#28 0x0000556cadc4c455 main /s/llvm-upstream/llvm-project/build/tools/llvm-objcopy/llvm-objcopy-driver.cpp:17:3
#29 0x00007efc2d843b8a __libc_start_call_main ./csu/../sysdeps/nptl/libc_start_call_main.h:74:3
#30 0x00007efc2d843c45 call_init ./csu/../csu/libc-start.c:128:20
#31 0x00007efc2d843c45 __libc_start_main ./csu/../csu/libc-start.c:347:5

@dschuff
Copy link
Member Author

dschuff commented Jul 30, 2024

This same crash happens when stripping the object file first and then trying to add it to an archive. This suggests that the stripping is corrupting the object file in such a way that it crashes the object file parser.

@dschuff
Copy link
Member Author

dschuff commented Jul 30, 2024

Actually looking at the file, the debug sections are gone, but the reloc sections that point to them are still there. Maybe the problem is that the linking section parser is finding the symbols pointing to the sections that are gone, and asserting, rather than handling this error in a more reasonable way. More generally we should probably be resilient to bogus symbols and relocations. And probably we should also strip relocations that point to debug sections when stripping only debug info.

@gmarella
Copy link

@dschuff It makes sense now, is there any workaround like option of removing the reloc sections of the debug sections that are stripped off?.

@dschuff
Copy link
Member Author

dschuff commented Jul 30, 2024

Unfortunately it doesn't look like it. Even if you remove the relocation sections, it looks like it's the invalid symbols in the symbol table (that point into the stripped debug info section) that are causing the problem.

Having said that though, as @kripken said in #22289 the expected way to do stripping is to link first, and then strip the wasm file after linking. That would avoid this problem and still let you have both a debuggable and smaller binary to work with.

Thinking about what the "right" behavior should be here: clearly the object file parser should not crash when reading corrupted symbol tables if possible. But the symbol table is still "invalid" because there are symbols pointing into nonexistent sections. And stripping out sections should probably (?) not also modify the symbol table. So what should the linker do if it encounters
if this use case is going to be made to actually work, the linker will probably have to distinguish between symbols pointing into missing known sections (which IMO should probably be an "irrecoverable" form of corruption) and symbols pointing into missing custom section (which maybe the linker can just tolerate). But it's not clear how far-reaching a change this would be. The linker would probably have to avoid adding such symbols to the linker symbol table at load time so that it would tolerate such symbols being unresolved, and not let such symbols resolve dependencies from other symbols, and tolerate duplicates, etc. It might or might not be worth it.

Is there a particular reason you need to be able to strip the object/library files instead of the linked wasm file?

@gmarella
Copy link

@dschuff Our use case is little different, we build WASM by linking a set of static libraries. For the ease of local development, we want to have a choice of which libraries to consider including the debug symbols for. For unwanted libraries, we want to strip the debug symbols first and then build WASM along with a dedicated dwarf symbol file with debug info only from the required libraries.

@sbc100
Copy link
Collaborator

sbc100 commented Aug 5, 2024

I opened llvm/llvm-project#102002 on the LLVM side. Sadly it looks like it might be fair amount of work to result this.

@gmarella
Copy link

gmarella commented Aug 6, 2024

Thanks @sbc100 .

@gmarella
Copy link

@sbc100 I see that fix is merged on the llvm-side llvm/llvm-project#102978

Any timeline on the fix to be picked up by the emscripten?.

@dschuff
Copy link
Member Author

dschuff commented Aug 23, 2024

The fix should be in 3.1.65 which was released yesterday.

@sbc100
Copy link
Collaborator

sbc100 commented Aug 23, 2024

Closing this issue for now, please re-open if it persists.

@sbc100 sbc100 closed this as completed Aug 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants