Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bitcode reader asserts in lazyLoadOneMetadata #56194

Open
arsenm opened this issue Jun 24, 2022 · 6 comments
Open

Bitcode reader asserts in lazyLoadOneMetadata #56194

arsenm opened this issue Jun 24, 2022 · 6 comments

Comments

@arsenm
Copy link
Contributor

arsenm commented Jun 24, 2022

While trying to load some bitcode from an older build from a few months ago, I ran into this assertion in the bitcode reader. Attached is patch adding reduced lit test

llvm-dis: /home/matt/src/llvm-project/llvm/lib/Bitcode/Reader/MetadataLoader.cpp:1058: void llvm::MetadataLoader::MetadataLoaderImpl::lazyLoadOneMetadata(unsigned int, (anonymous namespace)::(anonymous namespace)::PlaceholderQueue &): Assertion `ID < (MDStringRef.size()) + GlobalMetadataBitPosIndex.size()' failed.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.	Program arguments: /home/matt/src/llvm-project/build_debug/bin/llvm-dis
 #0 0x0000000000c2fcaa llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /home/matt/src/llvm-project/llvm/lib/Support/Unix/Signals.inc:569:11
 #1 0x0000000000c2fe5b PrintStackTraceSignalHandler(void*) /home/matt/src/llvm-project/llvm/lib/Support/Unix/Signals.inc:636:1
 #2 0x0000000000c2e4d6 llvm::sys::RunSignalHandlers() /home/matt/src/llvm-project/llvm/lib/Support/Signals.cpp:103:5
 #3 0x0000000000c30595 SignalHandler(int) /home/matt/src/llvm-project/llvm/lib/Support/Unix/Signals.inc:407:1
 #4 0x00007f377084e520 (/lib/x86_64-linux-gnu/libc.so.6+0x42520)
 #5 0x00007f37708a2a7c __pthread_kill_implementation ./nptl/./nptl/pthread_kill.c:44:76
 #6 0x00007f37708a2a7c __pthread_kill_internal ./nptl/./nptl/pthread_kill.c:78:10
 #7 0x00007f37708a2a7c pthread_kill ./nptl/./nptl/pthread_kill.c:89:10
 #8 0x00007f377084e476 gsignal ./signal/../sysdeps/posix/raise.c:27:6
 #9 0x00007f37708347f3 abort ./stdlib/./stdlib/abort.c:81:7
#10 0x00007f377083471b _nl_load_domain ./intl/./intl/loadmsgcat.c:1177:9
#11 0x00007f3770845e96 (/lib/x86_64-linux-gnu/libc.so.6+0x39e96)
#12 0x00000000007104d2 llvm::MetadataLoader::MetadataLoaderImpl::lazyLoadOneMetadata(unsigned int, (anonymous namespace)::(anonymous namespace)::PlaceholderQueue&) /home/matt/src/llvm-project/llvm/lib/Bitcode/Reader/MetadataLoader.cpp:0:3
#13 0x0000000000706edc llvm::MetadataLoader::MetadataLoaderImpl::resolveForwardRefsAndPlaceholders((anonymous namespace)::(anonymous namespace)::PlaceholderQueue&) /home/matt/src/llvm-project/llvm/lib/Bitcode/Reader/MetadataLoader.cpp:1104:18
#14 0x0000000000706b6a llvm::MetadataLoader::MetadataLoaderImpl::parseMetadata(bool) /home/matt/src/llvm-project/llvm/lib/Bitcode/Reader/MetadataLoader.cpp:1025:7
#15 0x000000000071117f llvm::MetadataLoader::parseMetadata(bool) /home/matt/src/llvm-project/llvm/lib/Bitcode/Reader/MetadataLoader.cpp:2346:17
#16 0x00000000006d6f68 llvm::MetadataLoader::parseModuleMetadata() /home/matt/src/llvm-project/llvm/lib/Bitcode/Reader/MetadataLoader.h:47:40
#17 0x000000000068ffdf (anonymous namespace)::BitcodeReader::parseModule(unsigned long, bool, llvm::function_ref<llvm::Optional<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>> (llvm::StringRef)>) /home/matt/src/llvm-project/llvm/lib/Bitcode/Reader/BitcodeReader.cpp:3921:35
#18 0x000000000067196b (anonymous namespace)::BitcodeReader::parseBitcodeInto(llvm::Module*, bool, bool, llvm::function_ref<llvm::Optional<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>> (llvm::StringRef)>) /home/matt/src/llvm-project/llvm/lib/Bitcode/Reader/BitcodeReader.cpp:4117:10
#19 0x0000000000670d39 llvm::BitcodeModule::getModuleImpl(llvm::LLVMContext&, bool, bool, bool, llvm::function_ref<llvm::Optional<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>> (llvm::StringRef)>) /home/matt/src/llvm-project/llvm/lib/Bitcode/Reader/BitcodeReader.cpp:7355:13

0001-Bitcode-Add-asserting-bitcode-testcase.zip

@arsenm
Copy link
Contributor Author

arsenm commented Oct 18, 2022

@scott-linder I'm starting to suspect this is somehow involved with the "DIOpArg" in the test case. This seems to be in the heterogenous debug info patches and triggers the assert in upstream

@arsenm
Copy link
Contributor Author

arsenm commented Oct 18, 2022

Further reduced with llvm-reduce

0001-Bitcode-Add-asserting-bitcode-testcase.patch.zip

@arsenm
Copy link
Contributor Author

arsenm commented Jun 29, 2023

ping, running into this again

@slinder1
Copy link
Contributor

slinder1 commented Jul 6, 2023

TLDR:

I don't know that I can resolve the issue fundamentally until we upstream. The new DIExpr metadata node requires a new bitcode record code (in amd-stg-open it is bitc::METADTA_EXPR), and there is no concept of "skipping" unknown metadata in the bitcode reader.

More details:

The assertion is most immediately due to the lack of a default: case for an unknown record code in lazyLoadModuleMetadataBlock. The original intention seems to be to abort lazy-loading when a known-record-code that is not directly supported for lazy-loading is encountered. However, it doesn't consider unknown or corrupted codes, so it just silently falls through and ignores it without disabling lazy-loading. Later lazyLoadOneMetadata asserts because the record's code is not lazy-loadable.

However even with lazy-loading disabled (this can be forced with the hidden -disable-ondemand-mds-loading flag) the record is fundamentally unknown to the reader, and it will fail in other ways. (Note that another latent bug means one has to hack out the body of ~PlaceholderQueue destructor to actually see any error messages generated.)

We also used sequential IDs for the new metadata downstream, which means we have actually already changed the values used between releases as we have merged in upstream (i.e. when the new METADATA_ASSIGN_ID kind was added upstream with the same value as METADATA_EXPR in our fork). This is an oversight on my part, and I simply missed the merge where the values were bumped up. We should have chosen much higher ID values to avoid this, but in any case there is just no way to load a bitcode module produced with heterogeneous debug-info using the upstream compiler.

@arsenm
Copy link
Contributor Author

arsenm commented Jul 11, 2023

Independent of the debugging support, the bitcode reader seems really fragile and should be more capable of dropping this

We also used sequential IDs for the new metadata downstream, which means we have actually already changed the values used between releases as we have merged in upstream (i.e. when the new METADATA_ASSIGN_ID kind was added upstream with the same value as METADATA_EXPR in our fork).

Is this a real problem given that you're supposed to be able to just drop metadata?

@slinder1
Copy link
Contributor

Certainly I agree there is a bug here (in fact several) but it amounts to "fix the whole of the metadata reader", which I can't promise I have the cycles for.

The issue with the overlapping IDs is that technically there is nothing which prevents different kinds of metadata from having overlapping encoding spaces. I.e. an amd-stg-open compiler might generate some METADATA_EXPR metadata in such a way that the overlapping metadata with the same ID from upstream could actually parse it. Maybe such a case is outlandish, but it seems like there could be miscompiles (potentially even silently) if something meaningful to codegen were parsed in this way. Much more likely this would just uncover other bugs in the compiler around corrupt metadata, though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants