Implements llvm-readobj with tests. [3/5] #2

red1bluelost · 2023-11-09T00:55:56Z

Adds dumping of PGOBBAddrMap in readobj. Also includes tests for dumping.

PR Series

The current code for PGOBBAddrMap to be upstreamed is split into five PRs:

Object and ObjectYAML - Implements PGOBBAddrMap in Object and ObjectYAML with tests [1/5] llvm/llvm-project#71750
AsmPrinter - Implements PGOBBAddrMap in AsmPrinter with tests [2/5] #1
llvm-readobj - (this one)
llvm-objdump - https://github.com/red1bluelost/llvm-project/tree/pgo-bb-addr-map--llvm-objdump
llvm obj2yaml - https://github.com/red1bluelost/llvm-project/tree/pgo-bb-addr-map--llvm-obj2yaml

If you would like to try testing PGOBBAddrMap locally on a program, PR-3 llvm-readobj is likely the minimum code needed to meaningfully use this feature.

…ooking options for a custom subcommand (llvm#71975) …ooking options for a custom subcommand. (llvm#71776)" This reverts commit b88308b. The build-bot is unhappy (https://lab.llvm.org/buildbot/#/builders/186/builds/13096), `GroupingAndPrefix` fails after `TopLevelOptInSubcommand` (the newly added test). Revert while I look into this (might be related with test sharding but not sure) ``` [----------] 3 tests from CommandLineTest [ RUN ] CommandLineTest.TokenizeWindowsCommandLine2 [ OK ] CommandLineTest.TokenizeWindowsCommandLine2 (0 ms) [ RUN ] CommandLineTest.TopLevelOptInSubcommand [ OK ] CommandLineTest.TopLevelOptInSubcommand (0 ms) [ RUN ] CommandLineTest.GroupingAndPrefix #0 0x00ba8118 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/home/tcwg-buildbot/worker/clang-armv7-global-isel/stage1/unittests/Support/./SupportTests+0x594118) #1 0x00ba5914 llvm::sys::RunSignalHandlers() (/home/tcwg-buildbot/worker/clang-armv7-global-isel/stage1/unittests/Support/./SupportTests+0x591914) #2 0x00ba89c4 SignalHandler(int) (/home/tcwg-buildbot/worker/clang-armv7-global-isel/stage1/unittests/Support/./SupportTests+0x5949c4) llvm#3 0xf7828530 __default_sa_restorer /build/glibc-9MGTF6/glibc-2.31/signal/../sysdeps/unix/sysv/linux/arm/sigrestorer.S:67:0 llvm#4 0x00af91f0 (anonymous namespace)::CommandLineParser::ResetAllOptionOccurrences() (/home/tcwg-buildbot/worker/clang-armv7-global-isel/stage1/unittests/Support/./SupportTests+0x4e51f0) llvm#5 0x00af8e1c llvm::cl::ResetCommandLineParser() (/home/tcwg-buildbot/worker/clang-armv7-global-isel/stage1/unittests/Support/./SupportTests+0x4e4e1c) llvm#6 0x0077cda0 (anonymous namespace)::CommandLineTest_GroupingAndPrefix_Test::TestBody() (/home/tcwg-buildbot/worker/clang-armv7-global-isel/stage1/unittests/Support/./SupportTests+0x168da0) llvm#7 0x00bc5adc testing::Test::Run() (/home/tcwg-buildbot/worker/clang-armv7-global-isel/stage1/unittests/Support/./SupportTests+0x5b1adc) llvm#8 0x00bc6cc0 testing::TestInfo::Run() (/home/tcwg-buildbot/worker/clang-armv7-global-isel/stage1/unittests/Support/./SupportTests+0x5b2cc0) llvm#9 0x00bc7880 testing::TestSuite::Run() (/home/tcwg-buildbot/worker/clang-armv7-global-isel/stage1/unittests/Support/./SupportTests+0x5b3880) llvm#10 0x00bd7974 testing::internal::UnitTestImpl::RunAllTests() (/home/tcwg-buildbot/worker/clang-armv7-global-isel/stage1/unittests/Support/./SupportTests+0x5c3974) llvm#11 0x00bd6ebc testing::UnitTest::Run() (/home/tcwg-buildbot/worker/clang-armv7-global-isel/stage1/unittests/Support/./SupportTests+0x5c2ebc) llvm#12 0x00bb1058 main (/home/tcwg-buildbot/worker/clang-armv7-global-isel/stage1/unittests/Support/./SupportTests+0x59d058) llvm#13 0xf78185a4 __libc_start_main /build/glibc-9MGTF6/glibc-2.31/csu/libc-start.c:342:3 ```

… functions (llvm#72069) Fixes a bug introduced by commit f95b2f1 ("Reland [InstrProf][compiler-rt] Enable MC/DC Support in LLVM Source-based Code Coverage (1/3)") The InstrProfiling pass was refactored when introducing support for MC/DC such that the creation of the data variable was abstracted and called only once per function from ::run(). Because ::run() only iterated over functions there were not fully inlined, and because it only created the data variable for the first intrinsic that it saw, data variables corresponding to functions fully inlined into other instrumented callers would end up without a data variable, resulting in loss of coverage information. This patch does the following: 1.) Move the call of createDataVariable() to getOrCreateRegionCounters() so that the creation of the data variable will happen indirectly either from ::new() or during profile intrinsic lowering when it is needed. This effectively restores the behavior prior to the refactor and ensures that all data variables are created when needed (and not duplicated). 2.) Process all MC/DC bitmap parameter intrinsics in ::run() prior to calling getOrCreateRegionCounters(). This ensures bitmap regions are created for each function including functions that are fully inlined. It also ensures that the bitmap region is created for each function prior to the creation of the data variable because it is referenced by the data variable. Again, duplication is prevented if the same parameter intrinsic is inlined into multiple functions. 3.) No longer pass the MC/DC intrinsic to createDataVariable(). This decouples the creation of the data variable from a specific MC/DC intrinsic. Instead, with #2 above, store the number of bitmap bytes required in the PerFunctionProfileData in the ProfileDataMap along with the function's CounterRegion and BitmapRegion variables. This ties the bitmap information directly to the function to which it belongs, and the data variable created for that function can reference that.

Internal builds of the unittests with msan flagged mempcpy_test. ==6862==WARNING: MemorySanitizer: use-of-uninitialized-value #0 0x55e34d7d734a in length llvm-project/libc/src/__support/CPP/string_view.h:41:11 #1 0x55e34d7d734a in string_view llvm-project/libc/src/__support/CPP/string_view.h:71:24 #2 0x55e34d7d734a in __llvm_libc_9999_0_0_git::testing::Test::testStrEq(char const*, char const*, char const*, char const*, __llvm_libc_9999_0_0_git::testing::internal::Location) llvm-project/libc/test/UnitTest/LibcTest.cpp:284:13 llvm#3 0x55e34d7d4e09 in LlvmLibcMempcpyTest_Simple::Run() llvm-project/libc/test/src/string/mempcpy_test.cpp:20:3 llvm#4 0x55e34d7d6dff in __llvm_libc_9999_0_0_git::testing::Test::runTests(char const*) llvm-project/libc/test/UnitTest/LibcTest.cpp:133:8 llvm#5 0x55e34d7d86e0 in main llvm-project/libc/test/UnitTest/LibcTestMain.cpp:21:10 SUMMARY: MemorySanitizer: use-of-uninitialized-value llvm-project/libc/src/__support/CPP/string_view.h:41:11 in length What's going on here is that mempcpy_test.cpp's Simple test is using ASSERT_STREQ with a partially initialized char array. ASSERT_STREQ calls Test::testStrEq which constructs a cpp:string_view. That constructor calls the private method cpp::string_view::length. When built with msan, the loop is transformed into multi-byte access, which then fails upon access. I took a look at libc++'s __constexpr_strlen which just calls __builtin_strlen(). Replacing the implementation of cpp::string_view::length with a call to __builtin_strlen() may still result in out of bounds access when the test is built with msan. It's not safe to use ASSERT_STREQ with a partially initialized array. Initialize the whole array so that the test passes.

We'd like a way to select the current thread by its thread ID (rather than its internal LLDB thread index). This PR adds a `-t` option (`--thread_id` long option) that tells the `thread select` command to interpret the `<thread-index>` argument as a thread ID. Here's an example of it working: ``` michristensen@devbig356 llvm/llvm-project (thread-select-tid) » ../Debug/bin/lldb ~/scratch/cpp/threading/a.out (lldb) target create "/home/michristensen/scratch/cpp/threading/a.out" Current executable set to '/home/michristensen/scratch/cpp/threading/a.out' (x86_64). (lldb) b 18 Breakpoint 1: where = a.out`main + 80 at main.cpp:18:12, address = 0x0000000000000850 (lldb) run Process 215715 launched: '/home/michristensen/scratch/cpp/threading/a.out' (x86_64) This is a thread, i=1 This is a thread, i=2 This is a thread, i=3 This is a thread, i=4 This is a thread, i=5 Process 215715 stopped * thread #1, name = 'a.out', stop reason = breakpoint 1.1 frame #0: 0x0000555555400850 a.out`main at main.cpp:18:12 15 for (int i = 0; i < 5; i++) { 16 pthread_create(&thread_ids[i], NULL, foo, NULL); 17 } -> 18 for (int i = 0; i < 5; i++) { 19 pthread_join(thread_ids[i], NULL); 20 } 21 return 0; (lldb) thread select 2 * thread #2, name = 'a.out' frame #0: 0x00007ffff68f9918 libc.so.6`__nanosleep + 72 libc.so.6`__nanosleep: -> 0x7ffff68f9918 <+72>: cmpq $-0x1000, %rax ; imm = 0xF000 0x7ffff68f991e <+78>: ja 0x7ffff68f9952 ; <+130> 0x7ffff68f9920 <+80>: movl %edx, %edi 0x7ffff68f9922 <+82>: movl %eax, 0xc(%rsp) (lldb) thread info thread #2: tid = 216047, 0x00007ffff68f9918 libc.so.6`__nanosleep + 72, name = 'a.out' (lldb) thread list Process 215715 stopped thread #1: tid = 215715, 0x0000555555400850 a.out`main at main.cpp:18:12, name = 'a.out', stop reason = breakpoint 1.1 * thread #2: tid = 216047, 0x00007ffff68f9918 libc.so.6`__nanosleep + 72, name = 'a.out' thread llvm#3: tid = 216048, 0x00007ffff68f9918 libc.so.6`__nanosleep + 72, name = 'a.out' thread llvm#4: tid = 216049, 0x00007ffff68f9918 libc.so.6`__nanosleep + 72, name = 'a.out' thread llvm#5: tid = 216050, 0x00007ffff68f9918 libc.so.6`__nanosleep + 72, name = 'a.out' thread llvm#6: tid = 216051, 0x00007ffff68f9918 libc.so.6`__nanosleep + 72, name = 'a.out' (lldb) thread select 215715 error: invalid thread #215715. (lldb) thread select -t 215715 * thread #1, name = 'a.out', stop reason = breakpoint 1.1 frame #0: 0x0000555555400850 a.out`main at main.cpp:18:12 15 for (int i = 0; i < 5; i++) { 16 pthread_create(&thread_ids[i], NULL, foo, NULL); 17 } -> 18 for (int i = 0; i < 5; i++) { 19 pthread_join(thread_ids[i], NULL); 20 } 21 return 0; (lldb) thread select -t 216051 * thread llvm#6, name = 'a.out' frame #0: 0x00007ffff68f9918 libc.so.6`__nanosleep + 72 libc.so.6`__nanosleep: -> 0x7ffff68f9918 <+72>: cmpq $-0x1000, %rax ; imm = 0xF000 0x7ffff68f991e <+78>: ja 0x7ffff68f9952 ; <+130> 0x7ffff68f9920 <+80>: movl %edx, %edi 0x7ffff68f9922 <+82>: movl %eax, 0xc(%rsp) (lldb) thread select 3 * thread llvm#3, name = 'a.out' frame #0: 0x00007ffff68f9918 libc.so.6`__nanosleep + 72 libc.so.6`__nanosleep: -> 0x7ffff68f9918 <+72>: cmpq $-0x1000, %rax ; imm = 0xF000 0x7ffff68f991e <+78>: ja 0x7ffff68f9952 ; <+130> 0x7ffff68f9920 <+80>: movl %edx, %edi 0x7ffff68f9922 <+82>: movl %eax, 0xc(%rsp) (lldb) thread select -t 216048 * thread llvm#3, name = 'a.out' frame #0: 0x00007ffff68f9918 libc.so.6`__nanosleep + 72 libc.so.6`__nanosleep: -> 0x7ffff68f9918 <+72>: cmpq $-0x1000, %rax ; imm = 0xF000 0x7ffff68f991e <+78>: ja 0x7ffff68f9952 ; <+130> 0x7ffff68f9920 <+80>: movl %edx, %edi 0x7ffff68f9922 <+82>: movl %eax, 0xc(%rsp) (lldb) thread select --thread_id 216048 * thread llvm#3, name = 'a.out' frame #0: 0x00007ffff68f9918 libc.so.6`__nanosleep + 72 libc.so.6`__nanosleep: -> 0x7ffff68f9918 <+72>: cmpq $-0x1000, %rax ; imm = 0xF000 0x7ffff68f991e <+78>: ja 0x7ffff68f9952 ; <+130> 0x7ffff68f9920 <+80>: movl %edx, %edi 0x7ffff68f9922 <+82>: movl %eax, 0xc(%rsp) (lldb) help thread select Change the currently selected thread. Syntax: thread select <cmd-options> <thread-index> Command Options Usage: thread select [-t] <thread-index> -t ( --thread_id ) Provide a thread ID instead of a thread index. This command takes options and free-form arguments. If your arguments resemble option specifiers (i.e., they start with a - or --), you must use ' -- ' between the end of the command options and the beginning of the arguments. (lldb) c Process 215715 resuming Process 215715 exited with status = 0 (0x00000000) ```

Linalg op fusion (`Linalg/Transforms/Fusion.cpp`) used to generate invalid fused producer ops: ``` error: 'linalg.conv_2d_nhwc_hwcf' op expected type of operand #2 ('tensor<1x8x16x4xf32>') to match type of corresponding result ('tensor<?x?x?x?xf32>') note: see current operation: %24 = "linalg.conv_2d_nhwc_hwcf"(%21, %22, %23) <{dilations = dense<1> : tensor<2xi64>, operandSegmentSizes = array<i32: 2, 1>, strides = dense<2> : tensor<2xi64>}> ({ ^bb0(%arg9: f32, %arg10: f32, %arg11: f32): %28 = "arith.mulf"(%arg9, %arg10) <{fastmath = #arith.fastmath<none>}> : (f32, f32) -> f32 %29 = "arith.addf"(%arg11, %28) <{fastmath = #arith.fastmath<none>}> : (f32, f32) -> f32 "linalg.yield"(%29) : (f32) -> () }) {linalg.memoized_indexing_maps = [affine_map<(d0, d1, d2, d3, d4, d5, d6) -> (d0, d1 * 2 + d4, d2 * 2 + d5, d6)>, affine_map<(d0, d1, d2, d3, d4, d5, d6) -> (d4, d5, d6, d3)>, affine_map<(d0, d1, d2, d3, d4, d5, d6) -> (d0, d1, d2, d3)>]} : (tensor<1x?x?x3xf32>, tensor<3x3x3x4xf32>, tensor<1x8x16x4xf32>) -> tensor<?x?x?x?xf32> ``` This is a problem because the input IR to greedy pattern rewriter during `-test-linalg-greedy-fusion` is invalid. This commit fixes tests such as `mlir/test/Dialect/Linalg/tile-and-fuse-tensors.mlir` when verifying the IR after each pattern application (llvm#74270).

This has been flaky for a while, for example https://lab.llvm.org/buildbot/#/builders/96/builds/50350 ``` Command Output (stdout): -- lldb version 18.0.0git (https://github.com/llvm/llvm-project.git revision 3974d89) clang revision 3974d89 llvm revision 3974d89 "can't evaluate expressions when the process is running." ``` ``` PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace. #0 0x0000ffffa46191a0 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/lib/python3.8/site-packages/lldb/_lldb.cpython-38-aarch64-linux-gnu.so+0x529a1a0) #1 0x0000ffffa4617144 llvm::sys::RunSignalHandlers() (/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/lib/python3.8/site-packages/lldb/_lldb.cpython-38-aarch64-linux-gnu.so+0x5298144) #2 0x0000ffffa46198d0 SignalHandler(int) (/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/lib/python3.8/site-packages/lldb/_lldb.cpython-38-aarch64-linux-gnu.so+0x529a8d0) llvm#3 0x0000ffffab25b7dc (linux-vdso.so.1+0x7dc) llvm#4 0x0000ffffab13d050 /build/glibc-Q8DG8B/glibc-2.31/string/../sysdeps/aarch64/multiarch/memcpy_advsimd.S:92:0 llvm#5 0x0000ffffa446f420 lldb_private::process_gdb_remote::GDBRemoteRegisterContext::PrivateSetRegisterValue(unsigned int, llvm::ArrayRef<unsigned char>) (/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/lib/python3.8/site-packages/lldb/_lldb.cpython-38-aarch64-linux-gnu.so+0x50f0420) llvm#6 0x0000ffffa446f7b8 lldb_private::process_gdb_remote::GDBRemoteRegisterContext::GetPrimordialRegister(lldb_private::RegisterInfo const*, lldb_private::process_gdb_remote::GDBRemoteCommunicationClient&) (/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/lib/python3.8/site-packages/lldb/_lldb.cpython-38-aarch64-linux-gnu.so+0x50f07b8) llvm#7 0x0000ffffa446f308 lldb_private::process_gdb_remote::GDBRemoteRegisterContext::ReadRegisterBytes(lldb_private::RegisterInfo const*) (/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/lib/python3.8/site-packages/lldb/_lldb.cpython-38-aarch64-linux-gnu.so+0x50f0308) llvm#8 0x0000ffffa446ec1c lldb_private::process_gdb_remote::GDBRemoteRegisterContext::ReadRegister(lldb_private::RegisterInfo const*, lldb_private::RegisterValue&) (/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/lib/python3.8/site-packages/lldb/_lldb.cpython-38-aarch64-linux-gnu.so+0x50efc1c) llvm#9 0x0000ffffa412eaa4 lldb_private::RegisterContext::ReadRegisterAsUnsigned(lldb_private::RegisterInfo const*, unsigned long) (/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/lib/python3.8/site-packages/lldb/_lldb.cpython-38-aarch64-linux-gnu.so+0x4dafaa4) llvm#10 0x0000ffffa420861c ReadLinuxProcessAddressMask(std::shared_ptr<lldb_private::Process>, llvm::StringRef) (/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/lib/python3.8/site-packages/lldb/_lldb.cpython-38-aarch64-linux-gnu.so+0x4e8961c) llvm#11 0x0000ffffa4208430 ABISysV_arm64::FixCodeAddress(unsigned long) (/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/lib/python3.8/site-packages/lldb/_lldb.cpython-38-aarch64-linux-gnu.so+0x4e89430) ``` Judging by the backtrace something is trying to read the pointer authentication address/code mask registers. This explains why I've not seen this issue locally, as the buildbot runs on Graviton 3 with has the pointer authentication extension. I will try to reproduce, fix and re-enable the test.

…vm#75394) Calling one of pthread join/detach interceptor on an already joined/detached thread causes asserts such as: AddressSanitizer: CHECK failed: sanitizer_thread_arg_retval.cpp:56 "((t)) != (0)" (0x0, 0x0) (tid=1236094) #0 0x555555634f8b in __asan::CheckUnwind() compiler-rt/lib/asan/asan_rtl.cpp:69:3 #1 0x55555564e06e in __sanitizer::CheckFailed(char const*, int, char const*, unsigned long long, unsigned long long) compiler-rt/lib/sanitizer_common/sanitizer_termination.cpp:86:24 #2 0x5555556491df in __sanitizer::ThreadArgRetval::BeforeJoin(unsigned long) const compiler-rt/lib/sanitizer_common/sanitizer_thread_arg_retval.cpp:56:3 llvm#3 0x5555556198ed in Join<___interceptor_pthread_tryjoin_np(void*, void**)::<lambda()> > compiler-rt/lib/asan/../sanitizer_common/sanitizer_thread_arg_retval.h:74:26 llvm#4 0x5555556198ed in pthread_tryjoin_np compiler-rt/lib/asan/asan_interceptors.cpp:311:29 The assert are replaced by error codes.

…cast` (llvm#79162) Makes `TransferReadAfterWriteToBroadcast` correctly propagate scalability flags.

This is to add test coverage for a change in llvm#73342

…e size (llvm#79245) Reported in 134fcc6 Incorrect opcode is used b/c there is a `[[fallthrough]]` at line 2386.

…#76551)" Test updated to expect i8 gep. Original message: This adopts a similar behavior to AArch64 SVE, where bool vectors are represented as a vector of chars with 1/8 the number of elements. This ensures the vector always occupies a power of 2 number of bytes. A consequence of this is that vbool64_t, vbool32_t, and vool16_t can only be used with a vector length that guarantees at least 8 bits.

…lvm#78637) Improve codegen for (trunc X to <3 x i8>) by converting it to a sequence of 3 ST1.b, but first converting the truncate operand to either v8i8 or v16i8, extracting the lanes for the truncate results and storing them. At the moment, there are almost no cases in which such vector operations will be generated automatically. The motivating case is non-power-of-2 SLP vectorization: llvm#77790 PR: llvm#78637

This will make it easier for folks who have patches that are not targeting LLVM 18 -- they can write the release notes in the LLVM 19 release notes immediately.

Turns out I was using DbgMarker::getDbgValueRange rather than the helper utility in Instruction::getDbgValueRange, which checks for null-ness. Original commit message follows. [DebugInfo][RemoveDIs] Convert debug-info modes when loading bitcode (llvm#78967) As part of eliminating debug-intrinsics in LLVM, we'll shortly be pushing the conversion from "old" dbg.value mode to "new" DPValue mode out from when the pass manager runs, to when modules are loaded. This patch adds that conversion process and some (temporary) options to llvm-lto{,2} to help test it. Specifically: now whenever we load a bitcode module, consider a flag of whether to "upgrade" it into the new debug-info mode, and if we're lazily materializing functions then do that lazily too. Doing this exposes an error in the IRLinker/materializer handling of DPValues, where we need to transfer the debug-info format flag correctly, and in ValueMapper we need to remap the Values that DPValues point at. I've added some test coverage in the modified tests; these will be exercised by our llvm-new-debug-iterators buildbot. This upgrading of debug-info won't be happening for the llvm18 release, instead we'll turn it on after the branch date, thenbe push the boundary of where "new" debug-info starts and ends down into the existing debug-info upgrade path over the course of the next release.

Minor cast warning that was missed in previous patch. Fixed with explicit cast.

This patch fixes: lldb/source/Plugins/Language/CPlusPlus/LibCxx.cpp:1108:39: error: format specifies type 'long long' but the argument has type 'std::time_t' (aka 'long') [-Werror,-Wformat] lldb/source/Plugins/Language/CPlusPlus/LibCxx.cpp:1116:64: error: format specifies type 'long long' but the argument has type 'std::time_t' (aka 'long') [-Werror,-Wformat]

The use of SmallSetVector saves 0.58% of heap allocations during the compilation of a large preprocessed file, namely X86ISelLowering.cpp, for the X86 target. During the experiment, the final size of ToCheck was 8 or less 88% of the time.

Extends `vector.insert_strided_slice` and `vector.insert_strided_slice` to allow scalable input and output vectors. For scalable sizes, the corresponding slice size has to match the corresponding dimension in the output/input vector (insert/extract, respectively). This is supported: ```mlir vector.extract_strided_slice %1 { offsets = [0, 3, 0], sizes = [1, 1, 4], strides = [1, 1, 1] } : vector<1x4x[4]xi32> to vector<1x1x[4]xi32> ``` This is not supported: ```mlir vector.extract_strided_slice %1 { offsets = [0, 3, 0], sizes = [1, 1, 2], strides = [1, 1, 1] } : vector<1x4x[4]xi32> to vector<1x1x[2]xi32> ```

…const char *, size_t *) (llvm#79215)

40dcf24 had changed the format specifier to fix the build for their local system. Unfortunately, that disagrees with some other systems, such as this buildbot: https://lab.llvm.org/buildbot/#/builders/37/builds/30440 This patch fixes the issue for all systems by casting.

…pp (NFC)

…lobalOps.cpp (NFC)

…rmOps.cpp (NFC)

…PUTransformOps.cpp (NFC)

A quick examination suggests that the current code in the codebase does not lead to incorrect annotations. However, the intention is for the object after the function to be annotated in a way that only its contents are unpoisoned and the rest is poisoned. This commit makes it explicit and avoids potential issues in future. In addition, I have implemented a few tests for a function that helped me identify the specific argument value. Notice: there is no known scenario where old code results in incorrect annotation.

A CPU may prefer to not sink splat operands, one reason being that it could require a S2V transfer buffer to move scalars into buffers.

…lvm#79307) Co-authored-by: Felix <felix@Dirks-MacBook-Pro.local>

…#79334) My previous patch, "Re-exec TSan with no ASLR if memory layout is incompatible on Linux (llvm#78351)" (0784b1e) hoisted the 'personality' call, to share the code between Android and non-Android Linux. Unfortunately, this eager call to 'personality' may trigger sandbox violations on non-Android Linux. This patch fixes the issue by only calling 'personality' on non-Android Linux if the memory mapping is incompatible. This may still cause a sandbox violation, but only if it was going to abort anyway due to an incompatible memory mapping. (The behavior on Android Linux is unchanged by this patch or the previous patch.)

… debug mode on PowerPC (llvm#79169) Adding the debug hardening modes into PowerPC target to check the assertion messages, to fix the PPC rehl bot [1]. This is needed as 58780b8 changed the assertion to trap in production hardening modes. Properly fixing these assertions is tracked by llvm#79216. [1]: https://lab.llvm.org/buildbot/#/builders/57/builds/32354 Co-authored-by: Maryam Moghadas <maryammo@ca.ibm.com>

We recently noticed that the unwrap_iter.h file was pushing macros, but it was pushing them again instead of popping them at the end of the file. This led to libc++ basically swallowing any custom definition of these macros in user code: #define min HELLO #include <algorithm> // min is not HELLO anymore, it's not defined While investigating this issue, I noticed that our push/pop pragmas were actually entirely wrong too. Indeed, instead of pushing macros like `move`, we'd push `move(int, int)` in the pragma, which is not a valid macro name. As a result, we would not actually push macros like `move` -- instead we'd simply undefine them. This led to the following code not working: #define move HELLO #include <algorithm> // move is not HELLO anymore Fixing the pragma push/pop incantations led to a cascade of issues because we use identifiers like `move` in a large number of places, and all of these headers would now need to do the push/pop dance. This patch fixes all these issues. First, it adds a check that we don't swallow important names like min, max, move or refresh as explained above. This is done by augmenting the existing system_reserved_names.gen.py test to also check that the macros are what we expect after including each header. Second, it fixes the push/pop pragmas to work properly and adds missing pragmas to all the files I could detect a failure in via the newly added test. rdar://121365472

…auto & to allow moving from BBAddrMap objects. (llvm#79456) std::move on `const auto &` references is essentially a noop. Changing to `auto &&` to actually allow moving.

attempts, NFC. If several iterations of reodering of orders is required, need to use different algorithm.

llvm#79481) We used to support a /branch comment to specify a branch with commits to backport to the release branch. However, now that we can use pull requests this is not needed. This also simplifies the process, because now the cherry-pick job can create the pull request directly instead of having it split across two separate jobs.

Adds tests for llvm-readobj PGOBBAddrMap. Updates readobj for the redesign with PGOAnalysisMap Updates tests after moving PGO analyses to after each function. Updates readobj with bitfield features.

Add custom combine to lower load <3 x i8> as the more efficient sequence below: ldrb wX, [x0, #2] ldrh wY, [x0] orr wX, wY, wX, lsl llvm#16 fmov s0, wX At the moment, there are almost no cases in which such vector operations will be generated automatically. The motivating case is non-power-of-2 SLP vectorization: llvm#77790

red1bluelost mentioned this pull request Nov 9, 2023

Implements PGOBBAddrMap in Object and ObjectYAML with tests [1/5] llvm/llvm-project#71750

Merged

red1bluelost changed the title ~~Implements llvm-readobj with tests.~~ Implements llvm-readobj with tests. [3/5] Nov 9, 2023

red1bluelost mentioned this pull request Nov 9, 2023

Implements PGOBBAddrMap in AsmPrinter with tests [2/5] #1

Closed

red1bluelost force-pushed the pgo-bb-addr-map--asm-printer branch 2 times, most recently from f0518af to b3d7e15 Compare November 21, 2023 01:13

red1bluelost force-pushed the pgo-bb-addr-map--llvm-readobj branch from 8573a87 to d8cb9c1 Compare November 22, 2023 17:19

red1bluelost force-pushed the pgo-bb-addr-map--asm-printer branch from 0997d5f to f3a651a Compare November 22, 2023 23:07

red1bluelost force-pushed the pgo-bb-addr-map--llvm-readobj branch from d8cb9c1 to 4535093 Compare November 22, 2023 23:25

red1bluelost force-pushed the pgo-bb-addr-map--asm-printer branch from f3a651a to c3e8eb7 Compare November 30, 2023 13:19

red1bluelost force-pushed the pgo-bb-addr-map--llvm-readobj branch from 4535093 to eb20d0e Compare November 30, 2023 13:27

red1bluelost force-pushed the pgo-bb-addr-map--asm-printer branch from c3e8eb7 to b08d416 Compare December 12, 2023 15:35

red1bluelost mentioned this pull request Dec 12, 2023

[SHT_LLVM_BB_ADDR_MAP][AsmPrinter] Implements PGOAnalysisMap emitting in AsmPrinter with tests. llvm/llvm-project#75202

Merged

red1bluelost force-pushed the pgo-bb-addr-map--asm-printer branch from b7e6ea6 to e365057 Compare December 14, 2023 16:08

red1bluelost force-pushed the pgo-bb-addr-map--llvm-readobj branch from eb20d0e to aa6a500 Compare January 4, 2024 00:55

red1bluelost force-pushed the pgo-bb-addr-map--llvm-readobj branch from aa6a500 to 10fed1a Compare January 8, 2024 20:07

banach-space and others added 8 commits January 24, 2024 08:18

[mlir][vector] Support scalable vec in `TransferReadAfterWriteToBroad…

d50705e

…cast` (llvm#79162) Makes `TransferReadAfterWriteToBroadcast` correctly propagate scalability flags.

[RISCV] Add tests for reverse shuffles of i1 vectors. NFC

f3b495f

This is to add test coverage for a change in llvm#73342

[mlir][Bazel] Add missing dependency after 750e90e

3446601

[Driver] Use StringRef::consume_front (NFC)

5404a37

[Transforms] Use llvm::pred_size and llvm::predecessors (NFC)

873a7bb

[AMDGPU] Use llvm::none_of (NFC)

18a3c7a

[DebugInfo] Use std::size (NFC)

b0763a1

[X86][CodeGen] Fix crash when commute operands of Instruction for cod…

33ecef9

…e size (llvm#79245) Reported in 134fcc6 Incorrect opcode is used b/c there is a `[[fallthrough]]` at line 2386.

MaskRay and others added 27 commits January 25, 2024 10:17

[ELF] Fix terminology: TLS optimizations instead of TLS relaxation. NFC

849951f

[libc++] Add base for LLVM 19 release notes (llvm#78990)

2550ce4

This will make it easier for folks who have patches that are not targeting LLVM 18 -- they can write the release notes in the LLVM 19 release notes immediately.

[libc] Fix type warning on gcc in float to str (llvm#79482)

8a0ff19

Minor cast warning that was missed in previous patch. Fixed with explicit cast.

[lldb][NFCI] Remove unused method BreakpointIDList::FindBreakpointID(…

59a6525

…const char *, size_t *) (llvm#79215)

Apply clang-tidy fixes for llvm-qualified-auto in MLProgramOps.cpp (NFC)

b9f7371

Apply clang-tidy fixes for llvm-qualified-auto in PipelineGlobalOps.c…

9cebb28

…pp (NFC)

Apply clang-tidy fixes for readability-identifier-naming in PipelineG…

4e71335

…lobalOps.cpp (NFC)

Apply clang-tidy fixes for bugprone-macro-parentheses in NVGPUTransfo…

70fdaef

…rmOps.cpp (NFC)

Apply clang-tidy fixes for performance-unnecessary-value-param in NVG…

a7759fb

…PUTransformOps.cpp (NFC)

[RISCV] Add Tune to DontSinkSplatOperands (llvm#79199)

594b92a

A CPU may prefer to not sink splat operands, one reason being that it could require a S2V transfer buffer to move scalars into buffers.

[libc] Add fminf128 and fmaxf128 implementations for Linux x86_64. (l…

0b0cce8

…lvm#79307) Co-authored-by: Felix <felix@Dirks-MacBook-Pro.local>

[mlir][sparse] fix mismatch between enter/exitWhileLoop (llvm#79493)

982c815

[llvm-objdump,SHT_LLVM_BB_ADDR_MAP,NFC] Use auto && instead of const …

313d33e

…auto & to allow moving from BBAddrMap objects. (llvm#79456) std::move on `const auto &` references is essentially a noop. Changing to `auto &&` to actually allow moving.

[SLP][NFC]Improve BottomTopTop reordering of orders for multi-iterations

92ae2ca

attempts, NFC. If several iterations of reodering of orders is required, need to use different algorithm.

Implements llvm-readobj handling without tests yet.

4d78bde

Adds tests for llvm-readobj PGOBBAddrMap. Updates readobj for the redesign with PGOAnalysisMap Updates tests after moving PGO analyses to after each function. Updates readobj with bitfield features.

red1bluelost force-pushed the pgo-bb-addr-map--llvm-readobj branch from 10fed1a to 4d78bde Compare January 25, 2024 22:32

red1bluelost closed this Jan 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implements llvm-readobj with tests. [3/5] #2

Implements llvm-readobj with tests. [3/5] #2

red1bluelost commented Nov 9, 2023

Implements llvm-readobj with tests. [3/5] #2

Implements llvm-readobj with tests. [3/5] #2

Conversation

red1bluelost commented Nov 9, 2023

PR Series