Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SEGFAULT when using both libassert::stringify and cpptrace #103

Closed
MatthiasJ1 opened this issue Sep 10, 2024 · 7 comments
Closed

SEGFAULT when using both libassert::stringify and cpptrace #103

MatthiasJ1 opened this issue Sep 10, 2024 · 7 comments
Labels
bug Something isn't working

Comments

@MatthiasJ1
Copy link

Compile command

g++ -g -std=c++20 main.cpp -lassert -lcpptrace -ldwarf

Header for all 3 snippets

#include <vector>
#include <string>
#include <iostream>
#include <libassert/assert.hpp>
using namespace std;

Using libassert::stringify

int main() {
    vector<string> a = {"hi"};
    cout << libassert::stringify(a) << endl;
}
> ./a.out                                                                                                                        
std::vector<std::string>: ["hi"]

Using cpptrace

int main() {
    vector<string> a = {"hi"};
    cout << cpptrace::generate_trace().frames[0].to_string() << endl;
}
> ./a.out                                                                                                                        
0x0000000100a01ce3 in main at main.cpp:10:13

Using both

int main() {
    vector<string> a = {"hi"};
    cout << libassert::stringify(a) << endl;
    cout << cpptrace::generate_trace().frames[0].to_string() << endl;
}
> ./a.out                                                                                                                        
std::vector<std::string>: ["hi"]
zsh: segmentation fault  ./a.out
@jeremy-rifkin
Copy link
Owner

Hi, thanks for opening this. Any segfaults are a high priority bug. I wasn't able to reproduce locally with the following setup:

// in a cpptrace clone
git checkout v0.7.0
mkdir build && cd build
cmake .. -DCMAKE_BUILD_TYPE=Debug -DCMAKE_EXPORT_COMPILE_COMMANDS=On -DCPPTRACE_BUILD_TESTING=On -DCMAKE_PREFIX_PATH=~/foobar -DCMAKE_INSTALL_PREFIX=~/foobar -GNinja && ninja install
// in a libassert clone
git checkout v2.1.0
mkdir build && cd build
cmake .. -DCMAKE_BUILD_TYPE=Debug -DCMAKE_EXPORT_COMPILE_COMMANDS=On -DCPPTRACE_BUILD_TESTING=On -DCMAKE_PREFIX_PATH=~/foobar -DCMAKE_INSTALL_PREFIX=~/foobar -DLIBASSERT_USE_EXTERNAL_CPPTRACE=On -GNinja

Then I compiled the test program using both with g++ test.cpp -isystem ~/foobar/include -L ~/foobar/lib -Wall -lassert -lcpptrace -ldwarf -lzstd -lz

› ./a.out
std::vector<std::string>: ["hi"]
0x0000555745304b63 at /path/to/repro/a.out

How did you compile and install libassert and cpptrace? What versions?

One thing that would additionally be helpful to check is that frames[0] actually exist before using it - if you could verify with .at(0) that would be very helpful.

@MatthiasJ1
Copy link
Author

I verified that frames[0] exists and the segfault is within to_string().
I compiled everything with both clang18 and clang 15 from Conda-Forge on macOS M1 and got the same results.


cpptrace
url: https://github.com/jeremy-rifkin/cpptrace/archive/refs/tags/v0.7.0.tar.gz
sha256: b5c1fbd162f32b8995d9b1fefb1b57fac8b1a0e790f897b81cdafe3625d12001

mkdir build && cd build
cmake .. -GNinja -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=$PREFIX -DCPPTRACE_USE_EXTERNAL_LIBDWARF=ON
ninja install

libassert
url: https://github.com/jeremy-rifkin/libassert/archive/refs/tags/v2.1.0.tar.gz
sha256: e42405b49cde017c44c78aacac35c6e03564532838709031e73d10ab71f5363d

mkdir build && cd build
cmake .. -GNinja -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=$PREFIX -DLIBASSERT_USE_EXTERNAL_CPPTRACE=ON -DLIBASSERT_USE_EXTERNAL_MAGIC_ENUM=ON
ninja install

I can give you the full Conda build config files if you'd like and I can also possibly debug this at some point on my machine but I don't have time at the moment.

Here is the build environment for libassert with clang18:

- bzip2 1.0.8
- c-ares 1.33.1
- ca-certificates 2024.8.30
- cctools_osx-arm64 1009.2
- clang 18.1.8
- clang-18 18.1.8
- clang_impl_osx-arm64 18.1.8
- clang_osx-arm64 18.1.8
- clangxx 18.1.8
- clangxx_impl_osx-arm64 18.1.8
- clangxx_osx-arm64 18.1.8
- cmake 3.30.3
- compiler-rt 18.1.8
- compiler-rt_osx-arm64 18.1.8
- icu 75.1
- krb5 1.21.3
- ld64_osx-arm64 907
- libclang-cpp18.1 18.1.8
- libcurl 8.9.1
- libcxx 18.1.8
- libcxx-devel 18.1.8
- libedit 3.1.20191231
- libev 4.33
- libexpat 2.6.3
- libiconv 1.17
- libllvm18 18.1.8
- libnghttp2 1.58.0
- libssh2 1.11.0
- libuv 1.48.0
- libxml2 2.12.7
- libzlib 1.3.1
- llvm-tools 18.1.8
- llvm-tools-18 18.1.8
- ncurses 6.5
- ninja 1.12.1
- openssl 3.3.2
- rhash 1.4.4
- sigtool 0.1.3
- tapi 1300.6.5
- xz 5.2.6
- zstd 1.5.6
- cpptrace 0.7.0
- libdwarf 0.11.0
- magic_enum 0.9.6

@jeremy-rifkin
Copy link
Owner

jeremy-rifkin commented Sep 12, 2024

Thanks so much, I can reproduce on an M1 Mac. I'm pasting my script below, mainly for my own remembrance. I have clang 15 and used external libdwarf, but looks like that didn't make a difference.

mkdir test && cd test

git clone https://github.com/jeremy-rifkin/cpptrace.git
cd cpptrace
git checkout v0.7.0
mkdir build && cd build
cmake .. -GNinja -DCMAKE_BUILD_TYPE=Release -DCMAKE_PREFIX_PATH=~/foobar -DCMAKE_INSTALL_PREFIX=~/foobar -DCPPTRACE_USE_EXTERNAL_LIBDWARF=Off && ninja install
cd ../..

// in a libassert clone
git clone https://github.com/jeremy-rifkin/libassert.git
cd libassert
git checkout v2.1.0
mkdir build && cd build
cmake .. -GNinja -DCMAKE_BUILD_TYPE=Release -DCMAKE_PREFIX_PATH=~/foobar -DCMAKE_INSTALL_PREFIX=~/foobar -DLIBASSERT_USE_EXTERNAL_CPPTRACE=On -DLIBASSERT_USE_EXTERNAL_MAGIC_ENUM=On && ninja install
cd ../..

mkdir test && cd test
cat <<eos > test.cpp
#include <vector>
#include <string>
#include <iostream>
#include <libassert/assert.hpp>
using namespace std;

int main() {
    vector<string> a = {"hi"};
    cout << libassert::stringify(a) << endl;
    cout << cpptrace::generate_trace().frames.at(0).to_string() << endl;
}
eos

clang++ test.cpp -isystem ~/foobar/include -L ~/foobar/lib -Wall -lassert -lcpptrace -ldwarf -lzstd -lz -std=c++17 -g

Edit: I was able to reproduce with -DCMAKE_BUILD_TYPE=Debug too, that'll make a big difference for triaging this.

Edit: And sanitizers too, fantastic

@jeremy-rifkin
Copy link
Owner

There's something really spooky going on here. Some initial triage:

There are a series of string format calls, and one fails but the others are fine. The failing one is https://github.com/jeremy-rifkin/cpptrace/blob/0742b42dadaac62436cb226a7d084738a8f82d1a/src/cpptrace.cpp#L147

                str += microfmt::format(":{}", line.value());

That calls https://github.com/jeremy-rifkin/cpptrace/blob/0742b42dadaac62436cb226a7d084738a8f82d1a/src/utils/microfmt.hpp#L281

    template<typename... Args>
    std::string format(const char* fmt, Args&&... args) {
        return detail::format<sizeof...(args)>(fmt, fmt + std::strlen(fmt), {detail::format_value(args)...});
    }

Which calls https://github.com/jeremy-rifkin/cpptrace/blob/0742b42dadaac62436cb226a7d084738a8f82d1a/src/utils/microfmt.hpp#L189

        template<std::size_t N, typename It>
        std::string format(It fmt_begin, It fmt_end, std::array<format_value, N> args) {

It's taking an array by value, there should be no possible way to see array.data() be a pointer in the null page here... But lldb shows that it is. And some printbugging of array.data():

-----------------0x16bd7a560
-----------------0x107f01320
-----------------0x107f01308
-----------------0xa

With the 0xa array.data() originating from the str += microfmt::format(":{}", line.value()); call.

One thing here is that my two libraries, libassert and cpptrace, use this microfmt code, which I'm realizing is an ODR problem. Between cpptrace 0.7.0 and libassert 2.1.0 the files match verbatim. This I don't think is ODR, but I'm going to at least namespace them differently to prevent future problems.

I'm stumped as to the cause here, I'm going to have to pick things apart more in-depth.

@jeremy-rifkin
Copy link
Owner

jeremy-rifkin commented Sep 12, 2024

Ah, I understand. The format_value class differs pre/post-c++17. I’ll fix this tomorrow and create patch releases for both libraries. This is an embarrassing mistake; thank you so much for the thorough bug report!

Edit to elaborate more, in case my future self or anyone else stumbles across this again: The smoking gun is https://godbolt.org/z/shbfxhbha. The std::string_view changes how the array of format_values is passed and that's a very troubling ABI break. I don't know how it ever worked on x86.

@MatthiasJ1
Copy link
Author

Awesome, thanks for writing/maintaining these libraries!

jeremy-rifkin added a commit that referenced this issue Sep 13, 2024
@jeremy-rifkin jeremy-rifkin added bug Something isn't working resolved in next release labels Sep 13, 2024
jeremy-rifkin added a commit that referenced this issue Sep 13, 2024
jeremy-rifkin added a commit to jeremy-rifkin/cpptrace that referenced this issue Sep 13, 2024
jeremy-rifkin added a commit that referenced this issue Sep 13, 2024
@jeremy-rifkin
Copy link
Owner

I've released v2.1.1 with a fix for this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants