Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bugfix for fmt::printf on Power9 architecture with the XL compiler #3256

Merged
merged 8 commits into from
Jan 13, 2023

Conversation

kennyweiss
Copy link
Contributor

I agree that my contributions are licensed under the {fmt} license, and agree to future changes to the licensing.

This PR works around a bug encountered with the IBM xl compiler on the Power9 architecture (e.g. on the Lassen supercomputer ) with fmt::printf().

The issue appears to be that XL is not copying the format_specs struct in the printf_arg_formatter constructor to the arg_formatter base class.

fmt/include/fmt/printf.h

Lines 237 to 238 in a73a9b6

printf_arg_formatter(OutputIt iter, format_specs<Char>& s, context_type& ctx)
: base{iter, s, locale_ref()}, context_(ctx) {}

This was causing the printf tests to segfault. E.g.

EXPECT_EQ("42", test_sprintf("%1$d", 42));

I resolved the problem by adding an explicit constructor to the arg_formatter base class.

This regression appears to have been introduced between fmt@7.1.3 and fmt@8.0.0.

I tested this using the following compiler/config options

set(CMAKE_C_COMPILER "/usr/tce/packages/xl/xl-2022.08.19/bin/xlc" CACHE PATH "")
set(CMAKE_CXX_COMPILER "/usr/tce/packages/xl/xl-2022.08.19/bin/xlC" CACHE PATH "")
set(CMAKE_C_FLAGS "--gcc-toolchain=/usr/tce/packages/gcc/gcc-8.3.1" CACHE STRING "")
set(CMAKE_CXX_FLAGS "--gcc-toolchain=/usr/tce/packages/gcc/gcc-8.3.1" CACHE STRING "")

set(CMAKE_CXX_STANDARD "14" CACHE STRING "")
set(CMAKE_BUILD_TYPE "Debug" CACHE STRING "")
set(CMAKE_SHARED_LINKER_FLAGS "${CMAKE_SHARED_LINKER_FLAGS} -Wl,-rpath,/usr/tce/packages/xl/xl-2022.08.19/lib" CACHE STRING "Adds a missing rpath for libraries associated with the fortran compiler")

set(FMT_HEADER_ONLY ON CACHE BOOL "")
set(FMT_TEST ON CACHE BOOL "")

Note: There are a few other failing tests with this compiler. I will post separate issues for them.

Copy link
Contributor

@vitaut vitaut left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR but looks like it doesn't work in some CI configs. Also since this is a workaround for printf it's better putting it there (i.e. initialize members in printf_arg_formatter's ctor).

@kennyweiss
Copy link
Contributor Author

Thanks @vitaut.

Since the specs member of the arg_formatter base class is a reference, I couldn't find a way to initialize it in printf_arg_formatter other than through a constructor in the base class. If you know of a better way, I'd be very interested in hearing about it.

(I think) I resolved the error with gcc-4.8 by adding a second constructor that does not require the temporary default locale_ref.

kennyweiss added a commit to LLNL/axom that referenced this pull request Jan 3, 2023
@vitaut
Copy link
Contributor

vitaut commented Jan 4, 2023

Please submit a bug report to IBM and also could you post the error here for future reference?

@kennyweiss
Copy link
Contributor Author

kennyweiss commented Jan 4, 2023

Thanks @vitaut -- I've created a simple reproducer and will submit to IBM.

Description of the error:

The code in fmt@master compiles with XLC, but segfaults when running the printf unit tests.

In case it helps, here's the stacktrace from gdb:

>gdb ./bin/printf-test

GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-120.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "ppc64le-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from <fmt>/build-xl/bin/printf-test...done.
(gdb) run
Starting program: <fmt>/build-xl/bin/printf-test
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
[==========] Running 38 tests from 1 test suite.
[----------] Global test environment set-up.
[----------] 38 tests from printf_test
[ RUN      ] printf_test.no_args
[       OK ] printf_test.no_args (0 ms)
[ RUN      ] printf_test.escape
[       OK ] printf_test.escape (0 ms)
[ RUN      ] printf_test.positional_args

Program received signal SIGSEGV, Segmentation fault.
0x0000200000178e3c in __dynamic_cast () from /lib64/libstdc++.so.6
Missing separate debuginfos, use: debuginfo-install libgcc-4.8.5-44.el7.ppc64le libstdc++-4.8.5-44.el7.ppc64le
(gdb) bt
#0  0x0000200000178e3c in __dynamic_cast () from /lib64/libstdc++.so.6
#1  0x0000000010195c14 in std::has_facet<fmt::v9::format_facet<std::locale> > (__loc=...) at /usr/tce/packages/gcc/gcc-8.3.1/rh/usr/bin/../lib/gcc/ppc64le-redhat-linux/8/../../../../include/c++/8/bits/locale_classes.tcc:110
#2  0x00000000101a14e8 in fmt::v9::detail::write_loc (out=..., value=..., specs=..., loc=...) at <fmt>/include/fmt/format-inl.h:126
#3  0x000000001007a218 in fmt::v9::detail::write<char, fmt::v9::appender, int, 0> (out=..., value=42, specs=..., loc=...) at <fmt>/include/fmt/format.h:2170
#4  0x000000001007a3ec in fmt::v9::detail::arg_formatter<char>::operator()<int> (this=0x7fffffffad80, value=42) at <fmt>/include/fmt/format.h:3588
#5  0x000000001007a4b0 in fmt::v9::detail::printf_arg_formatter<fmt::v9::appender, char>::operator()<int, 0> (this=0x7fffffffad80, value=42) at <fmt>/include/fmt/printf.h:261
#6  0x0000000010090c14 in fmt::v9::visit_format_arg<fmt::v9::detail::printf_arg_formatter<fmt::v9::appender, char>, fmt::v9::basic_printf_context<fmt::v9::appender, char> >(fmt::v9::detail::printf_arg_formatter<fmt::v9::appender, char>&&, fmt::v9::basic_format_arg<fmt::v9::basic_printf_context<fmt::v9::appender, char> > const&) ( vis=<unknown type in <fmt>/build-xl/bin/printf-test, CU 0x0, DIE 0x86298>, arg=...) at <fmt>/include/fmt/core.h:1643
#7  0x00000000100a4b64 in fmt::v9::detail::vprintf<char, fmt::v9::basic_printf_context<fmt::v9::appender, char> > (buf=warning: RTTI symbol not found for class 'fmt::v9::basic_memory_buffer<char, 500ul, std::allocator<char> >' ..., format=..., args=...) at <fmt>/include/fmt/printf.h:512
#8  0x00000000100a4de0 in fmt::v9::vsprintf<fmt::v9::basic_string_view<char>, char> (fmt=..., args=...) at <fmt>/include/fmt/printf.h:559
#9  0x00000000100b66bc in fmt::v9::sprintf<fmt::v9::basic_string_view<char>, int, char> (fmt=..., args=@0x7fffffffb4c0: 42) at <fmt>/include/fmt/printf.h:576
#10 0x00000000100b678c in test_sprintf<int> (format=..., args=@0x7fffffffb4c0: 42) at <fmt>/test/printf-test.cc:41
#11 0x000000001018f124 in printf_test_positional_args_Test::TestBody (this=0x10404990) at <fmt>/test/printf-test.cc:73
#12 0x0000000010225680 in testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void> (object=0x10404990, method=&virtual testing::Test::TestBody(), location=0x102fd2d4 "the test body" at <fmt>/test/gtest/gmock-gtest-all.cc:4098
#13 0x00000000102257a0 in testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void> (object=0x10404990, method=&virtual testing::Test::TestBody(), location=0x102fd2d4 "the test body") at <fmt>/test/gtest/gmock-gtest-all.cc:4134
#14 0x00000000102d067c in testing::Test::Run (this=0x10404990) at <fmt>/test/gtest/gmock-gtest-all.cc:4173
#15 0x00000000102d09a8 in testing::TestInfo::Run (this=0x10400990) at <fmt>/test/gtest/gmock-gtest-all.cc:4352
#16 0x00000000102d0c38 in testing::TestSuite::Run (this=0x104005b0) at <fmt>/test/gtest/gmock-gtest-all.cc:4506
#17 0x00000000102d1a48 in testing::internal::UnitTestImpl::RunAllTests (this=0x104001c0) at <fmt>/test/gtest/gmock-gtest-all.cc:7346
#18 0x0000000010223330 in testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool> (object=0x104001c0, method=(bool (UnitTestImpl::*)(UnitTestImpl * const)) 0x102d1590 <testing::internal::UnitTestImpl::RunAllTests()>, location=0x102fd46c "auxiliary test code (environments or event listeners)") at <fmt>/test/gtest/gmock-gtest-all.cc:4098
#19 0x00000000102234d4 in testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool> (object=0x104001c0, method=(bool (UnitTestImpl::*)(UnitTestImpl * const)) 0x102d1590 <testing::internal::UnitTestImpl::RunAllTests()>, location=0x102fd46c "auxiliary test code (environments or event listeners)") at <fmt>/test/gtest/gmock-gtest-all.cc:4134
#20 0x00000000102d20b4 in testing::UnitTest::Run (this=0x103f11e0 <_$STATIC+10160>) at <fmt>/test/gtest/gmock-gtest-all.cc:6929
#21 0x00000000101912e0 in RUN_ALL_TESTS () at <fmt>/test/gtest/./gtest/gtest.h:12393
#22 0x00000000101913b8 in main (argc=1, argv=0x7fffffffc278) at <fmt>/test/test-main.cc:38
(gdb)

Comment on lines 3586 to 3587
arg_formatter(buffer_appender<Char> it, const format_specs<Char>& s)
: out(it), specs(s) {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This constructor is redundant, please remove.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a workaround for gcc-4.8.
#3256 (comment)

Copy link
Contributor

@vitaut vitaut Jan 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, missed that.

@vitaut
Copy link
Contributor

vitaut commented Jan 9, 2023

I wonder if there is a cleaner workaround that doesn't involve introducing redundant ctors. Would using a function that constructs arg_formatter work? Something along the lines of:

  auto make_arg_formatter(OutputIt iter, format_specs<Char>& s) -> arg_formatter {
    return {iter, s, locale_ref()}
  }

  printf_arg_formatter(OutputIt iter, format_specs<Char>& s, context_type& ctx)
      : base(make_arg_formatter(iter, s), context_(ctx) {}

Then we can keep the workaround localized to where the compiler bug occurs.

@kennyweiss
Copy link
Contributor Author

I wonder if there is a cleaner workaround that doesn't involve introducing redundant ctors. Would using a function that constructs arg_formatter work? Something along the lines of:

  auto make_arg_formatter(OutputIt iter, format_specs<Char>& s) -> arg_formatter {
    return {iter, s, locale_ref()}
  }

  printf_arg_formatter(OutputIt iter, format_specs<Char>& s, context_type& ctx)
      : base(make_arg_formatter(iter, s), context_(ctx) {}

Then we can keep the workaround localized to where the compiler bug occurs.

Thanks @vitaut
I reworked the branch to convert arg_formatter::specs to a const pointer and added a guarded bugfix for the XL compiler in the printf_arg_formatter constructor.

Would you like me to rework it using the above suggestion?

@vitaut
Copy link
Contributor

vitaut commented Jan 10, 2023

Would you like me to rework it using the above suggestion?

Yes, please. Using a "factory" function is cleaner because it will allow us to keep the reference while still keeping the workaround localized to printf implementation.

@kennyweiss
Copy link
Contributor Author

Thanks for the suggestion @vitaut -- I think we're getting there!
XL seems happy with the explicit copy constructor for the arg_formatter base class and so does my local gcc-4.8.

Comment on lines 3596 to 3599
static auto make_arg_formatter(iterator iter, format_specs<Char>& s)
-> arg_formatter {
return {iter, s, locale_ref()};
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please move this function to printf.h where it is used (it doesn't need to be in arg_formatter) and add a comment that it is a workaround for the XL compiler bug, with a link to the bug report if possible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Unfortunately, the IBM ticket is not publicly accessible.

This avoids the need to pass a temporary (default) locale_ref.
Use this within printf_arg_formatter constructor as workaround
for XL compiler bug that optimizes away base class initializer.

Also: Reverts conversion of internal `specs` variable back to a const ref

Per PR suggestion.
@vitaut vitaut merged commit bfc0924 into fmtlib:master Jan 13, 2023
@vitaut
Copy link
Contributor

vitaut commented Jan 13, 2023

Merged, thanks. Please comment here once you know which version of the XL compiler fixes the issue.

@kennyweiss
Copy link
Contributor Author

Thanks -- will do.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants