Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix code causing spurious Wstringop-overflow warning #3333

Closed
wants to merge 2 commits into from

Conversation

codeinred
Copy link
Contributor

Despite attempts to disable -Wstringop-overflow for tests, building the library from source still results in spurious Wstringop-overflow warnings. This occurs both for downstream consumers (eg, those using CMake Package Manager), and for people building the library from scratch.

Rather than attempting to disable the warning, I've modified the code such that the compiler is able to "figure out" that fmt::detail::code_point_length returns a value between 1 and 4, inclusive. The change also allows using bitshifts to calculate the codepoint length, which avoids a second array access.

While the Wstringop-overflow warning was spurious in this case, the new code should be more performant, and downstream consumers will no longer have the headache of dealing with a spurious warning when compiling their code.

See Issue #2989, as well as #3054.

@vitaut
Copy link
Contributor

vitaut commented Mar 3, 2023

Please provide a godbolt repro that demonstrates the warning. The link in the referenced issue doesn't actually show any warnings.

@codeinred
Copy link
Contributor Author

codeinred commented Mar 3, 2023

Here's a reproduction that demonstrates the warning. It seems to occur when compiling any code that prints vectors or other ranges.

The code needs to be compiled at -O3, because when compiled in debug mode, the FMT_ASSERT statements are left in, and these do bounds checks.

Here is the code:

#include <fmt/core.h>
#include <fmt/ranges.h>
#include <vector>

void print_vec(std::vector<int> nums) {
    fmt::print("{}\n", nums);
}

Edit: The initial reproduction works on fmt 9.1.0. This example is reproducible on fmt trunk as well, provided that -DNDEBUG is passed.

Comment on lines +2212 to +2222
// Returns the length of a codepoint. Returns 1 for invalid codepoints.
// This is equivalent to
//
// int len = code_point_length_impl(c);
// return len + !len;
//
// This is useful because it allows the compiler to check that the
// length is within the range [1, 4]
FMT_CONSTEXPR inline auto code_point_length_impl_2(char c) -> int {
return static_cast<int>((0x3a55000000000000ull >> (2 * (static_cast<unsigned char>(c) >> 3))) & 0x3) + 1;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need both code_point_length_impl and code_point_length_impl_2, let's merge this logic into code_point_length_impl and make the latter take unsigned char to have one case instead of 3.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On a second thought, this logic can be moved directly into code_point_length and both impl functions removed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The return value of code_point_length is used as len and as len + !len.

First case (first and last lines):

fmt/include/fmt/format.h

Lines 654 to 664 in 3daf338

int len = code_point_length_impl(*s);
// Compute the pointer to the next character early so that the next
// iteration can start working on the next character. Neither Clang
// nor GCC figure out this reordering on their own.
const char* next = s + len + !len;
using uchar = unsigned char;
// Assume a four-byte character and load four bytes. Unused bits are
// shifted out.
*c = uint32_t(uchar(s[0]) & masks[len]) << 18;

Second case:

fmt/include/fmt/core.h

Lines 2215 to 2223 in 3daf338

FMT_CONSTEXPR auto code_point_length(const Char* begin) -> int {
if (const_check(sizeof(Char) != 1)) return 1;
int len = code_point_length_impl(static_cast<char>(*begin));
// Compute the pointer to the next character early so that the next
// iteration can start working on the next character. Neither Clang
// nor GCC figure out this reordering on their own.
return len + !len;
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, right. Then I think we should merge code_point_length_impl into utf8_decode and code_point_length_impl_2 into code_point_length. Unless, of course, there is a similar false positive in utf8_decode.

// This is useful because it allows the compiler to check that the
// length is within the range [1, 4]
FMT_CONSTEXPR inline auto code_point_length_impl_2(char c) -> int {
return static_cast<int>((0x3a55000000000000ull >> (2 * (static_cast<unsigned char>(c) >> 3))) & 0x3) + 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this logic deserves some explanation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi! Sorry for the delay.

code_point_length_impl is initially implemented like this, which gets the length of a code point based on the first character in that codepoint.

MT_CONSTEXPR inline auto code_point_length_impl(char c) -> int {
  return "\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\0\0\0\0\0\0\0\0\2\2\2\2\3\3\4"
      [static_cast<unsigned char>(c) >> 3];
}

For valid codepoints, we return the length of the codepoint, and for invalid codepoints we return a length of 0.

When used in code_point_length, the length of invalid codepoints gets converted to 1, so we don't actually need to distinguish between valid and invalid codepoints for code_point_length.

As a result, we could write it like this:

MT_CONSTEXPR inline auto code_point_length_impl_2(char c) -> int {
  return "\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\2\2\2\2\3\3\4"
      [static_cast<unsigned char>(c) >> 3];
}

Now we have 4 possible return values: [1, 2, 3, 4]. If we instead use [0, 1, 2, 3] and just add one, it looks like this now:

MT_CONSTEXPR inline auto code_point_length_impl_2(char c) -> int {
  return ("\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\1\1\1\1\2\2\3"
      [static_cast<unsigned char>(c) >> 3]) + 1;
}

Because this array only contains values 0..3, each value can be represented in 2 bits. This means that the entire array fits in a 64 bit integer.

We can get that integer like this:

arr = [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,2,2,3]
result = 0
pow = 1

for x in arr:
    result += pow * x
    pow *= 4

print(f'0x{result:x}')

This gives us 0x3a55000000000000.

That logic basically indexes into this integer, treating it as an array of 2-bit values, resulting in the final implementation:

FMT_CONSTEXPR inline auto code_point_length_impl_2(char c) -> int {
  return static_cast<int>((0x3a55000000000000ull >> (2 * (static_cast<unsigned char>(c) >> 3))) & 0x3) + 1;
}

This has two benefits:

  • The compiler recognizes that the return value is always between 1 and 4, so it gets rid of the stringop-overflow warning
  • We no longer need to access memory when computing the length of a codepoint

That being said the code could definitely be less ugly.

I'll apply your recommendations with regard to merging code_point_length_impl into utf8_decode, and merging code_point_length_impl_2 into code_point_length. I haven't noticed any similar false positives in utf8_decode.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, thanks for the explanation.

@vitaut
Copy link
Contributor

vitaut commented Mar 18, 2023

Merged with some tweaks in d9bc5f1. Thanks for the PR.

@vitaut vitaut closed this Mar 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants