Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some ranges of char are misprinted or don't compile #2634

Closed
brevzin opened this issue Dec 7, 2021 · 18 comments · Fixed by #3158
Closed

Some ranges of char are misprinted or don't compile #2634

brevzin opened this issue Dec 7, 2021 · 18 comments · Fixed by #3158

Comments

@brevzin
Copy link
Contributor

brevzin commented Dec 7, 2021

First example:

#include <ranges>
#include <string>
#include <fmt/ranges.h>

int main() {
    std::string line = "a,b-c,d-e,f";
    fmt::print("{}\n", line | std::views::split(','));
}

With C++20, this prints the expected/desired:

[['a'], ['b', '-', 'c'], ['d', '-', 'e'], ['f']]

But with C++23, this prints:

[a, b-c, d-e, f]

This has something to do with the mapper hijacking the underlying range as a string view and then printing it unquoted. Not sure exactly.

Second example:

#include <ranges>
#include <string>
#include <fmt/ranges.h>


struct X {
    template <std::ranges::range R>
    X(R&& r)
        : sv(&*std::ranges::begin(r), std::ranges::distance(r))
    { }

    auto begin() { return sv.begin(); }
    auto end() { return sv.end(); }

    std::string_view sv;
};

int main() {
    std::string line = "a,b-c,d-e,f";
    auto x = line | std::views::split(',')
                  | std::views::transform([](auto part){ return X(part); })
                  ;

    fmt::print("{}\n", x);
}

On C++20, this fails to compile with (this error I don't understand):

/opt/compiler-explorer/libs/fmt/trunk/include/fmt/ranges.h:526:21: error: call of overloaded 'write<char>(fmt::v8::appender&, const X&)' is ambiguous
  526 |   return write<Char>(out, v);
      |          ~~~~~~~~~~~^~~~~~~~
In file included from /opt/compiler-explorer/libs/fmt/trunk/include/fmt/ranges.h:18,
                 from <source>:3:
/opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:2174:20: note: candidate: 'constexpr fmt::v8::enable_if_t<(((std::is_class<T>::value && (! fmt::v8::detail::is_string<T>::value)) && (! std::is_same<T, Char>::value)) && (! std::is_same<const T&, decltype (fmt::v8::detail::arg_mapper<Context>().map(value))>::value)), OutputIt> fmt::v8::detail::write(OutputIt, const T&) [with Char = char; OutputIt = fmt::v8::appender; T = X; Context = fmt::v8::basic_format_context<fmt::v8::appender, char>; fmt::v8::enable_if_t<(((std::is_class<T>::value && (! is_string<T>::value)) && (! std::is_same<T, Char>::value)) && (! std::is_same<const T&, decltype (arg_mapper<Context>().map(value))>::value)), OutputIt> = fmt::v8::appender; decltype (arg_mapper<Context>().map(value)) = unformattable_const]'
 2174 | FMT_CONSTEXPR auto write(OutputIt out, const T& value) -> enable_if_t<
      |                    ^~~~~
/opt/compiler-explorer/libs/fmt/trunk/include/fmt/format.h:2185:20: note: candidate: 'constexpr fmt::v8::enable_if_t<(fmt::v8::detail::type_constant<decltype (fmt::v8::detail::arg_mapper<Context>().map(declval<const T&>())), typename Context::char_type>::value == fmt::v8::detail::type::custom_type), OutputIt> fmt::v8::detail::write(OutputIt, const T&) [with Char = char; OutputIt = fmt::v8::appender; T = X; Context = fmt::v8::basic_format_context<fmt::v8::appender, char>; fmt::v8::enable_if_t<(type_constant<decltype (arg_mapper<Context>().map(declval<const T&>())), typename Context::char_type>::value == type::custom_type), OutputIt> = fmt::v8::appender; typename Context::char_type = char; decltype (arg_mapper<Context>().map(declval<const T&>())) = unformattable_const]'
 2185 | FMT_CONSTEXPR auto write(OutputIt out, const T& value)
      |                    ^~~~~

On C++23, this fails to compile for a different reason (this issue is because core.h checks if string_view is constructible from X, which it is, but then tries to construct it from const X&, which it's not):

/opt/compiler-explorer/libs/fmt/trunk/include/fmt/core.h:1358:12: error: no matching function for call to 'std::basic_string_view<char>::basic_string_view(const X&)'
 1358 |     return std_string_view<char_type>(val);
      |            ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

This example is an attempt at reducing the actually-desired:

auto x = line | std::views::split(',') | std::views::transform(std::views::split('-'));

Which fails on the ambiguous write call.

@vitaut
Copy link
Contributor

vitaut commented Dec 27, 2021

The behavior in the first example is caused by range items becoming convertible to string_view in C++23: https://godbolt.org/z/eYvdszbof. So I guess this is expected although a bit weird and we should add escaping.

@vitaut
Copy link
Contributor

vitaut commented Dec 27, 2021

Range elements convertible to string_view are now escaped: 796662a.

    std::string line = "a,b-c,d-e,f";
    fmt::print("{}\n", line | std::views::split(','));

now produces the following output in C++23 (https://godbolt.org/z/f465jEq4s):

["a", "b-c", "d-e", "f"]

which seems reasonable.

@vitaut
Copy link
Contributor

vitaut commented Dec 27, 2021

The problem in the second example is caused by only partial support for non-const-iterable ranges. In particular, write_range_entry taking a const reference:

OutputIt write_range_entry(OutputIt out, const Arg& v) {

and a few other places.

A PR to improve support for non-const-iterable/formattable types would be welcome.

@brevzin
Copy link
Contributor Author

brevzin commented Dec 27, 2021

Range elements convertible to string_view are now escaped: 796662a.

    std::string line = "a,b-c,d-e,f";
    fmt::print("{}\n", line | std::views::split(','));

now produces the following output in C++23 (https://godbolt.org/z/f465jEq4s):

["a", "b-c", "d-e", "f"]

which seems reasonable.

Not sure I agree with this. If I'm just printing a vector<vector<char>> for instance, I don't think I'd want to print the inner ranges as string_view necessarily. I mean, maybe I do, maybe I don't -- but I don't think the library should assume that. That's why P2286 has an s specifier, for instance so that I could do "{::s}" if I want to print it as a range of strings, or "{}" if I want to print it as a range of range of char.

@vitaut
Copy link
Contributor

vitaut commented Dec 28, 2021

vector<char> is not convertible to string_view so it's not printed as a string. Although I don't think it's a problem, I'm OK with disabling implicit conversions when printing elements of ranges. A PR is welcome.

@brevzin
Copy link
Contributor Author

brevzin commented Dec 28, 2021

vector<char> is not convertible to string_view so it's not printed as a string. Although I don't think it's a problem, I'm OK with disabling implicit conversions when printing elements of ranges. A PR is welcome.

vector<char> is definitely convertible to string_view (... in C++23).

@vitaut
Copy link
Contributor

vitaut commented Dec 28, 2021

vector<char> is definitely convertible to string_view (... in C++23).

Wait, what? o_O

@brevzin
Copy link
Contributor Author

brevzin commented Dec 28, 2021

vector<char> is definitely convertible to string_view (... in C++23).

Wait, what? o_O

Most contiguous ranges of char: http://eel.is/c++draft/string.view.cons#12

@sunmy2019
Copy link
Contributor

sunmy2019 commented Feb 6, 2022

vector<char> is definitely convertible to string_view (... in C++23).

Wait, what? o_O

It's already implemented. https://godbolt.org/z/EzWr71z3c

@brevzin
Copy link
Contributor Author

brevzin commented Feb 6, 2022

vector<char> is definitely convertible to string_view (... in C++23).

Wait, what? o_O

It's already implemented. https://godbolt.org/z/EzWr71z3c

Yes, that's what my comment was pointing out.

@sunmy2019
Copy link
Contributor

IMO, we can take Python as an analogy.

If something is convertible to string_view, printing that string_view is very natural. Just like __str__ is provided in Python. The original output ['a', 'b', 'c'] is also sensible.

So, it's the way a vector<char> converts to a string_view caused the problem. (which is not provided before C++23)

Now assume there is a user defined string with operator string_view and iterator. The most natural way is to use operator string_view rather that using ranges iterator. Now standards library also provides us one, using it should be very natural.

The way a vector<char> converts to a string_view is natual in a C++ manner. Maybe we can consider using fmt::join to wrap it?

Also be very careful to change the code. Otherwise, the formatter<vector<char>> can compile in C++23 both with and without the help of ranges.h, and provides different results. (maybe by providing some specialization for vector<char> in ranges.h) Then ODR violation #2357 occurs again.

@denchat
Copy link
Contributor

denchat commented Sep 16, 2022

static_assert(std::is_convertible_v<std::vector<char>, std::string_view>);

Do GCC v12.2 stdlibc++'s and Clang v16 libc++'s is_convertible implementation need to be relaxed here?
MSVC v19.30 seems to be okey with it.
https://godbolt.org/z/rMY4K4d7b

@vitaut
Copy link
Contributor

vitaut commented Sep 16, 2022

MSVC seems broken: vector shouldn't be convertible to string_view for obvious reasons. Please report an issue to microsoft.

@denchat
Copy link
Contributor

denchat commented Sep 16, 2022

Ah, the is_convertible does require implicit conversion which string_view lacks of.
https://en.cppreference.com/w/cpp/types/is_convertible

string_view has only the c++23 explicit ctor from a range passed by forwarding ref.

...and with other range contrains also.

MSVC string_view seems to allow implicit ctor from a range. That might solely be an extension design decision, however it is supposed to be conformed with /permissive- present but currently isn't.

std::vector<char> a;
std::string_view c = a;  // MSVC doesn't produce an error

https://godbolt.org/z/G6Tce5qTh

@mwinterb
Copy link
Contributor

It's not an extension. The constructor used to be implicit, see #3030 / https://wg21.link/p2499.

@vitaut
Copy link
Contributor

vitaut commented Sep 16, 2022

Yeah, it was temporarily implicit in a working draft but it's neither valid C++20 nor valid C++23.

@denchat
Copy link
Contributor

denchat commented Sep 17, 2022

Microsoft/STL opensource side already fixes the ctor string_view(Range&&):

https://github.com/microsoft/STL/blob/67a96a910fb53aa4bc12801fb9683d30b2647c16/stl/inc/xstring?plain=1#L1270-L1288

@vitaut
Copy link
Contributor

vitaut commented Oct 1, 2022

Item 1 now gives the same output in C++20 (https://godbolt.org/z/qhcWPer7G) and C++23 (https://godbolt.org/z/er8qzjrYx):

#include <fmt/ranges.h>
#include <ranges>

int main() {
  std::string line = "a,b-c,d-e,f";
  fmt::print("{}\n", line | std::views::split(','));
}

Output:

[[a], [b, -, c], [d, -, e], [f]]

The only issue is that individual characters are not quoted which should be easy to fix (a PR is welcome).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants