Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feedback needed: compile-time API improvements #2078

Closed
alexezeder opened this issue Dec 28, 2020 · 10 comments
Closed

Feedback needed: compile-time API improvements #2078

alexezeder opened this issue Dec 28, 2020 · 10 comments

Comments

@alexezeder
Copy link
Contributor

What is it about?

I have a portion of ideas about compiling format string at compile-time (not formatting at compile-time since I kinda stuck there), maybe some of these ideas were proposed before.

Since there is still no support for arg_id I was thinking that maybe I can provide full support of parsing formatting string. And by starting with that I ended up with some not so bad ideas for {fmt} in general that I want to share to get some feedback.

Ideas

  1. First of all, I was thinking about the idea to fallback on runtime formatting in general and I think it's not a great idea honestly. Because I, as a user of {fmt}, expect to get some formatters structure prepared at compile-time by using FMT_COMPILE or _cf, so all my calls to {fmt} can be just optimized-out or, let's say, be well-prepared. But with this kind of fallback, I would get runtime calls to the static library (w/o FMT_HEADER_ONLY) or somehow optimized inlined calls (with FMT_HEADER_ONLY). And all of this without any indication, which is not intuitive at all.

    But if formatting string compilation fully supports parsing then there is not any need for that fallback I guess.

  2. Currently, FMT_COMPILE and _cf just mark formatting string so it can be compiled later inside of fmt::format* functions, so this code:

    constexpr auto compiled_string = FMT_COMPILE("{} {}") /* or "{}"_cf */; // not actually compiled
    fmt::format(compiled_string, true, 42.0f);
    fmt::format(compiled_string, nullptr, 42);
    fmt::format(compiled_string, "foo", std::byte{42});

    parses formatting string 3 times, but it can parse it only once.

    In the case of parsing it only once, compiled_string would not have prepared formatters because there are no arguments passed, so no type information, but it would have compiled replacement fields and of course it would validate the passed formatting string. So the main task of fmt::format* functions here is to match passed arguments to the compiled ones and to prepare formatters for the compiled replacement fields.

    Of course, there are additional template instantiations with this approach, but string parsing is not the easiest operation too.

  3. Support for the named arguments with compiled formatting string can be introduced with no runtime cost because we can get all argument names at compile-time. But it requires us to change fmt::arg to be able to hold a name as template parameter - fmt::arg<"name">(42), it's possible since C++20, also "name"_a can produce templated version too in C++20 (like _cf).

    In this case, the matching procedure in fmt::format* functions would get an argument index based on a compiled formatting string, so the is no need to change field and spec_field since they already accept argument index as ID template parameter.

    Actually, I don't know what to do with named arguments without a static name (unavailable at compile-time), should they be supported in that case or there is no much sense about it...

  4. With the current formatter implementation there is no way to optimize some paths at compile-time. Currently, we have parse function in formatter that should prepare it for the following argument passed via format function. Thereby spec_field-preparing function parse_specs just invokes parse function at compile-time, and later format proceeds the same routine as it would do with the non-compiled format, with all that dec/hex/... or 'G'/'g'/'f'/'L'/... checks that cannot be optimized out. This results in larger produced code when some specifiers were used - disappointing example on Compiler Explorer.

    This problem can be solved by using not generic formatter, but formatter that is specific for passed specs, so the format function, in this case, can use if constexpr for choosing the right paths. But this maneuver requires us to forget about the parse member-function from some generic formatter but use a free parse function that will produce specially prepared formatter parameterized on parsed at compile-time specs.

    Of course, dynamic width and precision should be handled separately, because they are runtime values, but a need for their handling can be retrieved at compile-time, this means no extra cost when they are not used.

    To save compatibility with not compile-time API formatting a formatter can have a default branch (that is current now) which will be executed when parsed specs were not passed to it, so specs can be passed as std::optional for example to get the presence of them at compile-time. It's probably even possible to implement in C++17, but I would stick with C++20 for simplicity.

    The entire idea can be represented by the following pseudocode:

    struct parsed_specs {
        int align;
    };
    
    template<typename T, std::optional<parsed_specs> specs = {}>
    class formatter {
        constexpr void format(char* in) {
            if constexpr (specs.has_value()) {
                // using compiled specs
                if constexpr (specs->align == 0)
                ...
            } else {
                // ew, using runtime
            }
        }
    };
    
    consteval parsed_specs parse_specs(std::string_view str);
    
    template<typename T>
    consteval auto parse(std::string_view str) {
        constexpr auto specs = parse_specs(str);
        return formatter<T, specs>{};
    }
  5. Even with implementing idea 4 we can end up with a problem related to not perfectly efficient formatting when we know dynamic width or precision values at compile-time. Here is pseudocode to illustrate the problem:

    constexpr auto compiled_string = "{:{}.{}f}"_cf;
    
    template<int width, int precision>
    auto format_float(float value) {
        return fmt::format(compiled_string, value, width, precision);
    }

    In this case, we can create a special compiled argument (aka argument with a static value) that can be used in formatter to optimize some paths, in the same manner as we can use non-dynamic width or precision in parsed specs there. Those arguments with static value can be passed to the parse function (mentioned in idea 4) and that function can decide what to do next with them.

    So, the final solution for the situation represented above can look like this pseudocode:

    constexpr auto compiled_string = "{:{}.{}f}"_cf;
    
    template<int width, int precision>
    auto format_float(float value) {
        return fmt::format(compiled_string, value, fmt::carg<width>{}, fmt::carg<precision>{});
        // fmt::carg<42> or fmt::carg<"name", 42> or even 42_ca
    }

Final note

In general, these ideas may look like - ah, just add more templates, but they actually can improve performance in runtime by making more at compile-time and compile-time API looks like just about that.

Here I'm asking for feedback not only from maintainers but also from the users of {fmt}. Any corrections or suggestions would be helpful.

@vitaut
Copy link
Contributor

vitaut commented Dec 29, 2020

Thanks for the suggestions. Here are some thoughts:

  1. The fallback should be there for portability reasons to provide graceful degradation for compilers with broken constexpr such as MSVC. Otherwise we'll put the burden on users. Format string compilation is just an optimization and it's perfectly safe to skip it.

  2. Parsing depends on argument types so the API will be awkward (which is why FMT_COMPILE was introduced in the first place). In practice format string compilation should be rarely used (only when formatting is a bottleneck) so I don't think it's a problem.

  3. It would be nice to support named arguments. We shouldn't change the arg function but using UDLs seems reasonable.

  4. (and 5) I don't think it's feasible to change the extension API, but some optimizations can be done in the compile-time formatter itself. There is also diminishing return in optimizing rare cases, so it is better be driven by real-world examples.

@alexezeder
Copy link
Contributor Author

alexezeder commented Dec 30, 2020

The fallback should be there for portability reasons to provide graceful degradation for compilers with broken constexpr such as MSVC. Otherwise we'll put the burden on users.

As far as I understand, there are 2 APIs which {fmt} supports - runtime and compile-time. There is also some strange situation in the compile-time API (compile.h) - both constexpr if and non-constexpr if functions presented, but AFAIK FMT_COMPILE uses constexpr if functions only, so I'm talking only about that part of this API. Just to be sure that we are talking about the same things.

Ok, so there are 2 APIs, this means that I, as a user, can decide which one to use - if I don't care so much about my runtime speed, then I use the runtime API with or without FMT_STRING checks (or should I say only with now 🙂), but if I'm okay to sacrifice a compilation time, then I can use the compile-time API. If I decide to use the compile-time API, I should get it only if it's available on my compiler / STD library. But if I want to use it and my compiler / STD library does not support this API, then I should get an error about that. When I get an error, I can decide what to do next - either use the runtime API or upgrade used tools to make the compile-time API supported for me.

If I get it correctly, @vitaut, you want to make downgrade to runtime API calls implicit. In that case, I'm sorry, but I disagree with you here because I think it should be done explicitly. The user should decide which API it uses, especially when it has all the needed information for that. IMHO, it's not a burden, but it's a conscious choice.

Format string compilation is just an optimization and it's perfectly safe to skip it.

My views differ from yours here, because as I said above, I'm considering the compile-time API as a different approach where I can get a runtime performance boost. Actually, both APIs have some different trade-offs, so mixing them together and making the compile-time API optional extension of the runtime API does not make much sense for me. It's probably why this issue (or let me call it discussion) exists.

Parsing depends on argument types so the API will be awkward (which is why FMT_COMPILE was introduced in the first place).

Actually, the interface wouldn't be changed, it will be the same function calls. But what is now called a string compilation would be separated into two steps - string parsing and matching arguments with parsed entities. The string parsing step would just prepare a collection of replacement fields with a name (or id) and non-parsed specs. While matching would replace these fields with ready-to-use field or spec_field structures, in the case of spec_field it also should create a formatter so it would parse specs here.

But a need for this change is disputable, and it should be backed up with some test showing that the proposed change actually improves compilation time when the same string is used more than twice. Even then, there should be another test done to check that this change does not degrade compilation time significantly when a string is used only once. So this is just an idea, it's an interesting observation for me, maybe it would be interesting for you too.

Also, I realized that probably parsing step (or matching substep of it, doesn't matter) should create a formatter even for field structures, since there can be implemented some compile-time optimizations for empty specs in format function mentioned in idea 4.

It would be nice to support named arguments.

Yes, that's where I started 😄

We shouldn't change the arg function but using UDLs seems reasonable.

There is no problem to add a new overload for that function, so the original would not be changed, proof.

I don't think it's feasible to change the extension API

As I said before - I see the compile-time API as a different approach, this is why it's okay for me to create a separate function specific for that API. For the runtime API, everything remains the same - it's completely or almost flawless in terms of efficiency. But the compile-time API suffers from shared solutions with the runtime API, but as I wrote in idea 4 - we can give it a huge improvement just by using a different formatter-creation solution there.

We can also even use a classic formatter which is not paired with a separate function to support existing code, but that's implicit. I prefer to make it explicit, especially in cases where it can be done by changing few lines, like by marking "yes, I'm aware that it can be done better at compile-time, but let me do it the way I want it to do".

but some optimizations can be done in the compile-time formatter itself.

And idea 4 is all about that - to make the creation of compile-time optimized formatter possible. And even more - it's about making this in a generic way, so this solution can be used not only inside of {fmt} but also in code that uses it.

There is also diminishing return in optimizing rare cases, so it is better be driven by real-world examples.

Yes, that's true. But it's actually not about optimizing some specific use cases, but about providing a way to make some new optimizations possible. And I hope you can agree with me that some optimization should be done like for the example I prepared in the previous message.

@vitaut
Copy link
Contributor

vitaut commented Jan 3, 2021

If I decide to use the compile-time API, I should get it only if it's available on my compiler / STD library.

That's not how {fmt} works. Pretty much all the APIs have reasonable fallbacks, for example, compile-time checks fall back on runtime checks on C++11 and broken compilers, integer formatting falls backs on a naive algorithm if there is no __builtin_clz, etc. It can be sometimes useful to enforce the use of a specific API but it shouldn't be the default.

you want to make downgrade to runtime API calls implicit

Yes, it used to be implicit but there was a regression which broke this.

But a need for this change is disputable, and it should be backed up with some test showing that the proposed change actually improves compilation time when the same string is used more than twice. Even then, there should be another test done to check that this change does not degrade compilation time significantly when a string is used only once. So this is just an idea, it's an interesting observation for me, maybe it would be interesting for you too.

Right. I agree that it's an interesting idea and possibly worth exploring in the future.

There is no problem to add a new overload for that function, so the original would not be changed, proof.

Sure, but it will be inconsistent with the rest of the API.

But the compile-time API suffers from shared solutions with the runtime API, but as I wrote in idea 4 - we can give it a huge improvement just by using a different formatter-creation solution there.

I'm not sure if the added complexity is worth it but if there are compelling use cases then we could explore it. But we should try to avoid creating a second extension API as much as possible for usability reasons.

I hope you can agree with me that some optimization should be done like for the example I prepared in the previous message.

I'm a bit skeptical about that particular example because fancy formatting normally means that it's intended for display to users. This is exactly the case where perf doesn't matter much (although we already do a decent job beating printf and other facilities). This is why I was saying that it should be driven by real-world examples. All of the perf-critical cases from existing codebases I've seen so far used pretty basic formatting. Also there are optimization opportunities in runtime formatting which will make it even less important.

@vitaut vitaut closed this as completed Jan 3, 2021
@alexezeder
Copy link
Contributor Author

alexezeder commented Jan 3, 2021

That's not how {fmt} works. Pretty much all the APIs have reasonable fallbacks... It can be sometimes useful to enforce the use of a specific API but it shouldn't be the default.

That's good because it wouldn't probably work on older compilers w/o those fallbacks. I realized that {fmt} just provides one single API, that can be tweaked somehow, for example by using the FMT_COMPILE macro. And I'm not saying that there is no such thing as compile-time API, of course, it's presented and it works, it's preparing formatters at compile-time to optimize some cases at runtime and it's providing support for formatting at compile-time now. But, yeah, it's just an extension.

Maybe some new API in {fmt} can solve my problems... 😄

I'm a bit skeptical about that particular example because fancy formatting normally means that it's intended for display to users. ... This is why I was saying that it should be driven by real-world examples.

Yes, it should be that way. But since I'm talking about something theoretical I can only provide an idea. And this idea is pretty simple - to get better performance we should optimize runtime paths at compile-time. The example I provided shows that without that kind of optimization format_to function with FMT_COMPILE macro can just inline some part of the library (even if the static library is used), and of course, this can give some performance boost, but it's not what I expected.

Actually, my recent observation of log libraries (what else to do on weekend) brought me back to the same idea. I would definitely be happy to use {fmt} to set pattern for messages (log entries) like:

auto logger = my_logger<
  "{date_time:%Y-%m-%d %H:%M:%S} {thread:^25(named)} {level:(one_letter)(colored)} {file}:{line} {text:(truncated:6)}"
>{};
logger.log_info("message");
// probable output: 2021-01-03 21:21:21        main_thread        I source.cpp:42 mess..

// and here some format() function in my_logger class
template<auto file, int line>
void logger::format(auto out, auto entry) {
  fmt::format_to(
    out,
    FMT_COMPILE(format), // format is a template parameter of my_logger class, FMT_COMPILE used to get compiled format
    fmt::arg<"date_time">(entry.data_time),
    fmt::arg<"thread">(entry.text),
    ...
    fmt::arg<"channel">(entry.channel), // not used by passed pattern
    ...
    fmt::arg<"file", file>(), // file can be passed as template parameter, doesn't make sense, but why not
    ...
    fmt::arg<"text">(entry.text)
  );
}

and to know that my logger object is optimized for the passed pattern only. Maybe this example is closer to real-world tho' it's another theoretical example.

Anyway, I will try to add support of named parameters into {fmt} compile-time API to make it on par with runtime API. Also, I will probably experiment with proposed ideas to see where they would lead me since it's cool stuff IMO, but let see how it would break through my laziness. 🙂

@vitaut
Copy link
Contributor

vitaut commented Jan 7, 2021

I realized that {fmt} just provides one single API, that can be tweaked somehow, for example by using the FMT_COMPILE macro.

Exactly.

Maybe some new API in {fmt} can solve my problems

Another option is to have a flag that disables the fallback to runtime formatting similar to FMT_ENFORCE_COMPILE_STRING.

@alexezeder
Copy link
Contributor Author

alexezeder commented Jan 8, 2021

Another option is to have a flag that disables the fallback to runtime formatting similar to FMT_ENFORCE_COMPILE_STRING.

IMHO this idea is good, because of the reasons I wrote previously. But there, actually, I meant my problems with all that templated formatter and separate parse function stuff.

@alexezeder
Copy link
Contributor Author

After the research I've done while working on fully functional compile-time parsing of formatting string, I have two revisits of proposed ideas.

Idea 2
This idea is impossible to implement, at least without specifying that only { and } symbols pair means nested replacement field in format specs of custom types. There is a syntax rule for non-custom types that there are only 2 of these nested replacement fields possible: dynamic width and precision. But for custom types, there is no obligatory special treatment for those symbols, so every custom formatter can have its own logic to call int next_arg_id() or void check_arg_id(int):

  • it can call these functions for nested replacement fields {} as built-in formatter does
  • it can call these functions also for any other symbol pairs, like [ and ], or even a single symbol
  • it can skip these {} standard replacement fields completely.

Because of that flexibility, there is no option to make type-independent parsing of format specs. Thus a type-independent parsing pass on a formatting string to collect information of replacement fields and needed arguments is not possible too.

But what does this flexibility actually provide?.. Besides the custom look of replacement fields (meh), it provides a fancy feature - varying amount of dynamic specs. It works well only for format strings with automatic indexing because int next_arg_id() can be called only for those strings. Also, formatter doesn't know the current argument index, so it's unable to call void check_arg_id(int) with the incremented index in manual indexing mode. I didn't realize that it's possible, but it's so fun that I'm not so upset that this 2-pass idea is impossible to implement.

Idea 4
The main goal here remains the same as I wrote before, but the proposed implementation was too specific. This part especially:

free parse function that will produce specially prepared formatter parameterized on parsed at compile-time specs.

It should be a free parse function that returns just any formatter type it wants, like in following pseudocode:

template<typename T, int some_parameter>
struct custom_formatter_with_static_specs {
    constexpr void format(char* in) {
        // everything is available at compile-time
    }
};

template<typename T>
struct custom_formatter_with_dynamic_specs {
    arg_ref some_parameter_ref;

    constexpr void format(char* in) {
        // ew, using runtime
    }
};

constexpr parsed_specs parse_specs(std::string_view str);

template<typename T>
constexpr auto parse(std::string_view str) {
    constexpr auto specs = parse_specs(str);
    if constexpr (specs.has_ref()) {
        return custom_formatter_with_dynamic_specs<T>{specs.get_some_parameter_ref()};
    } else {
        return custom_formatter_with_static_specs<T, specs.get_some_parameter_value()>{};
    }
}

It made me realize how the current extension API can be left untouched even with adopting the proposed extension API, here is pseudocode (again):

// {fmt} library code
namespace fmt {
    template <typename T, typename Char>
    struct formatter<T, Char>; // current fmt formatter, has parse() method

    namespace detail {
        template <typename T, typename Char>
        struct formatter_with_static_specs<T, Char>; // new fmt formatter, no parse() method
        // ... other formatters for non-custom types here
    }

    template<typename T>
    constexpr bool is_custom_type();

    template<typename T, enable_if_t<!is_custom_type<T>()>>
    constexpr auto parse(std::string_view str) {
        constexpr auto specs = parse_specs(str);
        // choose one of formatter from fmt::detail
        return detail::some_specific_formatter{};
    }

    template<typename T>
    constexpr bool has_fmt_formatter(); // has fmt::formatter<T> instance

    template<typename T, enable_if_t<is_custom_type<T>() && has_fmt_formatter<T>()>>
    constexpr auto parse(std::string_view str) {
        auto fmt = fmt::formatter<T>;
        fmt.parse(str);
        return fmt;
    }
}

// proposed user code, using parse()
struct custom_type{};

struct custom_formatter{}; // no parse() method

template<>
constexpr auto fmt::parse<custom_type>(std::string_view str) {
    return custom_formatter();
}

// legacy user code, using fmt::formatter<>
struct custom_type{};

template<>
struct fmt::formatter<custom_type> : fmt::formatter<std::string> {}; 
// has parse() method, as well as fmt::formatter<std::string> does

Of course, some tests are needed for that.

Just keeping everyone informed and probably making some kind of todo list for myself here 😄

@alexezeder
Copy link
Contributor Author

3. We shouldn't change the arg function but using UDLs seems reasonable.

Coming back to the first answer of yours, @vitaut. Rereading it I realized that maybe I get it wrong back then. As far as I can see, there is nothing against making it using UDL, by it I mean a named argument with its name available at compile-time. Am I right?

@vitaut
Copy link
Contributor

vitaut commented Apr 16, 2021

Am I right?

I think so.

@alexezeder
Copy link
Contributor Author

Noice 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants