Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Output Ranges for Functionality & Additional Safety #3756

Closed
wants to merge 14 commits into from

Conversation

ThePhD
Copy link
Contributor

@ThePhD ThePhD commented Dec 15, 2023

The following snippet will overrun a too-small buffer:

char my_tiny_char_buffer[4];
fmt::format_to(my_tiny_char_buffer, "Beeg number: {}", 58629475448);

This pull request changes that by turning all places that took OutputIt with format_to to use Output&& instead and use the concept of output_ranges OR output_iterator, where appropriate.

This PR is not intended to make it into mainline {fmt}, just to prove it is possible to do so and pass all the tests to show it doesn't break the existing interfaces.

@ThePhD ThePhD marked this pull request as ready for review December 15, 2023 06:09
— 💡 MSVC fixed this linker errors (at some point); should check for newer versions and just use the char32_t-based version properly.
- format_to(some_char_array, ...) will now do the right thing for the most part, and also supports appending to existing ranges (using the same conceptual ideals as as std::ranges::to)
- The API for format_to is a bit complicated since we do appending and also pass/support all the existing iterator-based use cases.
- format_into is a cleaner API that accepts ranges and output iterators, and does not do appending for those ranges that it recognizes (just fills in the `[begin, end)`). This is necessary to avoid strange behaviors on whether or not a container may or may not meet the appending requirements (e.g., more consistent in generic code).
- We lose a lot of compile-time performance with how we're checking things now, so not the best job we could be doing....
- Much of that compile-time performance can be gained back with more rigorous return-style enable_if and such, but that is beyond my work
- Actual performance for the calls themselves are only marginally smaller for pointer-based iterators (fmt is, internally, lacking continguous iterator support. Counterpoint; buffers are the only case that matters! (this is a bit of a lie lmaoo))
Copy link
Contributor

@vitaut vitaut left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR!

Improving safety for the array of known bound case sounds reasonable but unfortunately there are too many problems with the proposed implementation (and to some extent design) so it will require major changes. The most obvious one is that it is a fairly big breaking change. Fortunately there are ways to redesign this to make it less breaking.

You mentioned that you don't intend this to be merged so I'm a bit confused with what the plan here is. But if you change your mind, the first thing to do would be to split this into three PRs:

  1. Handling the array of known bound in fmt/core.h without introducing a dependency on ranges and making any changes below the buffer layer.
  2. Handling the array of known bound in format string compilation. This needs complete redesign but we could start with the context hack and iterate.
  3. Adding the new (output) ranges API in fmt/ranges.h. You'll need to provide motivation for the changes there.

@ThePhD
Copy link
Contributor Author

ThePhD commented Dec 21, 2023

Ah, making the changes to the format string compilation might be fully beyond my understanding at the moment. But, otherwise, I (think?) I can successfully split the changes into different overloads over multiple files again with the only range stuff in the top-level. I'll PR the known-bound for fmt/core.h seperately too from this one, I guess.

You mentioned that you don't intend this to be merged so I'm a bit confused with what the plan here is.

More of a proof of concept that this can be done, rather than asserting it should be. There's design motivation as to why this should be done but that's an entire C++ proposal, which I think is actually bigger and more involved than I could ever fit in this PR. Even as I chunk this into pieces to make it more acceptable, I'd like to keep this PR here still as a proof that we passed the existing tests and didn't break any of the tested use cases. (At least, not anything picked up by the tests or compilation of said tests, anyways.)

@vitaut
Copy link
Contributor

vitaut commented Dec 23, 2023

Ah, making the changes to the format string compilation might be fully beyond my understanding at the moment.

You actually did it already by changing basic_format_context. Those changes are not needed for anything other than format string compilation and need most revision. This is why I'm suggesting the split.

There's design motivation as to why this should be done but that's an entire C++ proposal, which I think is actually bigger and more involved than I could ever fit in this PR.

Do you have a paper number for this proposal?

@vitaut
Copy link
Contributor

vitaut commented Dec 23, 2023

Closing since this PR is not intended for merging.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants