Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

C++: return std::string_view instead of char* for variable length strings #550

Closed
tkohlman opened this issue Mar 23, 2018 · 7 comments
Closed

Comments

@tkohlman
Copy link

I nice optimization and/or enhancement would be to return std::string_view instead of const char* for variable length strings. Applications may wish to use a string without copying its bytes somewhere else.

To achieve this now, code must read and save the string's length, then read the string, then construct the string_view.

auto usernameLength = message.usernameLength();
std::string_view username(message.username(), usernameLength);

It cannot be done in a single line, because the length field must be read first. Also note that the length is copied into a uint32_t field twice, once in message.usernameLength() and a second time in message.username(). Returning std::string_view would eliminate one of the copies.

It would be nice to be able to do one of the following:

std::string_view username = message.username();

auto username = message.username();

This might require a C++17 compatibility flag of some kind.

@mjpt777
Copy link
Contributor

mjpt777 commented Mar 23, 2018

The string view is a nice feature. However requiring C++ 17 might be too much of an ask. We often get requests to provide back ports to C++ 98 and before!

@ksergey
Copy link
Contributor

ksergey commented Mar 23, 2018

#if __cplusplus > 201402L
  ...
#endif

Or make something like in fmtlib (https://github.com/fmtlib/fmt/blob/master/include/fmt/core.h)

@tmontgomery
Copy link
Contributor

A simple version check would be sufficient. As @ksergey mentions.

But, I would not remove the method to return the const char *. It is simply useful for too many things. Just adding a std::string_view method would be good enough.

@tkohlman
Copy link
Author

std::string_view does have a data() function, similar to std::string.

@tmontgomery
Copy link
Contributor

@tkohlman I am aware.

std::string_view is C++17. Using it to get a const char * would not be that useful for most users I know who just want a direct pointer into the buffer at the varData location and want it with the minimum amount of unknown side effects (for optimization purposes).

@tkohlman
Copy link
Author

@tmontgomery

Is the const char* useful without knowing the length of the varData field? The returned const char* is not guaranteed to be null terminated, so I think that most use cases would require the user to first call the corresponding length function. The length function involves a redundant memcpy, so if anything, std::string_view is an optimization over the existing scheme. With std::string_view, the code would essentially merge the two functions, eliminate a memcpy, and return a pair of values.

That being said, I understand a desire to maintain backwards compatibility in the API. For my purposes, I would be happy with a std::string_view field = message.fieldView() function for varData fields.

@tmontgomery
Copy link
Contributor

tmontgomery commented Mar 23, 2018

@tkohlman

C++17 is a full stop for most users.

I think you are making a lot of assumptions that don't hold in the usages I know of.

The const char * is generated for all varData field types. So, it is access into the field. It doesn't assume it's a NULL terminated string... although, it could be one. The field may be very predictable in length based on contents and length is not that important except there is no other way to handle fields that could vary in size in SBE. Just 2 length options is enough to warrant its usage. So, I would not make assumptions in this regard. The method is what it is.... access directly into the field.

Grabbing the length doesn't move the position. And the redundant memcpy is easily elided. All the optimizations you mention are also possible without std::string_view and are easier for the compiler to see (not to mention users to visually inspect and understand).

In my experience, what could be optimized usually means the optimizer can be easily defeated on. Simple straight forward logic, though, is usually more easily and predictably optimized. Especially in large projects.

But as with anything, measuring in actual usage in an app has no comparison. The use of const char *, I do know in several apps gets optimized very very well. Even when grabbing the length first.
std::string_view MAY be better in an app. It might not be. Simply removing const char * because the compiler might optimize std::string_view better would not be a great option.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants