Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support named arguments #169

Closed
wants to merge 12 commits into from
Closed

Support named arguments #169

wants to merge 12 commits into from

Conversation

jamboree
Copy link
Contributor

@jamboree jamboree commented Jun 5, 2015

Allows code like below:

int x = 1, y = 2;
fmt::print("point: ({x}, {y})", FMT_CAPTURE(x, y));

@vitaut
Copy link
Contributor

vitaut commented Jun 6, 2015

Wow, really impressive stuff! On a cursory glance, the implementation seems very reasonable, but I'd like to look into it in more details before merging. Does the name map handling add a lot of overhead?

@jamboree
Copy link
Contributor Author

jamboree commented Jun 6, 2015

I don't have any benchmark so I can't tell the numbers, but hopefully it won't add much overhead. What it does is simple: The args are distinguished into normal args and named args by fmt::arg(name, value), inside any formatting function, the named args will be extracted into a flat-map(name->index), the map uses stack space, no allocation needed, if there's no named args, I think the overhead here is negligible. In doing the formatting, there's small overhead that the parser needs to check whether it's arg_index or arg_name, if it's an arg_name, then a binary search is performed on the map.

@vitaut
Copy link
Contributor

vitaut commented Jun 7, 2015

I very much like the feature, but building maps of names arguments adds quite a bit to the compiled code size (bloat-test from format-benchmark) even when not using them:

Without named arguments: 34912 bytes
With named arguments: 256096 bytes

Could you move map construction from the individual formatting functions to BasicFormatter::format? This will bring the per-function call overhead to the old level. To do this you'll probably need to introduce a new type in Value::Type to represent a named argument (NamedArg object), something like:

  enum Type {
    // ...
    CSTRING, STRING, WSTRING, POINTER, CUSTOM,
    NAMED_ARG
  };

The real type will have to go to NamedArg.

Then the name map can be constructed dynamically in BasicFormatter::format by looking at the arguments of type NAMED_ARG. This only needs to be done if a format string contains an argument name, so there will be little overhead for the case when named arguments are not used.

Hope this makes sense.

@jamboree
Copy link
Contributor Author

jamboree commented Jun 8, 2015

Here's the result I got with MSVC2015 (x64):

Without named arguments: 98,304 bytes
With named arguments: 151,552 bytes
After moving the map sorting into BasicFormatter::format: 141,312 bytes

Not a big improvement :/

@vitaut
Copy link
Contributor

vitaut commented Jun 8, 2015

I meant moving not just map sorting but all map management to BasicFormatter::format so that the code

  template <typename... Args> \
  void func(arg_type arg0, const Args & ... args) { \
    typename fmt::internal::ArgArray<sizeof...(Args)>::Type array; \
   const int count = fmt::internal::NamedArgsCounter<Args...>::value; \
   typename fmt::internal::BasicArgMap<Char>::value_type mapArray[count + 1]; \
   fmt::internal::add_named_args<0>(mapArray, args...); \
   fmt::internal::BasicArgMap<Char> map(mapArray, count); \
   func(arg0, fmt::internal::make_arg_list<Char>(array, &map, fmt::internal::strip_name(args)...)); \
  }

becomes

  template <typename... Args> \
  void func(arg_type arg0, const Args & ... args) { \
    typename fmt::internal::ArgArray<sizeof...(Args)>::Type array; \
    func(arg0, fmt::internal::make_arg_list<Char>(array, args...)); \
  }

as it used to be. Then the compiled code size shouldn't increase (for user code). It won't be possible to fully allocate map on stack, but I think it's not critical. If desired this can be optimized by allocating some map elements on stack.

@jamboree
Copy link
Contributor Author

jamboree commented Jun 8, 2015

There are some problems:

  • NamedArg is not a single type, it's a templated wrapper, so you can't have a single NAMED_ARG type.
  • You have to store the names somewhere, if not in the map, then you have to add at least one more data member to Arg.
  • The storage strategy differs for args under MAX_PACKED_ARGS, this may add more complexity.
  • To build the map, you have to walk through the ArgList at runtime, find out which are named and add them to the map.

Note that the only "map management" thing left in the code was add_named_args, which should be nop (i.e. optimized away) if there's no named-args.

So...I don't think you can magically add names without paying something here.

@vitaut
Copy link
Contributor

vitaut commented Jun 9, 2015

Good points. Let try to answer and clarify my suggestion a bit:

  • NamedArg is not a single type, it's a templated wrapper, so you can't have a single NAMED_ARG type.

This can be addressed by making NamedArg wrap Arg, something like:

struct NamedArg {
  StringRef name;
  Arg arg;
};

NamedArg arg(StringRef name, T const& value) {
  NamedArg named_arg;
  named_arg.name = name;
  named_arg.arg = MakeValue<char>(value);
  named_arg.arg.type = make_type(value);
  return named_arg;
}

Then the type information will be captured by the Arg object inside NameArg.

  • You have to store the names somewhere, if not in the map, then you have to add at least one more data member to Arg.

Names can be stored in temporary NamedArg object which are constructed by fmt::arg anyway.

  • The storage strategy differs for args under MAX_PACKED_ARGS, this may add more complexity.

You are right, the storage strategy depends on MAX_PACKED_ARGS, but I don't think it affects named arguments. All the complexity is hidden in ArgList::operator[] which doesn't need to change. You can use it as is, the only difference in the proposed scheme is that it will return NAMED_ARG in addition to existing types.

  • To build the map, you have to walk through the ArgList at runtime, find out which are named and add them to the map.

Yes, but only if there are named arguments, so there will be no or little overhead when they are not used.

Note that the only "map management" thing left in the code was add_named_args, which should be nop (i.e. optimized away) if there's no named-args.

This is a bit more complicated than it seems. I haven't looked into it in details, but even adding a member to ArgList may cause noticeable code increase because an argument that used to be passed in registers now has to be copied onto stack and passed as an address.

If it sounds too complicated or unclear, I can try implementing the idea with additional NAMED_ARG type based on your PR myself and we'll see if it works.

@jamboree
Copy link
Contributor Author

jamboree commented Jun 9, 2015

This can be addressed by making NamedArg wrap Arg, something like:

struct NamedArg {
  StringRef name;
  Arg arg;
};

Hmm, this seems recursive, Arg is based on Value which is a variant of possible types and now you're including NamedArg itself as one of the possible types?

@vitaut
Copy link
Contributor

vitaut commented Jun 9, 2015

Hmm, this seems recursive, Arg is based on Value which is a variant of possible types and now you're including NamedArg itself as one of the possible types?

Yes, but it's not a problem as Value can contain a pointer to (forward declared) NamedArg. Normally recursion depth will be only two or three (depending on how you count) unless someone writes something like this =):

fmt::format("{}", fmt::arg("x", fmt::arg("y", ...)));

which can be rejected at compile time.

}
EXPECT_THROW_MSG(format("{0:{1}}", 0, (INT_MAX + 1ul)),
FormatError, "number is too big");

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are the tests above removed?

@vitaut
Copy link
Contributor

vitaut commented Jun 9, 2015

Thanks for making the changes, the compiled code size is back to 34912 which is great! Two minor comments:

  1. Since an argument name acts more like an identifier maybe we don't need a wide char version of it. What do you think?
  2. An inline comment in https://github.com/cppformat/cppformat/pull/169/files#r32016076

Otherwise this looks good and I'm happy to merge this in.

@vitaut
Copy link
Contributor

vitaut commented Jun 9, 2015

And one more thing: could you squash the WIP commits to keep the history clean? Thanks!

@jamboree
Copy link
Contributor Author

jamboree commented Jun 9, 2015

  • I'm not too familiar with git commands, what should I do to squash the commits? (I'll google it...)
  • That was a merge mistake.
  • You have to convert wchar_t string to char string first or how could you compare them?

@vitaut
Copy link
Contributor

vitaut commented Jun 9, 2015

  • I'm not too familiar with git commands, what should I do to squash the commits? (I'll google it...)

Answers to this SO question might help.

  • You have to convert wchar_t string to char string first or how could you compare them?

I guess conversion is the only way with std::map. If names where stored in a sorted array one could use binary search and std::lexicographical_compare which can handle different types. Maybe let's keep wide strings for now and discuss this question separately.

@jamboree
Copy link
Contributor Author

Close this for a new PR.

@jamboree jamboree closed this Jun 10, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants