Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add FMT_REDUCE_INT_INSTANTIATIONS flag #1781

Merged
merged 4 commits into from
Jul 19, 2020

Conversation

kammce
Copy link
Contributor

@kammce kammce commented Jul 17, 2020

Reduce code bloat by removing multiple instantiation of int_writer based
on the parameter.

Rationale:

  • The only functions that gains a speedup by int size would be
    int_writer::on_dec()'s call to count_digits which uses CLZ. Thus to
    still take advantage of this speedup, we store the size of the int
    so we can use a switch statement to call the correct count_digits.
  • All other implementations of count_digits require some sort of looping
    that terminates when the value hits zero regardless of what sized int
    it is.

Caveats:

  • There is a performance hit when dealing with and passing around
    64-bit/128-bit values compared to 32-bit values on 32-bit platforms,
    and with 64-bit values on 64-bit systems. But this should not reduce the
    performance that dramatically.
  • There is also a performance hit for on_dec() due to the addition of a
    switch case. But, due to it size, this should reduce to a jump table.

Resolves #1778

I agree that my contributions are licensed under the {fmt} license, and agree to future changes to the licensing.

@kammce
Copy link
Contributor Author

kammce commented Jul 17, 2020

@vitaut is there a make target to run a performance test?

@vitaut
Copy link
Contributor

vitaut commented Jul 18, 2020

There is a performance hit when dealing with and passing around
64-bit/128-bit values compared to 32-bit values on 32-bit platforms,
and with 64-bit values on 64-bit systems.

This is a non-starter.

A better option would be to keep int_writer parameterized on integer type and add a configuration option to make uint32_or_64_or_128_t only map to the largest type. This won't add overhead for the common case.

@kammce
Copy link
Contributor Author

kammce commented Jul 18, 2020

Hmmm, I like it. I can make that change.

Reduce code bloat by removing multiple instantiation of int_writer based
on the <typename UInt> parameter.

Rationale:
- The only functions that gains a speedup by int size would be
  int_writer::on_dec()'s call to count_digits which uses CLZ. Thus to
  still take advantage of this speedup, we store the size of the int
  so we can use a switch statement to call the correct count_digits.
- All other implementations of count_digits require some sort of looping
  that terminates when the value hits zero regardless of what sized int
  it is.

Caveats:
- There is a performance hit when dealing with and passing around
  64-bit/128-bit values compared to 32-bit values on 32-bit platforms,
  and with 64-bit values on 64-bit systems. But this should not reduce the
  performance that dramatically.
- There is also a performance hit for on_dec() due to the addition of a
  switch case. But, due to it size, this should reduce to a jump table.

Resolves fmtlib#1778
When defined and set to zero, will use the largest available integer
container for writing ints. The has the benefit of reducing instances
the of int_writer class which will reduce the binary cost.
@kammce kammce changed the title Remove <typename UInt> from int_writer Add FMT_USE_SMALLEST_INT flag Jul 18, 2020
@vitaut
Copy link
Contributor

vitaut commented Jul 19, 2020

Looks good but I suggest renaming FMT_USE_SMALLEST_INT to something that more clearly describes the effect of this option. Maybe FMT_REDUCE_INT_INSTANTIATIONS?

Add comment above FMT_REDUCE_INT_INSTANTIATIONS definition describing
why a developer would use it.
@kammce kammce changed the title Add FMT_USE_SMALLEST_INT flag Add FMT_REDUCE_INT_INSTANTIATIONS flag Jul 19, 2020
include/fmt/core.h Outdated Show resolved Hide resolved
@kammce kammce requested a review from vitaut July 19, 2020 19:39
@vitaut vitaut merged commit d11849b into fmtlib:master Jul 19, 2020
@vitaut
Copy link
Contributor

vitaut commented Jul 19, 2020

Thanks!

@kammce kammce deleted the optimize-int-writer branch July 19, 2020 20:18
@vitaut
Copy link
Contributor

vitaut commented Jul 19, 2020

@kammce, BTW for future reference, what platform are you targeting and what are the savings with -DFMT_REDUCE_INT_INSTANTIATIONS=1?

@kammce
Copy link
Contributor Author

kammce commented Jul 19, 2020

TL;DR: Saved 2024 bytes

I'm surprised I didn't mention that at the start.

Target CPU: Cortex M3 (although I test on M4F, M7 etc)
Target MCU: stm32f103 (64kB of flash)
Compiler: gcc-arm-none-eabi-9-2020-q2-update
Optimization Level: -Os

Following flags were used to get the binary as small as possible during testing:

#define FMT_STATIC_THOUSANDS_SEPARATOR ','
#define FMT_USE_FLOAT 0
#define FMT_USE_DOUBLE 0
#define FMT_USE_LONG_DOUBLE 0
#define FMT_REDUCE_INT_INSTANTIATIONS 1

Binary increased by 10,812 bytes when using fmt with the above flags turned on.
12,836 bytes is the increase with -D FMT_REDUCE_INT_INSTANTIATIONS 0.

Thus the binary amount saved is 2024 bytes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Reduce Binary Size by Removing <typename Uint> from int_writer
2 participants