Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WString: Optimize a bit more #7758

Closed

Conversation

jjsuwa-sys3175
Copy link
Contributor

@jjsuwa-sys3175 jjsuwa-sys3175 commented Dec 11, 2020

  • remove common subexpressions from concat(), replace(const String&, const String&), remove() and trim()
  • optimize substring(), due to the fact that concat(const char*, unsigned int) accepts non-null-terminated char sequence
  • optimize insert(size_t, char), same as the abovementioned way
  • optimize replace(char, char), toLowerCase() and toUpperCase() by range-based for loop, due to the fact that end() also returns nullptr when begin() returns nullptr
  • mark length(), setCharAt(), operator[], end() and replace(char, char) as __attribute__((flatten)) that inlines their bodies completely and performs more aggressive optimizations
  • eliminate unneeded truncation-to-16bit-op extui Ay,Ax,0,16 in changeBuffer() by replacing uint16_t with unsigned int
  • make setLen() implicit null-terminating (as per espressif/arduino-esp32@22a488c), and rearrange to eliminate explicit null-termination around calling setLen()
  • move some of function bodies in '.h', which their functions call ctor from inside of, to '.cpp' (@TD-er's suggestion)

"FSBrowser" building result indicates savings 224 bytes of IROM.

[before]
IROM : 313468
IRAM : 26957
DATA : 1516
RODATA : 1612
BSS : 26136

[after]
IROM : 313244
IRAM : 26957
DATA : 1516
RODATA : 1612
BSS : 26136

@mcspr
Copy link
Collaborator

mcspr commented Dec 15, 2020

remove duped body of ctor(StringSumHelper &&rval)

btw, any idea why string has this helper? Looking at some string implementations like std::string or folly::fbstring, both use the original class

@jjsuwa-sys3175
Copy link
Contributor Author

btw, any idea why string has this helper?

it's for optimization of String addition, i understand.

GCC recognizes such a phrase:

// b ... f can be accepted by String ctor
String a = b + c + d + e + f;

and will convert to:

String a;
do {
    StringSumHelper temp = b;
    temp += c;
    temp += d;
    temp += e;
    temp += f;
    a = temp;
} while (0);

@mcspr
Copy link
Collaborator

mcspr commented Dec 15, 2020

Ah. I guess it's another case of pre-c++11 code? I tried a small test removing Helper and replacing it with:

       String operator+(const String& lhs, const String& rhs);
       String operator+(String&& lhs, const String& rhs);
       String operator+(const String& lhs, String&& rhs);
       String operator+(String&& lhs, String&& rhs);
       String operator+(const String& lhs, const char* rhs);
       String operator+(String&& lhs, const char* rhs);

(raw mcve patch here -> https://gist.github.com/mcspr/aab2d925b20af22282ee0959b840d3e1)

Which seems to do the same thing - chained things avoid allocation, using moveable objects as temporaries. I'll test a bit more during the week to make it into something looking like a PR (which hopefully does not include all of 'concat(numeric type)' overrides...)

@jjsuwa-sys3175
Copy link
Contributor Author

I'll test a bit more during the week to make it into something looking like a PR (which hopefully does not include all of 'concat(numeric type)' overrides...)

thx for reply @mcspr, so looking forward to it :)

@@ -388,7 +388,8 @@ class StringSumHelper: public String {
}
};

extern const String emptyString;
// global empty string to allow returning const String& with nothing
inline const String emptyString;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this fix or improve?

@TD-er
Copy link
Contributor

TD-er commented May 24, 2021

Just another request for optimization:
This is done in the .h file:

unsigned char startsWith(const char *prefix) const { return this->startsWith(String(prefix)); }
unsigned char startsWith(const __FlashStringHelper *prefix) const { return this->startsWith(String(prefix)); }

And for a few more basic functions. (e.g. replace, 'endsWith) However, since this is in the .h file, it is done inline. Meaning the call to String(....)is also added to every call for these functions where the argument isn't already aString`.
This does add up to the compiled binary size.

So please implement this in the .cpp to reduce binary size.

* remove common subexpressions from `concat()`, `replace(const String&, const String&)`, `remove()` and `trim()`

* optimize `substring()`, due to the fact that `concat(const char*, unsigned int)` accepts non-null-terminated char sequence

* optimize `insert(size_t, char)`, same as the abovementioned way

* optimize `replace(char, char)`, `toLowerCase()` and `toUpperCase()` by range-based `for` loop, due to the fact that `end()` also returns `nullptr` when `begin()` returns `nullptr`

* mark `length()`, `setCharAt()`, `operator[]`, `end()` and `replace(char, char)` as `__attribute__((flatten))` that inlines their bodies completely and performs more aggressive optimizations

* eliminate unneeded truncation-to-16bit-op `extui Ay,Ax,0,16` in `changeBuffer()` by replacing `uint16_t` with `unsigned int`

* make `setLen()` implicit null-terminating (as per espressif/arduino-esp32@22a488c), and rearrange to eliminate explicit null-termination around calling `setLen()`

* move some of function bodies in '.h', which their functions call ctor from inside of, to '.cpp' (@TD-er's suggestion)

"FSBrowser" building result indicates savings 224 bytes of IROM.

[before]
IROM   : 313468
IRAM   : 26957
DATA   : 1516
RODATA : 1612
BSS    : 26136

[after]
IROM   : 313244
IRAM   : 26957
DATA   : 1516
RODATA : 1612
BSS    : 26136
@TD-er
Copy link
Contributor

TD-er commented May 27, 2021

I have spent quite a lot of time on my own project to prevent lots of casts to String from a flash string, by adding function wrappers to frequently used functions accepting a flash string as an argument.
This reduced the compiled binary size A LOT(!!!). For the largest binary I could gain over 70k in size.

But there is still a lot to gain as a lot of compares like operator == of the String class do not have a variant accepting a flash string as an argument.
So can you add those too (implement in the .cpp !!!) like:

unsigned char operator ==(const __FlashStringHelper *fstr) const;

unsigned char String::operator ==(const __FlashStringHelper *fstr) const {
  return equals(fstr);
}

There isn't a version of equals accepting a flash string, so it will copy-construct it to a String, like it also does now. But at least it will not generate code for this copy construct every single time it is called.
So this will not affect performance, but at least it will generate smaller binaries.
Even better would be to have a version of the equals function with a flash string as an argument, so you won't even need to copy it.

Just to give an example of what is implemented for the startsWith function, in the .h file:

        unsigned char startsWith(const __FlashStringHelper * prefix) const {
            return this->startsWith(String(prefix));
        }

Since it is implemented in the .h file, it is rather useless as it does generate the same (inline) code as the compiler would if only the function existed with const String& as argument.
By moving this to the .cpp file, you actually reduce the build size as the cast to String is then only added once to the binary.

@earlephilhower What do you think of such optimizations?
You could even wrap them in a define so people not using the String class a lot will not see an increase of the String object, but for those using it a lot, will see a (sometimes) significant drop in binary size.

ptr.len = len;
ptr.buff[len] = 0;
Copy link
Contributor

@TD-er TD-er May 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't there be a check to see if ptr.buff[len] is allocated?
If allocation failed, ptr.buff may not be allocated at all, or not being the expected size.
Not sure if it may happen at all in all occurences where setLen is called, but maybe best to at least have a check here to see if buff isn't a nullptr. (and maybe only allow to reduce the value)

@jjsuwa-sys3175
Copy link
Contributor Author

closing because outdated.

@jjsuwa-sys3175 jjsuwa-sys3175 deleted the WString_OptiMore branch July 5, 2022 06:16
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants