-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve SimpleJsonEncoder.escapeString memory usage #61
Conversation
I also created and ran a Java microbenchmark: Before:
After:
Is notable the reduced GC count and time, and increased throughput, almost 3x |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your contribution! Please see my comments.
} | ||
if (ch < ' ') { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why don't you keep this if in the switch (default) block?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code looked simpler that way, but I didn't realize that it would add an unnecessary if condition check when the current char is an escaped char. Moved the if back to the default block
escaped = escapeCharacter(ch); | ||
} | ||
if (escaped != null) { | ||
writer.append(str.substring(lastCopyIndex, currentIndex)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As this method already gets the char array representation of str
(via str.toCharArray()
), I assume that using java.io.Writer#write(char[], int, int)
would be a lot faster than slicing the string. Could you check this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to Writer.write
instead of append, and it performed way better than the first changes. Also performed better than the upstream version of this code, I will put the reports on the comments bellow
currentIndex++; | ||
} | ||
if (lastCopyIndex != currentIndex) { | ||
writer.append(str.substring(lastCopyIndex)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
...same here java.io.Writer#write(char[], int, int)
is probably way faster.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the code
See the summary of the benchmark results between the upstream version and before this changes: Upstream:
After:
testEncode -> 25% throughtput gain My understanding is that with the current changes, there are gains when there are few characters to escape, but there are losses when encoding strings with more characters to escape. I'm going to think how to re-work those changes. I also uploaded the benchmark code on a new git repo: https://github.com/fabiojb/logbook-gelf-benchmark |
int lastCopyIndex = 0; | ||
char[] chars = str.toCharArray(); | ||
for (final char ch : chars) { | ||
String escaped = null; | ||
switch (ch) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be beneficial to refactor the switch block to a private static method and call it via final String escaped = escapeIfNeeded(ch);
. Another possible optimization would be to have private static final char
attributes for all the escape sequences to test for.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm going to work on a way to escape the string only if is needed.
I did some tests putting some values on static final
variables, but it didn't show a significant difference in the performance tests
@osiegmar I have submitted two new changes, and finished the improvements I had in mind. First 90fb46a, reduced memory allocation:
Second, 53ca94b is a more verbose change, that escapes characters only when it's needed. It did improve the throughput:
To better understand the improvements, you can compare this results against the Upstream results in the previous comment. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry but this commit went way too far. While it might/does improve the performance it hurts the simplicity/maintainability due to its API changes and +150 lines of code. I don't want to sacrifice this for the sake of performance.
d1cc3a0
to
5d512cd
Compare
No problem, I understand your point. I removed the last commit, but can you still review the commit ca7259f that improves memory usage? |
5d512cd
to
ca7259f
Compare
@@ -107,7 +107,8 @@ private void appendKey(final String key) throws IOException { | |||
*/ | |||
@SuppressWarnings("checkstyle:cyclomaticcomplexity") | |||
private void escapeString(final String str) throws IOException { | |||
for (final char ch : str.toCharArray()) { | |||
for (int i = 0; i < str.length(); i++) { | |||
char ch = str.charAt(i); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a final in front of char ch
and I can approve it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Iterate over the current String internal values instead of making a copy of its content with String.toCharArray
ca7259f
to
ee36fad
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
I'll be part of https://github.com/osiegmar/logback-gelf/milestone/1 which is to be released when the test coverage is (greatly) improved. Unfortunately I hadn't had enough time for that and cannot promise a specific date. |
write String slices instead of single chars to the Writer. Iterate over the string to be escaped and write a new slice every time a char needs to be escaped.