-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce size of buffer, stringBuffer and tape. #42
base: main
Are you sure you want to change the base?
Conversation
I believe in future, the JsonValue might be deep copied, thus the size of byte[] and long[] is important. |
e72bac3
to
9fcfa59
Compare
In class JsonValue, the default size of buffer and stringBuffer, as well as long[] tape in class Tape, is 34M. But in practice it's not necessary. This patch reduce the size of them from 34M to its actual size.
@@ -34,7 +34,8 @@ public JsonValue parse(byte[] buffer, int len) { | |||
} | |||
|
|||
private byte[] padIfNeeded(byte[] buffer, int len) { | |||
if (buffer.length - len < PADDING) { | |||
if (buffer.length - len < PADDING && len < capacity) { | |||
byte[] paddedBuffer = new byte[len + PADDING]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason for maintaining the paddedBuffer
all the time, regardless of whether it is necessary or not, is to avoid allocations on hot paths. However, I see at least two issues with padding in general. Firstly, it requires adding this extra branch. Secondly, it complicates the API: on one hand, the user doesn't need to be aware of it, but on the other hand, if they want to achieve the best performance, they should pad the input. Therefore, I've been considering removing the need for padding altogether. It should be possible, although I haven't thoroughly researched this topic.
To summarize: I'd start by verifying if removing the padding is possible. If so, I'd remove it and test the performance of the parser. If there is no regression compared to the current version with padding, we have a win-win situation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think by design, the padding is 64 bytes. However, the 'paddedBuffer' is 34MB, that's such a waste. I'm just change the padding size to 64 bytes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with you that this is a waste. However, in your approach, you are potentially allocating a new array on every call of the parse
method, which can be costly.
I've been working on removing the padding entirely. It's a bit complicated, but we will see if it is feasible. I'll report back.
|
||
int stringBufferLen = stringParser.getStringBufferIdx(); | ||
byte[] newStringBuffer = new byte[stringBufferLen]; | ||
System.arraycopy(stringBuffer, 0, newStringBuffer, 0, stringBufferLen); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this change related to #36? I'm asking because I'm a bit concerned that we need another allocation on the parsing path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. I think there must be an allocation somewhere, if we want to save the information of "old" data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And as I mentioned above, the default size of buffer and stringBuffer, as well as long[] tape in class Tape, is 34M. If we allocated 34M * 3 for each element, the cost is way too much.
In class JsonValue, the default size of buffer and stringBuffer, as well as long[] tape in class Tape, is 34M.
But in practice it's not necessary.
This patch reduce the size of them from 34M to its actual size.