-
Notifications
You must be signed in to change notification settings - Fork 30.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
buffer: optimize Buffer#toString() #2027
Conversation
return ''; | ||
if (arguments.length === 0) | ||
return this.utf8Slice(0, length); | ||
return slowToString(this, arguments[0], arguments[1], arguments[2]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to be a tiny bit faster than slowToString.apply(this, arguments)
but that may change when we upgrade to V8 4.4.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, seems 4.2 does a pretty good job too, once it warms up. I'll switch this over to .apply()
.
f858003
to
4b66a40
Compare
return ''; | ||
if (arguments.length === 0) | ||
return this.utf8Slice(0, length); | ||
return slowToString.apply(this, arguments); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't it be faster to do return slowToString.call(this, arguments[0], arguments[1], arguments[2]);
? Or maybe pass this
as the first argument and avoid .call()
/.apply()
altogether?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See this comment. The initial version called slowToString(this, arguments[0], ...)
but when I ran more benchmarks, it turned out that .apply()
is faster by about 25-30% once the optimizing compiler kicks in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Won't function(encoding, start, end) {
and return slowToString.apply(this, [encoding, start, end]);
work here? There seems to be no reason to use arguments
. Could you test that, please?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why create a new array every time when there is already arguments
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, true, an array is slow.
In my local microbenchmark function(encoding, start, end) {
and return .call(this, encoding, start, end)
wins for all number of arguments (except three, where .apply(this, arguments)
is as fast).
The problem with return slowToString.call(this, arguments[0], arguments[1], arguments[2]);
is in arguments
, not in .call()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm. Can't get my test to perform accurately. Seems the true performance hit is using undefined arguments[N]
values. Welp, seems we have some cleaning up to do in places like: https://github.com/nodejs/io.js/blob/v2.3.0/src/node.js#L339-L350
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@trevnorris What do you mean cleaning up that particular section of code? That code is switching the arguments length and only passing that many arguments.
FWIW I already benchmarked various alternative function calling methods for this patch on top of the next branch (v8 4.3):
Replacing apply()
with .call(this, arguments[0], arguments[1], ...)
slows down the cases when there are arguments passed, and there is a slight performance hit in the zero argument case (with apply I saw ~510% increase, but call showed ~470%).
Replacing apply()
with a direct function call, passing in the context as an extra argument performs about the same as using .call()
.
Replacing apply()
with a switch on arguments.length
and using either .call()
or passing the context in the < 3 cases (using .apply()
as default), the zero argument case is a bit lower IIRC (~470% increase), but now the non-zero argument cases are no longer affected.
So just using .apply()
instead of several-line switch is shorter and even a tad faster on the zero argument case. I haven't tested these scenarios on the master branch (v8 4.2) though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mscdex Is args
there an arguments
object or an Array
? I'm aware that referencing undefined values on an arguments
object does have significant overhead, but my benchmarks show that that is not the case for a real array.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@trevnorris I did not test with an array, just arguments
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mscdex neither had I before this PR. Some testing showed that referencing undefined members in an array doesn't have any performance impact. Only side effect is the argument length being too long on the called function.
LGTM with one style nit |
Do we have data to back that up? Most of my calls to Also, why did you split up the function, is this purely something for the compiler to avoid unpacking the arguments until they are used? |
I'm basing it off the number of implicit toString() calls you get in scripts that do
V8 is a method JIT, not a tracing JIT. It optimizes whole methods, not individual code paths.
Two reasons: small methods are more likely to get inlined at the call site and generally result in tighter machine code. |
@@ -379,6 +378,16 @@ Buffer.prototype.toString = function(encoding, start, end) { | |||
}; | |||
|
|||
|
|||
Buffer.prototype.toString = function() { | |||
const length = this.length | 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the int32 conversion on the length a safeguard in case the length has been altered or this
isn't actually a Buffer instance?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Neither, really. I just like my variables to have the type I expect them to have. I can remove it if you want.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just wanted to make sure there wasn't something I didn't see. Don't bother taking it out.
One question, but code LGTM. |
This is my only concern going forward, I feel like this 'optimization' is entirely speculative at this point, and it does introduce some more complexity to maintaining this code in the future. If we can put together some actual statistics that show this will be more beneficial, then I think that'd be great. However, from a different view point, if we are going to do this code path style optimization, why not offer the same for the other If you really want to do this, why not offer |
It's a win no matter how you slice it: it makes the default case faster without regressing the non-default case. I don't find the complexity argument convincing. You should see some of the other code in the lib/ directory! |
I've not run the benchmark, but otherwise LGTM. |
Do we have stats for that? Or we just taking each others word on these things :)
Hardly a convincing rebuttal either, "look, its as bad everywhere else!". Not meaning to be a PITA, its just that this potentially touches on a lot of code, and I'm just trying to help :) |
Change in complexity is minimal at best, and there's an included benchmark to verify the result. And in terms of allowed complexity. We allow some crazy stuff to be done, far beyond what this patch does, for even minimal performance gains. I'm responsible for some of those myself. This patch is doing nothing out of the ordinary. |
LGTM |
@trevnorris sorry, I missed that benchmark. |
@dcousens such is the world of JIT compiled languages. :) |
Break up Buffer#toString() into a fast and slow path. The fast path optimizes for zero-length buffers and no-arg method invocation. The speedup for zero-length buffers is a satisfying 700%. The no-arg toString() operation gets faster by about 13% for a one-byte buffer. This change exploits the fact that most Buffer#toString() calls are plain no-arg method calls. Rewriting the method to take no arguments means a call doesn't go through an ArgumentsAdaptorTrampoline stack frame in the common case. PR-URL: nodejs#2027 Reviewed-By: Brian White <mscdex@mscdex.net> Reviewed-By: Christian Tellnes <christian@tellnes.no> Reviewed-By: Daniel Cousens <email@dcousens.com> Reviewed-By: Jeremiah Senkpiel <fishrock123@rocketmail.com> Reviewed-By: Trevor Norris <trev.norris@gmail.com>
4b66a40
to
8350f3a
Compare
@bnoordhuis I'm running your new benchmark script against master and v2.3.1 and it seems slower now:
|
@rvagg That may actually be a byproduct of a patch I'm responsible for and landed after this one about preventing Buffer methods from aborting. |
@trevnorris 700 steps forward, 800 steps back? |
backing up to this commit it looks like you're right @trevnorris, you've ruined great gains!
636% and 534% perf improvement at this commit, but going back below 100% @ master |
Okay, the |
yeah, I'm not overly concerned since I don't think len=0 is a particularly common case, I just needed to know whether this was a notable item for the changelog and the answer is no |
So is this commit still useful? |
Probably. Will need some massaging to regain as much perf as possible. |
Break up Buffer#toString() into a fast and slow path. The fast path optimizes for zero-length buffers and no-arg method invocation. The speedup for zero-length buffers is a satisfying 700%. The no-arg toString() operation gets faster by about 13% for a one-byte buffer. This change exploits the fact that most Buffer#toString() calls are plain no-arg method calls. Rewriting the method to take no arguments means a call doesn't go through an ArgumentsAdaptorTrampoline stack frame in the common case. PR-URL: nodejs#2027 Reviewed-By: Brian White <mscdex@mscdex.net> Reviewed-By: Christian Tellnes <christian@tellnes.no> Reviewed-By: Daniel Cousens <email@dcousens.com> Reviewed-By: Jeremiah Senkpiel <fishrock123@rocketmail.com> Reviewed-By: Trevor Norris <trev.norris@gmail.com>
Break up Buffer#toString() into a fast and slow path. The fast path
optimizes for zero-length buffers and no-arg method invocation.
The speedup for zero-length buffers is a satisfying 700%. The no-arg
toString() operation gets faster by about 13% for a one-byte buffer.
This change exploits the fact that most Buffer#toString() calls are
plain no-arg method calls. Rewriting the method to take no arguments
means a call doesn't go through an ArgumentsAdaptorTrampoline stack
frame in the common case.
R=@trevnorris?
CI: https://jenkins-iojs.nodesource.com/view/iojs/job/iojs+any-pr+multi/64/