-
Notifications
You must be signed in to change notification settings - Fork 29.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] lib: optimize util.format() #215
Conversation
Make a number of functions from require('_linklist') return the list or the item just inserted. Makes the API a little less unwieldy to use.
Add a simple (too simple!) benchmark for util.format().
Optimize util.format() by parsing the format string and generating specialized code on the fly, then caching the result in a LRU list. This change is based on the observation that most applications that log extensively, often have very skewed distributions of log patterns. It's common to see top 10 or top 25 of popular patterns, followed by a long (sometimes very long) tail of less popular patterns. This is a work in progress: the common case is currently 2-25x faster but the worst case - every pattern unique - is 4-5x slower.
source += ' return s;'; | ||
source += ' };'; | ||
|
||
return Function('inspect', source)(inspect); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note to self: the generated code gets optimized as of https://codereview.chromium.org/821553003/
@bnoordhuis Side note, I've noticed that |
I have benchmark results using template literals vs use "+" vs util.format. Just FYI: |
@yosuke-furukawa OMG! xD |
@yosuke-furukawa I'll have to refer you to http://mrale.ph/blog/2012/12/15/microbenchmarks-fairy-tale.html. The benchmark you show can be pretty much optimized out completely. |
} | ||
|
||
function untaint(s) { | ||
return s.replace(/[\\"]|[^\x20-\x7F]/g, function(c) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn’t /[\\"\r\n\u2028\u2029]/g
enough?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that you bring it up: probably not. With Harmony template strings being available now, the generated code is susceptible to template literal injection. I'll have to look at this again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/[\\"\r\n\u2028\u2029]/g
is enough, and template strings don’t affect the parsing of double-quoted string literals.
Since this introduces a new module ( |
source += ', a' + i; | ||
|
||
source += ') { var argc = arguments.length, s = "";' + body; | ||
source += ' for (var i = ' + argc + ' + 1; i < arguments.length; i += 1) {'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you want to use an internal argc
instead of arguments.length
for for
loop as a cached value? Otherwise argc
is unused.
Since #848 landed, we can probably introduce sprintf as an internal module. |
Is anything happening with this at this point, or can it be closed? |
I'll close it and revisit when I have time. |
Optimize util.format() by parsing the format string and generating
specialized code on the fly, then caching the result in a LRU list.
This change is based on the observation that most applications that
log extensively, often have very skewed distributions of log patterns.
It's common to see top 10 or top 25 of popular patterns, followed by
a long (sometimes very long) tail of less popular patterns.
This is a work in progress: the common case is currently 2-25x faster
but the worst case - every pattern unique - is 4-5x slower.
WIP, soliciting comments.