-
-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve performance of sortBy #92
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It gives me a warm feeling on the inside to see my hard work on the benchmarks being put to good use! I found one potential improvement.
index.js
Outdated
} | ||
var lteFn = | ||
typeof rs[0].prop === 'string' || typeof rs[0].prop === 'number' ? | ||
function(a, b) { return a <= b; } : |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function can be hoisted. May slightly benefit performance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function above behaves incorrectly for NaN
, as this demonstrates:
> Z.sort ([NaN, 2, NaN, 0, NaN, 1, NaN])
[NaN, NaN, 0, 1, NaN, 2, NaN]
We must either remove the special case for typeof rs[0].prop === 'number'
or ensure via other means that NaN
is handled appropriately.
Z.lte
was only recently updated to handle NaN
, in #88.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fast-path for numbers and strings is very important for performance.
I think we should consider whether it's worth the code complexity and performance to handle NaN
. For instance, Array$prototype$lte
doesn't handle sparse arrays because sparse arrays can be considered to not be a part of the array type we define. In the same way Number$prototype$lte
could not handle NaN
as NaN
could be considered not part of the number type we define.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be best for these changes to be refactoring only. In a subsequent pull request you could make the case for not supporting NaN
by showing the performance difference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⚡
Amazing work, Simon! |
index.js
Outdated
var rs = reduce(reducer, [], foldable); | ||
if (rs.length === 0) { | ||
return foldable; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this special case make an appreciable difference? I would have thought that sorting an empty structure would be an efficient operation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I now see that this is guarding rs[0].prop
below. I would still like to remove the special case. We could define lteFn
like so:
var type = rs.length > 0 ? typeof rs[0].prop : 'undefined';
var lteFn = type === 'number' || type === 'string' ? ... : lte;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⚡
index.js
Outdated
var foldableIsArray = Array.isArray(foldable); | ||
var idx = 0; | ||
function reducer(xs, x) { | ||
xs.push({idx: idx, x: x, prop: f(x)}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you consider prop
to be a better name than fx
for the f(x)
property?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm indifferent. For prop
I didn't actually look at the old name. The thought behind prop
is that f
returns some property of x
(not property as in a JS property of an object but in the more general sense of the word). But I'd have no problem in switching it back.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer fx
. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⚡
index.js
Outdated
lteFn(b.prop, a.prop) ? a.idx - b.idx : -1 : | ||
1; | ||
}); | ||
if (foldableIsArray) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since foldableIsArray
is only referenced here, let's replace this identifier with Array.isArray(foldable)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea. I was using it in several places earlier but that went away.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⚡
index.js
Outdated
return rs; | ||
} else { | ||
var result = empty(foldable.constructor); | ||
var of = foldable.constructor['fantasy-land/of']; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this safe? Could foldable.constructor
be Function
?
index.js
Outdated
var result = empty(foldable.constructor); | ||
var of = foldable.constructor['fantasy-land/of']; | ||
for (var i = 0; i < rs.length; i += 1) { | ||
result = result['fantasy-land/concat'](of(rs[i].x)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this safe? Are we sure that Array is the only built-in type that can support sorting?
package.json
Outdated
@@ -19,6 +19,9 @@ | |||
}, | |||
"devDependencies": { | |||
"fantasy-land": "3.5.0", | |||
"knuth-shuffle": "^1.0.8", | |||
"list": "^2.0.7", | |||
"ramda": "^0.25.0", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use the "1.2.x"
form rather than the "^1.2.3"
form.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⚡
index.js
Outdated
for (var i = 0; i < this.length; i += 1) { | ||
acc = f(acc, this[i]); | ||
} | ||
return acc; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this change make an appreciable difference?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've measured the difference in the benchmark suite for List. I'd say the difference is quite significant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Significant indeed!
Now that I know we'll keep the code I'll share my formatting request. I'd like to see this:
for (var idx = 0; idx < this.length; idx += 1) acc = f(acc, this[idx]);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⚡
@Avaq Props on the benchmark system! The output in particular is really nice 👍 |
@paldepind, I'm happy to push tweaks to your branch. I'd love to merge this pull request soon. :) |
@davidchambers You're very welcome to that David 👍 |
Looks good to me @davidchambers. |
index.js
Outdated
@@ -2132,9 +2132,7 @@ | |||
return xs; | |||
} | |||
var rs = reduce(reducer, [], foldable); | |||
if (rs.length === 0) return foldable; | |||
|
|||
var type = typeof rs[0].fx; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the benefit of this? Replacing one special case with another that doesn't terminate as early?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer typeof (rs[0] && rs[0].fx)
because the unsafe expression (rs[0].fx
) and its guard appear very close together. My preference is not strong, though, so feel free to reinstate the rs.length === 0
check. :)
index.js
Outdated
for (idx = 0; idx < rs.length; idx += 1) { | ||
result = concat(result, of(F, rs[idx].x)); | ||
} | ||
return result; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does seem to affect performance. On my computer, the Foldable benchmark went from 308 Hz to 233 Hz. That is, the previous is about 33% faster.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm concerned about correctness. Representatives of built-in types do not have fantasy-land/
-prefixed members, so replacing concat(x, y)
with x['fantasy-land/concat'](y)
is not necessarily valid. Can we be sure that result
will have a fantasy-land/concat
member and that F
will have a fantasy-land/of
member?
fantasy-land/concat
implementations are provided for the following built-in types: String, Array, and Object.
fantasy-land/of
implementations are provided for the following built-in types: Array and Function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are any of the build-in types, except for Array, both a foldable and a monoid?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a StrMap
, Object
also implements Foldable
and Monoid
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gabejohnson Sorry, I forgot to mention that it has to be an Applicative as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahhh...I missed the of
. Yeah, Array
is the only one that fits the bill AFAIK.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we reinstate the ['fantasy-land/concat']
and ['fantasy-land/of']
optimizations, let's add a comment explaining why these optimizations are safe.
ℹ️ @paldepind, I rebased this branch. |
package.json
Outdated
@@ -19,6 +19,7 @@ | |||
}, | |||
"devDependencies": { | |||
"fantasy-land": "3.5.0", | |||
"list": "2.0.7", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Avaq, should this be listed here or in bench/package.json?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's used as a library like any other (just so happens to be used in benchmarks), so it should be listed here. The bench/package.json
file only exists to allow for the installation of any version of choice, of the package you're benchmarking against.
LGTM 🎉 |
This PR improves the performance of
sortBy
and thus alsosort
. These are the benchmark results I get when runningsort
on an array of 100 numbers.In this case sorting an array got 214 times faster and sorting a foldable (using List as the benchmark subject) got 85 times faster 😄
Note however that to achieve the performance increase I had to use fewer sanctuary-type-class functions internally in the implementation of
sortBy
. It seems that using sanctuary-type-class functions results in some overhead. I created two benchmarks to try and measure the overhead of the functions used insidesortBy
.Then I plugged in Ramda and sanctuary-type-class in the above benchmarks and got the following results:
Getting the minimum of an array of 100 numbers seems to be 242 times as slow with sanctuary-type-classes and creating an array of size 5 by prepending is about 3.3 times slower than Ramda.
Z.lte
in particular seems to be be slow and it is still used insidesortBy
in some cases and in these casessortBy
is still slow due toZ.lte
ending up as the bottleneck.