-
Notifications
You must be signed in to change notification settings - Fork 30.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
events: improve performance by ~120% #44627
Conversation
this is extremely interesting! well done on the find |
Duplicate of #39939. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
the previous PR seems to be stalled
I think a key benchmark that should be added and checked here is one that uses a configurable number of unique event handlers to potentially highlight the difference between "fast mode" (best when very few properties added to it) and "slow mode" objects (best when lots of properties added to it), kinda like benchmark/es/map-bench.js but with key names that are not dependent upon the number of loop iterations. |
Beware this causes a shared prototype { proto: null } instead of object.create doesn't and should be benchmarks as it does change behavior however slight |
It should also be noted that we used to use this exact same storage method until #11930. |
lgtm |
Benchmark CI: https://ci.nodejs.org/view/Node.js%20benchmark/job/benchmark-node-micro-benchmarks/1191/ Results
|
I created a benchmark as described in #44627 (comment) and the results are basically what I suspected:
Code'use strict';
const common = require('../common.js');
const { EventEmitter } = require('events');
const bench = common.createBenchmark(main, {
events: [1, 2, 3, 5],
n: [1e6],
});
function main({ events, n }) {
const ee = new EventEmitter();
const listeners = [];
for (let k = 0; k < 10; k += 1)
listeners.push(() => {});
const eventNames = [];
for (let k = 0; k < events; ++k)
eventNames.push(`dummy${k}`);
bench.start();
for (let i = 0; i < n; i += 1) {
for (const eventName of eventNames) {
for (let k = listeners.length; --k >= 0; /* empty */) {
ee.on(eventName, listeners[k]);
}
}
for (const eventName of eventNames) {
for (let k = listeners.length; --k >= 0; /* empty */) {
ee.removeListener(eventName, listeners[k]);
}
}
}
bench.end(n);
} In this case it only takes two unique events for the regression to show up, which is a bit surprising as I thought it used to take more for V8 to transition objects to slow mode. After seeing these results, I don't think the changes in this PR are worth it, considering most event emitters will see handlers added for multiple events and not just one. |
@mscdex I basically run the same benchmark on my machine (M1 Max, 64 GB, arm64), and added more values to events (10 and 20), and my results are a lot different than yours:
|
I ran my benchmarks on an i7-3770K running Linux 5.4.0 (and compared using benchmark/compare.R if that makes a difference) 🤷 I also ran with larger values for |
I agree. I will continue working on this. Can you share with me more information about the fast/slow objects in the v8? That would be educational for everybody. |
v8 tries to optimize access to objects based on their "shape". If the shape changes too much, v8 gives up trying to optimize it and switches into a fallback "dictionary" mode, which is basically just a raw hash map. The properties of an object and the types of the values make up the shape (to some degree, check out https://mathiasbynens.be/notes/shapes-ics for an in depth explanation), so adding new properties to an object ( in bad hacky idea land, we could manually try calling |
In any case, we may want to land the benchmark, that seems a good idea to have one to catch those kind of perf regression in the future. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should not land due to regressions.
@@ -338,7 +341,7 @@ EventEmitter.init = function(opts) { | |||
|
|||
if (this._events === undefined || | |||
this._events === ObjectGetPrototypeOf(this)._events) { | |||
this._events = ObjectCreate(null); | |||
this._events = new NullObject(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change is fairly problematic, I think. At the very least it's semver-major. Unfortunately we have a number of ecosystem modules that touch this._events
directly and changing the type of thing that is can definitely break stuff.
I don't want to be rude. I would be grateful for any input from EventEmitter experts here. I updated the benchmark script attached in this thread to determine if it is reasonable to publish and subscribe thousands of different events on an EventEmitter instance (of course there is a warning when more than 10 event ids are attached). It looks like this is a terrible idea, but if anyone here would confirm that it is terrible, it would close the case for me once and for all. Thank you for any reply. |
Object.create(null)
is 20x slower on Node 14, and 10x slower on Node 18. This is mainly related to the bug I've opened: https://bugs.chromium.org/p/v8/issues/detail?id=13273#c2V8 is super optimized for
new Cls()
, but not forObject.create(null)