Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimizations #246

Merged
merged 6 commits into from
Jul 13, 2021
Merged

Optimizations #246

merged 6 commits into from
Jul 13, 2021

Conversation

zodern
Copy link
Contributor

@zodern zodern commented Mar 29, 2021

This PR tries to optimize:

  • Importing a module that has already been imported a large number of times. Each time the module is imported, it loops through all previously added setters to check if any need to be called. When importing @material-ui/icons, most of the 3 minutes reify runs is spent on this
  • Changing a single exported value in a module that is imported a large number of times. Svelte has a module that most svelte components import specific functions from, and the module also exports the component instance that is being initialized. Each time svelte creates a new component instance and changes that export, reify checks every setter to see if any need to be called, even though there usually are only a couple for the modified export. In one of my apps, this almost doubles the time svelte spends when switching pages.
  • Re-exporting with export ... from 'module'. Many large npm packages that reify is slow at importing, such as @material-ui/icons and mathjs have a large number of these. When importing @material-ui/icons, ~20 seconds is spent running setters for parent modules, and most of the setters re-export values.

Changes:

  • Check for changes once per name instead of once per setter. As far as I can tell, the only time some setters would have different previous values than others is if some of them are new and have never been ran. It now tracks the setters that have never been called separately so it can ensure they are still run.
  • When using the export ... from 'module' syntax, the re-exported values are not available within the module. If the re-exported value changes, there is no reason to run the other getters since their value shouldn't be affected by it. When setters are created using the shorthand to re-export, Reify now records which exports the setter modifies. When those are the only setters ran for a parent, reify only runs the parent's getters for the modified exports instead of all of them. This optimization is currently not used for export all declarations (export * from 'module').

Performance improvement:

The times are from the second run to ensure the cache is up to date. I modified the package.json files for @material-ui/icons and mathjs in node_modules to enable using reify with them.

Importing @material-ui/icons in Node 14

import { Web, Add } from '@material-ui/icons/esm/index.js';

commonjs: 3 seconds
before: 3 minutes
after: 4 seconds. ~1/3 of the extra second compared to commonjs is from reify's Node compile hook
The changes also reduces memory usage by ~12mb.

import * as icons from '@material-ui/icons/esm/index.js';

before: 3 minutes
after: 12 seconds

Tracking changes when running getters instead of when running setters would remove the difference between this and import { Web, Add } from '@material-ui/icons/esm/index.js';.

Importing mathjs in Node 14

require('mathjs/lib/esm');

commonjs: 800ms
before: 1730ms
after: 1450ms

I haven't spent enough time looking into why mathjs is still slow to import to find the cause. It does have some export all declarations (export * from 'module') which the re-export optimizations are disabled for.

Copy link
Owner

@benjamn benjamn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zodern Thanks for tackling this! It's refreshing to see a PR from someone who takes as much care as you do.

I will take a deeper look in the next few days, but I wanted to give you a heads up that I've cleaned up the repository a bit (enabled GitHub Actions and @dependabot, bumped dependencies, fixed a test, and closed/merged all the other PRs).

Would you mind rebasing? I think that will allow the tests to run on this PR, among other benefits.

Comment on lines 463 to 465
if (getter === void 0) {
return GETTER_ERROR;
}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like you discovered the same optimization as @jimrandomh in #242! Great minds, etc. I'll merge that PR first, since it came first, though that will cause a conflict here next time you rebase. Please feel free to resolve the conflict however you see fit, @zodern.

@benjamn
Copy link
Owner

benjamn commented Apr 22, 2021

@zodern Sorry for keeping you waiting here! I have a few commits I would like to push to this branch. Can you enable pushes to your branch? Alternatively, I could open another PR targeting your branch, but that's several more steps. Thanks!

@zodern
Copy link
Contributor Author

zodern commented Apr 22, 2021

@benjamn the option should be enabled for you to push to this branch.

benjamn added a commit to zodern/reify that referenced this pull request Apr 22, 2021
To make things a bit clearer, I renamed entry._lastValues to
entry.snapshots and setter.last to setter.snapshot, and I factored out the
snapshotting logic into a helper function called updateSnapshot, which
replaces the changed function from before.

Recording entry._lastValues[name] was a good idea (in addition to
recording setter.last for each setter function), because it allows us to
determine if the value has changed before looking at any of the setter
functions. However, I think we still need to record setter.snapshot, in
order to be sure we pass any new results to every applicable setter.

If the value has changed, then we need to pass it to every setter function
and update each setter.snapshot property to the current snapshot.

If the value has not changed, we still have to loop through the setter
functions to make sure they've all received an initial value, which
requires recording/checking setter.snapshot for each setter function.

We could potentially skip this work if we know no new setter functions
have been added since we last broadcast a value, but (unless this becomes
a performance problem) I prefer the safety of storing setter.snapshot on
each setter function.

Either way, the new snapshot becomes the setter.snapshot object for every
setter registered for the given name, which is much more efficient than
recomputing a separate setter.last snapshot for each setter, as we were
doing before PR benjamn#246.
@benjamn
Copy link
Owner

benjamn commented Apr 22, 2021

My mistake! Didn't catch that my local zodern-master branch was your master branch.

When you have a chance, I'd love to hear any thoughts you have on the commits I pushed. Still working a few other things, but I hope my adjustments make sense so far?

@zodern
Copy link
Contributor Author

zodern commented Apr 26, 2021

I really like the changes you made. It does feel much safer.

I tried this branch in a Meteor app I'm working on that uses Svelte for some interactive visualizations, and I am trying to get it to run at 60 fps. Svelte updates its current_component export when a component is being updated. With the latest commit on this branch, reify adds ~12ms per frame. Switching to c4b2a93 reduced it to ~6ms per frame. If I have some time this week or next I will look at this closer and see if we can reduce it.

@StorytellerCZ
Copy link

StorytellerCZ commented Jun 7, 2021

Any update on this @zodern @benjamn ?

@filipenevola filipenevola mentioned this pull request Jun 11, 2021
12 tasks
@zodern
Copy link
Contributor Author

zodern commented Jul 2, 2021

I checked the performance again for import { Web, Add } from '@material-ui/icons/esm/index.js';, this time in a web browser (Edge 93.0.933.1). The times are a little slower than normal since I recorded a profile at the same time.

Time spent by runSetters:

These are two other branches with other experiments, but they seemed to have minimal if any effect when importing @material-ui (it still took ~1000ms):

Svelte

I created a repro that has 102 components that cause there to be ~2,500 setters across 73 names for the svelte/internal module. For comparison, the largest svelte app I work on, which only uses svelte for a small part of the UI, has 1,548 setters for the svelte/internal module.

The times are how long runSetters takes during an update that causes the current_component export from svelte/internal to be modified 31 times.

Next steps

As is, the PR is much faster when importing large modules. If you think it is ready, we could merge it, and create separate PR's for any additional improvements.

We get a large performance improvement by reducing how often reify loops over each setter. I wasn't able to find a way to make it faster while keeping the looping. In the snapshots-with-uninitialized-tracking branch I re-implemented the uninitialized setters on top of your latest changes. I think this implementation should fix some of the problems with how I initially did it.

One commit in the optimization-experiments branch (Cache getter and export names) halves the remaining time spent in the Svelte reproduction. Caching export names would probably have to be done with #181, but they didn't have as much of an improvement as caching the getter names.

Copy link
Owner

@benjamn benjamn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy to merge this PR and release it, and continue iterating in follow-up PRs. Thanks again for pushing this forward @zodern!

zodern and others added 6 commits July 13, 2021 12:48
When re-exporting values, the values can not be used within the module. This allows us to skip running all getters since we know exactly which exports were modified.
To make things a bit clearer, I renamed entry._lastValues to
entry.snapshots and setter.last to setter.snapshot, and I factored out the
snapshotting logic into a helper function called updateSnapshot, which
replaces the changed function from before.

Recording entry._lastValues[name] was a good idea (in addition to
recording setter.last for each setter function), because it allows us to
determine if the value has changed before looking at any of the setter
functions. However, I think we still need to record setter.snapshot, in
order to be sure we pass any new results to every applicable setter.

If the value has changed, then we need to pass it to every setter function
and update each setter.snapshot property to the current snapshot.

If the value has not changed, we still have to loop through the setter
functions to make sure they've all received an initial value, which
requires recording/checking setter.snapshot for each setter function.

We could potentially skip this work if we know no new setter functions
have been added since we last broadcast a value, but (unless this becomes
a performance problem) I prefer the safety of storing setter.snapshot on
each setter function.

Either way, the new snapshot becomes the setter.snapshot object for every
setter registered for the given name, which is much more efficient than
recomputing a separate setter.last snapshot for each setter, as we were
doing before PR benjamn#246.
This logic was slightly off, since it combined all the config.name strings
in one initNames array and all the config.key strings in one initKeys
array, then passed those arrays to runSetters, even though not all keys
are applicable for all names.

Fortunately, tracking uninitialized setters should no longer be necessary,
since we make sure every setter receives the latest value (using the
setter.snapshot property to prevent duplicates).
@benjamn benjamn merged commit d6e2d63 into benjamn:master Jul 13, 2021
@benjamn
Copy link
Owner

benjamn commented Jul 13, 2021

@zodern @StorytellerCZ @filipenevola I've published these changes to npm as reify@0.22.0. You should now be able to depend on v0.22.0 in other packages, though I published this version with the next tag instead of latest for the time being (happy to update latest whenever you like). Cheers!

benjamn added a commit that referenced this pull request Aug 28, 2021
Minor version bump due to the extent/risk of PR #246, though no visible
behavioral changes are expected (other than performance improvements).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants