Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Allow direct import of utils.PlotlyJSONEncoder for faster Dash startup time #2174

Closed
anders-kiaer opened this issue Feb 10, 2020 · 10 comments

Comments

@anders-kiaer
Copy link

This issue maybe belongs more to the related plotly.py repository, but the motivation comes from using Dash. Please feel free to transfer the issue to plotly.py if/when more useful.


Running

time python -c "import dash"

real	0m2.172s
user	0m1.183s
sys	0m0.566s

shows real/wall-clock typically above 2 seconds on my system, even when the whole Python distribution is installed locally on the same computer. When using a shared network disk Python distribution, it of course gets slower. 🕙

Running (with Python 3.7 or higher)

python -X importtime -c "import dash"

shows that a large portion of the time is spent on unused plotly.graph_objs imports. The import stems from this line in Dash
https://github.com/plotly/dash/blob/8ee358826cd4318a3edc52fbf71e0bdda2369984/dash/dash.py#L23
where dash/dash.py is using PlotlyJSONEncoder from plotly.utils.

As a quick hack, changing that line to

from _plotly_utils.utils import PlotlyJSONEncoder

reduces the import dash wall time to 0.6-0.7 seconds, i.e. around 70% reduction in import time for dash. 🏎 Starting a Dash app suddenly felt much more instant (which is nice during development).

The plotly Python package I guess is free to change the "private package" _plotly_utils without releasing a major release, so the "hack" above is not a permanent nice solution for dash (even though dash does not do any pinning of plotly version today, so a new major release of plotly might already break previous dash releases, but that is a separate issue 🙂).

I guess the best solution might be to change plotly/__init__.py to not import all subpackages, such that the consumer of the plotly package can choose what to import. It is common to have to import subpackages explicitly, e.g.

python -c "import matplotlib; print(matplotlib.pyplot.__file__)"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
AttributeError: module 'matplotlib' has no attribute 'pyplot'

would not work, while

python -c "import matplotlib.pyplot; print(matplotlib.pyplot.__file__)"
[...]/python3.7/site-packages/matplotlib/pyplot.py

does. You also typically in the plotly.py documentation see lines like

import plotly.graph_objects as go

which is an example of explicit subpackage import, and would work even if plotly/__init__.py does not import graph_objects into its namespace.

Related issue: #740

@chriddyp
Copy link
Member

Yeah, doing a direct import would be an easy win.

We don't want to import from an underscore method, the official import would be

from plotly.utils import PlotlyJSONEncoder

However, it appears that this import is still slow.

I'll transfer this over to plotly.py and we'll see what we can do. This would certainly be much easier than #740.

@chriddyp chriddyp transferred this issue from plotly/dash Feb 10, 2020
@chriddyp chriddyp changed the title [Feature Request] Faster startup of Dash apps [Feature Request] Allow direct import of utils.PlotlyJSONEncoder for faster Dash startup time Feb 10, 2020
@chriddyp
Copy link
Member

@nicolaskruchten @emmanuelle @jonmmease - If we could speed up the import speed of from plotly.utils import PlotlyJSONEncoder by preventing plotly.graph_objs from being imported as a side effect, then that would be a big win for Dash's hot reload development speed.

Note - I'm just speculating that plotly.graph_objs is being imported as a side effect as doing from plotly.utils import PlotlyJSONEncoder in my terminal took about 4 seconds.

@anders-kiaer
Copy link
Author

Yeah, doing a direct import would be an easy win.

Thanks for your reply @chriddyp 👍

I'm just speculating that plotly.graph_objs is being imported as a side effect as doing from plotly.utils import PlotlyJSONEncoder in my terminal took about 4 seconds.

Can confirm it is imported as a side effect, due to a combination of how the Python import system works (where all __init__.py files are called while "going down the package tree"), together with these lines in plotly/__init__.py.

@anders-kiaer
Copy link
Author

anders-kiaer commented Feb 24, 2020

Maybe an easy, and non-breaking change for plotly.py, which still gives the performance increase (especially for Dash hot reload) on Python 3.7+, could be to utilize the newly added __getattr__ (see PEP562 for details) in order to do lazy Python imports.

I.e. pseudocode for plotly/__init__.py:

import sys

def __getattr__(name):
    # Perform lazy loading of `name` when asked for. See PEP562 and "Rationale" for details.
    ...

if sys.version_info < (3, 7):
    # PEP562 and `__getattr__` implemented in Python 3.7,
    # e.g. continue with direct imports as today for users using Python 3.6 (or older).
    from plotly import graph_objs,  tools, ...

@nicolaskruchten
Copy link
Contributor

We can definitely carve out a fast path for Dash just importing the encoder, but these gains are/will all be lost when people use PX or graph_objects directly. I think what we need is some sort of global setting to disable this auto-loading behaviour or something.

@chriddyp
Copy link
Member

Or perhaps Dash forks the encoder and uses its own version.

@nicolaskruchten
Copy link
Contributor

Right, but that will only give performance gains when folks use raw dicts and lists for making figures. We need a way to make Dash apps fast even when they use PX :)

@jonmmease
Copy link
Contributor

Startup time will be greatly improved by #2368.

Thanks to @anders-kiaer for pointing out the potential of PEP 562. Baking this into our code generation class hierarchy really helps.

@anders-kiaer
Copy link
Author

🎉 I would consider this issue as solved now after #2368 and lazy imports (if someone uses 🐍 Python < 3.7 and wants the speedup, they should just update their Python minor version... Python 3.7 release was back in 2018, and Python 3.6 has EOL/last security fix next year).

Thanks for implementing PEP562 @jonmmease! Looking forward to test it out in Dash 🚀

@chriddyp
Copy link
Member

Thanks again @anders-kiaer for bringing this up with us, and steering us in the right direction with lazy loading!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants