Very slow performance of `create_annotated_heatmap` for small dataset #2299

haphaeu · 2020-03-20T12:20:48Z

This is taking over 4s to run, which seems excessively large for such small dataset:

import time
import numpy as np
import pandas as pd
import plotly.figure_factory as ff

dfi = pd.DataFrame(np.random.rand(3, 6))

t0 = time.time()

fig = ff.create_annotated_heatmap(
            z=dfi.values,
            x=dfi.columns.tolist(),
            y=dfi.index.tolist(),
            annotation_text=dfi.round(2).values,
        )

print(f'elapsed time {time.time()-t0:.3f} s')

The text was updated successfully, but these errors were encountered:

nicolaskruchten · 2020-03-20T12:59:57Z

Most of this time is spent in importing plotly.graph_objects... If you import this before you start timing you'll find that the figure factory itself is pretty fast.

The import speed is a known issue that we chip away at but have no clear path to resolving today, see e.g. #740 or #2174

haphaeu · 2020-03-20T13:17:15Z

But the import statement is left out of the timer. There's something else being imported after the call to create_annotated_heatmap the first time it runs.

This can be seen in the snippet below.

My further problem is that I'm using this within a dash callback, and it seems to import these objects every time, making every graph update take 4 seconds...

import time
import numpy as np
import pandas as pd
import plotly.figure_factory as ff


def create_fig(dfi):
    return ff.create_annotated_heatmap(
            z=dfi.values,
            x=dfi.columns.tolist(),
            y=dfi.index.tolist(),
            annotation_text=dfi.round(2).values,
        )

        
dfi = pd.DataFrame(np.random.rand(3, 6))

t0 = time.time()
create_fig(dfi)
print(f'elapsed time {time.time()-t0:.3f} s')

t0 = time.time()
create_fig(dfi)
print(f'elapsed time {time.time()-t0:.3f} s')

Output:

elapsed time 3.568 s
elapsed time 0.182 s

nicolaskruchten · 2020-03-20T13:47:35Z

So in the output above the first one is slow, but the second one is quite fast... This should bear out for the third, fourth etc. Basically once things are loaded the performance should be quite good. This is really annoying for local development, however, admittedly.

jonmmease · 2020-04-10T18:31:44Z

Using Python 3.7 with PR at #2368, the code snippet in the original is much improved on my workstation:

plotly 4.6: 0.328 s
PR: 0.046 s
7x speedup.

haphaeu · 2020-05-08T12:15:08Z

Insisting a little on this one due to other factors not mentioned above. It seems that performance when run from a Dash app is not only related to importing.

I've recreated the snipped above, running in bare Python/plotly, and from within a Dash app.

There seems to be an 4 s overhead during first run when importing things. After that, heatmap creation takes 0.25 s to 0.3 s in bare python, while from Dash it takes 1.25 s to 1.3 s, hence around 5x slower.

Bare python/plotly:

elapsed time 4.270 s
elapsed time 0.250 s
elapsed time 0.343 s
elapsed time 0.371 s
elapsed time 0.241 s

From Dash:

Update took 5.778 s
Update took 1.258 s
Update took 1.342 s
Update took 1.306 s
Update took 1.233 s
Update took 1.261 s

And this is the snipped being time bench-marked from Dash. There are 4 dataframes (4 plots being created), each dataframe is flat 6 x 13.

    figs = list()   
 
    for limit, dfi in zip(limits, dfs):
    
        fig = ff.create_annotated_heatmap(
            z=dfi.values,
            x=dfi.columns.tolist(),
            y=dfi.index.tolist(),
            annotation_text=dfi.round(2).values,
            colorscale=colorscale(limit, dfi.max().max()),
        )
        fig.update_layout(
            xaxis=dict(title="Period [s]", dtick=1),
            yaxis=dict(title="Height [m]", dtick=0.25, autorange="reversed"),
            clickmode="event+select",
        )
        
        figs.append(fig)

nicolaskruchten · 2020-05-08T12:32:02Z

@haphaeu can you confirm that these results were obtained using plotly version 4.7, which was just released and contains many performance enhancements?

haphaeu · 2020-05-08T17:39:59Z

No, those results were done with version 4.5.2. Well spotted.

Using same conda environment, I've updated only plotly, so all other packages are the same.

Here's the re-run with plotly=4.7.1:

Bare python/plotly:

elapsed time 1.167 s
elapsed time 0.036 s
elapsed time 0.038 s
elapsed time 0.066 s
elapsed time 0.074 s

From Dash:

Update took 1.120 s
Update took 0.172 s
Update took 0.177 s
Update took 0.164 s
Update took 0.201 s
Update took 0.188 s

Indeed significantly faster.

Thanks and nicely done!

nicolaskruchten · 2020-05-08T18:01:27Z

Great! Is this running on Python 3.7? If not, upgrading to Python 3.7 might increase import performance further :)

haphaeu · 2020-05-08T18:14:42Z

This is Python 3.8

nicolaskruchten · 2020-05-08T18:31:37Z

Ah well, no free lunch there then :)

haphaeu closed this as completed May 8, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Very slow performance of `create_annotated_heatmap` for small dataset #2299

Very slow performance of `create_annotated_heatmap` for small dataset #2299

haphaeu commented Mar 20, 2020

nicolaskruchten commented Mar 20, 2020

haphaeu commented Mar 20, 2020 •

edited

Loading

nicolaskruchten commented Mar 20, 2020

jonmmease commented Apr 10, 2020 •

edited

Loading

haphaeu commented May 8, 2020 •

edited

Loading

nicolaskruchten commented May 8, 2020

haphaeu commented May 8, 2020

nicolaskruchten commented May 8, 2020

haphaeu commented May 8, 2020

nicolaskruchten commented May 8, 2020

Very slow performance of create_annotated_heatmap for small dataset #2299

Very slow performance of create_annotated_heatmap for small dataset #2299

Comments

haphaeu commented Mar 20, 2020

nicolaskruchten commented Mar 20, 2020

haphaeu commented Mar 20, 2020 • edited Loading

nicolaskruchten commented Mar 20, 2020

jonmmease commented Apr 10, 2020 • edited Loading

haphaeu commented May 8, 2020 • edited Loading

nicolaskruchten commented May 8, 2020

haphaeu commented May 8, 2020

nicolaskruchten commented May 8, 2020

haphaeu commented May 8, 2020

nicolaskruchten commented May 8, 2020

Very slow performance of `create_annotated_heatmap` for small dataset #2299

Very slow performance of `create_annotated_heatmap` for small dataset #2299

haphaeu commented Mar 20, 2020 •

edited

Loading

jonmmease commented Apr 10, 2020 •

edited

Loading

haphaeu commented May 8, 2020 •

edited

Loading