Fast image: add `binary_string` parameter to imshow #2691

emmanuelle · 2020-08-06T13:57:58Z

Work in progress!

This PR adds a use_binary_string parameter to imshow, which default value is

True for multichannel images (where the Image trace is always used)
False for single-channel images, in which case the Heatmap trace is used (when True, an Image trace is used)

I added another new parameter contrast_rescaling in order to reconcile the computation of zmin and zmax, which were computed differently for 2D/Heatmap and 3D/Image before (2D used the min and max of the array as Heatmap does, and 3D used 0 for the min and a max value deduced from the data type and the image range). The former behaviour corresponds to the image value of the newly introduced contrast_rescaling parameter, while dtype corresponds to the latter, and when the value is None it is set internally to ensure backwards compatibility with the current version.

To do

hovertemplate when using source
remove color from hover when data have been rescaled
update doc notebook

emmanuelle · 2020-08-06T14:45:37Z

The scikit-image code which is used here will be vendored later on, but for now I imported the function from scikit-image for faster development.

emmanuelle · 2020-08-06T17:40:32Z

.circleci/config.yml

@@ -405,6 +405,9 @@ jobs:
            if [ "${CIRCLE_BRANCH}" != "doc-prod" ]; then
              pip uninstall -y plotly
              cd ../packages/python/plotly
+              # To be removed after plotly.js release
+              pip install inflect black tox
+              python3 setup.py updateplotlyjsdev --devbranch fast-image


my idea here was to be able to take a look at the doc even when the corresponding feature is not released in plotly.js, but it does not work since the exported html asks for the regular plotly.js bundle so the feature is not available. Too bad! Or maybe I can find a way to link to another bundle...

nicolaskruchten · 2020-08-10T16:31:50Z

For the naming of contrast_rescaling how about z_bounds_mode and for the values how about "minmax" for the current heatmap logic and "heuristic" or "infer" or something like that for the current image logic?

Aside: I'm amused that basically we have two different sets of logic in this one function and we end up adding new parameters that have one value for one side and another value for the other side and the ability to force it either way :) Makes sense!

emmanuelle · 2020-08-14T12:33:14Z

I've benchmarked Pillow and pypng for different images and different compression levels. The code of the benchmarks is

import numpy as np
from skimage import data
from plotly.express._imshow import _array_to_b64str
from time import time
import pandas as pd
import plotly.express as px

img_rgb = data.astronaut()

sparse_img = np.zeros((512, 512, 3), dtype=np.uint8)
sparse_img[100:100, 150:-150] = 200

img_scientific = data.retina()

images = [img_rgb, img_scientific, sparse_img]
sizes = [im.size for im in images]

timings = []

for mode, img, size in zip(
        ['natural', 'scientific', 'sparse'],
        images,
        sizes):
    for backend in ['png', 'pil']:
        for compression_level in range(0, 10):
            t_start = time()
            _ = (_array_to_b64str(img, backend=backend, compression=compression_level))
            t_end = time()
            timings.append({
                'type':mode, 
                'compression':compression_level,
                'backend': backend,
                'time': t_end - t_start,
                'size_reduction': size/len(_)
                })

df = pd.DataFrame(timings)
fig_1 = px.scatter(
        df,
        x='time', y='size_reduction',
        color='backend',
        hover_data=['compression'],
        height=400, width=900,
        facet_col='type',
        title='size reduction vs. time for a natural image (astronaut)')
fig_1.update_yaxes(matches=None, showticklabels=True)
fig_1.update_xaxes(matches=None)
fig_1.show()

fig_2 = px.scatter(
        df,
        x='compression', y='size_reduction',
        color='backend',
        hover_data=['compression'],
        height=400, width=900,
        facet_col='type',
        title='size reduction vs. time for a natural image (astronaut)')
fig_2.update_yaxes(matches=None, showticklabels=True)
fig_2.update_xaxes(matches=None)
fig_2.show()

fig_3 = px.scatter(
        df,
        x='compression', y='time',
        color='backend',
        hover_data=['compression'],
        height=400, width=900,
        facet_col='type',
        title='size reduction vs. time for a natural image (astronaut)')
fig_3.update_yaxes(matches=None, showticklabels=True)
fig_3.update_xaxes(matches=None)
fig_3.show()

and the figures are

Pillow is more efficient in the sense that it reaches a higher level of compression for a given time (except for the very sparse image for which pypng is a little bit better). Probably we should use Pillow when it's installed. How about having a backend argument in imshow with values auto, png and pil? Ideally we would also give the possibility to control the compression_level parameter. The default value of pil is 6, which is a bit overkill for this benchmark (the best trade-off is 4 for these cases).

nicholas-esterer · 2020-08-14T12:42:18Z

I agree, it looks like Pillow is the bestest 👍
I'm all about benchmarks like this, this is great work!

emmanuelle · 2020-08-14T12:48:58Z

I agree, it looks like Pillow is the bestest +1
I'm all about benchmarks like this, this is great work!

Thanks! And quite easy with px once you've got all your data in a dataframe! Although it would have been great here to have an option within px.scatter to unmatch the axes and leave a little more space for the ticks (@nicolaskruchten is this a parameter which we should add? It could make it easier for people to use px instead of graph_objects).

nicolaskruchten · 2020-08-14T12:56:50Z

Great benchmark! I'd love to see this in a full Dash context with layout.images to weigh the tradeoff between png's speed and pillow's extra compression, especially after going through gzip compression and decompression on the other end. If it turns out that the optimum happens to be "less compression" then png's speed might make it a better fit. As we saw, the encoding-time was the biggest gain in this optimization, and the transfer time was not that important relatively speaking, so it could be that faster encoding dominates...

That said, I'm fine with a parameter that's auto/pillow/png also.

an option within px.scatter to unmatch the axes and leave a little more space for the ticks

This is pretty easy with .update_(x|y)axes(matches=None) and you can now leave more space for the ticks with facet_(row|col)_spacing ... this is all in the docs now too ;) https://plotly.com/python/facet-plots/

emmanuelle · 2020-08-14T13:10:58Z

Oh yes I had forgotten about facet_col_spacing. I remembered the example about independent axes, maybe we should add one sentence about spacing or a link to the example with spacing in the independent axes example.

Also, png is not really faster than pil, if you take a look at the compression factor (ie the size of the original image divided by the length of the b64 string) vs time. In a diagram of time vs "compression level" (which is some index but it's not completely clear what this is), then png is faster but it achieves a smaller size reduction.

And yes it's interesting to try in a Dash app. Related to this, my previous Dash app demo was a bit flawed I think because I had used a tiled image (the same image on a 4x4 grid) and apparently png compression is capable of optimizing this kind of images quite a lot. I'll use the retina image (https://scikit-image.org/docs/stable/auto_examples/data/plot_scientific.html#sphx-glr-auto-examples-data-plot-scientific-py) for a Dash app, it's quite big (1440^2) with some sparsity, so it's quite representative of CZI-relevant images.

nicolaskruchten · 2020-08-14T13:30:25Z

Yeah I know compression is an input and not an output but my point is that if the optimum is at level 0 then png dominates because it's faster... full dash benchmark will tell us :)

emmanuelle · 2020-08-17T20:49:39Z

I think the PR is ready for a quick/temporary review, while the "real review" will happen after the source feature has been released with plotly js.

doc/python/imshow.md

packages/python/plotly/plotly/express/_imshow.py

nicolaskruchten · 2020-08-31T13:55:30Z

Can we get the CI to pass plz? I think there's a dependency issue with either png or PIL

emmanuelle · 2020-08-31T14:53:17Z

Can we get the CI to pass plz? I think there's a dependency issue with either png or PIL

I think we can't get the CI to pass before the js release? Or am I missing something?

nicolaskruchten · 2020-08-31T17:20:48Z

I think we can't get the CI to pass before the js release?

in principle yes but some of the error messages here seem to be from something other than "can't find the validator" :)

nicolaskruchten · 2020-08-31T17:27:08Z

Just doing this results in no z or color in the hoverlabel, which is a bit of a downer... intentional?

import plotly.express as px
import numpy as np
img = np.arange(100).reshape((10, 10))
fig = px.imshow(img, binary_string=True)
fig.show()

emmanuelle · 2020-08-31T19:48:59Z

Just doing this results in no z or color in the hoverlabel, which is a bit of a downer... intentional?
import plotly.express as px
import numpy as np
img = np.arange(100).reshape((10, 10))
fig = px.imshow(img, binary_string=True)
fig.show()

Good question. It is intentional, but subject to debate. The idea is not to display z/color if the numerical value is different from the original array element, because it can be misleading.

If you do

import plotly.express as px
import numpy as np
img = np.arange(100).reshape((10, 10)).astype(np.uint8)
fig = px.imshow(img, binary_string=True, contrast_rescaling='infer')
fig.show()

then you will have the hover because there is no rescaling.

The alternative would be to always display z, and add an example in the imshow tutorial warning that it's not the same value and showing how to modify the hovertemplate.

nicolaskruchten · 2020-08-31T20:28:41Z

It is intentional, but subject to debate

Right, OK, I remember this now. The current behaviour is good, but I think should be mentioned in the docs if possible please.

nicolaskruchten · 2020-08-31T20:31:21Z

doc/python/imshow.md


 ```python
 import plotly.express as px
 from skimage import data
 img = data.astronaut()
 # Increase contrast by clipping the data range between 50 and 200
-fig = px.imshow(img, zmin=50, zmax=200)
+fig = px.imshow(img, zmin=50, zmax=200, binary_string=False)


we should explain why the binary_string=False is necessary here, and/or explain what happens if you don't do it... in this case the hover ends up different, right? We don't do it in the example immediately below, though, is that intentional?

I modified this example so that it does not use binary_string, which has not been introduced for imshow at this point of the tutorial, and then I added two examples about rescaling and hover in the section on imshow and binary_string.

emmanuelle · 2020-08-31T20:31:47Z

Some updates:

I vendored the pypng code (just one file! 2000+ LOC though)
for the CI I fixed some dependency problems in the test and the current failures come from the fact that the plotly js version is a "dirty" one.

nicolaskruchten · 2020-09-03T02:12:24Z

OK I fixed up the CI failures due to too-modern code in the inlined stuff for Python 2.7 and 3.5 tests to pass, and I updated Plotly.js and fixed up those failures in the docs. The remaining failures are CI artifacts that should go away on master so I also merged :)

@emmanuelle if you could add a changelog entry (straight onto master would be fine :) that'd be great!

emmanuelle added 8 commits June 12, 2020 16:29

Merge branch 'master' of https://github.com/plotly/plotly.py

f628e94

Merge branch 'master' of https://github.com/plotly/plotly.py

d2f8b72

Merge branch 'master' of https://github.com/plotly/plotly.py

3ef3764

Merge branch 'master' of https://github.com/plotly/plotly.py

902a98e

run updateplotlyjsdev using js fast-image branch

71b14c1

work in progress on imshow with source

1984998

added dev dep

e6feb33

added dev dep

e551672

emmanuelle commented Aug 6, 2020

View reviewed changes

emmanuelle added 2 commits August 6, 2020 23:03

more tests and docstring

34cc0de

range_color case

febdfaf

emmanuelle closed this Aug 7, 2020

emmanuelle reopened this Aug 7, 2020

emmanuelle added 3 commits August 11, 2020 09:33

values for contrast_rescaling are now minmax and infer

01195f8

vendored code from scikit-image for rescale_intensity

81ce9f6

different backends for b64 creation

d3d88b8

emmanuelle added 4 commits August 17, 2020 12:18

more tests and documented parameters in docstring

bd25f05

no color in hovertemplate when contrast has been modified

78d2454

modified tutorial

0752df7

modified requirements

34c1acf

emmanuelle requested a review from nicolaskruchten August 17, 2020 20:47

codegen

d3cb22b

nicolaskruchten reviewed Aug 31, 2020

View reviewed changes

doc/python/imshow.md Show resolved Hide resolved

nicolaskruchten reviewed Aug 31, 2020

View reviewed changes

packages/python/plotly/plotly/express/_imshow.py Outdated Show resolved Hide resolved

nicolaskruchten reviewed Aug 31, 2020

View reviewed changes

packages/python/plotly/plotly/express/_imshow.py Show resolved Hide resolved

emmanuelle added 2 commits August 31, 2020 17:02

add generated files

7b117dc

remove pypng dependency

4db5fe4

use rgba256 colormodel

63a4dc9

emmanuelle added 2 commits August 31, 2020 21:56

try to fix CI

29aab27

fix test

32ca251

nicolaskruchten reviewed Aug 31, 2020

View reviewed changes

emmanuelle and others added 8 commits September 2, 2020 11:01

improve examples

cdbfefe

add jpg compression option

03ce500

improve docstring

c0247d2

bump plotly.js to 1.55.1

ff6bd6e

Merge branch 'master' into fast-image

3640abc

blacken png

56494ba

remove f-string

cb72f88

fix png.py tests

96cebe3

nicolaskruchten merged commit 96cebe3 into master Sep 3, 2020

emmanuelle mentioned this pull request Sep 3, 2020

Add facet_col and animation_frame argument to imshow #2746

Merged

mkcor mentioned this pull request Sep 30, 2020

Example using 3D multichannel image. scikit-image/scikit-image#4946

Merged

emmanuelle deleted the fast-image branch December 1, 2020 13:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fast image: add `binary_string` parameter to imshow #2691

Fast image: add `binary_string` parameter to imshow #2691

emmanuelle commented Aug 6, 2020 •

edited

Loading

emmanuelle commented Aug 6, 2020

emmanuelle Aug 6, 2020

nicolaskruchten commented Aug 10, 2020

emmanuelle commented Aug 14, 2020

nicholas-esterer commented Aug 14, 2020

emmanuelle commented Aug 14, 2020

nicolaskruchten commented Aug 14, 2020

emmanuelle commented Aug 14, 2020

nicolaskruchten commented Aug 14, 2020

emmanuelle commented Aug 17, 2020

nicolaskruchten commented Aug 31, 2020

emmanuelle commented Aug 31, 2020

nicolaskruchten commented Aug 31, 2020

nicolaskruchten commented Aug 31, 2020

emmanuelle commented Aug 31, 2020

nicolaskruchten commented Aug 31, 2020

nicolaskruchten Aug 31, 2020

emmanuelle Sep 2, 2020

emmanuelle commented Aug 31, 2020

nicolaskruchten commented Sep 3, 2020

Fast image: add binary_string parameter to imshow #2691

Fast image: add binary_string parameter to imshow #2691

Conversation

emmanuelle commented Aug 6, 2020 • edited Loading

emmanuelle commented Aug 6, 2020

emmanuelle Aug 6, 2020

Choose a reason for hiding this comment

nicolaskruchten commented Aug 10, 2020

emmanuelle commented Aug 14, 2020

nicholas-esterer commented Aug 14, 2020

emmanuelle commented Aug 14, 2020

nicolaskruchten commented Aug 14, 2020

emmanuelle commented Aug 14, 2020

nicolaskruchten commented Aug 14, 2020

emmanuelle commented Aug 17, 2020

nicolaskruchten commented Aug 31, 2020

emmanuelle commented Aug 31, 2020

nicolaskruchten commented Aug 31, 2020

nicolaskruchten commented Aug 31, 2020

emmanuelle commented Aug 31, 2020

nicolaskruchten commented Aug 31, 2020

nicolaskruchten Aug 31, 2020

Choose a reason for hiding this comment

emmanuelle Sep 2, 2020

Choose a reason for hiding this comment

emmanuelle commented Aug 31, 2020

nicolaskruchten commented Sep 3, 2020

Fast image: add `binary_string` parameter to imshow #2691

Fast image: add `binary_string` parameter to imshow #2691

emmanuelle commented Aug 6, 2020 •

edited

Loading