-
-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fast image: add binary_string
parameter to imshow
#2691
Conversation
The scikit-image code which is used here will be vendored later on, but for now I imported the function from scikit-image for faster development. |
.circleci/config.yml
Outdated
@@ -405,6 +405,9 @@ jobs: | |||
if [ "${CIRCLE_BRANCH}" != "doc-prod" ]; then | |||
pip uninstall -y plotly | |||
cd ../packages/python/plotly | |||
# To be removed after plotly.js release | |||
pip install inflect black tox | |||
python3 setup.py updateplotlyjsdev --devbranch fast-image |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my idea here was to be able to take a look at the doc even when the corresponding feature is not released in plotly.js, but it does not work since the exported html asks for the regular plotly.js bundle so the feature is not available. Too bad! Or maybe I can find a way to link to another bundle...
For the naming of Aside: I'm amused that basically we have two different sets of logic in this one function and we end up adding new parameters that have one value for one side and another value for the other side and the ability to force it either way :) Makes sense! |
I've benchmarked Pillow and pypng for different images and different compression levels. The code of the benchmarks is import numpy as np
from skimage import data
from plotly.express._imshow import _array_to_b64str
from time import time
import pandas as pd
import plotly.express as px
img_rgb = data.astronaut()
sparse_img = np.zeros((512, 512, 3), dtype=np.uint8)
sparse_img[100:100, 150:-150] = 200
img_scientific = data.retina()
images = [img_rgb, img_scientific, sparse_img]
sizes = [im.size for im in images]
timings = []
for mode, img, size in zip(
['natural', 'scientific', 'sparse'],
images,
sizes):
for backend in ['png', 'pil']:
for compression_level in range(0, 10):
t_start = time()
_ = (_array_to_b64str(img, backend=backend, compression=compression_level))
t_end = time()
timings.append({
'type':mode,
'compression':compression_level,
'backend': backend,
'time': t_end - t_start,
'size_reduction': size/len(_)
})
df = pd.DataFrame(timings)
fig_1 = px.scatter(
df,
x='time', y='size_reduction',
color='backend',
hover_data=['compression'],
height=400, width=900,
facet_col='type',
title='size reduction vs. time for a natural image (astronaut)')
fig_1.update_yaxes(matches=None, showticklabels=True)
fig_1.update_xaxes(matches=None)
fig_1.show()
fig_2 = px.scatter(
df,
x='compression', y='size_reduction',
color='backend',
hover_data=['compression'],
height=400, width=900,
facet_col='type',
title='size reduction vs. time for a natural image (astronaut)')
fig_2.update_yaxes(matches=None, showticklabels=True)
fig_2.update_xaxes(matches=None)
fig_2.show()
fig_3 = px.scatter(
df,
x='compression', y='time',
color='backend',
hover_data=['compression'],
height=400, width=900,
facet_col='type',
title='size reduction vs. time for a natural image (astronaut)')
fig_3.update_yaxes(matches=None, showticklabels=True)
fig_3.update_xaxes(matches=None)
fig_3.show() Pillow is more efficient in the sense that it reaches a higher level of compression for a given time (except for the very sparse image for which pypng is a little bit better). Probably we should use Pillow when it's installed. How about having a backend argument in imshow with values auto, png and pil? Ideally we would also give the possibility to control the |
I agree, it looks like Pillow is the bestest 👍 |
Thanks! And quite easy with |
Great benchmark! I'd love to see this in a full Dash context with That said, I'm fine with a parameter that's
This is pretty easy with |
Oh yes I had forgotten about Also, And yes it's interesting to try in a Dash app. Related to this, my previous Dash app demo was a bit flawed I think because I had used a tiled image (the same image on a 4x4 grid) and apparently png compression is capable of optimizing this kind of images quite a lot. I'll use the retina image (https://scikit-image.org/docs/stable/auto_examples/data/plot_scientific.html#sphx-glr-auto-examples-data-plot-scientific-py) for a Dash app, it's quite big (1440^2) with some sparsity, so it's quite representative of CZI-relevant images. |
Yeah I know compression is an input and not an output but my point is that if the optimum is at level 0 then png dominates because it's faster... full dash benchmark will tell us :) |
I think the PR is ready for a quick/temporary review, while the "real review" will happen after the source feature has been released with plotly js. |
Can we get the CI to pass plz? I think there's a dependency issue with either png or PIL |
I think we can't get the CI to pass before the js release? Or am I missing something? |
in principle yes but some of the error messages here seem to be from something other than "can't find the validator" :) |
Just doing this results in no import plotly.express as px
import numpy as np
img = np.arange(100).reshape((10, 10))
fig = px.imshow(img, binary_string=True)
fig.show() |
Good question. It is intentional, but subject to debate. The idea is not to display z/color if the numerical value is different from the original array element, because it can be misleading. If you do import plotly.express as px
import numpy as np
img = np.arange(100).reshape((10, 10)).astype(np.uint8)
fig = px.imshow(img, binary_string=True, contrast_rescaling='infer')
fig.show() then you will have the hover because there is no rescaling. The alternative would be to always display z, and add an example in the imshow tutorial warning that it's not the same value and showing how to modify the hovertemplate. |
Right, OK, I remember this now. The current behaviour is good, but I think should be mentioned in the docs if possible please. |
doc/python/imshow.md
Outdated
|
||
```python | ||
import plotly.express as px | ||
from skimage import data | ||
img = data.astronaut() | ||
# Increase contrast by clipping the data range between 50 and 200 | ||
fig = px.imshow(img, zmin=50, zmax=200) | ||
fig = px.imshow(img, zmin=50, zmax=200, binary_string=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should explain why the binary_string=False
is necessary here, and/or explain what happens if you don't do it... in this case the hover ends up different, right? We don't do it in the example immediately below, though, is that intentional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I modified this example so that it does not use binary_string
, which has not been introduced for imshow at this point of the tutorial, and then I added two examples about rescaling and hover in the section on imshow and binary_string.
Some updates:
|
OK I fixed up the CI failures due to too-modern code in the inlined stuff for Python 2.7 and 3.5 tests to pass, and I updated Plotly.js and fixed up those failures in the docs. The remaining failures are CI artifacts that should go away on @emmanuelle if you could add a changelog entry (straight onto master would be fine :) that'd be great! |
Work in progress!
This PR adds a
use_binary_string
parameter to imshow, which default value isI added another new parameter
contrast_rescaling
in order to reconcile the computation ofzmin
andzmax
, which were computed differently for 2D/Heatmap and 3D/Image before (2D used the min and max of the array as Heatmap does, and 3D used 0 for the min and a max value deduced from the data type and the image range). The former behaviour corresponds to theimage
value of the newly introducedcontrast_rescaling
parameter, whiledtype
corresponds to the latter, and when the value is None it is set internally to ensure backwards compatibility with the current version.To do