Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use plotly.js base64 API to store and pass typed arrays declared by numpy, pandas, etc. #4470

Merged
merged 128 commits into from
Oct 21, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
128 commits
Select commit Hold shift + click to select a range
1f3e605
revisit validator
archmoj Oct 25, 2023
a627c9e
pass numpy conversions as typed array spec
archmoj Oct 25, 2023
c363f3e
adjust validators
archmoj Nov 2, 2023
61aa4ea
add b64 file for tests
archmoj Jan 9, 2024
730c0dd
adjust test_dataarray_validator.py
archmoj Jan 9, 2024
7c31dc1
adjust test_pandas_series_input.py
archmoj Jan 9, 2024
d4c3162
adjust test_xarray_input.py
archmoj Jan 9, 2024
d1b4706
adjust test_figure_factory.py
archmoj Jan 9, 2024
2b33199
adjust test_imshow.py
archmoj Jan 9, 2024
f2c8d66
adjust test_px.py
archmoj Jan 9, 2024
b4d23b5
adjust test_px_functions.py
archmoj Jan 9, 2024
03aa9e0
adjust test_px_input.py
archmoj Jan 9, 2024
4514e92
adjust test_px_wide.py
archmoj Jan 9, 2024
048455e
adjust test_trendline.py
archmoj Jan 9, 2024
520e9bf
adjust test_figure_factory.py
archmoj Jan 9, 2024
c48f3ce
adjust test_utils.py
archmoj Jan 9, 2024
dcb6b91
skip test_fast_track_finite_arrays
archmoj Nov 15, 2023
9b1cbfd
skip test_violin_fig on CI
archmoj Nov 15, 2023
bd7e9dc
skip few mocks in compare pandas v1 vs v2
archmoj Dec 21, 2023
124f12d
remove clean_float which is not necessary
archmoj Dec 21, 2023
82c8199
add examples using base64
archmoj Jan 9, 2024
f200767
Update packages/python/plotly/_plotly_utils/basevalidators.py
archmoj Jan 11, 2024
5fb9ee3
Merge remote-tracking branch 'origin/master' into pass-b64_dev
archmoj Feb 29, 2024
b31dc27
Merge branch 'master' into pass-b64
archmoj Mar 15, 2024
0323215
Merge branch 'master' into pass-b64
archmoj Mar 22, 2024
99ebbbc
Merge branch 'master' into pass-b64
archmoj Jul 8, 2024
7c7ef30
Merge branch 'master' into pass-b64
archmoj Jul 24, 2024
f5df6db
also check for dtype in is_typed_array_spec function
archmoj Jul 24, 2024
f332922
remove print
archmoj Jul 24, 2024
5947221
Add performance test for b64
marthacryan Jul 26, 2024
62d9aa7
Add tests for size
marthacryan Jul 26, 2024
e5c24fe
Add test for array_ok and b64 together in IntegerValidator:
marthacryan Jul 26, 2024
8430c52
Black
marthacryan Jul 26, 2024
baeedc9
Change the time difference to be larger between b64 and raw array
marthacryan Jul 30, 2024
4f63296
Add random seed
marthacryan Jul 30, 2024
a566543
Change numpy array to python list before comparison
marthacryan Jul 30, 2024
3d63fa2
Remove unnecessary casting to np array
marthacryan Jul 30, 2024
7fbb701
specify width and height and fix logic of time comparison
marthacryan Jul 30, 2024
6e53e51
Add hard-coded margins
marthacryan Jul 30, 2024
dd1aba8
Add uint8 and float32 tests
marthacryan Jul 30, 2024
b6f9d14
Update performance margin to be a little smaller
marthacryan Jul 30, 2024
555d960
Black
marthacryan Jul 30, 2024
b301d99
Fix size performance tests and add graph object tests
marthacryan Jul 31, 2024
4323c28
Remove print statements
marthacryan Jul 31, 2024
c1e6728
Add numpy as a requirement for core tests
marthacryan Jul 31, 2024
7822635
Black
marthacryan Jul 31, 2024
58d4844
update requirements for python 3.12
marthacryan Jul 31, 2024
43de1cb
Update packages/python/plotly/plotly/tests/test_core/test_graph_objs/…
marthacryan Aug 1, 2024
1105328
Update names
marthacryan Aug 1, 2024
65f0dad
Update variables used in tests
marthacryan Aug 1, 2024
da100bc
Lower threshold for passing
marthacryan Aug 1, 2024
554f5cb
Use different version of setuptools
marthacryan Aug 1, 2024
024f3c1
Black
marthacryan Aug 1, 2024
8a051a3
Update tests to remove conversion to base64 before passing numpy arrays
marthacryan Aug 1, 2024
ece3e3d
remove setuptools from requirements
marthacryan Aug 1, 2024
0675f5b
Add setup tools install before requirements
marthacryan Aug 2, 2024
2fc29f8
Remove pin on numpy version
marthacryan Aug 2, 2024
d8924c5
Try removing the setuptools from config
marthacryan Aug 2, 2024
e1f91cd
Update performance thresholds
marthacryan Aug 2, 2024
98f2541
Parametrize functions and lower performance thresholds
marthacryan Aug 2, 2024
3702686
Code format
marthacryan Aug 2, 2024
4932cdb
Remove px tests (duplicates)
marthacryan Aug 2, 2024
ddbc3f1
Remove px tests (duplicates)
marthacryan Aug 2, 2024
0b83ebd
Add back in max_value and parameterize the count
marthacryan Aug 2, 2024
dabbcb8
Remove numpy requirement after moving performance tests back to optional
marthacryan Aug 2, 2024
6c01e6a
Use scattergl instead of scatter
marthacryan Aug 2, 2024
a056d7e
Add verbose flag to debug ci
marthacryan Aug 2, 2024
6d82e48
Only run performance tests for debugging
marthacryan Aug 2, 2024
8823a8c
Try commenting out all but one test
marthacryan Aug 2, 2024
25f6ccf
Print pio.renderers to debug
marthacryan Aug 2, 2024
b931fd5
Debug
marthacryan Aug 2, 2024
addfd47
Try rendering as png
marthacryan Aug 2, 2024
ef51cce
Add back in other tests and update renderer default
marthacryan Aug 2, 2024
3142c62
Black
marthacryan Aug 2, 2024
72e9ad0
Update failing performance threshold
marthacryan Aug 2, 2024
2d4ad05
Update failing performance threshold
marthacryan Aug 2, 2024
9d3b50d
Update thresholds
marthacryan Aug 2, 2024
63335d2
Update thresholds
marthacryan Aug 2, 2024
644992f
Merge pull request #4695 from marthacryan/add-tests-b64
marthacryan Aug 2, 2024
0f68aff
Add validator test to basetraces validator
marthacryan Aug 5, 2024
2a81d2a
Add more validator tests
marthacryan Aug 5, 2024
1c0b48e
black
marthacryan Aug 5, 2024
e500e38
Add more base64 array_ok tests
marthacryan Aug 7, 2024
c2453d3
Add other int types to integer validator tests
marthacryan Aug 7, 2024
c8d35b4
Add more integer types to the validation
marthacryan Aug 7, 2024
8861451
black
marthacryan Aug 7, 2024
8afa4d8
remove unused imports
marthacryan Aug 12, 2024
69729db
Remove unnecessary usage of numpy dtypes to prevent throwing error
marthacryan Aug 12, 2024
c52e9f4
Merge pull request #4707 from plotly/validator_tests
marthacryan Aug 12, 2024
c6390f4
Add test for geojson not converting to b64
marthacryan Aug 23, 2024
a3940ee
Simplify tests
marthacryan Aug 26, 2024
259d509
Add tests for layers and range keys
marthacryan Aug 27, 2024
d201b58
Code format
marthacryan Aug 27, 2024
f018291
Update packages/python/plotly/plotly/tests/test_optional/test_graph_o…
marthacryan Aug 28, 2024
8d7edf6
Merge branch 'master' of github.com:plotly/plotly.py into pass-b64
marthacryan Aug 28, 2024
d59ceb0
Merge branch 'pass-b64' of github.com:plotly/plotly.py into add-skipp…
marthacryan Aug 28, 2024
40166bb
Potential fix to conversion bug
marthacryan Aug 28, 2024
17b531c
Refactor logic to be clearer
marthacryan Aug 30, 2024
2fa09e5
remove todo
marthacryan Aug 30, 2024
4452868
Black
marthacryan Aug 30, 2024
02fb3a4
Merge pull request #4727 from plotly/add-skipped-key-tests
marthacryan Aug 30, 2024
a8995b7
Merge branch 'master' of github.com:plotly/plotly.py into pass-b64
marthacryan Aug 30, 2024
12ff7f3
remove failing orca tests to prevent confusion while waiting on updat…
marthacryan Aug 30, 2024
ca4340b
Remove another part of config that we're removing to prevent CI failure
marthacryan Aug 30, 2024
dd9379a
Add base64 to the changelog
marthacryan Aug 30, 2024
9362db3
Remove examples that over-complicate the usage of base64 spec
marthacryan Sep 17, 2024
6364d4e
Remove base64 documentation
marthacryan Sep 17, 2024
61e9178
Convert base64 in validate_coerce_fig_to_dict instead of validate_coerce
marthacryan Oct 7, 2024
066564e
Update logic to be recursive
marthacryan Oct 10, 2024
fb036c7
Move conversion to to_dict function
marthacryan Oct 10, 2024
4b289e9
Fix import path
marthacryan Oct 10, 2024
aabfa6e
Consolidate import statemenets
marthacryan Oct 10, 2024
de2bcb5
Merge pull request #4784 from plotly/b64-before-render
archmoj Oct 15, 2024
0d0dad2
Revert changes to validator tests
marthacryan Oct 15, 2024
9c5d112
Revert changes to tests that check data field
marthacryan Oct 15, 2024
5e50d8c
Merge with master
marthacryan Oct 15, 2024
8fff9c5
Remove performance tests
marthacryan Oct 15, 2024
bd0a2d3
Update base64 tests to reflect new approach
marthacryan Oct 15, 2024
b3ed838
Fix failing tests
marthacryan Oct 16, 2024
7bd7993
Merge with master
marthacryan Oct 16, 2024
cd8e0be
Revert changes to to_json_plotly tests
marthacryan Oct 16, 2024
f60b122
revert changes to percy compare pandas
marthacryan Oct 17, 2024
9d6b0c7
Remove unused util
marthacryan Oct 17, 2024
960adb9
Revert changes to validators
marthacryan Oct 17, 2024
12ab42a
Revert changes to circleci config
marthacryan Oct 17, 2024
59fe206
Address review
marthacryan Oct 17, 2024
60c73a8
fix doctsring
marthacryan Oct 17, 2024
f481af7
update skipped keys to include layers
marthacryan Oct 17, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,10 @@
All notable changes to this project will be documented in this file.
This project adheres to [Semantic Versioning](http://semver.org/).

### Updated

- Updated plotly.py to use base64 encoding of arrays in plotly JSON to improve performance.

## [5.24.1] - 2024-09-12

### Updated
Expand Down
106 changes: 105 additions & 1 deletion packages/python/plotly/_plotly_utils/utils.py
Original file line number Diff line number Diff line change
@@ -1,11 +1,115 @@
import base64
import decimal
import json as _json
import sys
import re
from functools import reduce

from _plotly_utils.optional_imports import get_module
from _plotly_utils.basevalidators import ImageUriValidator
from _plotly_utils.basevalidators import (
ImageUriValidator,
copy_to_readonly_numpy_array,
is_homogeneous_array,
)


int8min = -128
int8max = 127
int16min = -32768
int16max = 32767
int32min = -2147483648
int32max = 2147483647

uint8max = 255
uint16max = 65535
uint32max = 4294967295

plotlyjsShortTypes = {
"int8": "i1",
"uint8": "u1",
"int16": "i2",
"uint16": "u2",
"int32": "i4",
"uint32": "u4",
"float32": "f4",
"float64": "f8",
}


def to_typed_array_spec(v):
"""
Convert numpy array to plotly.js typed array spec
If not possible return the original value
"""
v = copy_to_readonly_numpy_array(v)

np = get_module("numpy", should_load=False)
if not np or not isinstance(v, np.ndarray):
return v

dtype = str(v.dtype)

# convert default Big Ints until we could support them in plotly.js
if dtype == "int64":
max = v.max()
min = v.min()
if max <= int8max and min >= int8min:
v = v.astype("int8")
elif max <= int16max and min >= int16min:
v = v.astype("int16")
elif max <= int32max and min >= int32min:
v = v.astype("int32")
else:
return v

elif dtype == "uint64":
max = v.max()
min = v.min()
if max <= uint8max and min >= 0:
v = v.astype("uint8")
elif max <= uint16max and min >= 0:
v = v.astype("uint16")
elif max <= uint32max and min >= 0:
v = v.astype("uint32")
else:
return v

dtype = str(v.dtype)

if dtype in plotlyjsShortTypes:
arrObj = {
"dtype": plotlyjsShortTypes[dtype],
"bdata": base64.b64encode(v).decode("ascii"),
}

if v.ndim > 1:
arrObj["shape"] = str(v.shape)[1:-1]

return arrObj

return v


def is_skipped_key(key):
"""
Return whether the key is skipped for conversion to the typed array spec
"""
skipped_keys = ["geojson", "layer", "layers", "range"]
return any(skipped_key == key for skipped_key in skipped_keys)


def convert_to_base64(obj):
if isinstance(obj, dict):
for key, value in obj.items():
if is_skipped_key(key):
continue
elif is_homogeneous_array(value):
obj[key] = to_typed_array_spec(value)
else:
convert_to_base64(value)
elif isinstance(obj, list) or isinstance(obj, tuple):
for value in obj:
convert_to_base64(value)


def cumsum(x):
Expand Down
4 changes: 4 additions & 0 deletions packages/python/plotly/plotly/basedatatypes.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@
display_string_positions,
chomp_empty_strings,
find_closest_string,
convert_to_base64,
)
from _plotly_utils.exceptions import PlotlyKeyError
from .optional_imports import get_module
Expand Down Expand Up @@ -3310,6 +3311,9 @@ def to_dict(self):
if frames:
res["frames"] = frames

# Add base64 conversion before sending to the front-end
convert_to_base64(res)

return res

def to_plotly_json(self):
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,84 @@
import json
from unittest import TestCase
import numpy as np
from plotly.tests.test_optional.optional_utils import NumpyTestUtilsMixin
import plotly.graph_objs as go


class TestShouldNotUseBase64InUnsupportedKeys(NumpyTestUtilsMixin, TestCase):
def test_np_geojson(self):
normal_coordinates = [
[
[-87, 35],
[-87, 30],
[-85, 30],
[-85, 35],
]
]

numpy_coordinates = np.array(normal_coordinates)

data = [
{
"type": "choropleth",
"locations": ["AL"],
"featureidkey": "properties.id",
"z": np.array([10]),
"geojson": {
"type": "Feature",
"properties": {"id": "AL"},
"geometry": {"type": "Polygon", "coordinates": numpy_coordinates},
},
}
]

fig = go.Figure(data=data)

assert (
json.loads(fig.to_json())["data"][0]["geojson"]["geometry"]["coordinates"]
== normal_coordinates
)

def test_np_layers(self):
layout = {
"mapbox": {
"layers": [
{
"sourcetype": "geojson",
"type": "line",
"line": {"dash": np.array([2.5, 1])},
"source": {
"type": "FeatureCollection",
"features": [
{
"type": "Feature",
"geometry": {
"type": "LineString",
"coordinates": np.array(
[[0.25, 52], [0.75, 50]]
),
},
}
],
},
},
],
"center": {"lon": 0.5, "lat": 51},
},
}
data = [{"type": "scattermapbox"}]

fig = go.Figure(data=data, layout=layout)

assert (fig.layout["mapbox"]["layers"][0]["line"]["dash"] == (2.5, 1)).all()

assert json.loads(fig.to_json())["layout"]["mapbox"]["layers"][0]["source"][
"features"
][0]["geometry"]["coordinates"] == [[0.25, 52], [0.75, 50]]

def test_np_range(self):
layout = {"xaxis": {"range": np.array([0, 1])}}

fig = go.Figure(data=[{"type": "scatter"}], layout=layout)

assert json.loads(fig.to_json())["layout"]["xaxis"]["range"] == [0, 1]
Original file line number Diff line number Diff line change
Expand Up @@ -25,15 +25,15 @@ def _compare_figures(go_trace, px_fig):
def test_pie_like_px():
# Pie
labels = ["Oxygen", "Hydrogen", "Carbon_Dioxide", "Nitrogen"]
values = [4500, 2500, 1053, 500]
values = np.array([4500, 2500, 1053, 500])

fig = px.pie(names=labels, values=values)
trace = go.Pie(labels=labels, values=values)
_compare_figures(trace, fig)

labels = ["Eve", "Cain", "Seth", "Enos", "Noam", "Abel", "Awan", "Enoch", "Azura"]
parents = ["", "Eve", "Eve", "Seth", "Seth", "Eve", "Eve", "Awan", "Eve"]
values = [10, 14, 12, 10, 2, 6, 6, 4, 4]
values = np.array([10, 14, 12, 10, 2, 6, 6, 4, 4])
# Sunburst
fig = px.sunburst(names=labels, parents=parents, values=values)
trace = go.Sunburst(labels=labels, parents=parents, values=values)
Expand All @@ -45,7 +45,7 @@ def test_pie_like_px():

# Funnel
x = ["A", "B", "C"]
y = [3, 2, 1]
y = np.array([3, 2, 1])
fig = px.funnel(y=y, x=x)
trace = go.Funnel(y=y, x=x)
_compare_figures(trace, fig)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -372,38 +372,6 @@ def test_invalid_encode_exception(self):
with self.assertRaises(TypeError):
_json.dumps({"a": {1}}, cls=utils.PlotlyJSONEncoder)

def test_fast_track_finite_arrays(self):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@marthacryan
Could we revert this change now or we should drop the test?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed this because it was failing. Presumably it's just faster now?

# if NaN or Infinity is found in the json dump
# of a figure, it is decoded and re-encoded to replace these values
# with null. This test checks that NaN and Infinity values are
# indeed converted to null, and that the encoding of figures
# without inf or nan is faster (because we can avoid decoding
# and reencoding).
z = np.random.randn(100, 100)
x = np.arange(100.0)
fig_1 = go.Figure(go.Heatmap(z=z, x=x))
t1 = time()
json_str_1 = _json.dumps(fig_1, cls=utils.PlotlyJSONEncoder)
t2 = time()
x[0] = np.nan
x[1] = np.inf
fig_2 = go.Figure(go.Heatmap(z=z, x=x))
t3 = time()
json_str_2 = _json.dumps(fig_2, cls=utils.PlotlyJSONEncoder)
t4 = time()
assert t2 - t1 < t4 - t3
assert "null" in json_str_2
assert "NaN" not in json_str_2
assert "Infinity" not in json_str_2
x = np.arange(100.0)
fig_3 = go.Figure(go.Heatmap(z=z, x=x))
fig_3.update_layout(title_text="Infinity")
t5 = time()
json_str_3 = _json.dumps(fig_3, cls=utils.PlotlyJSONEncoder)
t6 = time()
assert t2 - t1 < t6 - t5
assert "Infinity" in json_str_3


class TestNumpyIntegerBaseType(TestCase):
def test_numpy_integer_import(self):
Expand Down