Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Causes circular reference in some DataFrames #74

Open
afshin opened this issue Mar 22, 2022 · 3 comments
Open

Causes circular reference in some DataFrames #74

afshin opened this issue Mar 22, 2022 · 3 comments
Labels
bug Something isn't working

Comments

@afshin
Copy link

afshin commented Mar 22, 2022

Some instances of pandas.DataFrame cannot be displayed in a notebook if beakerx_tabledisplay has been loaded.

Here is a DataFrame that has been pickled that exhibits this behavior: circular.pickle.zip

Example using circular.pickle

import pandas
import pickle
frame = None
with open('circular.pickle', 'rb') as pickled:
    frame = pickle.load(pickled)
import beakerx_tabledisplay
display(frame)

Behavior (video)

circular.mov
@afshin afshin added the bug Something isn't working label Mar 22, 2022
@afshin
Copy link
Author

afshin commented Mar 22, 2022

Here is the error it generates:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
File ~/anaconda3/envs/beakerx/lib/python3.10/site-packages/IPython/core/formatters.py:921, in IPythonDisplayFormatter.__call__(self, obj)
    919 method = get_real_method(obj, self.print_method)
    920 if method is not None:
--> 921     method()
    922     return True

File ~/anaconda3/envs/beakerx/lib/python3.10/site-packages/beakerx_tabledisplay/table_display_runtim.py:22, in TableDisplayWrapper.__get__.<locals>.f()
     21 def f():
---> 22     display_html(TableDisplay(model_instance))

File ~/anaconda3/envs/beakerx/lib/python3.10/site-packages/beakerx_tabledisplay/tabledisplay.py:297, in TableDisplay.__init__(self, *args, **kwargs)
    295 super(TableDisplay, self).__init__(**kwargs)
    296 self.chart = Table(*args, **kwargs)
--> 297 self.model = self.chart.transform()
    298 self.on_msg(self.handle_msg)
    299 self.details = None

File ~/anaconda3/envs/beakerx/lib/python3.10/site-packages/beakerx_tabledisplay/tabledisplay.py:239, in Table.transform(self)
    237 def transform(self):
    238     if TableDisplay.loadingMode == "ALL":
--> 239         return super(Table, self).transform()
    240     else:
    241         start_index = self.startIndex

File ~/anaconda3/envs/beakerx/lib/python3.10/site-packages/beakerx_base/utils.py:75, in BaseObject.transform(self)
     74 def transform(self):
---> 75     model = json.dumps(self, cls=ObjectEncoder)
     76     return json.loads(model)

File ~/anaconda3/envs/beakerx/lib/python3.10/json/__init__.py:238, in dumps(obj, skipkeys, ensure_ascii, check_circular, allow_nan, cls, indent, separators, default, sort_keys, **kw)
    232 if cls is None:
    233     cls = JSONEncoder
    234 return cls(
    235     skipkeys=skipkeys, ensure_ascii=ensure_ascii,
    236     check_circular=check_circular, allow_nan=allow_nan, indent=indent,
    237     separators=separators, default=default, sort_keys=sort_keys,
--> 238     **kw).encode(obj)

File ~/anaconda3/envs/beakerx/lib/python3.10/json/encoder.py:199, in JSONEncoder.encode(self, o)
    195         return encode_basestring(o)
    196 # This doesn't pass the iterator directly to ''.join() because the
    197 # exceptions aren't as detailed.  The list call should be roughly
    198 # equivalent to the PySequence_Fast that ''.join() would do.
--> 199 chunks = self.iterencode(o, _one_shot=True)
    200 if not isinstance(chunks, (list, tuple)):
    201     chunks = list(chunks)

File ~/anaconda3/envs/beakerx/lib/python3.10/json/encoder.py:257, in JSONEncoder.iterencode(self, o, _one_shot)
    252 else:
    253     _iterencode = _make_iterencode(
    254         markers, self.default, _encoder, self.indent, floatstr,
    255         self.key_separator, self.item_separator, self.sort_keys,
    256         self.skipkeys, _one_shot)
--> 257 return _iterencode(o, 0)

ValueError: Circular reference detected

@afshin afshin changed the title Causes a causes circular reference in some DataFrames Causes circular reference in some DataFrames Mar 28, 2022
@fcollonval
Copy link
Contributor

The root cause is actually in beakerx_base JSON custom encoder:

https://github.com/twosigma/beakerx_base/blame/master/beakerx_base/utils.py#L188

The data frame has data of type datetime.date that is not supported by the default encoder and the encoder is not proper as default should not called it self but rather return a serializable object or fallback on the original.

The following encoder fix the bug:

class ObjectEncoder(json.JSONEncoder):
    def default(self, obj):
        if isinstance(obj, (date, datetime)):
            return date_time_2_millis(str(obj))
        elif isinstance(obj, Enum):
            return obj.value
        elif isinstance(obj, Color):
            return obj.hex()
        elif isinstance(obj, pd.Series):
            return obj.tolist()
        elif isinstance(obj, np.ndarray):
            return obj.tolist()
        elif isinstance(obj, (np.int64, np.bool_)):
            return obj.item()
        elif hasattr(obj, "__dict__"):
            d = dict(
                (key, value)
                for key, value in inspect.getmembers(obj)
                if value is not None
                and not key == "Position"
                and not key == "colorProvider"
                and not key == "toolTipBuilder"
                and not key == "parent"
                and not key.startswith("__")
                and not inspect.isabstract(value)
                and not inspect.isbuiltin(value)
                and not inspect.isfunction(value)
                and not inspect.isgenerator(value)
                and not inspect.isgeneratorfunction(value)
                and not inspect.ismethod(value)
                and not inspect.ismethoddescriptor(value)
                and not inspect.isroutine(value)
            )
            return d

        return json.JSONEncoder.default(self, obj)

The date is then transform to an integer. Is it the desired result?

image

@fcollonval
Copy link
Contributor

So after discussion, the idea is to serialize the date object to a {'type': 'Date', 'timestamp': <int>} (similarly to pandas.Timestamp). So the widget is able to display the date nicely:

First table is the coming patch and second is the current behavior if the data are of type pandas.Timestamp:

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants