Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ERror and improvement suggestion in json_packer function #778

Closed
hxawax opened this issue Apr 26, 2022 · 1 comment · Fixed by #779
Closed

ERror and improvement suggestion in json_packer function #778

hxawax opened this issue Apr 26, 2022 · 1 comment · Fixed by #779

Comments

@hxawax
Copy link

hxawax commented Apr 26, 2022

Hello,
I think I found a bug in the json_packer function in session.py file.
The function is unable to decode special characters like (éàè .... ) in some circumstance, raising error shown in 'Details' section.

I suggest adding the following correction:

  • add of an Error Handler errors='surrogateescape' in the encode function of the json_packer function, like following

  95 def json_packer(obj):
  96     try:
  97         return json.dumps(
  98             obj,
  99             default=json_default,
 100             ensure_ascii=False,
 101             allow_nan=False,
 102         ).encode("utf8", errors='surrogateescape')
 103     except (TypeError, ValueError) as e:
 104         # Fallback to trying to clean the json before serializing
 105         packed = json.dumps(
 106             json_clean(obj),
 107             default=json_default,
 108             ensure_ascii=False,
 109             allow_nan=False,
 110         ).encode("utf8", errors='surrogateescape')
 111 
 112         warnings.warn(
 113             f"Message serialization failed with:\n{e}\n"
 114             "Supporting this message is deprecated in jupyter-client 7, please make "
 115             "sure your message is JSON-compliant",
 116             stacklevel=2,
 117         )
 118 
 119         return packed

Without the error handler, the following error is raised:

---------------------------------------------------------------------------
UnicodeEncodeError                        Traceback (most recent call last)
File ~/miniconda3/envs/cachalot/lib/python3.10/site-packages/jupyter_client/session.py:102, in json_packer(obj)
     96 try:
     97     return json.dumps(
     98         obj,
     99         default=json_default,
    100         ensure_ascii=False,
    101         allow_nan=False,
--> 102     ).encode("utf8")
    103 except (TypeError, ValueError) as e:
    104     # Fallback to trying to clean the json before serializing

UnicodeEncodeError: 'utf-8' codec can't encode characters in position 36261-36262: surrogates not allowed

During handling of the above exception, another exception occurred:

UnicodeEncodeError                        Traceback (most recent call last)
Input In [9], in <cell line: 3>()
      1 xr.set_options(display_style='html')
----> 3 ant17

File ~/miniconda3/envs/cachalot/lib/python3.10/site-packages/IPython/core/displayhook.py:268, in DisplayHook.__call__(self, result)
    266     self.write_format_data(format_dict, md_dict)
    267     self.log_output(format_dict)
--> 268 self.finish_displayhook()

File ~/miniconda3/envs/cachalot/lib/python3.10/site-packages/ipykernel/displayhook.py:89, in ZMQShellDisplayHook.finish_displayhook(self)
     87 sys.stderr.flush()
     88 if self.msg["content"]["data"]:
---> 89     self.session.send(self.pub_socket, self.msg, ident=self.topic)
     90 self.msg = None

File ~/miniconda3/envs/cachalot/lib/python3.10/site-packages/jupyter_client/session.py:842, in Session.send(self, stream, msg_or_type, content, parent, ident, buffers, track, header, metadata)
    840 if self.adapt_version:
    841     msg = adapt(msg, self.adapt_version)
--> 842 to_send = self.serialize(msg, ident)
    843 to_send.extend(buffers)
    844 longest = max([len(s) for s in to_send])

File ~/miniconda3/envs/cachalot/lib/python3.10/site-packages/jupyter_client/session.py:716, in Session.serialize(self, msg, ident)
    714     content = self.none
    715 elif isinstance(content, dict):
--> 716     content = self.pack(content)
    717 elif isinstance(content, bytes):
    718     # content is already packed, as in a relayed message
    719     pass

File ~/miniconda3/envs/cachalot/lib/python3.10/site-packages/jupyter_client/session.py:110, in json_packer(obj)
     97     return json.dumps(
     98         obj,
     99         default=json_default,
    100         ensure_ascii=False,
    101         allow_nan=False,
    102     ).encode("utf8")
    103 except (TypeError, ValueError) as e:
    104     # Fallback to trying to clean the json before serializing
    105     packed = json.dumps(
    106         json_clean(obj),
    107         default=json_default,
    108         ensure_ascii=False,
    109         allow_nan=False,
--> 110     ).encode("utf8")
    112     warnings.warn(
    113         f"Message serialization failed with:\n{e}\n"
    114         "Supporting this message is deprecated in jupyter-client 7, please make "
    115         "sure your message is JSON-compliant",
    116         stacklevel=2,
    117     )
    119     return packed

UnicodeEncodeError: 'utf-8' codec can't encode characters in position 36261-36262: surrogates not allowed
@blink1073
Copy link
Contributor

Thanks for the report @hxawax! This sounds like a reasonable compromise, after reading https://vstinner.github.io/pep-383.html. Would you like to submit a PR?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants