-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
latin-1 / utf-8 codec can't encode/decode #21
Comments
Thanks for reporting the issue. This looks like a python string decoding issue. I'm not familiar with this, so I will outline the problem here. If anyone knows how to properly decode this, please let me know. You can try this in sublime text console (go to View/Show Console).
(1)>>> x=b'`\x81\n'
>>> x
b'`\x81\n'
>>> print(x)
b'`\x81\n'
(2)>>> x.decode('utf-8')
Traceback (most recent call last):
File "<string>", line 1, in <module>
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x81 in position 1: invalid start byte
No, not intentional. I don't think I've changed the decoding related code either (I have refactored it, but the logic doesn't change). The decoding code is here just fyi https://github.com/komsit37/sublime-q/blob/master/util.py#L11 |
Hi Komsit37, I figured this one out. The version of qPython that sublime-q is currently using encodes and decodes with "latin-1" (which does not support characters beyond the basic set) as opposed to UTF-8. This can be patched without too much trouble -- it just requires a small change to how binary string length is calculated. However, there is another option: the newest version of qPython defaults to latin-1, but can be overriden in qconnection using encoding = 'UTF-8'. What do you think about upgrading to the latest qPython? Either way, I can create a pull request with the update and the utf-8 change, but because it is such a core change to the code I suggest we test quite a bit before merging. |
Cool, yup we could try to upgrade qpython. I checked the diff. The upgrade shouldn't be too bad (as long as we don't need to change numpy dependency part). Either 2.0.0 or 1.2.2 should be ok. Agreed we would need some test. But also shouldn't be too bad since we don't use so much data types. We mostly just decode to string. |
fixed in #28 |
This may be intended, just fyi:
I'm getting "Error in QSendRawCommand.sendAndUpdateStatus:"
and then either: "'latin-1' codec can't encode characters..." (on send) or "'utf-8' codec can't decode byte..."
when I try to send or receive characters above \200 until \371. E.g. `$"\201" fails.
I think this worked until recently; not sure if you changed the char encoding intentionally here.
The text was updated successfully, but these errors were encountered: