Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Doesn't work with "narrow" Python builds #4

Open
vitorio opened this issue Sep 4, 2016 · 0 comments
Open

Doesn't work with "narrow" Python builds #4

vitorio opened this issue Sep 4, 2016 · 0 comments

Comments

@vitorio
Copy link

vitorio commented Sep 4, 2016

A "narrow" Python build is one where unicode objects are UTF-16 internally (most characters are 2 bytes, but characters beyond U+FFFF get represented by a 4-byte "surrogate pair"), whereas in "wide" builds they're UCS-4 (every character takes 4 bytes).

The system build on Mac OS X, and the Python.org Windows and Mac builds, are "narrow" Python builds, but apparently most Linux builds are "wide." base65536 fails on "narrow" builds with:

>>> a = base65536.encode(x)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/vitorio/Library/Python/2.7/lib/python/site-packages/base65536/core.py", line 118, in encode
    stream.write(unichr(code_point))
ValueError: unichr() arg not in range(0x10000) (narrow Python build)

Replacing unichr() with a struct-based solution such as in HypothesisWorks/hypothesis@f49b829 allows encoding to work, but decoding continues to fail:

>>> a = base65536.encode(b"hello world")
>>> a
u'\u9a68\ua36c\u556f\U00012077\ua372\u1564'
>>> base65536.decode(a)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/vitorio/Library/Python/2.7/lib/python/site-packages/base65536/core.py", line 136, in decode
    'point: %d' % code_point)
ValueError: Invalid Base-65536 code point: 55304

Replacing unichr() and ord() and int2byte() with unichr() and byteord() and bytechr() from https://github.com/behdad/fonttools/blob/master/Lib/fontTools/misc/py23.py#L20 fails similarly:

>>> import base65536
>>> a = base65536.encode(b'hello world')
>>> base65536.decode(a)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/vitorio/Library/Python/2.7/lib/python/site-packages/base65536/core.py", line 134, in decode
    'point: %d' % code_point)
ValueError: Invalid Base-65536 code point: 55304
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants