Skip to content
This repository has been archived by the owner on Jul 7, 2023. It is now read-only.

Python3 compatibility #2

Closed
vthorsteinsson opened this issue Jun 18, 2017 · 4 comments
Closed

Python3 compatibility #2

vthorsteinsson opened this issue Jun 18, 2017 · 4 comments

Comments

@vthorsteinsson
Copy link
Contributor

There are some holes in the Python 3 compatibility of the Tensor2tensor code. For instance:

In data_generators/generator_utils.py, import urllib needs to be:

import sys
if sys.version_info[0] >= 3:
  import urllib.request as urllib
else:
  import urllib

In data_generators/image.py, import cPickle needs to be:

try:
  import cPickle
except ImportError:
  import pickle as cPickle

Finally, data_generators/tokenizer.py needs to be revised as it assumes that a char ordinal is always in the range (0, 256), which is not a safe assumption in Python 3. A better solution uses a set instead of array subscripts based on char ordinals. Would you like me to submit a revised version in a pull request?

@rsepassi
Copy link
Contributor

Thank you! Yes, please do. The chr assumption may also be in a few other places - e.g. text_encoder.py. Appreciate the help. We're on 2.7 at Google so Python 3 usage isn't as well-exercised/tested.

@plutusedge
Copy link

Thanks for the info this is helpful

@vthorsteinsson
Copy link
Contributor Author

See PR #22

@rsepassi
Copy link
Contributor

PR #22 merged. Thanks!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants