Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unicode error #122

Closed
vmpartner opened this issue Aug 4, 2016 · 14 comments
Closed

Unicode error #122

vmpartner opened this issue Aug 4, 2016 · 14 comments
Assignees
Milestone

Comments

@vmpartner
Copy link

Unicode error

image

@adam-waldenberg adam-waldenberg self-assigned this Aug 5, 2016
@adam-waldenberg
Copy link
Member

Hi. Welcome to the wonderful world of cryptic Python unicode error messages. This can be any number of things. My first guess would be an issue with the terminal. However, there are a few things you should check before we can iron it down;

  1. What codeset/encoding is the terminal configured to ?
  2. Do you get the same error if you redirect output to a file ?
  3. Does it behave the same in Python 3 ?
  4. What version of gitinspector is this ?

@vmpartner
Copy link
Author

  1. Its Xshell5 for Windows. Encoding UTF-8
    image

@vmpartner
Copy link
Author

  1. I can't install Python 3. Current version Python 2.7.9
  2. Latest. Git clone yesterday

@vmpartner
Copy link
Author

  1. image

@vmpartner
Copy link
Author

In git repo we have russian cp-1251 characters. May be this will be helpful

@adam-waldenberg
Copy link
Member

Hi @vmpartner. This particular error is, I think, related to the fix in issue #46. This change was added in order to handle escape characters in emails - something that can occur when a repository is imported into git from other revision control systems.

What happens if you remove that line and run it? Maybe we can just ignore it and catch the exception. This line is really only used for a particular corner case that is very uncommon, so that should be an acceptable solution.

@vmpartner
Copy link
Author

Yes, its work now. Nice program ;)

@adam-waldenberg adam-waldenberg added this to the 0.4.5 milestone Aug 9, 2016
@adam-waldenberg
Copy link
Member

Great. I will implement a fix at a later point. Thank you. Keeping it open for now.

@vassilevsky
Copy link

I have successfully ran gitinspector on OS X after applying this solution:

https://coderwall.com/p/-k_93g/mac-os-x-valueerror-unknown-locale-utf-8-in-python

I think it needs to be added to this project's README or FAQ. Which one is better?

@adam-waldenberg
Copy link
Member

Hi @vassilevsky. None. As it doesn't really concern gitinspector (it affects all Python applications under OS X), I think we will add a a specific FAQ for OS/Python specific errors eventually. There are some errors related to Windows that might be worth mentioning as well. For the particular error you reported, there are several old bugs discussing this, for example #109, #93, #53, #32 and #9 to name a few of them.

@kiwichris
Copy link

I have hit a Unicode issue in gitinspector in changesoutput.py with the RTEMS repo (https://git.rtems.org/rtems.git). We have Unicode users in the repo and I have no locale set and locale.getpreferredencoding() is returning 'US-ASCII'. This is on FreeBSD 10.3 with Python 2.7.12. I have hacked around the problem by adding:

import codecs
import sys
sys.stdout = codecs.getwriter('UTF-8')(sys.stdout)

in gitinspector/gitinspector.py:main. This is hack is taken from https://wiki.python.org/moin/PrintFails without the getting the preferred encoding and forcing UTF-8.

@adam-waldenberg
Copy link
Member

@kiwichris Discussed several times previously and definitely not a bug in gitinspector. There are a few things you can do to modify behaviour;

  1. Use a terminal with unicode encoding set up.
  2. Set PYTHIONIOENCODING to UTF-8.
  3. Pipe the output to a file (defaults to using UTF-8 and does your exact code for both stdin/stdout).

Instead of forcing the output to utf-8 (as you do in your hack), gitinspector will always try to re-encode/"convert" characters to the requester chacrater encoding. However, US-ASCII lacks mappings for many unicode characters. In any case - it's the correct behavior.

@kiwichris
Copy link

Sure, the solutions you highlight make sense. The python error makes it look like an bug.

@adam-waldenberg
Copy link
Member

Fixed with the above commit. Report any problems related to this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants