-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
__repr__ wrong column alignment with non-ascii characters #1620
Comments
ghost
assigned changhiskhan
Jul 19, 2012
wesm
added a commit
that referenced
this issue
Jul 21, 2012
changhiskhan
added a commit
that referenced
this issue
Jul 23, 2012
I don't think this should've been closed. The original problem in the given example still exists AFAICT. >>> from pandas import Series, DataFrame
>>>
>>> df1 = DataFrame([["aaaa", 1], ["bbbb", 2]])
>>> df2 = DataFrame([["aaää", 1], ["bbbb", 2]])
>>> df3 = DataFrame([[u"aaää", 1], ["bbbb", 2]])
>>>
>>> # Comparison between "similar dataframes"
>>> print df1
0 1
0 aaaa 1
1 bbbb 2
>>> print
>>> print df2
0 1
0 aaää 1
1 bbbb 2
>>> print
>>> print df3
0 1
0 aaää 1
1 bbbb 2
>>> pandas.version.version
'0.8.2.dev-c99d9cd' |
Though maybe this is intentional for strings? |
Thinking about this a bit more, I'm thinking this maybe should be fixed, but does that imply always using unicode in to_string? Regardless, force_unicode=True fails for df2. I'll push a fix for this. |
Merged
yarikoptic
added a commit
to neurodebian/pandas
that referenced
this issue
Sep 12, 2012
Version 0.8.1 * tag 'v0.8.1': (126 commits) RLS: Version 0.8.1 DOC: tweak DOC: set_index/reset_index examples DOC: doc fixes and what's new in 0.8.1, vectorized string methods ENH: better string element access/slicing notation close pandas-dev#1656 DOC: minor additions to release notes for 0.8.1 BUG: handle Yahoo! finance returning duplicate dates for prev bus day, doc fixes BUG: fix windows/32-bit builds BUG: get pandas-dev#1620 fix working on python 3 ENH: handling of UTF-8 strings in DataFrame columns, close pandas-dev#1620 TST: span unit test pandas-dev#1635 TST: skip another @network test if no internet connection ENH/BUG: handle tz-aware datetime.datetime in to_datetime, add utc=True option to allow conversion to utc, close pandas-dev#1581 ENH: hack to not compress single group keys, accelerate single-key and Categorical groupby operations BUG: fix merge bug with left joins on length-0 DataFrame, close pandas-dev#1628 BUG: Series.interpolate bug with method='values' and datetime64[ns], close pandas-dev#1646 BUG: properly handle None values in dict input to concat, close pandas-dev#1649 BUG: len-0 Series min/max/describe pandas-dev#1650 Fix describe() failure for None and empty Series. BUG: string date aliases now work with tz-aware time series close pandas-dev#1647 ...
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hi,
it seems that when DataFrame, Series and maybe other objects contain non-ascii characters inside non-unicode strings the
__repr__
method is not able to give the correct column alignment to its values. However, we see that this issue does not affect unicode strings. I'm using pandas '0.8.1.dev-70c3deb' in a Linux box.Sample code:
This results in:
Thanks and regards.
The text was updated successfully, but these errors were encountered: