Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Selecting multiple columns from DataFrame with duplicate column labels failure. #1943

Closed
lodagro opened this issue Sep 20, 2012 · 6 comments
Closed
Labels
Bug Indexing Related to indexing on series/frames, not to indexes themselves
Milestone

Comments

@lodagro
Copy link
Contributor

lodagro commented Sep 20, 2012

In [25]: df = pandas.DataFrame(np.random.randn(4,4), columns=list('AABC'))

In [26]: df
Out[26]: 
          A         A         B         C
0 -0.174905  0.332522  1.134984 -0.201270
1  1.730445  0.382556 -0.607761  1.221815
2  0.513049  0.196231 -1.746732 -0.252282
3 -0.297577 -1.000121 -0.090442 -2.129467

In [27]: df.ix[:,['A', 'B']]
Out[27]: 
          A         A         B
0 -0.174905  0.332522  1.134984
1  1.730445  0.382556 -0.607761
2  0.513049  0.196231 -1.746732
3 -0.297577 -1.000121 -0.090442

In [28]: df[['A', 'B']]
---------------------------------------------------------------------------
Exception                                 Traceback (most recent call last)
...
Exception: Reindexing only valid with uniquely valued Index objects

In [29]: df[['B', 'C']]
---------------------------------------------------------------------------
Exception                                 Traceback (most recent call last)
...
Exception: Reindexing only valid with uniquely valued Index objects
@wesm wesm closed this as completed in 2160b40 Sep 20, 2012
@wesm
Copy link
Member

wesm commented Sep 20, 2012

It won't reorder, but it will at least work

wesm added a commit that referenced this issue Sep 25, 2012
* master: (171 commits)
  BUG: fix Cython tz_convert bug with time zones that haven't had a UTC transition in a long time. close #1946
  BUG: fix buglet
  BUG: try fixing tzlocal bug
  Minor fixes to time series doc.
  Adding DataFrame methods to API reference.
  Added Series functions to API doc.
  BUG: fix segfault in SeriesGrouper with non-contiguous index
  RLS: Version 0.9.0 Release Candidate 1
  BLD: add lib depends #1945
  BUG: missing case for assigning DataFrame via ix
  BUG: python 3.1 timedelta compat issue
  BUG: python 3 tzoffset is not hashable
  TST: adds dateutil to travis-ci install commands
  BUG: let selecting multiple columns in DataFrame.__getitem__ work when there are duplicates. close #1943
  BUG: DatetimeConverter does not handle datetime64 arrays properly
  BUG: reindex with axis=1 when setting Series to scalar location, close #1942
  BUG: fix formatting of Timestamps in to_html/IPython notebook. refactor to_html code. close #1940
  ENH: allow single str input to na_values #1944
  TST: when xlrd is not installed skip tests needing it, close #1941
  BUG: DatetimeIndex localizes twice if input is localized DatetimeIndex #1838
  ...
yarikoptic added a commit to neurodebian/pandas that referenced this issue Sep 27, 2012
Version 0.9.0 Release Candidate 1

* tag 'v0.9.0rc1': (58 commits)
  RLS: Version 0.9.0 Release Candidate 1
  BLD: add lib depends pandas-dev#1945
  BUG: missing case for assigning DataFrame via ix
  BUG: python 3.1 timedelta compat issue
  BUG: python 3 tzoffset is not hashable
  TST: adds dateutil to travis-ci install commands
  BUG: let selecting multiple columns in DataFrame.__getitem__ work when there are duplicates. close pandas-dev#1943
  BUG: DatetimeConverter does not handle datetime64 arrays properly
  BUG: reindex with axis=1 when setting Series to scalar location, close pandas-dev#1942
  BUG: fix formatting of Timestamps in to_html/IPython notebook. refactor to_html code. close pandas-dev#1940
  ENH: allow single str input to na_values pandas-dev#1944
  TST: when xlrd is not installed skip tests needing it, close pandas-dev#1941
  BUG: DatetimeIndex localizes twice if input is localized DatetimeIndex pandas-dev#1838
  BUG: align input on setting via ix pandas-dev#1630
  cython methods for group bins pandas-dev#1809
  BUG: allow non-numeric columns in groupby first/last pandas-dev#1809
  TST: skip unicode filename test if system requires encoding to ascii
  BUG: more fixedoffset occurrences pandas-dev#1928
  BUG: no zone in tzinfo pandas-dev#1838
  BUG: handle lists too in DataFrame.xs when partially selecting data from DataFrame. close pandas-dev#1796
  ...
@jankatins
Copy link
Contributor

I still get this error with a long dataframe (pandas.version == '0.10.0'). Renaming the column makes it work. Unfortuantelly I can't produce a small reproduceable example: using the above code it simple works :-(

In [85]: journals = journals[["sid", "title",  "SNIP2_2009" , "SJR2_2009", "SNIP2_2010", "SJR2_2010", "SNIP2_2011", "SJR2_2011"]]
---------------------------------------------------------------------------
Exception                                 Traceback (most recent call last)
<ipython-input-85-7a986a34fc84> in <module>()
----> 1 journals = journals[["sid", "title",  "SNIP2_2009" , "SJR2_2009", "SNIP2_2010", "SJR2_2010", "SNIP2_2011", "SJR2_2011"]]

C:\portabel\Python27\lib\site-packages\pandas\core\frame.pyc in __getitem__(self, key)
   1932             if com._is_bool_indexer(key):
   1933                 key = np.asarray(key, dtype=bool)
-> 1934             return self._getitem_array(key)
   1935         elif isinstance(self.columns, MultiIndex):
   1936             return self._getitem_multilevel(key)

C:\portabel\Python27\lib\site-packages\pandas\core\frame.pyc in _getitem_array(self, key)
   1968                         raise KeyError("No column(s) named: %s" %
   1969                                        com.pprint_thing(k))
-> 1970                 return self.take(mask.nonzero()[0], axis=1)
   1971 
   1972     def _slice(self, slobj, axis=0):

C:\portabel\Python27\lib\site-packages\pandas\core\frame.pyc in take(self, indices, axis)
   2850             else:
   2851                 new_columns = self.columns.take(indices)
-> 2852                 return self.reindex(columns=new_columns)
   2853         else:
   2854             new_values = com.take_2d(self.values,

C:\portabel\Python27\lib\site-packages\pandas\core\frame.pyc in reindex(self, index, columns, method, level, fill_value, limit, copy)
   2505         if columns is not None:
   2506             frame = frame._reindex_columns(columns, copy, level,
-> 2507                                            fill_value, limit)
   2508 
   2509         if index is not None:

C:\portabel\Python27\lib\site-packages\pandas\core\frame.pyc in _reindex_columns(self, new_columns, copy, level, fill_value, limit)
   2592                          limit=None):
   2593         new_columns, indexer = self.columns.reindex(new_columns, level=level,
-> 2594                                                     limit=limit)
   2595         return self._reindex_with_indexers(None, None, new_columns, indexer,
   2596                                            copy, fill_value)

C:\portabel\Python27\lib\site-packages\pandas\core\index.pyc in reindex(self, target, method, level, limit)
    873             else:
    874                 indexer = self.get_indexer(target, method=method,
--> 875                                            limit=limit)
    876         return target, indexer
    877 

C:\portabel\Python27\lib\site-packages\pandas\core\index.pyc in get_indexer(self, target, method, limit)
    792 
    793         if not self.is_unique:
--> 794             raise Exception('Reindexing only valid with uniquely valued Index '
    795                             'objects')
    796 

Exception: Reindexing only valid with uniquely valued Index objects

@wesm wesm reopened this Jan 15, 2013
@ghost ghost assigned changhiskhan Jan 19, 2013
@changhiskhan
Copy link
Contributor

looks like your data is mixed type but the original issue is not.

@changhiskhan
Copy link
Contributor

@wesm I moved this to 0.10.2

Not enough time for 0.10.1

@jreback
Copy link
Contributor

jreback commented Mar 21, 2013

pushing to 0.12 when can deal with this in block manager, see #3012

@jreback
Copy link
Contributor

jreback commented Apr 30, 2013

this was actually fixed in 0.9.0 and test for it in test_frames...so closing...UOD (unless otherwise directed)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Indexing Related to indexing on series/frames, not to indexes themselves
Projects
None yet
Development

No branches or pull requests

5 participants