You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When calling set_index on an index with duplicates, the verify_integrity=True option correctly identifies the duplicates but this check appears to take place after the original columns have already been dropped when inplace=True is also passed. This results in data being lost.
I believe it would be better if the original DataFrame object was only modified in the case that the set_index operation is successful.
Code to reproduce the problem:
In [189]: df=DataFrame({'one':[1, 1, 2], 'two':[1,2,3]})
In [190]: dfOut[190]:
onetwo011112223In [191]: df.set_index(['one'], inplace=True, verify_integrity=True)
---------------------------------------------------------------------------ExceptionTraceback (mostrecentcalllast)
/mnt/hgfs/fastdata/<ipython-input-191-e1c0e8c92f6c>in<module>()
---->1df.set_index(['one'], inplace=True, verify_integrity=True)
/home/tobias/code/envs/mac/local/lib/python2.7/site-packages/pandas/core/frame.pycinset_index(self, keys, drop, append, inplace, verify_integrity)
2328ifverify_integrityandnotindex.is_unique:
2329duplicates=index.get_duplicates()
->2330raiseException('Index has duplicate keys: %s'%duplicates)
23312332# clear up memory usageException: Indexhasduplicatekeys: [1]
In [192]: dfOut[192]:
two011223In [202]: printsys.version2.7.3 (default, Aug12012, 05:14:39)
[GCC4.6.3]
In [203]: printpd.version.version0.8.1In [204]:
The text was updated successfully, but these errors were encountered:
When calling set_index on an index with duplicates, the verify_integrity=True option correctly identifies the duplicates but this check appears to take place after the original columns have already been dropped when inplace=True is also passed. This results in data being lost.
I believe it would be better if the original DataFrame object was only modified in the case that the set_index operation is successful.
Code to reproduce the problem:
The text was updated successfully, but these errors were encountered: