Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
ENH: upgrade categoricals to a first class pandas type
GH3943, GH5313, GH5314, GH7444 ENH: delegate _reduction and ops from Series to the categorical to support min/max and raise TypeError on other ops (numerical) and reduction Add Categorical Properties to Series Default to 'ordered' Categoricals if values are ordered Categorical: add level assignments and reordering + changed default for ordered Add a `Categorical.reorder_levels()` method. Change some naming in `Series`, so that the methods do not clash with established standards and rename the other categorical methods accordingly. Also change the default for `ordered` to True if values + levels are passed in at creation time. Initial doc version for working with Categorical data Categorical: add Categorical.mode() and use that in Series.mode() Categorical: implement remove_unused_levels() Categorical: implement value_count() for categorical series Categorical: make Series.astype("category") work ENH: add setitem to Categorical BUG: assigning to levels not in level set now raises ValueError API: disallow numpy ufuncs with categoricals Categorical: Categorical assignment to int/obj column ENH: add support for fillna to Categoricals API: deprecate old style categorical constructor usage and change default Before it was possible to pass in precomputed labels/pointer and the corresponding levels (e.g.: `Categorical([0,1,2], levels=["a","b","c"])`). This could lead to subtle errors in case of integer categoricals: the following could be both interpreted as "precomputed pointers and levels" or "values and levels", but converting it back to a integer array would result in different arrays: `np.array(Categorical([1,2], levels=[1,2,3]))` interpreted as pointers: `[2,3]` interpreted as values: `[1,2]` Up to now we would favour old style "pointer and levels" if these values could be interpreted as such (see code for details...). With this commit we favour new style "values and levels" and only attempt to interprete them as "pointers and levels" if "compat=True" is passed to the constructor. BREAKS: This will break code which uses Categoricals with "pointer and levels". A short google search and a search on stackoverflow revealed no such useage. Categorical: document constructor changes and small fixes Categorical: document that inappropriate numpy functions won't work anymore ENH: concat support
- Loading branch information