-
-
Notifications
You must be signed in to change notification settings - Fork 18k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature Request: categorical.reset_order #9190
Comments
Thanks, yeah that's helpful. AFAICT, it still doesn't look like there's a way to just drop the ordering though correct? Re: Unrelated, I'm sure it was discussed ad nauseam but I was also surprised that |
just set the cc @JanSchulz .... do you recall the exact discussion w.r.t. |
@jreback Your example above of resetting the order can maybe be added to the docs (I didn't directly see this now in the docs. They speak about setting the order and sorting, but I did not find this) |
Oh, nice. I still think a method for for changing the state of the object |
The default to ordered=True was due to the fact that Stata only stores numerical data so that it is always possibly to order according to the numeric values, and that it was trivial to drop the ordering if needed, but non-trivial to re-assign it if read in as unordered. Oops. this was only w.r.t |
In the discussion, we wanted to have a ordered categorical when the underlying data had an order, which is the case in most cases (ints, strings, ... are all orderable). So def __init__(self, values, categories=None, ordered=None, name=None, fastpath=False,
levels=None):
[...]
# case without explicit categories
# If the underlying data structure was sortable, and the user doesn't want to
# "forget" this order, the categorical also is sorted/ordered
if ordered is None:
ordered = True
# case with explicit categories
# if we got categories, we can assume that the order is intended
# if ordered is unspecified
if ordered is None:
ordered = True
[...]
self.ordered = False if ordered is None else ordered regarding a
|
@bashtage Yeah, I just meant with Categorical in general. @JanSchulz Hmm. I think the rationale for the default should be based on what's more common in the real world. My prior is that unordered factors are much more common. R defaults to unordered unless Re: drop ordering, it would just be the same as |
If stata has |
Stata's datafile format does not explicitly allow a determination of whether a labeled variable ordered or not - only the end user has this information. The primary reasons to import as an ordered categorical is that
|
This topic is now also in #9347: should |
I was thinking about trying to add a
to_unordered
method, then I thought maybe areset_order
(orreorder
?) with an optionaldrop
keyword à lareset_index
makes more sense. I didn't see if this was possible yet, so this is also a question. Is this possible via some other syntactic sugar? I might see if I can hack this together at some point unless someone beats me to it. My motivation for this is that I'm getting all ordered or unordered factors usingread_stata
. This could be "fixed" there by taking a list in addition to the boolean convert to ordered or whatever, but I think a method like this would be generally useful, plus I never peak at data before reading it.The text was updated successfully, but these errors were encountered: