Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOC: Clarify that unique() promotes dtype to 64-bit #27869

Closed

Conversation

stuarteberg
Copy link
Contributor

I found this behavior surprising:

In [1]: pd.Series([1,2,3], dtype=np.uint8).unique()
Out[1]: array([1, 2, 3], dtype=uint64)

... because I did not expect a different dtype in the result. This PR adds a sentence to the docs to clarify this.

  • passes black pandas
  • passes git diff upstream/master -u -- "*.py" | flake8 --diff
  • (n/a) closes #xxxx
  • (n/a) tests added / passed
  • (n/a) whatsnew entry

@stuarteberg
Copy link
Contributor Author

FWIW, These changes do pass black and flake8. The CI failures appear unrelated.

@TomAugspurger
Copy link
Contributor

CI failure is fixed on master.

Is this deliberate, or an implementation detail? We should document that.

@stuarteberg
Copy link
Contributor Author

Is this deliberate, or an implementation detail?

My hunch is that it's an implementation detail. For example, pd.unique() differs from np.unique() in this respect, and I don't know why pandas would go out of its way to be different. It's probably just incidental.

Heck, maybe it's even accidental (i.e. a bug). Should this line be changed as follows?

-    uniques = _reconstruct_data(uniques, dtype, original)
+    uniques = _reconstruct_data(uniques, original.dtype, original)

In case that looks right to you, I've opened a PR: #27874

@jreback
Copy link
Contributor

jreback commented Aug 15, 2019

superseded by #27874

@jreback jreback closed this Aug 15, 2019
@jreback jreback added Dtype Conversions Unexpected or buggy dtype conversions Reshaping Concat, Merge/Join, Stack/Unstack, Explode Algos Non-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diff labels Oct 7, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Algos Non-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diff Dtype Conversions Unexpected or buggy dtype conversions Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants