-
-
Notifications
You must be signed in to change notification settings - Fork 18k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DEPR: deprecate SparseArray.values #26421
DEPR: deprecate SparseArray.values #26421
Conversation
Codecov Report
@@ Coverage Diff @@
## master #26421 +/- ##
==========================================
- Coverage 91.69% 91.68% -0.01%
==========================================
Files 174 174
Lines 50741 50743 +2
==========================================
- Hits 46529 46526 -3
- Misses 4212 4217 +5
Continue to review full report at Codecov.
|
Codecov Report
@@ Coverage Diff @@
## master #26421 +/- ##
==========================================
+ Coverage 91.74% 91.75% +<.01%
==========================================
Files 174 174
Lines 50763 50754 -9
==========================================
- Hits 46575 46567 -8
+ Misses 4188 4187 -1
Continue to review full report at Codecov.
|
@@ -2272,10 +2272,10 @@ def _cast_sparse_series_op(left, right, opname): | |||
# TODO: This should be moved to the array? | |||
if is_integer_dtype(left) and is_integer_dtype(right): | |||
# series coerces to float64 if result should have NaN/inf | |||
if opname in ('floordiv', 'mod') and (right.values == 0).any(): | |||
if opname in ('floordiv', 'mod') and (right.to_dense() == 0).any(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we not be using np.asarry
? generally rather than .to_dense()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both are equivalent (although to_dense
actually does a bit less as it specified the dtype and asarray
does some inference (not sure for that difference though)).
cc @TomAugspurger since you are most familiar with Sparse nowadays .. (although reluctantly :-)) Removing this here also further entangles a bit the get_values / values mess, as SparseArray is still the only array with .values, and in some places we do |
+1 Looks like a few warnings still https://dev.azure.com/pandas-dev/pandas/_build/results?buildId=11542&view=logs&jobId=521b7dfd-2989-5ff8-bc8c-7481906480fa&taskId=07b8d9d4-6363-5e2d-bc2b-146a30521256&lineStart=154&lineEnd=154&colStart=109&colEnd=115 My other PR is adding
to our setup.cfg. If you make the error message something like SparseArray.values, these warnings would be elevated to errors too (not sure if we want that or not). |
Ah, I missed the apply ones. There is one (that I actually already knew about, but for now ignored) that is not that easy to solve: the json code (ujson/python/objToJSON.c) checks in C for a 'values' attribute to get the values out of dataframe / series / index etc. |
@TomAugspurger @jreback can you have a new look? I added some extra compat code in cython/c code |
@@ -28,6 +28,14 @@ cdef _get_result_array(object obj, Py_ssize_t size, Py_ssize_t cnt): | |||
return np.empty(size, dtype='O') | |||
|
|||
|
|||
cdef bint _is_sparse_array(object obj): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would this be better-suited for pandas._libs.util
? Or keep here since this is the only file using it and it's temporary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, exactly for those reasons (It's only used here, and should be removed again once we get rid of this deprecation), I would keep it here (it's not mean to be a general utility)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not the right location not should be in util
your argument is not correct ; just because we eventually will remove it does not mean it should. it be with similar code
+1
…On Mon, May 20, 2019 at 8:18 AM Joris Van den Bossche < ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In pandas/_libs/reduction.pyx
<#26421 (comment)>:
> @@ -28,6 +28,14 @@ cdef _get_result_array(object obj, Py_ssize_t size, Py_ssize_t cnt):
return np.empty(size, dtype='O')
+cdef bint _is_sparse_array(object obj):
Yes, exactly for those reasons (It's only used here, and should be removed
again once we get rid of this deprecation), I would keep it here
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#26421?email_source=notifications&email_token=AAKAOIS6HRAYVGNAUBPWWMDPWKQJZA5CNFSM4HNKAIJ2YY3PNVWWK3TUL52HS4DFWFIHK3DMKJSXC5LFON2FEZLWNFSXPKTDN5WW2ZLOORPWSZGOBZDF2AQ#discussion_r285587187>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAKAOIXL3A3WPZHBHQW2H7TPWKQJZANCNFSM4HNKAIJQ>
.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not really sure of the urgency here @jorisvandenbossche
i have some comments - and will fully review at some point
@@ -28,6 +28,14 @@ cdef _get_result_array(object obj, Py_ssize_t size, Py_ssize_t cnt): | |||
return np.empty(size, dtype='O') | |||
|
|||
|
|||
cdef bint _is_sparse_array(object obj): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not the right location not should be in util
your argument is not correct ; just because we eventually will remove it does not mean it should. it be with similar code
@@ -28,6 +28,14 @@ cdef _get_result_array(object obj, Py_ssize_t size, Py_ssize_t cnt): | |||
return np.empty(size, dtype='O') | |||
|
|||
|
|||
cdef bint _is_sparse_array(object obj): | |||
# TODO can be removed one SparseArray.values is removed (GH26421) | |||
if hasattr(obj, '_subtyp'): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this idiom should be getattr
Sorry, there was no urgency at all. Just thought for a moment that the review of Tom was enough, and wanted to get over with this PR. Will wait on your full review then before doing any fixup. |
@jorisvandenbossche my main comment was the Is_sparse_array needs to be in util.pyx (doesn't matter that we will eventually remove it), its in the wrong place. slight confusion between whether we recommend |
Having a
.values
attribute on SparseArray is confusing, as.values
is typically used on Series/DataFrame/Index and not on the array classes.