-
-
Notifications
You must be signed in to change notification settings - Fork 18k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
REGR: Fixes first_valid_index when DataFrame or Series has duplicate row index (GH21441) #21497
Changes from 1 commit
6151181
003f801
952758a
1f4beb0
675201d
0640279
177a3f4
e94aad5
ff58ffd
d326b0a
b53bb11
0cb3405
11edb51
ed410e1
05e8a99
01a9f7e
cbcb089
111efb0
608c09e
d8fface
751046d
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -8970,17 +8970,17 @@ def _find_valid_index(self, how): | |
|
||
if how == 'first': | ||
# First valid value case | ||
i = is_valid.idxmax() | ||
if not is_valid[i]: | ||
return None | ||
return i | ||
i = is_valid.values[::].argmin() | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. just call this idxpos, no need for i any longer There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks - done |
||
idxpos = i | ||
|
||
elif how == 'last': | ||
# Last valid value case | ||
i = is_valid.values[::-1].argmax() | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. make this idxpos |
||
if not is_valid.iat[len(self) - i - 1]: | ||
return None | ||
return self.index[len(self) - i - 1] | ||
idxpos = len(self) - i - 1 | ||
|
||
if not is_valid.iat[idxpos]: | ||
return None | ||
return self.index[idxpos] | ||
|
||
@Appender(_shared_docs['valid_index'] % {'position': 'first', | ||
'klass': 'NDFrame'}) | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -612,6 +612,16 @@ def test_pct_change(self, periods, fill_method, limit, exp): | |
else: | ||
tm.assert_series_equal(res, Series(exp)) | ||
|
||
@pytest.mark.parametrize("DF,idx,first_idx,last_idx", [ | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Some renaming suggestions for readability:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks - done |
||
({'A': [1, 2, 3]}, [1, 1, 2], 1, 2), | ||
({'A': [1, 2, 3]}, [1, 2, 2], 1, 2), | ||
({'A': [1, 2, 3, 4]}, ['d', 'd', 'd', 'd'], 'd', 'd')]) | ||
def test_valid_index(self, DF, idx, first_idx, last_idx): | ||
# GH 21441 | ||
df1 = pd.DataFrame(DF, index=idx) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You can just call this |
||
assert first_idx == df1.first_valid_index() | ||
assert last_idx == df1.last_valid_index() | ||
|
||
|
||
class TestNDFrame(object): | ||
# tests that don't fit elsewhere | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be :meth:`DataFrame.first_valid_index`
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you don't need a separate sub-section here, just list the issue
'raised for a row index'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks - updated
Have left it as :meth:
first_valid_index
as this issue affects both DataFrame and Series (though the example and title of the original issue points only to DataFrame)