-
-
Notifications
You must be signed in to change notification settings - Fork 18k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: Respect dups in reindexing CategoricalIndex #17355
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -365,18 +365,18 @@ def test_astype(self): | |
tm.assert_index_equal(result, expected) | ||
|
||
def test_reindex_base(self): | ||
|
||
# determined by cat ordering | ||
idx = self.create_index() | ||
# Determined by cat ordering. | ||
idx = CategoricalIndex(list("cab"), categories=list("cab")) | ||
expected = np.arange(len(idx), dtype=np.intp) | ||
|
||
actual = idx.get_indexer(idx) | ||
tm.assert_numpy_array_equal(expected, actual) | ||
|
||
with tm.assert_raises_regex(ValueError, 'Invalid fill method'): | ||
idx.get_indexer(idx, method='invalid') | ||
with tm.assert_raises_regex(ValueError, "Invalid fill method"): | ||
idx.get_indexer(idx, method="invalid") | ||
|
||
def test_reindexing(self): | ||
np.random.seed(123456789) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would leave this, the fact it was random before actually catched a bug There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You could, but "random" tests are generally not good design-wise. The fact that it happened to catch a bug doesn't change that IMO. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. why don't you add a new test for this instead (I mean just split the section that you added to a new test); I would not use random either, rather you can use the specific case that was actually causing the issue. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @gfyoung already added the specific case that was failing below without using random data There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
that might be in certain cases, but we have a ton of random tests in our test suite, and the majority has no seed set |
||
|
||
ci = self.create_index() | ||
oidx = Index(np.array(ci)) | ||
|
@@ -388,6 +388,18 @@ def test_reindexing(self): | |
actual = ci.get_indexer(finder) | ||
tm.assert_numpy_array_equal(expected, actual) | ||
|
||
# see gh-17323 | ||
# | ||
# Even when indexer is equal to the | ||
# members in the index, we should | ||
# respect duplicates instead of taking | ||
# the fast-track path. | ||
for finder in [list("aabbca"), list("aababca")]: | ||
expected = oidx.get_indexer_non_unique(finder)[0] | ||
|
||
actual = ci.get_indexer(finder) | ||
tm.assert_numpy_array_equal(expected, actual) | ||
|
||
def test_reindex_dtype(self): | ||
c = CategoricalIndex(['a', 'b', 'c', 'a']) | ||
res, indexer = c.reindex(['a', 'c']) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jorisvandenbossche : This test does exactly what you both were asking for: unique elements tested against identical indexer