-
-
Notifications
You must be signed in to change notification settings - Fork 403
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement support for retaining Pandas index #6061
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #6061 +/- ##
===========================================
- Coverage 88.68% 26.96% -61.73%
===========================================
Files 316 318 +2
Lines 66072 67132 +1060
===========================================
- Hits 58598 18104 -40494
- Misses 7474 49028 +41554
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
may partially resolve #6058 |
daf36cf
to
03af717
Compare
5ec315a
to
ab1f2c2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, left a few suggestions and questions for you.
Co-authored-by: Philipp Rudiger <prudiger@anaconda.com>
Okay, I'm happy with this PR personally. We really need to merge this and test it extensively in all kinds of scenarios over the next few weeks. |
This pull request has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
The Pandas support in HoloViews has always been severely hampered by the fact that we do not properly support indexes, i.e. if a user references an index column we are forced to call
.reset_index()
which is inefficient and breaks one core principle of HoloViews, which is that we simply provide thin wrappers around data. It also has significant performance implications both from a memory perspective and from a speed perspective since indexes can provide significant speedups when indexing or performing aggregations. One other major issue solved by supporting indexes is the problem of having multiple elements providing a view onto the same DataFrame, e.g. when you have an NdOverlay of curves each visualizing a different column.Implements #2537
__init__
validate
dtype
dimension_type
range
iloc
(check multi-index)values
(check multi-index)sort
reindex
sample
select
groupby
dataset.data
directly do not make (now invalid) assumptions about indexes