Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(pivot-wider): handle the case of empty id_cols #9912

Merged
merged 1 commit into from
Aug 27, 2024

Conversation

cpcloud
Copy link
Member

@cpcloud cpcloud commented Aug 23, 2024

Fixes an issue with handling simpler cases of Table.pivot_wider where you want to take two columns and use one as the header and one as the first and only row.

@cpcloud cpcloud added this to the 9.4 milestone Aug 23, 2024
@cpcloud cpcloud added feature Features or general enhancements ux User experience related issues labels Aug 23, 2024
@cpcloud cpcloud requested review from jcrist and gforsyth August 23, 2024 22:11
@cpcloud
Copy link
Member Author

cpcloud commented Aug 23, 2024

cc @IndexSeek

@cpcloud cpcloud force-pushed the pivot-wider-empty-id-cols branch from b25bac1 to 71a3dbd Compare August 23, 2024 22:25
@cpcloud cpcloud added the ci-run-cloud Add this label to trigger a run of BigQuery, Snowflake, and Databricks backends in CI label Aug 24, 2024
@ibis-docs-bot ibis-docs-bot bot removed the ci-run-cloud Add this label to trigger a run of BigQuery, Snowflake, and Databricks backends in CI label Aug 24, 2024
@IndexSeek
Copy link
Member

Thank you for working on this! I wanted to add a bit of context to the problem I was seeing here and to help repro/explain the desired outcome.

Example data

from ibis.interactive import *

t = ibis.memtable(dict(outcome=["yes", "no"], counted=[3, 4]))
t

┏━━━━━━━━━┳━━━━━━━━━┓
┃ outcomecounted ┃
┡━━━━━━━━━╇━━━━━━━━━┩
│ stringint64   │
├─────────┼─────────┤
│ yes3 │
│ no4 │
└─────────┴─────────┘

I was hoping to "transpose" this with pivot_wider.

Failed attempt

t.pivot_wider(names_from="outcome", values_from="counted")

Expected signature: GroupedTable(table: Relation, groupings: Annotated[tuple[Value, ...], Length(at_least=1, at_most=None)], orderings: tuple = (), havings: tuple = ())

Hacky workaround

t.mutate(dummy=1).pivot_wider(names_from="outcome", values_from="counted").drop("dummy")
┏━━━━━━━┳━━━━━━━┓
┃ yes   ┃ no    ┃
┡━━━━━━━╇━━━━━━━┩
│ int64 │ int64 │
├───────┼───────┤
│     3 │     4 │
└───────┴───────┘

Relatively complex workaround

cases = {
    outcome: ibis.case()
    .when(_.outcome == outcome, _.counted)
    .end()
    for outcome in ("yes", "no")
}
t.select(**cases).agg(s.across(s.all(), _.sum()))
┏━━━━━━━┳━━━━━━━┓
┃ yes   ┃ no    ┃
┡━━━━━━━╇━━━━━━━┩
│ int64 │ int64 │
├───────┼───────┤
│     3 │     4 │
└───────┴───────┘

Copy link
Member

@jcrist jcrist left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM. If you're adding none in #9917 I guess we'll want to hold-off merging this and rebase after that one.

@cpcloud
Copy link
Member Author

cpcloud commented Aug 26, 2024

Yep, SGTM, I started the refactor after this PR, so I'll merge that one first and then this one.

Thanks for the review(s)!

@cpcloud cpcloud force-pushed the pivot-wider-empty-id-cols branch from 728b438 to 89fffc0 Compare August 27, 2024 11:18
@cpcloud cpcloud enabled auto-merge (squash) August 27, 2024 11:18
@cpcloud cpcloud merged commit 4a4bc64 into ibis-project:main Aug 27, 2024
81 checks passed
@cpcloud cpcloud deleted the pivot-wider-empty-id-cols branch August 27, 2024 12:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Features or general enhancements ux User experience related issues
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants