Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add filtercols and filtercols! #2508

Open
bkamins opened this issue Nov 1, 2020 · 9 comments
Open

add filtercols and filtercols! #2508

bkamins opened this issue Nov 1, 2020 · 9 comments
Labels
non-breaking The proposed change is not breaking
Milestone

Comments

@bkamins
Copy link
Member

bkamins commented Nov 1, 2020

We have mapcols so we can consider adding filtercols (with view kwarg) and filtercols! that would work like filter but would filter columns. There is no rush to add this.

@bkamins bkamins added the non-breaking The proposed change is not breaking label Nov 1, 2020
@bkamins bkamins added this to the 1.x milestone Nov 1, 2020
@zerefwayne
Copy link

@nalimilan @bkamins This looks interesting. Can I work on this?

@bkamins
Copy link
Member Author

bkamins commented Nov 9, 2020

sure - just make sure to understand the API of filter and filter! first. Also this should be synced with #2417 as the API should be the same in both places.
The question is if the predicate should get a column or a "column name", column Pair. I think it should be just a column

@bkamins
Copy link
Member Author

bkamins commented Nov 9, 2020

ah - and filter should support copycols kwarg additionally.

@bkamins
Copy link
Member Author

bkamins commented Nov 9, 2020

And the last thing - maybe wait for 0.22 release before opening a PR. Then you will be sure that you are working on a clean master (of course you can start implementation now if you want; 0.22 should be tagged this week). Finally - have a look at https://github.com/JuliaData/DataFrames.jl/blob/master/CONTRIBUTING.md, as following this will speed up the review process.

@zerefwayne
Copy link

I'll have a look at filter first. It will roughly take me a week to implement this, I'll open the PR after syncing with 0.22! Thank you for the heads-up @bkamins!

@bkamins bkamins modified the milestones: 1.x, 1.5 Dec 4, 2022
@bkamins bkamins modified the milestones: 1.5, 1.6 Feb 5, 2023
@bottine
Copy link

bottine commented Apr 14, 2023

I don't know if that's reasonable wrt orthogonality and API, but I would have liked to have mapcols!(fun, df, cols) where cols is the usual column selector, allowing to apply a function only on specific columns. Could e.g. have mapcols!(uppercase, df, names(df, String)) to make string values uppercase.

@bkamins
Copy link
Member Author

bkamins commented Apr 14, 2023

Now the pattern for this is:

transform!(df, names(df, String) .=> ByRow(uppercase), renamecols=false)

I understand you find mapcols! shorter or easier to reason about?

Note that you would need to write:

mapcols!(col -> uppercase.(col), df, names(df, String))

though.

@bottine
Copy link

bottine commented Apr 14, 2023

Ah, indeed, I didn't think it would be so short with transform!, you're right!
But then,

mapcols!(fun, df) == transform!(df, All() .=> fun, renamecols=false)

so the point of mapcols! is mostly that it's a bit shorter?
(Btw, let this be an opportunity to thank you for the package!)

@bkamins
Copy link
Member Author

bkamins commented Apr 14, 2023

Yes, the original point is that mapcols was introduced much earlier in the development process of DataFrames.jl than transform.

@bkamins bkamins modified the milestones: 1.6, 1.7 Jul 10, 2023
@bkamins bkamins modified the milestones: 1.7, 1.x Sep 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
non-breaking The proposed change is not breaking
Projects
None yet
Development

No branches or pull requests

3 participants