Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

filter performance #3460

Closed
sprig opened this issue Aug 29, 2024 · 2 comments
Closed

filter performance #3460

sprig opened this issue Aug 29, 2024 · 2 comments
Labels
Milestone

Comments

@sprig
Copy link
Contributor

sprig commented Aug 29, 2024

Hello,

Thanks for your work on this package!

I happened to come upon your comment in the julia forums regarding performance of filter on whole dataframes vs column selectors;
https://discourse.julialang.org/t/adding-multiple-new-columns-to-dataframe/50997/14

Having read the documentation for filter both in installed versions and on the docs site - and now also the docstrings directly in the code

- I see no mention of this fact other than perhaps indirectly (i.e. conclude this from the fact that rows are passed to the function rather than columns).

I assume this is still correct due to e.g. type stability as well as the performance of unpacing rows vs accessing the column vectors directly? Would you confirm please? And, do you think it would be prudent to state this explicitly?

@bkamins
Copy link
Member

bkamins commented Aug 29, 2024

Yes, the reason is that DataFrameRow is not type stable. This info can be added to a docstring. Would you be willing to make a PR (I can do it instead)

@bkamins bkamins added the doc label Aug 29, 2024
@bkamins bkamins added this to the 1.7 milestone Aug 29, 2024
@sprig
Copy link
Contributor Author

sprig commented Aug 30, 2024

Sure!

@bkamins bkamins closed this as completed in 97bbb40 Sep 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants