-
Notifications
You must be signed in to change notification settings - Fork 123
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pairwise correlation #759
Pairwise correlation #759
Conversation
names_series = cols |> Shared.from_list(:string) |> Shared.create_series() | ||
|
||
from_series([{column_name, names_series} | correlations]) | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need it in the backend? I think the implementation within Explorer.DataFrame
was better, no? Or is this faster in any way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@josevalim we did using the backend because I wanted to raise for the lazy version, since we cannot implement for that right now. Do you think it worth to revert it anyway?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@philss maybe we can implement it with lazy by implementing it with a mutate
+ select/discard
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This way it works with lazy and within Explorer.DataFrame
as well! Something like:
mutate_with(df, fn ldf ->
Enum.map(columns, ...)
end)
|> select(existing_columns -- new_columns, :discard)
WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I'm going to try.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@josevalim I think this won't work the way we want, because mutate_with
expects only lazy_series/expressions as values, and we are trying to create it with a list of lazy series. On the other hand, we could try create a column for each pair and try to pivot the results later. But again, pivoting does not work with lazy frames.
Maybe there is another way to reshape this DF, but I don't know yet.
I'm going to investigate more tomorrow :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hrm, now that you mention it, I think you are right. Feel free to ship it. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, don't ship it yet. I would like Chris' approval on the API before. :D
I like the API. Matches R's |
The idea is to make clear that this won't work yet for lazy frames. Co-authored-by: Cristine Guadelupe <cristineguadelupe@me.com>
4febd4d
to
a20e098
Compare
@cigrainger makes sense! Thanks! We added a note in the function description, just like we have with the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great!
No description provided.