Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify implementation of Delta #378

Merged
merged 4 commits into from
Oct 8, 2021
Merged

Simplify implementation of Delta #378

merged 4 commits into from
Oct 8, 2021

Conversation

devmotion
Copy link
Member

@devmotion devmotion commented Oct 8, 2021

Addresses #370 (comment) and similar comments in the PR (let's see if it breaks anything and what additional fixes are needed 😄).

In general, I think we should stop implementing internal, unexported and undocumented functions from Distances if it does not provide any clear performance benefit (e.g. in DotProduct). Nowadays it should be sufficient to just define (::Dist)(x, y) for the desired types of x and y. This could be added to this PR or done separately.

(Edit: At some point, we should probably make the long-proposed move and roll our own binary ops system with a Distances fallback and do not incorrectly declare these operations as metrics)

@@ -7,7 +7,7 @@ end
pairwise(d::PreMetric, X::AbstractVector) = pairwise(d, X, X)

function pairwise!(out::AbstractMatrix, d::PreMetric, X::AbstractVector, Y::AbstractVector)
return broadcast!(d, out, X, Y')
return broadcast!(d, out, X, permutedims(Y))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests revealed this bug (caused the test errors for vectors of vectors with WhiteKernel): ' is recursive but here we just want to permute the outer dimension, hence one has to use permutedims. In the definition of pairwise above permutedims is already used.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still need this stuff actually? It's now been implemented in Distances.jl no?
https://github.com/JuliaStats/Distances.jl/blob/a43d76b23874c10d8ab44909f7d4816ba88ef7db/src/generic.jl#L169

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe. As mentioned above, we shouldn't use Distances at all for (some of) our binary operations. I think a major overhaul of the pairwise system should be left for a different PR though since I assume it might break stuff and require more changes. One reason to be a bit careful and conservative is that Distances does not own Distances.pairwise but StatsBase (even though it is defined in StatsAPI).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. Let's leave it as it is for now. I'll open an issue to discuss this more generally.

@theogf theogf mentioned this pull request Oct 8, 2021
@devmotion devmotion merged commit f78f028 into master Oct 8, 2021
@devmotion devmotion deleted the dw/delta branch October 8, 2021 14:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants