-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Matrixkernel convenience functions and related performance improvements #363
base: master
Are you sure you want to change the base?
Changes from 10 commits
45096ed
738348c
7ef2db5
4af63ac
68ef765
7a9ba3d
de51078
61aca2b
47b64e4
2de7f5f
03338a3
6a65de7
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -80,6 +80,13 @@ type enables specialised implementations of e.g. [`kernelmatrix`](@ref) for | |
|
||
To find out more about the background, read this [review of kernels for vector-valued functions](https://arxiv.org/pdf/1106.6251.pdf). | ||
|
||
If you are interested in the matrix-kernel interpretation, Kernelfunction provides a convenience function that computes the resulting kernel for a pair of inputs directly. | ||
```@docs | ||
matrixkernel | ||
``` | ||
<!-- Add when exporting IsotopicOutputs --> | ||
<!-- One way to look at this is that applying `matrixkernel` pairwise to a list of inputs results in a block matrix, which when flattened is the same is the `kernelmatrix` computed when providing the same list of inputs as `MOInputIsotopicByOutputs`. --> | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. "flattened" = "reshape from [N,N] array of [Q,Q] arrays to [NQ, NQ] array"? would it be worth being more explicit here? |
||
|
||
## Generic Utilities | ||
|
||
KernelFunctions also provides miscellaneous utility functions. | ||
|
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
|
@@ -43,6 +43,8 @@ end | |||||
_fbm(modX, modY, modXY, h) = (modX^h + modY^h - modXY^h) / 2 | ||||||
|
||||||
_mod(x::AbstractVector{<:Real}) = abs2.(x) | ||||||
_mod(x::AbstractVector{<:AbstractVector{<:Real}}) = sum.(abs2, x) | ||||||
# two lines above could be combined into the second (dispatching on general AbstractVectors), but this (somewhat) more performant | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What's this new line needed for ?
Suggested change
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Many other kernels work on arrays of arrays, but the fbm kernels errors as it could not find a method for There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That'd be great! Sounds like something that should be included in the test suite (maybe for all kernels?). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Apologies for the delay, the error appears when doing k = FBMKernel()
xs = [rand(5) for _ in 1:4]
kernelmatrix(k, xs) which produces: There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I know it's very tempting to just keep things together in a single PR from the author's point of view, but I do highly recommend splitting it up - then e.g. it's easier for different people to review the different (smaller) PRs, it's easier to say "ok I'll spend ten minutes reviewing a ten-line PR" instead of thinking "oh when will i ever find enough time to review a several hundred lines PR", and so on! |
||||||
_mod(x::ColVecs) = vec(sum(abs2, x.X; dims=1)) | ||||||
_mod(x::RowVecs) = vec(sum(abs2, x.X; dims=2)) | ||||||
|
||||||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -37,6 +37,49 @@ function (κ::LinearMixingModelKernel)((x, px)::Tuple{Any,Int}, (y, py)::Tuple{A | |
return sum(κ.H[i, px] * κ.K[i](x, y) * κ.H[i, py] for i in 1:length(κ.K)) | ||
end | ||
|
||
## current implementation | ||
# julia> @benchmark KernelFunctions.kernelmatrix(k, x1IO, x2IO) | ||
# BenchmarkTools.Trial: 3478 samples with 1 evaluation. | ||
# Range (min … max): 1.362 ms … 5.498 ms ┊ GC (min … max): 0.00% … 72.47% | ||
# Time (median): 1.396 ms ┊ GC (median): 0.00% | ||
# Time (mean ± σ): 1.435 ms ± 358.577 μs ┊ GC (mean ± σ): 2.28% ± 6.70% | ||
|
||
# ▂▆█▇▄▂ ▂▁ ▁ | ||
# ███████▆██▅▅▁▄▁▁▁▁▁▁▁▁▄▁▁▁▁▁▁▁▁▁▃▁▁▁▁▁▁▁▁▁▁▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁▃ █ | ||
# 1.36 ms Histogram: log(frequency) by time 2.1 ms < | ||
|
||
# Memory estimate: 410.73 KiB, allocs estimate: 23090. | ||
|
||
## proposed improvement | ||
# julia> @benchmark KernelFunctions.kernelmatrix2(k, x1IO, x2IO) | ||
# BenchmarkTools.Trial: 10000 samples with 1 evaluation. | ||
# Range (min … max): 16.871 μs … 3.440 ms ┊ GC (min … max): 0.00% … 97.80% | ||
# Time (median): 18.625 μs ┊ GC (median): 0.00% | ||
# Time (mean ± σ): 24.734 μs ± 129.308 μs ┊ GC (mean ± σ): 20.63% ± 3.92% | ||
|
||
# ▄▆███▇▆▅▄▄▂▂ ▁ ▁ ▂ | ||
# ████████████████▇▆▄▅▅▄▅▅▅▅▅▆▃▄▅▃▂▂▃▃▄▆▇▇████████▅▆▅▃▅▄▆▅▆▆▆▇ █ | ||
# 16.9 μs Histogram: log(frequency) by time 36.4 μs < | ||
|
||
# Memory estimate: 84.56 KiB, allocs estimate: 338. | ||
|
||
function kernelmatrix2(k::LinearMixingModelKernel, X, Y) | ||
K = [kernelmatrix(ki, X.x, Y.x) for ki in k.K] | ||
L = size(k.H, 2) | ||
return reduce( | ||
hcat, [reduce(vcat, [sum(k.H[:, i] .* (K .* k.H[:, j])) for i in 1:L]) for j in 1:L] | ||
) | ||
end | ||
Comment on lines
+66
to
+72
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For ease of reviewing, it would be great if you could move the performance improvements into a separate PR! |
||
|
||
# function matrixkernel(k::LinearMixingModelKernel, x, y) | ||
# return matrixkernel(k, x, y, size(k.H, 2)) | ||
# end | ||
|
||
function matrixkernel(k::LinearMixingModelKernel, x, y) | ||
K = [ki(x, y) for ki in k.K] | ||
return k.H' * (K .* k.H) | ||
end | ||
|
||
function Base.show(io::IO, k::LinearMixingModelKernel) | ||
return print(io, "Linear Mixing Model Multi-Output Kernel") | ||
end | ||
|
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
|
@@ -4,3 +4,31 @@ | |||||
Abstract type for kernels with multiple outpus. | ||||||
""" | ||||||
abstract type MOKernel <: Kernel end | ||||||
|
||||||
""" | ||||||
matrixkernel(k::MOK, x, y) | ||||||
Crown421 marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
|
||||||
Convenience function to compute the matrix kernel for two inputs `x` and `y`. The `outputsize` keyword is only required for the `IndependentMOKernel` to indicated the number of outputs. | ||||||
Crown421 marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
""" | ||||||
function matrixkernel(k::MOKernel, x, y, out_dim) | ||||||
@assert size(x) == size(y) | ||||||
xMO = MOInputIsotopicByFeatures([x], out_dim) | ||||||
yMO = MOInputIsotopicByFeatures([y], out_dim) | ||||||
return kernelmatrix(k, xMO, yMO) | ||||||
end | ||||||
|
||||||
function matrixkernel(k::MOKernel, x, y) | ||||||
return throw( | ||||||
ArgumentError( | ||||||
"This kernel does not have a specific matrixkernel implementation, you can call `matrixkernel(k, x, y, out_dim)`", | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
), | ||||||
) | ||||||
end | ||||||
|
||||||
function _kernelmatrix_kron_helper(::MOInputIsotopicByFeatures, Kfeatures, Koutputs) | ||||||
return kron(Kfeatures, Koutputs) | ||||||
end | ||||||
|
||||||
function _kernelmatrix_kron_helper(::MOInputIsotopicByOutputs, Kfeatures, Koutputs) | ||||||
return kron(Koutputs, Kfeatures) | ||||||
end | ||||||
Comment on lines
+28
to
+34
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. should not be part of this PR? |
Original file line number | Diff line number | Diff line change | ||||||
---|---|---|---|---|---|---|---|---|
|
@@ -33,7 +33,20 @@ function (κ::LatentFactorMOKernel)((x, px)::Tuple{Any,Int}, (y, py)::Tuple{Any, | |||||||
return cov_f + κ.e((x, px), (y, py)) | ||||||||
end | ||||||||
|
||||||||
function kernelmatrix(k::LatentFactorMOKernel, x::MOInput, y::MOInput) | ||||||||
# function matrixkernel(k::LatentFactorMOKernel, x, y) | ||||||||
# return matrixkernel(k, x, y, size(k.A, 1)) | ||||||||
# end | ||||||||
|
||||||||
Comment on lines
+36
to
+39
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
but would be great to check the implementation for correctness by comparing against this! |
||||||||
function matrixkernel(k::LatentFactorMOKernel, x, y) | ||||||||
W = [col * col' for col in eachcol(k.A)] | ||||||||
h = [gi(x, y) for gi in k.g] | ||||||||
w_h = sum(Wi * Hi for (Wi, Hi) in zip(W, h)) | ||||||||
return w_h + matrixkernel(k.e, x, y) | ||||||||
end | ||||||||
|
||||||||
function kernelmatrix( | ||||||||
k::LatentFactorMOKernel, x::IsotopicMOInputsUnion, y::IsotopicMOInputsUnion | ||||||||
) | ||||||||
x.out_dim == y.out_dim || error("`x` and `y` should have the same output dimension") | ||||||||
x.out_dim == size(k.A, 1) || | ||||||||
error("Kernel not compatible with the given multi-output inputs") | ||||||||
|
@@ -45,7 +58,7 @@ function kernelmatrix(k::LatentFactorMOKernel, x::MOInput, y::MOInput) | |||||||
H = [gi.(x.x, permutedims(y.x)) for gi in k.g] | ||||||||
|
||||||||
# Weighted latent kernel matrix ((N*out_dim) x (N*out_dim)) | ||||||||
W_H = sum(kron(Wi, Hi) for (Wi, Hi) in zip(W, H)) | ||||||||
W_H = sum(_kernelmatrix_kron_helper(x, Hi, Wi) for (Wi, Hi) in zip(W, H)) | ||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Will this work if |
||||||||
|
||||||||
return W_H .+ kernelmatrix(k.e, x, y) | ||||||||
end | ||||||||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -15,6 +15,8 @@ | |
@test repr(k) == "Fractional Brownian Motion Kernel (h = $(h))" | ||
test_ADs(FBMKernel; ADs=[:ReverseDiff]) | ||
|
||
# ToDo: needs tests for _mod | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. not specifically for |
||
|
||
# Tests failing for ForwardDiff and Zygote@0.6. | ||
# Related to: https://github.com/FluxML/Zygote.jl/issues/1036 | ||
f(x, y) = x^y | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
@testset "mokernel" begin | ||
struct TestMOKernel <: MOKernel end | ||
@test_throws ArgumentError matrixkernel(TestMOKernel(), rand(3), rand(3)) | ||
|
||
out_dim = 3 | ||
A = rand(out_dim, out_dim) | ||
A = A * A' | ||
k = IntrinsicCoregionMOKernel(GaussianKernel(), A) | ||
|
||
in_dim = 4 | ||
x = rand(in_dim) | ||
y = rand(in_dim) | ||
@test matrixkernel(k, x, y, 3) ≈ matrixkernel(k, x, y) | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.