-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Integrating friedrich into linfa #1
Comments
I agree with the shim proposition (as it would keep I have actually tried both Both kind of benchmarks are high on my todo list, I will start that once I am fully happy with the internals and API (I am still wondering whether the current kernel trait and builder pattern can be improved). At the moment I have confirmed that it works as expected but not that it reproduce results from reference implementations. I do think tests are needed but they are low priority for the moment (I want to confirm API stability before that). (I should probably have a clean full and official TODO list somewhere) |
I am actually happy to work on the shim myself, to get familiar with the crate. I'll spend some time on it over the Christmas break and I'll pull you in for a review when I have something decent ready 👍 My main issue with |
Great! I am currently working on adding I can understand that, |
The first provides much better ergonomics to the user of the crate (they can pass in views, mutable views, owned arrays, etc.).
We are aligned on this, not top of the list for me either. |
I have added (but not tested) support for The next step on my list (which should be done within a week) is to reproduce the classical Mauna Loa example. As it has been implemented in sci-kit learn, it might give you a reference point to compare both implementations. |
Ok, I have started to write the shim, which means I looked closer at the whole crate and I am struggling to understand some design choices. Let's start from my biggest confusion source right now - what do you exactly mean as |
It's there any reference (book/article) I can look at for the training algorithm you are using? |
This article is quitte good at explaining what a gaussian process is and where does the prior come into play : gaussian processes are not so fancy (see the The Bayesian prior away from data section). In short the kernel function gives a similarity metric between elements in the input space while the prior gives a default value that will be returned in the absence of information (in particular stationary kernels returns zero when the similarity to the training samples fall to 0 which is probably not what the user would expect). The prior can be any regression model and, while having a good prior (close to the target function) makes the learning faster (it is useful for things such as transfer learning), I have seen some paper omiting it from the definition of the gaussian process (since it is trivial to add it on top of the equation). Something like a linear prior can be included into the kernel but (to the best of my knowledge) it is not true for all type of priors while it is easy to include an explicit prior term in the equation used. (don't hesitate to tell me if I am not clear) |
The training of the prior or of the kernel ? For the prior the training is up to each particular prior implementation (linear regression for a linear prior, etc). For the kernel I believe I put some links in the sources. In short I use the ADAM optimizer (a form of gradient descent) to maximize the log-likelihood of the model given the training data (which has some nice regularising properties). Does it answer your question ? |
Reading the documentation for scikit learn's implementation, it seem that they set the prior to either 0 or the mean of the data but do not allow for alternatives :
(note that when I say prior, I always mean prior mean) |
Ok, let me rephrase then. is while |
if |
Perfect. I have started to draft the shim - you can follow my progresses here: https://github.com/LukeMathWalker/linfa/blob/gp/linfa-gaussian-processes/src/hyperparameters.rs |
Great! I gave a quick look to your code, here are two remarks : I see that you put a You set noise to (on my side, I did not have as much time as expected but the example should be coming shortly) |
I figured that out later when doing the setters, but I didn't go back to edit the comment - I'll fix it up 👍
I did it on purpose and I know it's broken right now 😁 |
Making it a % of the std is not too hard (computing the std is cheap compared to the training) but I believe that using the std is only good to get an idea of the magnitude of the noise while, when you have an exact number given by an expert, its usually an absolute value (but it might be a moot point as the fit will change the amplitude of the noise anyway). I am currious about your PR nevertheless |
Yeah, when I say |
Have been following this with great interest. What is the current state of the integration of GPs into linfa? |
Hi @wegreenall ! Currently there is a shim in the Linfa repository but the work has not been integrated in the main branch and, to my knowledge, might be incomplete. Your best bet is to check with @LukeMathWalker to determine what is missing for integration in Linfa. Don't hesitate to ask me any question to help with the integration work. Longer term, Friedrich uses nalgebra while linfa focuses on ndarray, both of which are large dependencies, so it might be interesting to start working on a new version of Friedrich that would not be dependent on nalgebra (however the integration is quitte deep so that would reprsent a lot of work). |
I have started to review
friedrich
with the aim of integrating it intolinfa
.I imagine the integration using a "shim" crate in the
linfa
repository, namedlinfa-gaussian-processes
, which re-exports all or part offriedrich
's public API. This ensures thatlinfa
is in control of naming conventions, interfaces and whatever is required to give the overalllinfa
ecosystem an homogeneous feeling as well as allowing you to retain ownership offriedrich
and evolving it as you see fit.The alternative would be to move
friedrich
insidelinfa
aslinfa-guassian-processes
, but that's a much more substantial step hence I don't think that's what we want to go for (at least for now).In terms of requirements, apart from some minor improvements here and there that I should be able to submit as PRs against
friedrich
over the Christmas holidays, I'd like to understand the following:nalgebra
overndarray
?I'd like to standardize on
ndarray
, at least in the initial phase, to avoid having overly generic interfaces (e.g. inputs and outputs generic parameters).Is it because you needed
ndarray-linalg
and that requires a non-Rust dependency?The text was updated successfully, but these errors were encountered: