-
Notifications
You must be signed in to change notification settings - Fork 392
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Gaussian processes via gpytorch #782
Merged
BenjaminBossan
merged 77 commits into
master
from
feature/gaussian-processes-via-gpytorch
Oct 9, 2021
Merged
Changes from 70 commits
Commits
Show all changes
77 commits
Select commit
Hold shift + click to select a range
72d227b
[WIP] First working implementation of GPs
df0413e
Minor changes
108b434
Don't reinitialize uninitialized net bc set_params
a880987
Move cb_params update, improve comment
ab5f973
Update notebook with a warning on init likelihood
e572d65
Remove unnecessary imports
c6e92a5
Simplify initialize_* methods
15d6fa8
Add first few unit tests
0b19538
Add unit test for set_params on uninitialized net
d6e1bca
Print only when verbose
c87c932
Remove init code related to likelihood
0cda3ef
Further clean up of set_params re-initialization
cff5edb
Add more tests for re-initialization logic
6c50ef7
Rework logic of creating custom modules/optimizers
f02738a
Add battery of tests for custom modules/optimizers
435fc75
Implement changes to make tests pass
52bdefa
[WIP] Update CHANGES
eddd5aa
Merge branch 'changed/refactor-init-more-consistency-custom-modules' …
c471c78
Simplify implementation based on refactoring
be4c035
[WIP] Document an edge case not covered yet
f20982d
Remove _PYTORCH_COMPONENTS global
5480b8f
Update documentation reflecting the changes
7454e85
All optimizers perform updates automatically
e1d3c2f
Address reviewer comments
5510ef0
Merge branch 'changed/refactor-init-more-consistency-custom-modules' …
2eff978
Further updates based on new skorch refactoring
1bec3d3
Fix corner case with pre-initialized modules
ee23283
Merge branch 'changed/refactor-init-more-consistency-custom-modules' …
968da8c
Activate test about initialization message
0889eea
Extend test coverage, fix a typo
c6fb0aa
Custom modules are set to train/eval mode
b6fb645
Merge branch 'changed/refactor-init-more-consistency-custom-modules' …
66b2e80
Update docs about train/eval mode
9633b9e
Update notebook
d14c1e2
Move tests around, add comment about multioutput
1547600
Complete docstrings
44069bc
Complete entries in CHANGES.md
4cf1c26
Merge branch 'changed/refactor-init-more-consistency-custom-modules' …
fcc0c06
Complete docs, docstrings, fix linting
d90f6e6
check_is_fitted also checks for likelihood_
7784014
Fix a bug when likelihood/module are initialized
317efde
Update notebook
6a7c6b1
Update README
8cad813
Add documentation rst files
f6e8647
Merge branch 'master' into changed/refactor-init-more-consistency-cus…
BenjaminBossan 4ca941d
Reviewer comment: Consider virtual params
1a33aec
Reviewer comment: Docs: No need to return self
bb4e573
Reviewer comment: Docs: explain NeuralNet.predict
d787d4a
Reviewer comment: Docs: When not calling super
a61a4c7
Reviewer comment: get_all_learnable_params
64b3380
Reviewer comment: facilitate module initialization
eee2922
Merge branch 'changed/refactor-init-more-consistency-custom-modules' …
8955975
Merge branch 'master' into feature/gpytorch-integration-copy
58f1e58
Fix a bug that led to double-registration
e22f59d
Merge branch 'bugfix/module-double-registration-after-clone' into fea…
307ebf7
Increment gpytorch minimum version
59d5da7
[WIP] Try to fix some tests
b8c061a
Fix duplicate parameter bug
969218b
Merge branch 'master' into feature/gaussian-processes-via-gpytorch
BenjaminBossan 5c9396e
Revert changes in test_net.py
5fce890
Merge branch 'feature/gaussian-processes-via-gpytorch' of https://git…
df6c6be
Bump gpytorch version to 1.5
3795c19
Fix failing test caused by distribution shape
b3c5908
For testing, exclude Python 3.6, PyTorch 1.7.1
e97dc14
For testing, exclude Python 3.8 PyTorch 1.7.1
d85597f
Skip gpytorch tests for pytorch 1.7.1
02d1b6f
Modify pytorch version check
8befe3a
Reviewer comment: Use pytest.mark.skipif
BenjaminBossan cfa993e
Comment out code that is not currently needed
e3cd39b
Improve documentation example for GPs
663ee3d
Reviewer comments: some improvements to notebook
454e898
Use set_train_data for exact GPs
e793f93
Address reviewer comments by Jacob Gardner
9670c9c
Address reviewer comments by Immanuel Bayer
c1a1ace
Fix typo in README
cdc82a8
Add entry about GPs to CHANGES
777914a
Merge branch 'master' into feature/gaussian-processes-via-gpytorch
BenjaminBossan File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
skorch.probabilistic | ||
==================== | ||
|
||
.. automodule:: skorch.probabilistic | ||
:members: |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -11,6 +11,7 @@ skorch | |
helper | ||
history | ||
net | ||
probabilistic | ||
regressor | ||
scoring | ||
toy | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,196 @@ | ||
================== | ||
Gaussian Processes | ||
================== | ||
|
||
skorch integrates with GPyTorch_ to make it easy to train Gaussian Process (GP) | ||
models. You should already know how Gaussian Processes work. Please refer to | ||
other resources if you want to learn about them, this section assumes | ||
familiarity with the concept. | ||
|
||
GPyTorch adopts many patterns from PyTorch, thus making it easy to pick up for | ||
seasoned PyTorch users. Similarly, the skorch GPyTorch integration should look | ||
familiar to seasoned skorch users. However, GPs are a different beast than the | ||
more common, non-probabilistic machine learning techniques. It is important to | ||
understand the basic concepts before using them in practice. | ||
|
||
Installation | ||
------------ | ||
|
||
In addition to the normal skorch dependencies and PyTorch, you need to install | ||
GPyTorch as well. It wasn't added as a normal dependency since most users | ||
probably are not interested in using skorch for GPs. To install GPyTorch, use | ||
either pip or conda: | ||
|
||
.. code:: bash | ||
|
||
# using pip | ||
pip install -U gpytorch | ||
# using conda | ||
conda install gpytorch -c gpytorch | ||
|
||
|
||
Examples | ||
-------- | ||
|
||
Exact Gaussian Processes | ||
^^^^^^^^^^^^^^^^^^^^^^^^ | ||
|
||
Same as GPyTorch, skorch supports exact and approximate Gaussian Processes | ||
regression. For exact GPs, use the | ||
:class:`~skorch.probabilistic.ExactGPRegressor`. The likelihood has to be a | ||
:class:`~gpytorch.likelihoods.GaussianLikelihood` and the criterion | ||
:class:`~gpytorch.mlls.ExactMarginalLogLikelihood`, but those are the defaults | ||
and thus don't need to be specified. For exact GPs, the module needs to be an | ||
:class:`~gpytorch.models.ExactGP`. For this example, we use a simple RBF kernel. | ||
|
||
.. code:: python | ||
|
||
import gpytorch | ||
from skorch.probabilistic import ExactGPRegressor | ||
|
||
class RbfModule(gpytorch.models.ExactGP): | ||
def __init__(self, X_train, y_train, likelihood, noise_init=None): | ||
super().__init__(X_train, y_train, likelihood) | ||
self.mean_module = gpytorch.means.ConstantMean() | ||
self.covar_module = gpytorch.kernels.RBFKernel() | ||
|
||
def forward(self, x): | ||
mean_x = self.mean_module(x) | ||
covar_x = self.covar_module(x) | ||
return gpytorch.distributions.MultivariateNormal(mean_x, covar_x) | ||
|
||
gpr = ExactGPRegressor( | ||
RbfModule, | ||
module__X_train=X_train, | ||
module__y_train=y_train, | ||
) | ||
|
||
gpr.fit(X_train, y_train) | ||
y_pred = gpr.predict(X_test) | ||
|
||
As you can see, this almost looks like a normal skorch regressor with a normal | ||
PyTorch module. We can fit as normal using the ``fit`` method and predict using | ||
the ``predict`` method. | ||
|
||
Inside the module, we determine the mean by using a mean function (just constant | ||
in this case) and the covariance matrix using the RBF kernel function. You | ||
should know about mean and kernel functions already. Having the mean and | ||
covariance matrix, we assume that the output distribution is a multivariate | ||
normal function, since exact GPs rely on this assumption. We could send the | ||
``x`` through an MLP for `Deep Kernel Learning | ||
<https://docs.gpytorch.ai/en/stable/examples/06_PyTorch_NN_Integration_DKL/index.html>`_ | ||
but left it out to keep the example simple. | ||
|
||
One major difference to usual deep learning models is that we actually predict a | ||
distribution, not just a point estimate. That means that if we choose an | ||
appropriate model that fits the data well, we can express the **uncertainty** of | ||
the model: | ||
|
||
.. code:: python | ||
|
||
y_pred, y_std = gpr.predict(X, return_std=True) | ||
lower_conf_region = y_pred - y_std | ||
upper_conf_region = y_pred + y_std | ||
|
||
Here we not only returned the mean of the prediction, ``y_pred``, but also its | ||
standard deviation, ``y_std``. This tells us how uncertain the model is about | ||
its prediction. E.g., it could be the case that the model is fairly certain when | ||
*interpolating* between data points but uncertain about *extrapolating*. This is | ||
not possible to know when models only learn point predictions. | ||
|
||
The obtain the confidence region, you can also use the ``confidence_region`` | ||
method: | ||
|
||
.. code:: python | ||
|
||
# 1 standard deviation | ||
lower, upper = gpr.confidence_region(X, sigmas=1) | ||
|
||
# 2 standard deviation, the default | ||
lower, upper = gpr.confidence_region(X, sigmas=2) | ||
|
||
Furthermore, a GP allows you to sample from the distribution even *before | ||
fitting* it. The GP needs to be initialized, however: | ||
|
||
.. code:: python | ||
|
||
gpr = ExactGPRegressor(...) | ||
gpr.initialize() | ||
samples = gpr.sample(X, n_samples=100) | ||
|
||
By visualizing the samples and comparing them to the true underlying | ||
distribution of the target, you can already get a feel about whether the model | ||
you built is capable of generating the distribution of the target. If fitting | ||
takes a long time, it is therefore recommended to check the distribution first, | ||
otherwise you may try to fit a model that is incapable of generating the true | ||
distribution and waste a lot of time. | ||
|
||
Approximate Gaussian Processes | ||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | ||
|
||
For some situations, fitting an exact GP might be infeasible, e.g. because the | ||
distribution is not Gaussian or because you want to perform stochastic | ||
optimization with mini-batches. For this, GPyTorch provides facilities to train | ||
variational and approximate GPs. The module should inherit from | ||
:class:`~gpytorch.models.ApproximateGP` and should define a *variational | ||
strategy*. From the skorch side of things, use | ||
:class:`~skorch.probabilistic.GPRegressor`. | ||
|
||
.. code:: python | ||
|
||
import gpytorch | ||
from gpytorch.models import ApproximateGP | ||
from gpytorch.variational import CholeskyVariationalDistribution | ||
from gpytorch.variational import VariationalStrategy | ||
from skorch.probabilistic import GPRegressor | ||
|
||
class VariationalModule(ApproximateGP): | ||
def __init__(self, inducing_points): | ||
variational_distribution = CholeskyVariationalDistribution(inducing_points.size(0)) | ||
variational_strategy = VariationalStrategy( | ||
self, inducing_points, variational_distribution, learn_inducing_locations=True, | ||
) | ||
super().__init__(variational_strategy) | ||
self.mean_module = gpytorch.means.ConstantMean() | ||
self.covar_module = gpytorch.kernels.ScaleKernel(gpytorch.kernels.RBFKernel()) | ||
|
||
def forward(self, x): | ||
mean_x = self.mean_module(x) | ||
covar_x = self.covar_module(x) | ||
return gpytorch.distributions.MultivariateNormal(mean_x, covar_x) | ||
|
||
X, y = get_data(...) | ||
X_incuding = X[:100] | ||
X_train, y_train = X[100:], y[100:] | ||
num_training_samples = len(X_train) | ||
|
||
gpr = GPRegressor( | ||
VariationalModule, | ||
module__inducing_points=X_inducing, | ||
criterion__num_data=num_training_samples, | ||
) | ||
|
||
gpr.fit(X_train, y_train) | ||
y_pred = gpr.predict(X_train) | ||
|
||
As you can see, the variational strategy requires us to use inducing points. We | ||
split off 100 of our training data samples to use as inducing points, assuming | ||
that they are representative of the whole distribution. Apart from this, there | ||
is basically no difference to using exact GP regression. | ||
|
||
Finally, skorch also provides :class:`~skorch.probabilistic.GPBinaryClassifier` | ||
for binary classification with GPs. It uses a Bernoulli likelihood by default. | ||
However, using GPs for classification is not very common, GPs are most commonly | ||
used for regression tasks where data points have a known relationship to each | ||
other (e.g. in time series forecasts). | ||
|
||
Multiclass classification is not currently provided, but you can use | ||
:class:`~skorch.probabilistic.GPBinaryClassifier` in conjunction with | ||
:class:`~sklearn.multiclass.OneVsRestClassifier` to achieve the same result. | ||
|
||
Further examples | ||
---------------- | ||
|
||
To see all of this in action, we provide a notebook that shows using skorch with GPs on real world data: `Gaussian Processes notebook <https://nbviewer.jupyter.org/github/skorch-dev/skorch/blob/master/notebooks/Gaussian_Processes.ipynb)>`_. | ||
|
||
.. _GPyTorch: https://gpytorch.ai/ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The jupyter notebook you sent me currently highlights
gpytorch.settings.fast_pred_samples
. when callingsample
. That setting won't actually do anything unless you are using KISS-GP. A setting that definitely will make a perf difference is wrapping predict ingpytorch.settings.fast_pred_var()
though, assuminggpytorch.settings.skip_posterior_variances()
isn't also on (see my comment about that below).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't know that, thanks for clarifying. I will remove the usage of
gpytorch.settings.fast_pred_var
in the notebook.