-
Notifications
You must be signed in to change notification settings - Fork 392
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Gaussian processes via gpytorch #782
Gaussian processes via gpytorch #782
Conversation
Including a notebook to demonstrate. Open TODOs: - [] adding a bunch of tests (similar to the tests for NeuralNetClassifier et al) - [] adding most docstrings - [] adding a section to the documentation - [] adding a mention to the README - [] update CHANGES - [] multiclass classification never learns (output all nans) - [] verifying with the GPyTorch devs if the assumptions made here are correct (hopefully they have time to respond)
Previously, when a parameter on, say, the module was changed via set_params (e.g. net.set_params(module__hidden_units=123)), set_params would always trigger (re-)initialization of the module. However, when the net was not initialized in the first place, this is unnecessary. It is sufficient to set the new attribute and wait for the net to be initialized later. Fortunately, this change doesn't seem to have any further impact, i.e. we didn't implicitly rely on this behavior anywhere. The only exceptions are 2 tests in test_cli.py, but those can easily be adjusted and this shouldn't have any user impact.
These methods started to become complicated because they did the following: 1. Check if there is anything to initialize et all 2. Print message about reason for potential re-initialization 3. Moving to device That made it quite difficult to override them without forgetting about some aspect. With this change, there are now corresponding _intialize_* methods that are called by net.initialize() and net.set_params. These new methods now take care of the points above and call the initialize_* methods inside. Now, we can more easily make sure that the user can override initialize_* without anything important being forgotten.
They are not passing yet.
Add a test to check that set_params doesn't initialize the net if it's not yet initialized at that time.
There were two instances of printing regardless of verbosity.
Is not relevant at the moment.
Removed code for states that could not be reached because of virtual params. This simplifies the logic considerably.
Check optimizer-related messages for an initialized net with set_params applied on module.
This is partly WIP because there is more to come, even though this change per se is already an improvement on the status quo. So far, the logic for creating custom modules or optimizers was separate from the logic that created the default module, criterion and optimizer. E.g., the "prefixes_" attribute was prefilled with 'module_', 'criterion_' and 'optimizer_'. This makes dealing with custom modules/optimizers (e.g. creating a second module called 'mymodule_') more difficult, because the logic for treating those was completely disjoint from the logic of how the default modules/optimizer were treated. This change actually removes most of the "special status" of module/criterion/optimizer. Therefore, the logic to treat those is now the same as for any custom module. So for instance, they are no longer pre-registered but instead are only registered later during their initialize_* methods. The this is implemented is to move the registration to the respective initialize_* methods. This is because during __init__, we don't actually know if we deal with a module or optimizer yet (passed argument for 'module' can, for instance, be a function, so we cannot type check). But during 'initialize', when the actual instances are created, we can check if we deal with a nn.Module or optim.Optimizer. If we do, we register them. So overall, the logic and role of 'initialize' have changed. Users will be expected to set custom modules/optimizers during their respective 'initialize_*' methods from now on (stricter checks and doc updates will be added). This affords us to no longer rely on the name to infer the function (remember that previously, a custom module needed to contain the substring 'module', which is an ugly restriction). As more of a side effect to these changes, the '_check_kwargs' call was moved to 'initialize' as well, since we cannot really check for faulty kwargs as long as we don't know what modules and optimizers will be registered.
These are only the tests, which will currently fail, hence WIP. Right now, there is a big hole in the treatment of custom modules/optimizers that distinguishes them from the assumed ones ('module', 'criterion', 'optimizer'). This battery of unit tests covers behaviors that will fail but really shouldn't: - custom module parameters should be passed to the optimizer - set_params on a custom module should trigger re-initialization of criterion and optimizer - set_params on a custom criterion should trigger re-initialization of optimizer - custom modules and criteria are not automatically moved to cuda
…into feature/gpytorch-integration-copy
Since custom components are no longer matched by name, this became obsolete.
Before this, only the default "optimizer_" was used and all others were being ignored. With this change, "zero_grad" and "step" are called on all optimizers automatically.
- Update example in docstring of get_learnable_params - Add comments about unused init contexts - Add deprecation for triggered_directly argument Also: Improved docstring for named_parameters
…into feature/gpytorch-integration-copy
Merge in latest branch of the refactoring that elevates custom modules to first class citizens. Moreover, provide an ExactGPRegressor class directly, instead of requiring the user to implement this. That way, we avoid the minor annoyance of having to write your own initialize_module. Consequently, the notebook could be simplified and the futureattr workaround could be removed. Furthermore, add a lost of tests, most notably tests for GPRegressor (not exact) and GPBinaryClassifier. Added some tests for grid search as well, which showed a few issues that were addressed.
This case had to be covered yet: When the module/criterion is already initialized and none of it's parameters changed, initialize_module/criterion was not called. However, what if a custom module/criterion does need to be initialized? In that case, not calling initialize_module/criterion is bad. With this fix, this bad behavior no longer occurs. Tests were added to cover this. In order to achieve this change, we had to unfortunately push down the checking whether module/criterion is already initialized from _initialize_module/criterion to initialize_module/criterion. There was no other way of checking this, since at first, we cannot know which attributes are modules/criteria. For the user, this means a little more work if they want to implement initialize_module/criterion absolutely correctly. However, that's not so bad because it is only important if the user wants to work with pre-initialized modules/criteria and with custom modules/criteria, which should happen very rarely in practice. And even if it does, the user can just copy the default skorch code and will end up with a correct implementation.
…into feature/gpytorch-integration-copy
With the upstream changes on the refactoring of custom module treatment, this test now passes. It had to be adjusted a bit because BernoulliLikelihood didn't support some of the given parameters.
Extend tests regarding saving, loading, pickling error.
Depending on the kind of distribution, the shape has to be determined differently.
Now handles '+something' suffix, e.g. '1.7.1+cu102'
Note: GPyTorch + PyTorch 1.7.1 don't work together, so I skip GPyTorch tests if this PyTorch version is detected. Officially, PyTorch >= 1.8.1 is supported by GPyTorch, but PyTorch 1.5.1 and 1.6.0 also work, thus I left those tests in. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Initial look at the docs/API.
docs/user/probabilistic.rst
Outdated
module__y=y, | ||
) | ||
|
||
gpr.fit(X, y) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Passing X
, and y
into ExactGPRegressor.__init__
and in fit
is a little strange. Alternatives where we delay initializing the module until fit
is call feels too hacky.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had started out in a similar direction but it really didn't work well. The current solution is not the most beautiful but it serves its purpose and uses facilities that are already present. But I'm open for suggestions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
skorch's API is not flexible enough to use the X, y to initialize the module.
As long as we check that module__X
is X
in fit
, I am okay with this workaround.
BTW Another hack would be to do something like nn.LazyLinear
: https://github.com/pytorch/pytorch/blob/e711b5ce6c4982960c293b005da37a6d91253bd7/torch/nn/modules/lazy.py#L52 (I do not recommend it)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As long as we check that
module__X
isX
infit
, I am okay with this workaround.
GPyTorch already checks for this equality (unless debug mode is explicitly switched off). I believe this is enough and skorch doesn't need to do the same work.
BTW Another hack would be to do something like
nn.LazyLinear
: https://github.com/pytorch/pytorch/blob/e711b5ce6c4982960c293b005da37a6d91253bd7/torch/nn/modules/lazy.py#L52 (I do not recommend it)
I agree that we shouldn't use it (for now), given that it's still experimental. Maybe GPyTorch will adopt it so that we don't have to.
docs/user/probabilistic.rst
Outdated
gpr = ExactGPRegressor( | ||
RbfModule, | ||
module__X=X, | ||
module__y=y, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see that there is a default likelihood
that I assume routes into __init__
. I think at this point we can use train_inputs
, train_targets
and route into ExactGPRegressor
. This would be more inline with ExactGP
itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean that X
and y
should be renamed to train_inputs
and train_targets
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think just module__X_train
, module__y_train
, would work. This is to make it clear that we are using training data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
docs/user/probabilistic.rst
Outdated
gpr = ExactGPRegressor( | ||
RbfModule, | ||
module__X=X, | ||
module__y=y, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think just module__X_train
, module__y_train
, would work. This is to make it clear that we are using training data.
docs/user/probabilistic.rst
Outdated
module__y=y, | ||
) | ||
|
||
gpr.fit(X, y) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
skorch's API is not flexible enough to use the X, y to initialize the module.
As long as we check that module__X
is X
in fit
, I am okay with this workaround.
BTW Another hack would be to do something like nn.LazyLinear
: https://github.com/pytorch/pytorch/blob/e711b5ce6c4982960c293b005da37a6d91253bd7/torch/nn/modules/lazy.py#L52 (I do not recommend it)
if loss.dim() != 0: | ||
loss = loss.mean() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is getting the mean handled by the Criterion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The gpytorch losses don't always return a scalar by default, as e.g. in this example, so I added this conditional to take care of this. Ideally, it wouldn't be necessary.
return latent_pred | ||
|
||
|
||
class BaseProbabilisticTests: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My mind is blown that using fixtures + setting class attributes in child classes works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was doing an experiment here. Do you think this overall improves the code or not? It does reduce code duplication but it may be less obvious what is going on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's mostly okay with me. Looking at this more closely, it all sense and its a reasonable way to "do not repeat yourself". I can see other approaches, but I do not see it as drastic improvement over what we have now.
skorch/utils.py
Outdated
if ( | ||
likelihood | ||
and GPYTORCH_INSTALLED | ||
and isinstance(likelihood, gpytorch.likelihoods.SoftmaxLikelihood) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code using SoftmaxLikelihood
is commented out, so this code path is never reached?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, good catch, I commented this code out (adding a comment), as well as the code for _transpose
.
Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>
This is code only required in case we add multiclass GP regression.
Be explicit about X and y being X_train and y_train.
Ran through the notebook and left some comments at https://app.reviewnb.com/skorch-dev/skorch/blob/feature/gaussian-processes-via-gpytorch/notebooks/Gaussian_Processes.ipynb/ Let me know if you can see them. |
@thomasjpfan Thanks for your comments, I could read them. Hopefully, you can read my replies as well. I adjusted the notebook according to your comments/wrote a reply. |
This way, we can avoid having to set X and y during the module's __init__, making the usage of exact GPs easier overall.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly looks good to me! Left a few comments, primarily about the prediction and sampling methods.
|
||
class _GPRegressorPredictMixin: | ||
"""Mixin class that provides a predict method for GP regressors.""" | ||
def predict(self, X, return_std=False, return_cov=False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When return_std is False
, consider turning the gpytorch.settings.skip_posterior_variances
setting on -- this will avoid doing the math for the posterior variances altogether, which will save a great deal of compute.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, I will add this.
skorch/probabilistic.py
Outdated
|
||
""" | ||
self.check_is_fitted() | ||
samples = [p.sample(torch.Size([n_samples])) for p in self.forward_iter(X)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Obviously not 100% familiar with how skorch's internals work, but assuming p
is a MultivariateNormal
, is there any reason to call sample
instead of rsample
here? Outright enforcing that sampling must be done in a torch.no_grad()
context seems unnecessary at first glance, and a common use case for GPs looks something like: x -> GP -> p(y|D) -> samples of y -> average some function of y over the samples -> differentiate the average with respect to x
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main reason I used sample
was because rsample
was not supported for all distributions, e.g. Bernoulli
. I didn't think of the possibility of using the sample gradients. I'll make a switch to rsample
if the used distribution supports it and fall back on sample
otherwise.
|
||
nonlin = self._get_predict_nonlinearity() | ||
y_preds, y_stds = [], [] | ||
for yi in self.forward_iter(X, training=False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing that GPyTorch does is we cache a couple things that don't depend on the test points the first time we make predictions so that the next time we make predictions we don't have O(n^3) complexity. These caches get deleted in some cases, like if the model gets put back in training
mode -- I assume that forward_iter
wouldn't do any of these things though.
It should be pretty easy to test:
gpr = ...
gpr.fit(...)
with gpytorch.settings.fast_pred_var():
y_pred, y_std = gpr.predict(test_x, return_std=True)
# test_x could change here.
y_pred, y_std = gpr.predict(test_x, return_std=True) # This call should be significantly faster.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, when using forward_iter
, there is no risk of the model being set to training
again. Therefore, I'll add the fast_pred_var
context for it. When testing, I didn't find any noticeable speed difference. Perhaps, that is because I'm mostly using toy models on small data in the notebook?
) | ||
|
||
gpr.fit(X_train, y_train) | ||
y_pred = gpr.predict(X_train) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The jupyter notebook you sent me currently highlights gpytorch.settings.fast_pred_samples
. when calling sample
. That setting won't actually do anything unless you are using KISS-GP. A setting that definitely will make a perf difference is wrapping predict in gpytorch.settings.fast_pred_var()
though, assuming gpytorch.settings.skip_posterior_variances()
isn't also on (see my comment about that below).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't know that, thanks for clarifying. I will remove the usage of gpytorch.settings.fast_pred_var
in the notebook.
Since jacobrgardner already review as authority on GPyTorch, I looked at the integration from the perspective of potential user. This is what I came up with. When I was reading the tutorial I was wondering in which situation I should use skorch's GPyTorch integration Good use cases:
Better use GPyTorch directly:
This would then naturally lead to section titles in the tutorial nb such as "Scaling GP's" and "Deploying GPyTorch Models with Skorch". The material is already there but this would have helped me to jump right into the parts most relevant for me. Below are a few nitpicking comments on the tutorial text.
Sparse GP's can be used to scale to larger data. Not all approaches a based on variational inference.
This is a bit tricky since GPs are non-parametric methods and "learning" is usually associated with estimating primary parameters and not hyper-parameters as with GPs.
Again the learning thing ;), I would say the distribution is modeled using kernel engineering and the distribution parameters are fitted.
That's all, I think the PR is in good shape. Thanks for making GP's easier accessible. :) |
* Use fast_pred_vars in forward_iter * Use rsample (where possible) when sampling * Use skip_posterior_variances when predicting without returning the standard deviations * Don't use fast_pred_samples in notebook since there is no point there Accompanying these fixes, I made testing more rigurous by checking that certain values are not all identical. This is because with some of the introduced optimizations, GPyTorch will return (what I assume to be) dummy values for certain operations, which we don't want to use.
Thank you @ibayer for your feedback.
I couldn't quite manage to rework the sections to fit the titles as you suggested them. Still, as you said, it's important to clarify when and when not to use GPyTorch with skorch. To this end, I added a section at the start of the GP documentation in skorch that discusses this topic. The notebook references the docs too, so that interested readers can look up that information.
The part you cite talks about using batches and not about scaling per se. For batching, use of sparse GPs would not help and the statement is correct as is ;) But I get what you mean, I'll change the wording to no longer give the impression that variational GPs are the only possibility for scaling.
Thank you, you are right. I'll change the wording to not mislead the reader. |
* Add discussion and when (and when not) to use GPyTorch + skorch * Reword to no longer give the impression that variational GPs are the only scaling method * Reword sentences that refer to "learning", as this can be misleading
@thomasjpfan @ottonemo @jacobrgardner @ibayer Thank you all for your helpful reviews. Hopefully, I could address all of your concerns. My proposal is to give one more week for any issues with this PR to be raised. If not, I'll assume it is ready to go now. Then I'd merge and prepare the next skorch release. For that, I would frame this feature as a "beta" with possible breaking changes in the future in case something is just plain wrong. That way, we are not shackled by possible mistakes I made here. Please let me know if you have any objections. |
Sounds good to me. |
Thanks again everyone for your comments and suggestions. As announced, I merged this PR after waiting one week. A new skorch release will follow soon (I'll aim for tomorrow). |
We are happy to announce the new skorch 0.11 release: Two basic but very useful features have been added to our collection of callbacks. First, by setting `load_best=True` on the [`Checkpoint` callback](https://skorch.readthedocs.io/en/latest/callbacks.html#skorch.callbacks.Checkpoint), the snapshot of the network with the best score will be loaded automatically when training ends. Second, we added a callback [`InputShapeSetter`](https://skorch.readthedocs.io/en/latest/callbacks.html#skorch.callbacks.InputShapeSetter) that automatically adjusts your input layer to have the size of your input data (useful e.g. when that size is not known beforehand). When it comes to integrations, the [`MlflowLogger`](https://skorch.readthedocs.io/en/latest/callbacks.html#skorch.callbacks.MlflowLogger) now allows to automatically log to [MLflow](https://mlflow.org/). Thanks to a contributor, some regressions in `net.history` have been fixed and it even runs faster now. On top of that, skorch now offers a new module, `skorch.probabilistic`. It contains new classes to work with **Gaussian Processes** using the familiar skorch API. This is made possible by the fantastic [GPyTorch](https://github.com/cornellius-gp/gpytorch) library, which skorch uses for this. So if you want to get started with Gaussian Processes in skorch, check out the [documentation](https://skorch.readthedocs.io/en/latest/user/probabilistic.html) and this [notebook](https://nbviewer.org/github/skorch-dev/skorch/blob/master/notebooks/Gaussian_Processes.ipynb). Since we're still learning, it's possible that we will change the API in the future, so please be aware of that. Morever, we introduced some changes to make skorch more customizable. First of all, we changed the signature of some methods so that they no longer assume the dataset to always return exactly 2 values. This way, it's easier to work with custom datasets that return e.g. 3 values. Normal users should not notice any difference, but if you often create custom nets, take a look at the [migration guide](https://skorch.readthedocs.io/en/latest/user/FAQ.html#migration-from-0-10-to-0-11). And finally, we made a change to how custom modules, criteria, and optimizers are handled. They are now "first class citizens" in skorch land, which means: If you add a second module to your custom net, it is treated exactly the same as the normal module. E.g., skorch takes care of moving it to CUDA if needed and of switching it to train or eval mode. This way, customizing your networks architectures with skorch is easier than ever. Check the [docs](https://skorch.readthedocs.io/en/latest/user/customization.html#initialization-and-custom-modules) for more details. Since these are some big changes, it's possible that you encounter issues. If that's the case, please check our [issue](https://github.com/skorch-dev/skorch/issues) page or create a new one. As always, this release was made possible by outside contributors. Many thanks to: - Autumnii - Cebtenzzre - Charles Cabergs - Immanuel Bayer - Jake Gardner - Matthias Pfenninger - Prabhat Kumar Sahu Find below the list of all changes: Added - Added `load_best` attribute to `Checkpoint` callback to automatically load state of the best result at the end of training - Added a `get_all_learnable_params` method to retrieve the named parameters of all PyTorch modules defined on the net, including of criteria if applicable - Added `MlflowLogger` callback for logging to Mlflow (#769) - Added `InputShapeSetter` callback for automatically setting the input dimension of the PyTorch module - Added a new module to support Gaussian Processes through [GPyTorch](https://gpytorch.ai/). To learn more about it, read the [GP documentation](https://skorch.readthedocs.io/en/latest/user/probabilistic.html) or take a look at the [GP notebook](https://nbviewer.jupyter.org/github/skorch-dev/skorch/blob/master/notebooks/Gaussian_Processes.ipynb). This feature is experimental, i.e. the API could be changed in the future in a backwards incompatible way (#782) Changed - Changed the signature of `validation_step`, `train_step_single`, `train_step`, `evaluation_step`, `on_batch_begin`, and `on_batch_end` such that instead of receiving `X` and `y`, they receive the whole batch; this makes it easier to deal with datasets that don't strictly return an `(X, y)` tuple, which is true for quite a few PyTorch datasets; please refer to the [migration guide](https://skorch.readthedocs.io/en/latest/user/FAQ.html#migration-from-0-10-to-0-11) if you encounter problems (#699) - Checking of arguments to `NeuralNet` is now during `.initialize()`, not during `__init__`, to avoid raising false positives for yet unknown module or optimizer attributes - Modules, criteria, and optimizers that are added to a net by the user are now first class: skorch takes care of setting train/eval mode, moving to the indicated device, and updating all learnable parameters during training (check the [docs](https://skorch.readthedocs.io/en/latest/user/customization.html#initialization-and-custom-modules) for more details, #751) - `CVSplit` is renamed to `ValidSplit` to avoid confusion (#752) Fixed - Fixed a few bugs in the `net.history` implementation (#776) - Fixed a bug in `TrainEndCheckpoint` that prevented it from being unpickled (#773)
This is a feature to add support for Gaussian Processes (GPs) via integration with gpytorch.
Similar to "vanilla" PyTorch, the idea here is that skorch allows the user to focus on what's important (implementing the mean function and kernel function) and not to bother with stuff like the training loop, callbacks, etc. This is probably best illustrated in the accompanying notebook.
GPs are primarily for regression, hence those are the main focus here. Traditionally, there are "exact" solutions and approximations. skorch will provide an
ExactGPRegressor
and aGPRegressor
for those two use cases.On top of that, a
GPBinaryClassifier
is offered, though I suspect it to be rarely used. I couldn't get theGPClassifier
for multiclass to work, the code is therefore commented out.The API is mostly the same as for the normal skorch estimators. There are some additions to make working with GPs easier:
predict
method takes areturn_std
argument to return the standard deviation as well (as in sklearn'sGaussianProcessRegressor
;return_cov
is not supported)sample
method to sample for the modelconfidence_region
method to get the confidence regionBefore merging this, I would like this to be reviewed by some experts on GPs, since I consider myself to know very little about them. Therefore, I will try to find some external help. Especially the following points should be clarified:
This PR already includes the bugfix in #781