Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update gpytorch and gpflow benchmark #19

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

gpleiss
Copy link

@gpleiss gpleiss commented Oct 16, 2020

Here's a few updates to the GPyTorch and GPFlow benchmark code that I believe will make a better comparison to both packages:

  1. Using whitened variational inference - the whitening operation is known to dramatically accelerate the convergence of variational optimization without any additional computational complexity. (Importantly, the GPyTorch UnwhitenedVariationalStrategy is very old code that uses out-dated linear algebra. We haven't updated it because we wouldn't normally recommend using it at all.)

  2. For the gpytorch data loader - set num_workers=0. More num_workers is (counterintuitively) slower for non-image datasets, as it requires thread synchronization.

You are comparing GPyTorch/GPFlow SVGP models against Falkon's kernel ridge regression - which essentially compares the timing difference of large kernel matrix operations against the convergence rate of SGD. Whitening is a known technique that improves SGD's convergence rate for SVGP (without any additional complexity).

cc/ @jacobrgardner - @jameshensman, @alexggmatthews am I missing anything on the GPFlow end?

@Giodiro
Copy link
Contributor

Giodiro commented Oct 16, 2020

Hi @gpleiss
I have one question regarding the use of whitened vs unwhitened:
When running with natural gradients, 3k inducing points, I'm seeing a 3x slowdown with the whitened strategy. Do you have any ideas on what could be the cause?

@FalkonML FalkonML deleted a comment from codecov-io Oct 16, 2020
@gpleiss
Copy link
Author

gpleiss commented Oct 16, 2020

When running with natural gradients, 3k inducing points, I'm seeing a 3x slowdown with the whitened strategy. Do you have any ideas on what could be the cause?

Sorry - what is the 3x slowdown in comparison to?

@Giodiro
Copy link
Contributor

Giodiro commented Oct 16, 2020

I'll give you the whole list of settings I'm trying:

  • diagonal variational distribution, whitened strategy -> 70s per eopch
  • full variational distribution, whitened strategy -> 80s per epoch
  • diagonal variational distribution, UNwhitened strategy, (fast computations set to True) -> 40s per epoch
  • diagonal variational distribution, UNwhitened strategy, (fast computations set to False) -> 25s per epoch
  • full variational distribution, UNwhitened strategy, (fast computations set to False) -> 25s per epoch

@gpleiss
Copy link
Author

gpleiss commented Oct 16, 2020

Which benchmark script are you running to get these results?

@Giodiro
Copy link
Contributor

Giodiro commented Oct 16, 2020

benchmark_millionsongs.sh from the benchmark branch. To change the variational distribution you can change the VAR variable between "full", "diag". If you set NATGRAD_LR to 0 then you'll be doing SGD with Adam. If you actually want to run the script, you'll also need to change some hardcoded paths in the datasets.py file (check line 181). If needed, I can share the data (it's the standard million songs dataset https://archive.ics.uci.edu/ml/datasets/yearpredictionmsd).

@gpleiss
Copy link
Author

gpleiss commented Oct 16, 2020

Ah I understand the discrepancy now. In VariationalStrategy we cast the Cholesky call to double precision, which is not something we do in UnwhitenedVariationalStrategy (again - unwhitened is old code which hasn't been updated in a while).

With M=1000 or 2000, usually this extra precision doesn't take much time. At M=4000 you start to see a difference, but the double precision usually is necessary for later epochs to prevent numerical errors.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants