Best practices with GPs

Last modified by acerbi@helsinki_fi on 2024/01/26 07:23

Gpytorch

To turn off fast approximate algorithms to work with exact GPs.


gpytorch.settings._fast_log_prob._default = Falsegpytorch.settings._fast_solves._default = Falsegpytorch.settings._fast_covar_root_decomposition._default = False

How to optimize hyperparameters of GPs

Maximum likelihood or MAP

  1. Sample a large number of hyperparameters from a plausible range or distribution (not necessarily the prior over hyperparameters), the number of candidates can be 10 to 10k depending on the cost; evaluate the GP log posterior (log marginal likelihood + log prior over hyperparameters) and take the best (or the few best ones)
  2. Optimize with lbfgs.

MCMC Sampling

  1. Sample a large number of hyperparameters from a plausible range or distribution (not necessarily the prior over hyperparameters), the number of candidates can be 10 to 10k depending on the cost; evaluate the GP log posterior (log marginal likelihood + log prior over hyperparameters) and take the best (or the few best ones)
  2. Given the best candidate(s), start from there and run local optimization of the log posterior with an early termination rule (e.g., terminate after ~100 steps or set a very large tolerance for BFGS - don't optimize until the end! Ideally we want to start sampling from near the mode but not at the mode.
  3. Start the sampler (typically, slice sampling) from the end result of the optimization, run it for a few iterations