Best practices with GPs
Last modified by acerbi@helsinki_fi on 2024/01/26 07:23
Gpytorch
To turn off fast approximate algorithms to work with exact GPs.
gpytorch.settings._fast_log_prob._default = Falsegpytorch.settings._fast_solves._default = Falsegpytorch.settings._fast_covar_root_decomposition._default = False
How to optimize hyperparameters of GPs
Maximum likelihood or MAP
- Sample a large number of hyperparameters from a plausible range or distribution (not necessarily the prior over hyperparameters), the number of candidates can be 10 to 10k depending on the cost; evaluate the GP log posterior (log marginal likelihood + log prior over hyperparameters) and take the best (or the few best ones)
- Optimize with lbfgs.
MCMC Sampling
- Sample a large number of hyperparameters from a plausible range or distribution (not necessarily the prior over hyperparameters), the number of candidates can be 10 to 10k depending on the cost; evaluate the GP log posterior (log marginal likelihood + log prior over hyperparameters) and take the best (or the few best ones)
- Given the best candidate(s), start from there and run local optimization of the log posterior with an early termination rule (e.g., terminate after ~100 steps or set a very large tolerance for BFGS - don't optimize until the end! Ideally we want to start sampling from near the mode but not at the mode.
- Start the sampler (typically, slice sampling) from the end result of the optimization, run it for a few iterations