Bayesian theory with applications, spring 2010

Last modified by corander@helsinki_fi on 2024/03/27 10:02

Bayesian theory with applications, lecture diary and course information

Lecturer

Scope

5+3 cu. The additional 3 credits are gained by completing a project task.

Type

Advanced studies. Bayesian theory is currently applied throughout the whole spectrum of scientific modeling and it is also a very important tool in a multitude of technological and engineering fields. The aims of the course are to decipher the Bayesian machinery, how and why it works, as well as to gain detailed understanding of an array of its applications.

Prerequisites

Probability calculus, calculus, linear algebra are important pre-requisites. Stochastic processes and computational statistics are useful, but not obligatory.

Lectures

see main page.

Lecture diary (only tentative schedule).

Week 11: Recent article in NY Times about Bayesian statistics, Course introduction, Introduction to subjective and epistemic perspective on probability, see Stanford Encyclopedia on probability, Bayes' theorem, dynamic revision of uncertainty using Bayes' theorem; see the example on perception and sensory integration which is demonstrated live in this BBC clip, Search & Rescue game sw, Search and Rescue demo case as a pdf, (note also that there is a real Bayesian search & rescue sw in use by coast guards, see here for a SAROPS demo), sequential Monte Carlo related computation. Revision of uncertainty and predictions for a 'cigar-box sampling problem', usefulness of systematic use of prior information in the context of infant mortality and SIDS (see this article by Gilbert et al. 2005). Use of Bayesian statistics to locate a missing plane, an example of Air France flight 447 and its relevance to MH370 search. This paper in Statistical Science discusses the AF447 case using SAROPS, and there are many other success stories listed here (a special Bayes issue of Statistical Science, the papers are also available on arxiv.org). Discussion of Bayesian inference and likelihood ratio calculations in forensics, for DNA evidence see these excellent slides (1) , (2) , (3) and (4) from Richard Gill's homepage, for gunshot residue analysis, see this paper and these slides. This paper discusses more complicated evidence calculation in cases of mixture of DNA and this review by David Balding discusses several challenges in DNA based evidence.

Week 12: Exchangeability, de Finetti's representation theorem, subjective probability modeling, prior and posterior predictive distributions, illustrations with probabilistic classification of documents, SpamAssasin is the most widely used spam protection system with a Bayesian filter. Cormack and Lynam at U Waterloo present a nice study on the efficiency SpamAssasin and other filters, instead of modeling presence/absence of words, it is also possible to use data compression on text sequences for spam filtering. In a more general setting, this recent paper discusses predictive classification and exchangeability, see also its sequel paper and these illustrative slides that summarize behavior of predictive inference under various classification circumstances. A February 2015 paper about predictive classifiers based on graphical models illustrates several improvements over the previous approaches. A useful approach to calculating integrals in Bayesian inference via 'visual pattern recognition' is explained in this document. A catalogue of conjugate prior distributions is here. Gu, L. has provided these useful Notes on Dirichlet distribution with relatives provides a concise recapitulation of some of the central formulas around the Dirichlet distribution.

Week 13: marginal and conditional independence, DAGs, graphs for representations of hierarchical models (see also this introductory article by M Jordan), choosing prior distributions. Vanilla introduction to hierarchical models as a case study on kidney cancers from the book by Gelman et al.

Week 14: kidney cancer story continued (with simulation in classroom), for a more realistic example of Bayesian smoothing of disease rates, see the excellent slides of Aki Vehtari, Bayesian inference procedures in practice, illustrating case-study with IQ estimation (with simulation in classroom), choosing priors continued, these slides and this article illustrate the impact of different prior/model choices for clustering data of genomic aberrations observed in cancers.

Week 15: more about priors, hierarchical models, partial exchangeability. We do several experiments related to specifying subjective probability intervals. In one of them participants collectively examine a glass jar with Euro coins (actually plastic pearls) and settled on a prior for the 'Number of Euros in the jar' problem, see this document. For a discussion about scoring probabilistic forecasts using Brier score, see this article. For a discussion about combining expert information using probabilities, see this article. Problems of getting reliable statements from experts are discussed here and here. Advanced hierarchical modeling: a solid frozen vanilla cracker example of a hierarchical model and a summary of it (to appreciate the concept of genetic drift you may wish to watch this simple animation). Example of Bayesian meta-analysis from the biostatistics book of George Woodworth. Examples of hierarchical models in fisheries management.

Week 17-18: model selection issues, fair-coin and star-tree paradox, a review of information-theoretic criteria for model selection, Bayes factor, see this paper by Kass & Raftery (1995), Occam's razor - see a demo, sampling from two cigar boxes and dynamically updating model uncertainty (classroom simulation), discussion of choosing priors by formal rules, see this article in particular, asymptotic behavior of model selection
procedures, see this proof of asymptotic consistency for the discrete case. Model selection under improper priors with fractional marginal likelihood (see course slides and these articles: paper1, paper2, paper3), What is wrong with Bayes factors or posterior probabilities when null hypothesis must NOT be favored? For an answer see paper1 and paper2, Bayesian model averaging, see this article, asymptotic behavior of Bayesian inference, see also the free book by David MacKay, which contains chapters on Bayesian modeling and in particular a very nice discussion about Occham's razor principle. About ABC (approximate Bayesian computation) inference, see this introduction. Finite mixture models and EM-algorithm from the HMM book by prof Timo Koski at KTH. A nice tutorial on Bayesian non-parametric models is available here, see also these slides on mixture models by Christopher Bishop.

Exams

Written exam and assignments (weekly assignments downloadable from the main page).

See the course page for current year for the exam date. Participants are allowed to bring all the lecture and assignment materials with them to the exam.

A list of possible topics for a larger assignment task is available here, choose freely one project from the list. Reports on the larger assignment can be produced by working alone or in pairs. In case you decide to do the project jointly with another participant, return only a single report with names of both participants. The reports should be returned within three months from the written exam date. By completing both the written exam and a larger assignment participants will gain 8 credits for the course. In case you wish to suggest own topic for a larger assignment, contact the lecturer.

Bibliography

Lecture slides are available here. In addition several classroom demonstrations and various case study materials are considered. Examples of useful books on Bayesian theory and modeling are Bernardo & Smith (1994), O'Hagan (1994), Schervish (1995), Gelman et al. (2004).