Computational statistics, spring 2010 (Laskennallinen tilastotiede)

Last modified by ppkoisti@helsinki_fi on 2024/03/27 10:02

Computational statistics, spring 2010

Lecturer

Petri Koistinen

Scope

8 cu.

Type

Advanced studies.

News

  • The grading of the second course exam is  on display on the bulletin board of the department.
  • Good news: The general level of the course was high. Everybody who attended both of the course exams will be accepted and there are going to be lots of 4s and 5s. The probability mass concentrating on the lower marks goes (virtually) to those students who censored themselves by not finishing the course.
  • Bad news: You will get your credits only after your practical work has been accepted.
  • There will be a summer exam on Thu 12 Aug at 10-14 (for those who did not come to both of the course exams). You have to register for that exam in the department office. The domain is the whole lecture material except that we leave out Sections 6.4, 10.5, 10.6, 10.7 and Chapter 11.

Prerequisites

  • Basic skills in linear algebra.
  • Basic skills in multivariate differential calculus (partial derivatives and multiple integrals).
  • Basic skills in calculating with multivariate joint and conditional distributions from probability theory (but measure theory is not needed).
  • Some previous exposure to Bayesian inference would be helpful.

Lectures

Weeks 3-9 and 11-18 Mon 12-14, Fri 12-14 in room B120.

No lectures or exercises on Fri Feb 26.

Easter holiday 1.-7.4.

Second course exam, room CK112, Mon 3 May at 12-14.

  • Domain of the exam: exercise sessions 5-9 and chapters 6, 7, 8 and 10, with the following exceptions. We skip sections 6.4, 7.4.4, 7.4.6, 8.5, 10.5, 10.6 and 10.7. Chapters 9 and 11 are not required for this exam.
  • When you read for the exam: don't try to memorize formulas, try to understand the ideas and how they are applied (but now you need to remember the formula for the M-H ratio r).
  • Two hours of time, four problems. I try to test your understanding of some of the central ideas. The computations required should be easier than in the exercises.

First course exam, Mon 1 Mar

  • Domain of the exam: chapters 1-5 except Sec 3.3.2. You don't need to memorize the details of the ratio of uniforms method. You can also skip Sec. 5.4.4. Exercise sessions 1-4
  • Exam problems.

Contents of lectures

  • Week 17: Mon: Sec. 11.3, 11.4, 11.5, 11.6, 11.9; Fri: ergodicity and CLTs; recapitulation of ch 6 - 10 for the course exam.
  • Week 16: Mon: Sec. 10.3, 10.4 and 10.7; Fri: rest of Ch. 10 and beginning of Ch. 11.
  • Week 15: Mon: rest of Ch. 9, Fri: to end of Sec. 10.2.
  • Week 14: No lecture on Mon (Easter); Fri: Sec. 8.4, (8.5 was skipped), 9.1.
  • Week 13: Lecture on Mon: Sec. 8.2 and 8.3; discussion of practical work. No program on Fri (Easter).
  • Week 12: Lectures on Sec. 7.4 - Sec. 8.1.
  • Week 11: Lectures on Ch. 6 - Sec 7.3., but we skip Sec. 6.4 this year. On Fri we discussed results of the first course exam.
  • Week 9: Mon: exam; Fri: no program.
  • Week 8: Mon: Sec. 5.4 (quickly) and recap of the material for the first course exam; Fri: no program.
  • Week 7: to the end of Ch. 5, but we skipped Sec. 5.4.
  • Week 6: from the beginning of Ch. 4 to the middle of Sec 4.5.1 (we also had a look at Sec. 2.8).
  • Week 5: to the end of Ch. 3, but we skipped polar coordinates (Sec. 3.3.2) and you do not need to study it for the exam.
  • Week 4: to the end of Sec. 3.3.1, but we skipped Sec. 2.8.
  • Week 3: from Ch. 1 to the middle of Sec. 2.6.

Exercises

Fri 10-12 in room B120. No exercises on the first week of period III.

You will get additional points, which will be added to your points from the two course exams, according to the formula

max(0, floor((n - 2)/5))

where n is the number of marked exercises.

Description

This course gives an overview of certain computational methods which are useful especially in Bayesian statistics. Topics include

  • Review of probability and Bayesian inference.
  • Methods for generating independent samples from distributions.
  • Classical Monte Carlo integration and importance sampling.
  • Approximating the posterior distribution using numerical quadrature or Laplace expansion.
  • MCMC methods: Gibbs and Metropolis-Hastings sampling.
  • Auxiliary variable methods in MCMC.
  • EM algorithm.
  • Multi-model inference.
  • MCMC theory.

Exams

Two course exams at the end of each of the periods III and IV. Alternatively, a separate exam.

You will get extra points by solving enough exercises, provided you take part in the two course exams.

Practical work

In order to get the credits, you need to pass also the compulsory practical work (harjoitustyö). The work should be done in groups of two or three. The aim is to implement the Metropolis--Hastings algorithm or the Gibbs sampler in some simple bivariate problem, where you can visualize your results. The result of this practical work should be a short report (not much text, 3-4 pages would be the ideal length) which includes

  • all the needed formulas (e.g., likelihood; prior; full conditionals, if these are needed)
  • a description of the algorithm (the code does not need to be included)
  • graphical and numerical summaries of the posterior

Project proposals:

  • (pdf) Gibbs sampler for allele frequencies in the ABO blood type system.
  • (pdf) Inference from grouped normal data.
  • (pdf) Analysis of a bioassay experiment using the logistic link.
  • (pdf) Analysis of a bioassy experiment using the t-link.
  • (pdf) Trying different Metropolis--Hastings samplers for gamma parameters.
  • (pdf) Analyzing space shuttle Challenger data; data file challenger.dat.

Deadline:

  • You should return your work by the end of September, 2010. However, I hope many of you are able to return the work even earlier. (Don't spend very much time on polishing the text; concentrate on getting the formulas into shape.)

Examples

  • B.txt Appendix B of the lecture notes: examples of MCMC samplers coded in R.
  • probit-gibbs.txt Gibbs sampler for probit regression (Sec. 8.4).

Lecture notes

  • Part 1: Chapters 1-5 and Appendix A.
  • Part 2: Chapters 6-9.
  • Revised ch. 7: A revised version of Ch. 7. There are corrections in Sections 7.4.4, 7.4.5, and 7.6; pages 94, 95 and 99.
  • Part 3: Chapters 10 and 11.
  • Revised ch. 10: A revised version of Ch. 10. There are many minor changes, and the explanation of how one gets BIC by using Laplace has been improved considerably.

Registration

Did you forget to register? What to do.