# Computational statistics, spring 2011

### Lecturer

### News

There is going to be an exam on Thu 11 Aug (the general examination day of the department). You should register for the exam at the department office.

### Scope

8 cu.

### Type

Advanced studies

### Prerequisites

- Basic skills in linear algebra
- Basic skills in multivariate differential calculus (partial derivatives and multiple integrals)
- Basic skills in calculating with multivariate probability distributions (joint and conditional distributions)
- Some previous exposure to Bayesian statistics would be helpful

### Lectures

Periods III and IV: Mon 12-14 and Fri 12-14, room B120

Easter holiday

### Contents of lectures

- Week 4: we started the course on Friday and went through Ch. 1 of the lecture notes.
- Week 5: Ch. 2 (except Section 2.8).
- Week 6, Ch. 3 (but this year we skip sections 3.3.2 and 3.3.3). We will return to Sec. 3.8 on Monday.
- Week 7, Sec. 3.8, and Sections 4.1 - 4.6.1.
- Week 8, Sections 4.6.2, 4.6.4, and 5.1 - 5.7.
- Week 11, recapitulation for the exam on Monday; exam on Friday (no exercises).
- Week 12, Sections 6.1, 6.2, 6.3 and 7.1. This year we skip Sec. 6.4. Results of first course exam on Friday.
- Week 13, Sec. 7.2 - 7.7.
- Week 14, Sec. 7.7, 8.1 - 8.3 and Sec 8.4 up to eq. (8.5).
- Week 15, Rest of Sec. 8.4; we skipped Sec 8.5. Ch. 9.
- Week 16, Sec 10.1 - 10.4.

### Description

This course gives an overview of computational methods which are useful especially in Bayesian statistics (but some of the methods are also used widely in frequentist inference)

- Review of probability and Bayesian inference.
- Methods for generating independent samples from distributions.
- Classical Monte Carlo integration and importance sampling.
- Approximating the posterior distribution using numerical quadrature or Laplace expansion.
- MCMC methods: Gibbs and Metropolis-Hastings sampling.
- Auxiliary variable methods in MCMC.
- EM algorithm.
- Multi-model inference.
- MCMC theory.

### Exams

Two course exams at the end of each of the periods III and IV. Alternatively, a separate exam.

- General advice for the two course exams: You should bring a pencil and an eraser to the exam. You will be provided blank paper. Additionally, you are allowed to (but need not) bring a calculator and a lightweight (less than half a kilogram) book of mathmetatical tables/formulas (for Finns: MAOL taulukot).
- The first course exam was held on Fri 18 March at 12-14 in the room B120. Its area: Chapters 1-5 (skip Sections 3.3.2, 3.3.3 and 4.6.3) and the exercises from sessions 1-4. Exam problems and suggested solutions.
- The second course exam was held on Fri 13 May at 12-14 in the room B120. Exam area: Chapter 6 - Section 10.4 (skip Sections 6.4, 7.4.4, 7.4.6, 8.5). No questions on the EM algorithm (ch. 9). No questions about the DIC (Sec. 10.4). Exam problems and suggested solutions.

If you want to take a separate exam, then it is easiest to arrange it on some of the general examination dates of the department. Send me a message well in advance, when you are ready to take the exam. The area is Chapter 1 - Section 10.4 (skip Sections 3.3.2, 3.3.3, 4.6.3, 6.4, 7.4.4, 7.4.6, 8.5). (I might make a question about the EM algorithm.)

### Lecture notes

- Part 1: Chapters 1-5 and Appendices A and B.
- There is a bug in the pseudocode of Example 3.2 on p. 41. The correct procedure is: in step 2 generate Z from N_d(0, Sigma); in step 3 set X = mu + Z / sqrt(Y).
- Part 2: Chapters 6-11.

### Exercises

Fri 10-12, room B120. No exercises on the first week of period III or period IV.

You will get additional points, which will be added to your points from the two course exams, according to the formula

max(0, floor((n - 2)/5))

where n is the number of marked exercises (out of maximum 45). If you mark an exercise as solved, then you should be able to discuss your solution on the blackboard. (Your solution does not need to be complete or correct for that.)

### Practical work

In order to get the credits, you need to pass also the compulsory practical work (harjoitustyö). It would be ideal to do the work in groups of two or three. The aim is to implement the Metropolis--Hastings algorithm or the Gibbs sampler in some simple bivariate problem, where you can visualize your results. The result of this practical work should be a short report (not much text, 3-4 pages would be the ideal length) which includes

- all the needed formulas (e.g., likelihood; prior; full conditionals, if these are needed)
- a description of the algorithm (the code does not need to be included)
- graphical and numerical summaries of the posterior

Project proposals (each group selects one from these or invents an own topic):

- (pdf) Gibbs sampler for allele frequencies in the ABO blood type system.
- (pdf) Inference from grouped normal data.
- (pdf) Analysis of a bioassay experiment.
- (pdf) Three different Metropolis--Hastings samplers for gamma parameters.
- (pdf) Analyzing space shuttle Challenger data; data file challenger.dat.

Deadline:

- You should return your work by the end of September, 2011. However, I hope many of you are able to return the work earlier. (Don't spend very much time on polishing the text; concentrate on getting the formulas and results into shape.)

### Useful links

- Wikipedia: List of probability distributions.
- Wikipedia: English pronunciation of Greek letters.
- R: R project homepage; download R from CRAN.
- BUGS: OpenBUGS, WinBUGS.

### Registration

Did you forget to register? What to do.