Probability

The goal of this assignment is to apply the model development and inference tools from class to Gaussian data. Dealing with the particulars of implementation will help in the development of your final project. You should hand in your code and text.

Mixture modelling

A set of observations were generated according to the model

π ∼ Dir(1, 1, 1)

zi ∼ Cat(π), i = 1, …, n

μk ∼ MV N(0, 10I2), k = 1, …, K

xi ∼ MV N(μzi , I2), i = 1, …, n

where Dir is a Dirichlet distribution, Cat is a categorical distribution, MV N is a multivariate normal distribution, I2 is a 2 × 2 identity matrix, K = 3, and n ∈ {250, 1000, 5000} (three distinct simulations).

Part 1a

  1. We gave you the generative model. Write the other two ways to specify a probabilistic model, namely, a plate diagram and joint probability distribution. Two observed random variables, xi and x j are conditionally independent given what model variables, if any?
  1. Implement a Gibbs Sampler for the aforementioned model. Please document your derivation.

Hint: you may want to keep track of p(x, z, μ, π) or a similar quantity to test for convergence.

Hint: You will almost surely want to work in log space when dealing with small probabilities.

 

Part 1b

  1. Implement mean-field variational inference for the aforementioned model. Commenting your code makes it easier to grant partial credit. Document your derivation.
  1. Apply your code to the provided simulated data on the course website (hw1_250.txt, hw1_1000.txt, hw1_5000.txt). How did you decide convergence for both inference algorithms?
  1. Evaluate your methods. Use any methods or metrics you deem necessary (e.g. figures, clustering metrics, runtime comparisons). Interpret your parameter estimates. Did both inference methods converge on similar parameter estimates? Why or why not?

Hint: You may use external libraries for evaluation. For instance, scikit-learn has a number of off-the-shelf options for evaluating clustering metho