 #### 联系方式

• QQ：99515681
• 邮箱：99515681@qq.com
• 工作时间：8:00-23:00
• 微信：codehelp #### 您当前位置：首页 >> Algorithm 算法作业Algorithm 算法作业

###### 日期：2019-11-08 09:34

Lab 3: Markov chain Monte Carlo

Nial Friel

1 November 2019

Aim of this lab

In the lab we will focus on implementing MCMC for an example which we have explored in class. The

objective will be understand the role that the proposal variance plays in determining the efficiency of the

Markov chain.

A brief re-cap of MCMC

Recall that MCMC is a framework for simulating from a target distribution π(θ) by constructing a Markov

chain with whose stationary distribution is π(θ). Then, if we run the chain for long enough, simulated values

from the chain can be treated as a (dependent) sample from the target distribution and used as a basis for

summarising important features of π.

Under certain regularity conditions, the Markov chain sample path mimics a random sample from π. Given

realisations {θt : t = 0, 1, . . . } from such a chain, typical asymptotic results include the distributional

convergence of the realisations, ie

θt → π(θ)

both in distribution and as t → ∞. Also the consistency of the ergodic average, for any scalar functional φ,

So this leads to the question of how one can generate such a Markov chain, in general. As we’ve seen in

lectures, the Metropolis-Hastings algorithm provides a general means to do this.

Metropolis-Hastings algorithm

Recall that at iteration t of the Metropolis-Hastings algorithm one simulates a proposed state, φ from some

proposal distribution which depends on the current state, θt. We denote this proposal density by q(θt, φ).

This proposed state is then accepted with probability denoted by α(θt, φ) (as described below) and the next

state of the Markov chain becomes θt+1 = φ, otherwise, θt+1 = θt.

We algorithmically describe this algorithm as follows:

Step 1. Given the current state, θt, generate a proposed new state, φ, from the distribution q(θt, φ).

Step 2. Calculate.

Step 3. With probability α(θt, φ), set θt+1 = φ, else set θt+1 = θt.

1

Here we write a generic function to implement the Metroplis-Hastings algorithm.

metropolis.hastings <- function(f, # the target distribution

g, # the proposal distribution

random.g, # a sample from the proposal distribution

x0, # initial value for chain, in R it is x

sigma, # proposal st. dev.

chain.size=1e5){ # chain size

x <- c(x0, rep(NA,chain.size-1)) # initialize chain

for(i in 2:chain.size) {

y <- random.g(x[i-1],sigma) # generate Y from g(.|xt) using random.g

#alpha <- min(1, f(y)*g(x[i-1],y,sigma)/(f(x[i-1])*g(y,x[i-1],sigma)))

alpha <- min(1, f(y)*g(y,x[i-1],sigma)/(f(x[i-1])*g(x[i-1],y,sigma)))

if( runif(1)<alpha){x[i] <- y}

else{x[i] <- x[i-1]}}x}

A first (simple) example

Here we will re-visit a simple example which we explored in lectures where we wish to sample from a N(0, 1)

distribution using another normal distribution as a proposal distribution. Recall that the objective was

explore the effect of the variance of the proposal distribution on the efficiency of the resulting Markov chain.

To implement this using our generic code we do the following, where we first use a poor choice of proposal

variance (σ2 = 202):sigma=20

f <- function(x) dnorm(x,0,1)

random.g <- function(x,sigma) rnorm(1,x,sigma) # q(x,y)

g <- function(x,y,sigma) dnorm(y,x,sigma)

x0 <- 4

chain.size <- 1e4

x <- metropolis.hastings(f,g,random.g,x0,sigma,chain.size)

We can then explore the output of this chain as follows:

par(mfrow=c(1,2))

plot(x, xlab="x", ylab="f(x)", main=paste("Trace plot of x[t], sigma=", sigma),col="red", type="l")

xmin = min(x[(0.2*chain.size) : chain.size])

xmax = max(x[(0.2*chain.size) : chain.size])

xs.lower = min(-4,xmin)

2

xs.upper = max(4,xmax)

xs <- seq(xs.lower,xs.upper,len=1e3)

hist(x[(0.2*chain.size) : chain.size],50, xlim=c(xs.lower,xs.upper),col="blue",xlab="x", main="Metropolis-Hastings",freq=FALSE)

lines(xs,f(xs),col="red", lwd=1)

0 2000 6000 10000

Trace plot of x[t], sigma= 20

Metropolis?Hastings

We see that the chain apears not to have mixed very well (as evidenced from the trace plot). Also, the

histrogram does not yield an excellent summary of the target density (plotted in red on the right hand plot).

Now we use a much smaller proposal variance (σ2 = 0.22) and explore how this has effected the Markov chain.

sigma=0.2

f <- function(x) dnorm(x,0,1)

random.g <- function(x,sigma) rnorm(1,x,sigma) # q(x,y)

g <- function(x,y,sigma) dnorm(y,x,sigma)

x0 <- 4

chain.size <- 1e4

x <- metropolis.hastings(f,g,random.g,x0,sigma,chain.size)

We can then explore the output of this chain as follows:

par(mfrow=c(1,2))

plot(x, xlab="x", ylab="f(x)", main=paste("Trace plot of x[t], sigma=", sigma),col="red", type="l")

xmin = min(x[(0.2*chain.size) : chain.size])

xmax = max(x[(0.2*chain.size) : chain.size])

xs.lower = min(-4,xmin)

xs.upper = max(4,xmax)

xs <- seq(xs.lower,xs.upper,len=1e3)

3

hist(x[(0.2*chain.size) : chain.size],50, xlim=c(xs.lower,xs.upper),col="blue",xlab="x", main="Metropolis-Hastings",freq=FALSE)

lines(xs,f(xs),col="red", lwd=1)

0 2000 6000 10000

Trace plot of x[t], sigma= 0.2

Metropolis?Hastings

Again, the trace plot indicates that the chain has not mixed very well, although the histogram of the Markov

chain trace is in much better agreement with the target density.

Finally, we consider the case where the proposal variance is set to σ2 = 32.

sigma=3

f <- function(x) dnorm(x,0,1)

random.g <- function(x,sigma) rnorm(1,x,sigma) # q(x,y)

g <- function(x,y,sigma) dnorm(y,x,sigma)

x0 <- 4

chain.size <- 1e4

x <- metropolis.hastings(f,g,random.g,x0,sigma,chain.size)

We can then explore the output of this chain as follows:

par(mfrow=c(1,2))

plot(x, xlab="x", ylab="f(x)", main=paste("Trace plot of x[t], sigma=", sigma),col="red", type="l")

xmin = min(x[(0.2*chain.size) : chain.size])

xmax = max(x[(0.2*chain.size) : chain.size])

xs.lower = min(-4,xmin)

xs.upper = max(4,xmax)

xs <- seq(xs.lower,xs.upper,len=1e3)

hist(x[(0.2*chain.size) : chain.size],50, xlim=c(xs.lower,xs.upper),col="blue",xlab="x", main="Metropolis-Hastings",freq=FALSE)

lines(xs,f(xs),col="red", lwd=1)

Trace plot of x[t], sigma= 3

Now the output, in terms of the trace plot and histogram suggests that the algorithm is mixing well and that

the output from the Metropolis-Hastings algorithm provides a good summary of the target distribution.

5

Exercises

1. Consider again the problem explored in the lab.

? Modify the code to monitor the rate at which the proposed values are accepted, for each of the three

scenarios above (σ = 20, 0.2 or 3). Explain how this analysis can help to decide how to tune the proposal

variance. [15 marks]

? For scenario, estimate the probability that X > 2, where X ～ N(0, 1) using the output from the

Metropolis-Hastings algorithm. Provide an approximate standard error in each case. (Note that you

can compare your estimates to the true value 0.02275) [10 marks]

2. Consider sampling from Beta(2.5, 4.5) distribution using the an independence sampler with a uniform

proposal distribution. Provide R code to do this. Produce output from your code to illustrate the