an example of gibbs sampler (with r code)

Consider the latent variable formulation of the binary probit model:
$$
z = beta x + epsilon
$$
and $y = 0$ if $z < 0$.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56

rm(list = ls())
library(MASS)

binaryprobitGibbs <- function(, y, A, betabar, mcmc.step, burnin){

nvar <- ncol(X)
beta.chain <- matrix(nrow = mcmc.step, ncol = nvar)
beta.chain[1,] <- rep(0, nvar)
sigma <- rep(1, nrow(X))
covar <- chol2inv(chol(crossprod(X, X) + A))
Abetabar <- crossprod(A, betabar)
max.number <- 100
a <- ifelse(y == 0,-max.number, 0)
b <- ifelse(y == 0, 0, max.number)

# draw z
draw.z <- function(mu, sigma, a, b){
# truncated normal distribution
phi.a <- pnorm(a, mean = mu, sd = sigma)
phi.b <- pnorm(b, mean = mu, sd = sigma)
z <- mu + sigma*qnorm(runif(length(mu)) * (phi.b - phi.a) + phi.a)
return(z)
}

# draw beta
draw.beta <- function(covar, , z, Abetabar){
beta.tilde <- covar %*% (crossprod(X, z) + Abetabar)
return(mvrnorm(n = 1, beta.tilde, covar))
}

# main interation loop
for (i in 2:mcmc.step){
mu <- X %*% beta.chain[i - 1,]
z <- draw.z(mu, sigma, a, b)
beta.chain[i,] <- draw.beta(covar, X, z, Abetabar)
}
return(beta.chain[(burnin + 1):mcmc.step,])
}

# a simulated example
library(magrittr)
set.seed(123)
n.sample <- 1000
X <- c(rep(1, n.sample), runif(n = 2 * n.sample, min = 0, max = 20)) %>% matrix(ncol = 3)
beta.true <- c(4, 6, -5)
z.true <- X %*% beta.true + rnorm(n.sample)
y <- ifelse(z.true < 0, 0, 1)
A <- .01 * diag(ncol(X))
betabar <- rep(0, ncol(X))
mcmc.step <- 2e5
burnin <- .25 * mcmc.step

# test the algorithm
beta.chain <- binaryprobitGibbs(X, y, A, betabar, mcmc.step, burnin)
apply(beta.chain, 2, mean) # show the posterior mean

[1] 3.76 4.57 -3.87

1
2
3
4
posterior.out <- data.frame(intercept = beta.chain[,1], beta1 = beta.chain[,2], beta2 = beta.chain[,3])
library(ggplot2)
# show the posterior distribution and the ACF (autocorrelation function)
ggplot(data = posterior.out, aes(intercept)) + geom_histogram(breaks = seq(1, 6, by = .01))

Posterior density of the intercept

1
2
# show the autocorrelation
acf(beta.chain[,1])

Autocorrelation

Since the Gibbs sampler with data augmentation always results in high autocorrelation, we may need more iterations to ensure convergence. We find that estimated parameters $hat{beta} = (3.76, 4.57, -3.87)$ is not that close to the true values $beta = (4, 6, -5)$. In consequence, we need to check the performance of Gibbs sampler before reporting the results.