bayesian-learning-with-R/BruceCampbell_ST540_HW_5.Rmd at master · NCSU-STATS/bayesian-learning-with-R · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
---
title: "Applied Bayesian Analysis : NCSU ST 540"
subtitle: "Homework 5"
author: "Bruce Campbell"
fontsize: 11pt
output: pdf_document
bibliography: BruceCampbell_ST540_HW_1.bib
---

---
```{r setup, include=FALSE,echo=FALSE}
rm(list = ls())
setwd("C:/E/brucebcampbell-git/bayesian-learning-with-R")
knitr::opts_chunk$set(echo = TRUE)
knitr::opts_chunk$set(dev = 'pdf')
knitr::opts_chunk$set(cache=TRUE)
knitr::opts_chunk$set(tidy=TRUE)
knitr::opts_chunk$set(prompt=FALSE)
knitr::opts_chunk$set(fig.height=5)
knitr::opts_chunk$set(fig.width=7)
knitr::opts_chunk$set(warning=FALSE)
knitr::opts_chunk$set(message=FALSE)
knitr::opts_knit$set(root.dir = ".")
knitr::opts_chunk$set(tidy.opts=list(width.cutoff=38),tidy=TRUE)
library(latex2exp)
library(pander)
library(ggplot2)
library(GGally)
```

let $Y_i$ be the number of concussions from team $i = 1, ..., 32$. The model is

$$Y_i|\lambda_i \sim Poisson(\lambda_i)$$

and the prior is

$$\lambda_i|\theta \sim Gamma(1, \theta)$$
where
$$\theta \sim Gamma(0.1, 0.1)$$.

### Derive the full conditional distribution of $\lambda_1$

Assuming independence of $Y_i$ we can write the full joint distribution as

$$P(Y_1, ... , Y_n, \lambda_1,...\lambda_n,\theta) \propto \prod\limits_{i=1}^{n} P(Y_i|\lambda_i) P(\lambda_i | \theta) P(\theta)   $$
Now
$$ P(\lambda_1 | Y_1, ... , Y_n, \lambda_2,...\lambda_n,\theta)\propto P(Y_1|\lambda_1) P(\lambda_1 | \theta) P(\theta)\;\;\;\prod\limits_{i=2}^{n} P(Y_i|\lambda_i) P(\lambda_i | \theta)   $$
Putting the expressions for the densities in here and dropping the product terms on the right hand side unrelated to $\lambda_1$ we have that

$$P(\lambda_1 | Y_1, ... , Y_n, \lambda_2,...\lambda_n,\theta)\propto  \frac{\lambda_1^{y_1}}{y_1!}e^{-\lambda_1} \;\;\;\theta e^{- \theta \lambda_1 } \;\;\;  \frac{(0.1)^{0.1}}{\Gamma(0.1)} \theta^{0.1-1} e^{-0.1 \;\theta} \propto \lambda_1^{(y_1+1) -1} e^{-\lambda_1 (1+\theta)}$$
The last experssion we recognise as the kernel of a $Gamma(y_1+1,1+\theta)$ distribution.

### Derive the full conditional distribution of $\theta$

From above, we have that

$$ P(\theta | Y_1, ... , Y_n, \lambda_1,...\lambda_n)\propto P(\theta)\;\;\;\prod\limits_{i=1}^{n} P(\lambda_i | \theta)   $$
and putting the expression for the densities in we have

$$ P(\theta | Y_1, ... , Y_n, \lambda_1,...\lambda_n)\propto \prod\limits_{i=1}^{n} \theta e^{- \theta \lambda_i } \;\;\;  \frac{(0.1)^{0.1}}{\Gamma(0.1)} \theta^{0.1-1} e^{-0.1 \;\theta} \propto \;\; \theta^{n-.9} \; e^{\theta (0.1+\sum \lambda_i)}$$
We recognise this expression as the kernel of a $Gamma(n+0.1,0.1+ \sum \lambda_i)$ density.

### Write Gibbs sampling code to draw samples from the joint distribution of $(\lambda_1, ..., \lambda_{32}, \theta)$.


```{r}
df <- read.csv("ConcussionsByTeamAndYear.csv",header = TRUE)
df$sum <- df$X2012+df$X2013
n <- nrow(df)
M <- 10000 #Burn in
N <- 10000 #Number of draws

posterior.sample <- matrix(nrow = N,ncol = n+1)
#Set the initial values to the means of the respective distributions

theta.0 <- 1 #Mean of Gamma(.1,.1)

lambda.0 <- matrix(data=rep(1,n),n,1) # Set to mean of Gamma(1,theta.0)


#Initialize outside loop
theta.t <- 1
lambda.t <- matrix(data=rep(1,n),n,1)

for(j in  1: M+N)
{
  theta.t <-rgamma(1,shape = n+0.1, rate = 0.1+sum(lambda.t))
  for(i in 1:n)
  {
    y.i <- df$sum[i]
    lambda.t[i] <- rgamma(1,y.i+1,1+theta.t)
  }
  if(j>M)
  {
    sample <-  c(lambda.t,theta.t)
    posterior.sample[j-M,]<- sample
  }
}

```

### Show trace plots of the samples for $\lambda_1$ and $\theta$.

####Trace plot for $\lambda_1$

```{r}
plot(posterior.sample[,1], main = TeX("$\\lambda_1$"),pch="*")
abline (h = mean(posterior.sample[,1]),col='red')
abline (h = mean(df$sum[1]),col='blue')
```


####Trace plot for $\theta$
```{r}
plot(posterior.sample[,33], main = TeX("$\\theta$"),pch="*")
```


(5) Plot the estimated posterior mean of the $\lambda_i$ versus $Y_i$ and comment on whether the code is
returning reasonable estimates. Turn in your solution on one piece of paper. Also, turn in MCMC code on a second piece of paper
stapled to the solution.

```{r}
means.lambda <- as.list(colMeans(posterior.sample))
means.lambda[[33]] <- NULL
plot(df$sum,means.lambda)
abline(a = 0,b = 1, col ='red')
```

We see good alignment between the values $Y_i$ and $\lambda_i$ - remember $Y \sim Pisson(\lambda) \implies E[Y]=\lambda$

### Test - Let's validate with Jags


```{r}
library(rjags)
library(coda)
a      <- .1
b      <- .1
model_string <- "model{

  for(i in 1:n)
  {
    lambda[i] ~ dgamma(1,theta)
    # Likelihood
    Y[i] ~ dpois(lambda[i])
  }

  # Prior
  theta ~ dgamma(a, b)
}"
Y <- df$sum
model.concuss <- jags.model(textConnection(model_string), data = list(Y=Y,n=n,a=a,b=b))
update(model.concuss, 10000, progress.bar="none"); # Burnin for 10000 samples

samp.theta <- coda.samples(model.concuss, variable.names=c("theta"),
        n.iter=20000, progress.bar="none")

summary(samp.theta)
plot(samp.theta)


samp.lambda1 <- coda.samples(model.concuss, variable.names=c("lambda[1]"),
        n.iter=20000, progress.bar="none")

summary(samp.lambda1)
plot(samp.lambda1)
mean(unlist(samp.lambda1))

```