You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A continuous-time Markov process (CTMP) is similar to a discrete-time
Markov chain, except that transitions between states can occur at any
time and are not restricted to discrete time points. For a CTMP, if we
know the state of the stochastic process at a specific time point, then
what will happen in the future is independent of what has happened in
the past.
In the continuous-time analog of our weather model, being wet or dry is
now a property of every specific point of time and not restricted to
daily summary.
Definition
We will define a continuous-time Markov process using a discrete-time
Markov chain called the “jump chain” and a set of time points called the
“jump times”.
Let $Y = \left\{Y_k \in \Sigma : k = 0, 1, \ldots\right\}$ be a
discrete-time Markov chain with initial state, $Y_0$ (the jump chain).
Let $R = \left[r_{ij}\right]$ be the transition matrix for the jump
chain such that $r_{ii} = 0 \; \forall i$. This means that in a single
jump, the jump chain must move from one state to a different state.
Let $H = \left\{H_k \in [0,\infty) : k = 1, 2, \ldots\right\}$ be the
sequence holding times associated with the jump chain. $H_k$
represents how long the chain was in state $Y_{k-1}$ before jumping to
state $Y_{k}$. If $Y_{k-1} = y$, then $H_k$, has an exponential
distribution with rate $v_y$.
Using an exponential distribution for holding times is important for
creating a Markov process because it is the only continuous-time
distribution that is memoryless.
Let $J_k \in [0, \infty)$ be the time when the k-th jump occurred (the
jump times) such that $J_0 = 0$ and
$$J_k = \sum_{i=1}^k H_i$$
Let $X = \left\{X(t) \in \Sigma : t \in [0,\infty)\right\}$ be a
stochastic process where
Markov property. Let $X(t)$ be a continuous-time Markov process. Then
conditional on $X(t) = i$, $X(t+r)$ is independent of $X(s)$ when
$r \ge 0$ and $s < t$.
Proof. Let $X(t) = i$ such that $J_m \le t < J_{m+1}$ and
$Y_m = i$. Then
Let $\tilde{X}(r) = X(t+r)$. The jump chain $\tilde{Y}_0$,
$\tilde{Y}_1$, etc. of $\tilde{X}(r)$ is given by
$\tilde{Y}_n = Y_{m+n}$ and the holding times $\tilde{H}_1$,
$\tilde{H}_2$, etc. by
Since $Y$ is a Markov chain that depends on $Y_0$, $\tilde{Y}$ is
also a Markov chain that depends on $\tilde{Y}_0$. Since
$\tilde{Y}_0 = Y_m = X(t)$, condition on the event $\{X(t) = i\}$,
then $\tilde{Y}$ is a Markov chain independent of $Y_0$ to
$Y_{m}$.
Recall that holding times are independent and exponentially distributed,
i.e. $H_k$ has an exponential distribution with rate $v_y$ where
$y = Y_{k-1}$. First note that by the memoryless property of
$H_{m+1}$, $\tilde{H}_1$ is exponentially distributed with rate
$v_y$ where $y = \tilde{Y}_0$, and second that
$\tilde{H}_{n:n\ge2}$ is exponentially distributed with rate $v_y$
where $y = \tilde{Y}_n$. Next, condition on $\{X(t) = i\}$ and
$\tilde{Y}_1$, $\tilde{Y}_2$, etc, then $\tilde{H}_1$,
$\tilde{H}_2$, etc. are independent of $H_1$ to $H_m$ and $Y_0$
to $Y_{m}$.
Therefore, conditional on $\{X(t) = i\}$, $\tilde{Y}$ is independent
of all $X(s)$ where $s < t$. $\blacksquare$
Transition Probability Matrices
It can also be shown that this process is homogeneous such that
We can construct a single matrix that describes the instanteous behavior
of a Markov process based on the jump chain and holding time processes.
Let $Q = \left[q_{ij}\right]$ be an $n \times n$ matrix where
$q_{ij : i \ne j}$ are non-negative and
$q_{ii} = - \sum_{j : j \ne i} q_{ij}$ are negative, such that
Note that the sum of each row is zero. $Q$ is the transition rate
matrix and describes the instanteous behavior of a Markov process. It
represents how quickly transitions from $i$ to $j$ happen (for
$i \ne j$) and how long the process stays in individual states (for
$i = j$).
Justification
Consider a very small length of time, $h$, such that the chance that
two or more jumps occur in the process is negligible. Given
$X(0) = i$, the probability of no jump is $e^{-v_ih}$ and the
probability of one jump is approximately $1 - e^{-v_ih}$.
Therefore, $P(t)$ is the solution to the ordinary differential
equation $P^\prime(t) = P(t) Q$ with the initial condition of
$P(0) = I$. The solution to this is well known and uses the matrix
exponential function
Alternatively, consider a time length of $t$, subdivided into $m$
independent sections of length $t/m$. Since the subdivisions are
independent, the total transition matrix is the product of the
transition matrices of the individual subdivisions:
A Markov process may be inhomogeneous because the rates of the holding
times can vary depending on the state of the process. However, we can
construct a Poisson process which is homogenous but still produces the a
stochastic process that is the same as the original Markov process. This
is called “uniformization”.
First we will choose a rate, $v$, that is at least as large as the
largest transition rate of the original process: $v \ge \max(v_i)$. We
will assume that there exists a Poisson process that affects stochastic
process $X$ with rate $v$. The probability that $X$ experiences
$k$ steps during a time span of length $t$ follows a Poisson
distribution:
$$\Pr(K = k; t) = \frac{(vt)^k e^{-vt}}{k!}$$
And the holding times between steps follows an exponential distribution:
$$f(t) = v e^{-vt}$$
Note that unlike our previous Markov chains, the rates of this Poisson
process are independent of $X(t)$.
Let
$\breve{Y} = \left\{\breve{Y}_k \in \Sigma : k = 0, 1, \ldots\right\}$
be a discrete-time Markov chain with initial state, $\breve{Y}_0$ and
transition matrix
$$\breve{R} = \left[\breve{r}_{ij}\right] = I + \frac{Q}{v}$$
Since the Poisson process is independent of $X(t)$, we have introduced
virtual events to the transition matrix such that when a step occurs
there is a probability that the $\breve{Y}$ will not change state.
Lemma. This uniformized process is identical to a Markov process using
transition rate matrix Q.
Proof. The number of steps that $\breve{Y}$ stays in the same state
follows a geometric distribution. If $\breve{Y}_n = i$, and $k$ is
the number of steps that $\breve{Y}$ stays unchanged, then
Markov chains can have a unique stationary probability vector that is
also the limiting probability vector. (This is true for the Markov
models we use.) Let $\pi$ be the stationary probability (row) vector.
Then the following properties are true.
This has two implications. One, if $X(t)$’s current marginal
probability is $\pi$, then the marginal probability will not change.
Two, if it is not $\pi$, then it will approach $\pi$ asymptotically.
Let $s > 0$, then for all $s$
Therefore, the stationary distribution $\pi$ can be found by solving
the system of linear equations given by $\pi Q = 0$ and
$\sum_i \pi_i = 1$. Furthermore, consider that
$Q = \mathrm{diag}(\{v_i\}) (R - I)$ where $R$ is the jump matrix.
Let $w = [w_i]$ where $w_i = \pi_i v_i$, then
$$\begin{align}
\pi Q = 0 \implies & \pi \, \mathrm{diag}(\{v_i\}) (R - I) = 0\\\
\implies & w (R - I) = 0\\\
\implies & w R = w\\\
\end{align}$$
Therefore, the following three statements are equivalent
$$\begin{align}
\pi P(t) = & \pi\\\
\pi Q = & 0\\\
w R = & w\\\
\end{align}$$
Reversibility
A Markov process is considered time-reversible if for all $i$, $j$,
and $t \ge 0$,
Therefore, a continuous-time Markov process will be time-reversible if
$\pi_i q_{ij} = \pi_j q_{ji}$ for all $i$ and $j$.
Application to Mutation
We can use a continuous-time Markov process to represent a mutation
process if we let each element of $\Sigma$ represent a different
allele and each jump represent a mutation. A few matrices will be
important in our calculations, the transition rate matrix $Q$, the
diagonal matrix of mutation rates,
$V = \mathrm{diag}(v_1, v_2, \ldots)$, and the scaled jump matrix
$S = Q + V$.
Expected Number of Mutations
The number of mutations that occurs in a mutation process is the length
of the jump chain. Let $N(X, t)$ be the number of mutations that has
occurred to a Markov process by time $t$. Then, the expected number of
mutations is
$$E[N(x,t)] = \sum_{k=0}^\infty k \Pr(N(x, t) = k)$$
The event $\{N(x, t) = k\}$ can be expressed in terms of the jump
chain:
$$\{N(x, t) = k\} = \{J_k \le t < J_{k+1} \} = \{J_n \le J_k \le t < J_{k+1} \} = \{J_n \le t \} \cap \{J_k \le t < J_{k+1} \}$$
where $1 \le n \le k$. Using this, we will rewrite the expectation as
where $f(s) = \sum_{n=1}^\infty f_n(s)$ is the probability density
function of there being a mutation at time $s$. We can calculate
$f(s)$ through matrix multiplication and summing over the result. Let
$M(s, t)$ be a matrix such that the i-j-th element represents the
probability density of a mutation happening at time $s$ and
$X(t) = j$ given that $X(0) = i$.
$$E[N(x,t)] = \int_0^t f(s)\;ds = t \sum_i \pi_i v_i$$
In phylogenetics and molecular evolution, we typically normalize the
Q-matrix such that $\sum_i \pi_i v_i = 1$ and $t$ measures the
expected number of mutations. This is known as “substitution time”.
Probability of No Mutations
Let
$p^{(0)}_{ij}(t) = \Pr(X(t) = j \text{ and no mutations} | X(0) = i)$
and $P^0(t) = \left[p^{(0)}_{ij}(t)\right]$. Obvious if $i \ne j$,
then there has been at least one mutation and $p^{(0)}_{ij} = 0$. If
$i = j$, then we can calculate the probability using the probability
density function of $H_1$.
Define $P^{(1)}(t)$ similar to $P^{(0)}(t)$ as the joint probability
of $X_t$ and one mutation given $X_0$. Let $S = Q + V$ be the
matrix containing the off-diagonals of $Q$.
$$\begin{align}
P^{(1)}(t) = & \int_0^t e^{-Vs} S e^{-V(t-s)} ds\\\
= & \int_0^t e^{-V(t-s)} S e^{-Vs} ds\\\
= & e^{-Vt} \int_0^t e^{Vs} S e^{-Vs} ds\\\
\end{align}$$
Therefore, $p^{(1)}_{ii}(t) = 0$ and if $i \ne j$,