Introduction

11.3.1 Introduction

So far, we have discussed discrete-time Markov chains in which the chain jumps from the current state to the next state after one unit time. That is, the time that the chain spends in each state is a positive integer. It is equal to $1$ if the state does not have a self-transition ($p_{ii}=0$), or it is a $Geometric(1-p_{ii})$ random variable if $p_{ii}>0$. Here, we would like to discuss continuous-time Markov chains where the time spent in each state is a continuous random variable.

More specifically, we will consider a random process $\{X(t), t \in [0, \infty) \}$. Again, we assume that we have a countable state space $S \subset \{0,1,2,\cdots\}$. If $X(0)=i$, then $X(t)$ stays in state $i$ for a random amount of time, say $T_1$, where $T_1$ is a continuous random variable. At time $T_1$, the process jumps to a new state $j$ and will spend a random amount of time $T_2$ in that state, and so on. As it will be clear shortly, the random variables $T_1$, $T_2$, $\cdots$ have exponential distribution. The probability of going from state $i$ to state $j$ is shown by $p_{ij}$.

Similar to discrete-time Markov chains, we would like to have the Markov property, i.e., conditioned on the current value of $X(t)$, the past and the future values of the process must be independent. We can express the Markov property as follows: for all $0 \leq t_1 \lt t_2 \lt \cdots \lt t_n \lt t_{n+1}$, we must have \begin{align*} P\big(X(t_{n+1})=j|X(t_{n})=i, X(t_{n-1})=i_{n-1}, \cdots, X(t_1)=i_1\big)= P\big(X(t_{n+1})=j|X(t_{n})=i\big). \end{align*}

In particular, suppose that at time $t$, we know that $X(t)=i$. To make any prediction about the future, it should not matter how long the process has been in state $i$. Thus, the time that the process spends in each state must have a "memoryless" property. As it has been discussed in previous chapters (e.g, Chapter 4), the exponential distribution is the only continuous distribution with that property. Thus, the time that a continuous-time Markov chain spends in state $i$ (called the holding time) will have $Exponential(\lambda_i)$ distribution, where $\lambda_i$ is a nonnegative real number. We further assume that the $\lambda_i$'s are bounded, i.e., there exists a real number $M \lt \infty$ such that $\lambda_i \lt M$, for all $i \in S$.

Thus, a continuous Markov chain has two components. First, we have a discrete-time Markov chain, called the jump chain or the the embedded Markov chain, that gives us the transition probabilities $p_{ij}$. Second, for each state we have a holding time parameter $\lambda_i$ that controls the amount of time spent in each state.

Note that if $i$ is not an absorbing state, we can assume that $i$ does not have a self-transition, i.e., $p_{ii}=0$. The reason is that if we go from state $i$ to state $i$, it's as if we never left that state. On the other hand, if $i$ is an absorbing state, we have $p_{ii}=1$, and $p_{ij}=0$, for all $i \neq j$. In this case, we have $\lambda_i=0$, which means that the chain will spend an infinite amount of time in the absorbing state $i$.

Continuous-Time Markov Chains
A continuous-time Markov chain $X(t)$ is defined by two components: a jump chain, and a set of holding time parameters $\lambda_i$. The jump chain consists of a countable set of states $S \subset \{0,1,2,\cdots\}$ along with transition probabilities $p_{ij}$. We assume $p_{ii}=0$, for all non-absorbing states $i \in S$. We assume

if $X(t)=i$, the time until the state changes has $Exponential(\lambda_i)$ distribution;
if $X(t)=i$, the next state will be $j$ with probability $p_{ij}$.

The process satisfies the Markov property. That is, for all $0 \leq t_1 \lt t_2 \lt \cdots \lt t_n \lt t_{n+1}$, we have \begin{align*} &P\bigg(X(t_{n+1})=j\bigg{|} X(t_{n})=i, X(t_{n-1})=i_{n-1}, \cdots, X(t_1)=i_1\bigg)\\ &\hspace{75pt} = P\big(X(t_{n+1})=j \big{|} X(t_{n})=i\big). \end{align*}

Let's define the transition probability $P_{ij}(t)$ as \begin{align*} P_{ij}(t)&=P(X(t+s)=j | X(s)=i)\\ &=P(X(t)=j|X(0)=i), \quad \textrm{ for all }s,t \in [0, \infty). \end{align*} We can then define the transition matrix, $P(t)$. Assuming the states are $1$, $2$, $\cdots$, $r$, then the state transition matrix for any $t \geq 0$ is given by \begin{equation} \nonumber P(t) = \begin{bmatrix} p_{11}(t) & p_{12}(t) & ... & p_{1r}(t) \\%[5pt] p_{21}(t) & p_{22}(t) & ... & p_{2r}(t) \\%[5pt] . & . & . & .\\[-10pt] . & . & . & . \\[-10pt] . & . & . & . \\[5pt] p_{r1}(t) & p_{r2}(t) & ... & p_{rr}(t) \end{bmatrix}. \end{equation} Let's look at an example.

Example
Consider a continuous Markov chain with two states $S=\{0, 1\}$. Assume the holding time parameters are given by $\lambda_0=\lambda_1=\lambda>0$. That is, the time that the chain spends in each state before going to the other state has an $Exponential(\lambda)$ distribution.

Draw the state diagram of the embedded (jump) chain.
Find the transition matrix $P(t)$. Hint: You might want to use the following identities \begin{align*} \sinh (x)&= \frac{e^x-e^{-x}}{2}= \sum_{n=0}^\infty \frac{x^{2n+1}}{(2n+1)!},\\ \cosh (x)&= \frac{e^x+e^{-x}}{2} = \sum_{n=0}^\infty \frac{x^{2n}}{(2n)!}. \end{align*}

Solution
- 1. There are two states in the chain and none of them are absorbing (since $\lambda_i > 0$). Since we do not allow self-transitions, the jump chain must have the following transition matrix: \begin{equation} \nonumber P = \begin{bmatrix} 0 & 1 \\ 1 & 0 \end{bmatrix}. \end{equation} The state transition diagram of the jump chain is shown in Figure 11.22.
  2. This Markov chain has a simple structure. Let's find $P_{00}(t)$. By definition \begin{align*} P_{00}(t)=P(X(t)=0|X(0)=0), \quad \textrm{ for all }t \in [0, \infty). \end{align*} Assuming that $X(0)=0$, $X(t)$ will be $0$ if and only if we have an even number of transitions in the time interval $[0,t]$. The time between each transition is an $Exponential(\lambda)$ random variable. Thus, the transitions occur according to a Poisson process with parameter $\lambda$. We have \begin{align*} P_{00}(t)&=P(X(t)=0|X(0)=0)\\ &=P\big(\textrm{an even number of arrivals in }[0,t]\big)\\ &=\sum_{n=0}^\infty e^{-\lambda t} \frac{(\lambda t)^{2n}}{(2n)!}\\ &=e^{-\lambda t} \sum_{n=0}^\infty \frac{(\lambda t)^{2n}}{(2n)!}\\ &= e^{-\lambda t} \left[\frac{e^{\lambda t}+e^{-\lambda t}}{2}\right] & (\textrm{by the hint})\\ &=\frac{1}{2}+\frac{1}{2}e^{-2\lambda t}. \end{align*} Next, we obtain \begin{align*} P_{01}(t)&=1-P_{00}(t)\\ &=\frac{1}{2}-\frac{1}{2}e^{-2\lambda t}. \end{align*} Finally, because of the symmetry in the problem, we have \begin{align*} P_{11}(t)&=P_{00}(t)\\ &=\frac{1}{2}+\frac{1}{2}e^{-2\lambda t}, \end{align*} \begin{align*} P_{10}(t)&=P_{01}(t)\\ &=\frac{1}{2}-\frac{1}{2}e^{-2\lambda t}. \end{align*} Thus, the transition matrix for any $t \geq 0$ is given by \begin{equation} \nonumber P(t) = \begin{bmatrix} \frac{1}{2}+\frac{1}{2}e^{-2\lambda t} & \frac{1}{2}-\frac{1}{2}e^{-2\lambda t} \\[5pt] \frac{1}{2}-\frac{1}{2}e^{-2\lambda t} & \frac{1}{2}+\frac{1}{2}e^{-2\lambda t} \\[5pt] \end{bmatrix}. \end{equation}

If we let $t$ go to infinity in the above example, the transition matrix becomes \begin{equation} \nonumber \lim_{t\rightarrow \infty} P(t) = \begin{bmatrix} \frac{1}{2} & \frac{1}{2} \\[5pt] \frac{1}{2} & \frac{1}{2} \\[5pt] \end{bmatrix}, \end{equation} which means that for $i=0, 1$, we have \begin{align*} &\lim_{t \rightarrow \infty} P(X(t)=0 |X(0)=i)=\frac{1}{2},\\ &\lim_{t \rightarrow \infty} P(X(t)=1 |X(0)=i)=\frac{1}{2}. \end{align*} In fact, $\pi=[\frac{1}{2}, \frac{1}{2}]$ is a limiting distribution of this chain. We will discuss limiting distributions shortly. Before doing that let's talk about some properties of the transition matrices. First, similar to the discrete-time analysis, the rows of the transition matrix must sum to $1$: \begin{align*} \sum_{j \in S} p_{ij}(t)=1,\quad \textrm{ for all } t \geq 0. \end{align*} Next, we note that by definition \begin{align*} P_{ii}(0)&=P(X(0)=i|X(0)=i)\\ &=1, \quad \textrm{ for all } i. \end{align*} \begin{align*} P_{ij}(0)&=P(X(0)=j|X(0)=i)\\ &=0, \quad \textrm{ for all } i \neq j. \end{align*} We conclude that $P(0)$ is equal to the identity matrix, $P(0)=I$. Finally, we can obtain the Chapman-Kolmogorov equation by applying the law of total probability and by using the Markov property: \begin{align*} P_{ij}(s+t)&=P(X(s+t)=j|X(0)=i)\\ &=\sum_{k \in S} P(X(s)=k|X(0)=i) P(X(s+t)=j|X(s)=k)\\ &=\sum_{k \in S} P_{ik}(s) P_{kj}(t), \quad \textrm{ for all }s,t \geq 0. \end{align*} The above equation can be written in the matrix form as follows \begin{align*} P(s+t)=P(s)P(t), \quad \textrm{ for all }s,t \geq 0. \end{align*}

Transition Matrix
For a continuous-time Markov chain, we define the transition matrix$P(t)$. The $(i,j)$th entry of the transition matrix is given by \begin{align*} P_{ij}(t)&=P(X(t)=j|X(0)=i). \end{align*} The transition matrix satisfies the following properties:

$P(0)$ is equal to the identity matrix, $P(0)=I$;
the rows of the transition matrix must sum to $1$, \begin{align*} \sum_{j \in S} p_{ij}(t)=1,\quad \textrm{ for all } t \geq 0; \end{align*}
for all $s,t \geq 0$, we have \begin{align*} P(s+t)=P(s)P(t). \end{align*}

← previous

The print version of the book is available on Amazon.

Practical uncertainty: Useful Ideas in Decision-Making, Risk, Randomness, & AI