Stationary Processes

10.1.4 Stationary Processes

We can classify random processes based on many different criteria. One of the important questions that we can ask about a random process is whether it is a stationary process. Intuitively, a random process $\big\{X(t), t \in J \big\}$ is stationary if its statistical properties do not change by time. For example, for a stationary process, $X(t)$ and $X(t+\Delta)$ have the same probability distributions. In particular, we have \begin{align}%\label{} F_{X(t)}(x)=F_{X(t+\Delta)}(x), \quad \textrm{ for all }t, t+\Delta \in J. \end{align} More generally, for a stationary process, the joint distribution of $X(t_1)$ and $X(t_2)$ is the same as the joint distribution of $X(t_1+\Delta)$ and $X(t_2+\Delta)$. For example, if you have a stationary process $X(t)$, then \begin{align}%\label{} P\Bigg(\Big(X(t_1),X(t_2)\Big) \in A\Bigg)= P\Bigg(\Big(X(t_1+\Delta),X(t_2+\Delta)\Big) \in A\Bigg), \end{align} for any set $A \in \mathbb{R}^2$. In sum, a random process is stationary if a time shift does not change its statistical properties. Here is a formal definition of stationarity of continuous-time processes.

A continuous-time random process $\big\{X(t), t \in \mathbb{R}\big\}$ is strict-sense stationary or simply stationary if, for all $t_1,t_2,\cdots, t_r \in \mathbb{R}$ and all $\Delta \in \mathbb{R}$, the joint CDF of \begin{align}%\label{} X(t_1), X(t_2), \cdots, X(t_r) \end{align} is the same as the joint CDF of \begin{align}%\label{} X(t_1+\Delta), X(t_2+\Delta), \cdots, X(t_r+\Delta). \end{align} That is, for all real numbers $x_1, x_2,\cdots, x_r$, we have \begin{align}%\label{} F_{X(t_1) X(t_2) \cdots X(t_r)}(x_1,x_2,\cdots, x_r)= F_{X(t_1+\Delta) X(t_2+\Delta) \cdots X(t_r+\Delta)}(x_1,x_2,\cdots, x_r). \end{align}

We can provide similar definition for discrete-time processes.

A discrete-time random process $\big\{X(n), n \in \mathbb{Z}\big\}$ is strict-sense stationary or simply stationary, if for all $n_1,n_2,\cdots, n_r \in \mathbb{Z}$ and all $D \in \mathbb{Z}$, the joint CDF of \begin{align}%\label{} X(n_1), X(n_2), \cdots, X(n_r) \end{align} is the same as the joint CDF of \begin{align}%\label{} X(n_1+D), X(n_2+D), \cdots, X(n_r+D). \end{align} That is, for all real numbers $x_1, x_2,\cdots, x_r$, we have \begin{align}%\label{} F_{X(n_1) X(n_2) \cdots X(n_r)}(x_1,x_2,\cdots, x_n)= F_{X(n_1+D) X(n_2+D) \cdots X(n_r+D)}(x_1,x_2,\cdots, x_r). \end{align}

Example
Consider the discrete-time random process $\big\{X(n), n \in \mathbb{Z} \cdots\}$, in which the $X(n)$'s are i.i.d. with CDF $F_{X(n)}(x)=F(x)$. Show that this is a (strict-sense) stationary process.

Solution
- Intuitively, since $X(n)$'s are i.i.d., we expect that as time evolves the probabilistic behavior of the process does not change. Therefore, this must be a stationary process. To show this rigorously, we can argue as follows. For all real numbers $x_1, x_2,\cdots, x_r$ and all distinct integers $n_1$, $n_2$,$\cdots$, $n_r$, we have \begin{align}%\label{} &F_{X(n_1) X(n_2) \cdots X(n_r)}(x_1,x_2,\cdots, x_r)\\ & \quad =F_{X(n_1)}(x_1) F_{X(n_2)}(x_2) \cdots F_{X(n_r)}(x_r) \quad \textrm{ (since the $X(n_i)$'s are independent)}\\ & \quad =F(x_1)F(x_2) \cdots F(x_r) \quad \textrm{ (since $F_{X(t_i)}(x)=F(x)$)}. \end{align} We also have \begin{align}%\label{} &F_{X(n_1+D) X(n_2+D)\cdots X(n_r+D)}(x_1,x_2,\cdots, x_r)\\ & \quad =F_{X(n_1+D)}(x_1) F_{X(n_2+D)}(x_2) \cdots F_{X(n_r+D)}(x_r) \quad \textrm{ (since the $X(n_i+D)$'s are independent)}\\ & \quad =F(x_1)F(x_2) \cdots F(x_n) \quad \textrm{ (since $F_{X(n_i+D)}(x)=F(x)$)}. \end{align}

In practice, it is desirable if a random process $X(t)$ is stationary. In particular, if a process is stationary, then its analysis is usually simpler as the probabilistic properties do not change by time. For example, suppose that you need to do forecasting about the future of a process $X(t)$. If you know the process is stationary, you can observe the past, which will normally give you a lot of information about how the process will behave in the future.

However, it turns out that many real-life processes are not strict-sense stationary. Even if a process is strict-sense stationary, it might be difficult to prove it. Fortunately, it is often enough to show a "weaker" form of stationarity than the one defined above.

Weak-Sense Stationary Processes:

Here, we define one of the most common forms of stationarity that is widely used in practice. A random process is called weak-sense stationary or wide-sense stationary (WSS) if its mean function and its correlation function do not change by shifts in time. More precisely, $X(t)$ is WSS if, for all $t_1,t_2\in \mathbb{R}$ and all $\Delta \in \mathbb{R}$,

$E[X(t_1)]=E[X(t_2)]$,
$E[X(t_1)X(t_2)]=E[X(t_1+\Delta)X(t_2+\Delta)]$.

Note that the first condition states that the mean function $\mu_X(t)$ is not a function of time, $t$, thus we can write $\mu_X(t)=\mu_X$. The second condition states that the correlation function $R_X(t_1,t_2)$ is only a function of $\tau=t_1-t_2$, and not $t_1$ and $t_2$ individually. Thus, we can write $R_X(t_1,t_2)=R_X(t_1-t_2)=R_X(\tau)$. Therefore, we can provide the following definition.

A continuous-time random process $\big\{X(t), t \in \mathbb{R}\big\}$ is weak-sense stationary or wide-sense stationary (WSS) if

$\mu_X(t)=\mu_X$, for all $t\in \mathbb{R}$,
$R_X(t_1,t_2)=R_X(t_1-t_2)$, for all $t_1,t_2 \in \mathbb{R}$.

We can provide a similar definition for discrete-time WSS processes.

A discrete-time random process $\big\{X(n), n \in \mathbb{Z}\big\}$ is weak-sense stationary or wide-sense stationary (WSS) if

$\mu_X(n)=\mu_X$, for all $n\in \mathbb{Z}$,
$R_X(n_1,n_2)=R_X(n_1-n_2)$, for all $n_1,n_2 \in \mathbb{Z}$.

Example
Consider the random process $\big\{X(t), t \in \mathbb{R}\big\}$ defined as \begin{align}%\label{} X(t)=\cos (t+U), \end{align} where $U \sim Uniform(0,2\pi)$. Show that $X(t)$ is a WSS process.

Solution
- We need to check two conditions:
  1. $\mu_X(t)=\mu_X$, for all $t\in \mathbb{R}$, and
  2. $R_X(t_1,t_2)=R_X(t_1-t_2)$, for all $t_1,t_2 \in \mathbb{R}$.
  We have \begin{align}%\label{} \mu_X(t) &=E[X(t)]\\ &=E[\cos(t+U)]\\ &=\int_{0}^{2\pi} \cos(t+u) \frac{1}{2\pi} \; du \\ &=0, \quad \textrm{ for all $t\in \mathbb{R}$.} \end{align} We can also find $R_X(t_1,t_2)$ as follows \begin{align}%\label{} R_X(t_1,t_2) &=E[X(t_1)X(t_2)]\\ &=E[\cos(t_1+U) \cos(t_2+U)]\\ &=E\left[\frac{1}{2} \cos(t_1+t_2+2U)+\frac{1}{2} \cos(t_1-t_2) \right]\\ &=E\left[\frac{1}{2} \cos(t_1+t_2+2U)\right]+E\left[\frac{1}{2} \cos(t_1-t_2) \right]\\ &=\int_{0}^{2\pi} \cos(t_1+t_2+u) \frac{1}{2\pi} \; du + \frac{1}{2} \cos(t_1-t_2)\\ &=0+ \frac{1}{2} \cos(t_1-t_2)\\ &=\frac{1}{2} \cos(t_1-t_2), \quad \textrm{ for all $t_1,t_2 \in \mathbb{R}$.} \end{align} As we see, both conditions are satisfied, thus $X(t)$ is a WSS process.

Since for WSS random processes, $R_X(t_1,t_2)=R_X(t_1-t_2)$, we usually denote the correlation function by $R_X(\tau)$, where $\tau=t_1-t_2$. Thus, for a WSS process, we can write \begin{align}\label{eq:RxWSS} R_X(\tau)=E[X(t)X(t-\tau)]=E[X(t+\tau)X(t)] \hspace{30pt} (10.1) \end{align} As we will see in Section 10.2, $R_X(\tau)$ is a very useful tool when we do frequency domain analysis. Here, we would like to study some properties of $R_X(\tau)$ for WSS signals. Let $\big\{X(t), t \in \mathbb{R}\big\}$ be a WSS process with correlation function $R_X(\tau)$. Then, we can write \begin{align}%\label{} R_X(0)=E[X(t)^2]. \end{align} The quantity $E[X(t)^2]$ is called the expected (average) power in $X(t)$ at time $t$. For a WSS process, the expected power is not a function of time. Since $X(t)^2 \geq 0$, we conclude that $R_X(0)\geq 0$.

\begin{align}%\label{} R_X(0)=E[X(t)^2] \geq 0 \end{align}

Next, let's consider $R_X(-\tau)$. We have \begin{align}%\label{} R_X(-\tau) &=E[X(t)X(t+\tau)] \quad \big(\textrm{by definition (Equation 10.1)}\big) \\ &=E[X(t+\tau)X(t)] \\ &=R_X(\tau) \quad \textrm{(Equation 10.1)} \end{align} Thus, we conclude that $R_X(\tau)$ is an even function.

\begin{align}%\label{} R_X(\tau)=R_X(-\tau), \quad \textrm{for all } \tau \in \mathbb{R}. \end{align}

Finally, we would like to show that $R_X(\tau)$ takes its maximum value at $\tau=0$. That is, $X(t)$ and $X(t+\tau)$ have the highest correlation when $\tau=0$.

\begin{align}%\label{} |R_X(\tau)|\leq R_X(0), \quad \textrm{for all } \tau \in \mathbb{R}. \end{align}

The proof can be done using the Cauchy-Schwarz inequality: For any two random variables $X$ and $Y$, we have \begin{align} |EXY| \leq \sqrt{E[X^2] E[Y^2]}, \end{align} where equality holds if and only if $X=\alpha Y$ for some constant $\alpha \in \mathbb{R}$. Now, if we choose $X=X(t)$ and $Y=X(t-\tau)$, we obtain \begin{align}%\label{} |E[X(t)X(t-\tau)]| &\leq \sqrt{E[X(t)^2] E[X(t-\tau)^2]}\\ &=\sqrt{R_X(0)R_X(0)}\\ &=R_X(0). \end{align} Therefore, we conclude that $|R_X(\tau)|\leq R_X(0)$. Considering these properties, Figure 10.4 shows some possible shapes for $R_X(\tau)$.

correlatio-function — Figure 10.4 - Some possible shapes for $R_X(\tau)$.

Jointly Wide-Sense Stationary Processes:

We often work with multiple random processes, so we extend the concept of wide-sense stationarity to more than one process. More specifically, we can talk about jointly wide-sense stationary processes.

Two random processes $\big\{X(t), t \in \mathbb{R}\big\}$ and $\big\{Y(t), t \in \mathbb{R}\big\}$ are said to be jointly wide-sense stationary if

$X(t)$ and $Y(t)$ are each wide-sense stationary.
$R_{XY}(t_1,t_2)=R_{XY}(t_1-t_2)$.

Example
Let $X(t)$ and $Y(t)$ be two jointly WSS random processes. Consider the random process $Z(t)$ defined as \begin{align*} Z(t) &=X(t)+Y(t). \end{align*} Show that $Z(t)$ is WSS.

Solution
- Since $X(t)$ and $Y(t)$ are jointly WSS, we conclude
  1. $\mu_X(t)=\mu_X$, $\mu_Y(t)=\mu_Y$,
  2. $R_X(t_1,t_2)=R_X(t_1-t_2)$, $R_Y(t_1,t_2)=R_Y(t_1-t_2)$,
  3. $R_{XY}(t_1,t_2)=R_{XY}(t_1-t_2)$.
  Therefore, we have \begin{align*}%\label{} \mu_Z(t)&= E[X(t)+Y(t)]\\ &=E[X(t)]+E[Y(t)]\\ &=\mu_X+\mu_Y. \end{align*} \begin{align*}%\label{} R_Z(t_1,t_2)&= E\left[\big(X(t_1)+Y(t_1)\big) \big(X(t_2)+Y(t_2)\big) \right]\\ &=E[X(t_1)X(t_2)]+E[X(t_1)Y(t_2)]+E[Y(t_1)X(t_2)]+E[Y(t_1)Y(t_2)]\\ &=R_X(t_1-t_2)+R_{XY}(t_1-t_2)+R_{YX}(t_1-t_2)+R_{Y}(t_1-t_2). \end{align*}

Cyclostationary Processes:

Some practical random processes have a periodic structure. That is, the statistical properties are repeated every $T$ units of time (e.g., every $T$ seconds). In other words, the random variables \begin{align}%\label{} X(t_1), X(t_2), \cdots, X(t_r) \end{align} have the same joint CDF as the random variables \begin{align}%\label{} X(t_1+T), X(t_2+T), \cdots, X(t_r+T). \end{align} Such random variables are called cyclostationary. For example, consider the random process $\big\{X(t), t \in \mathbb{R}\big\}$ defined as \begin{align}%\label{} X(t)=A \cos (\omega t), \end{align} where $A$ is a random variable. Here, we have \begin{align}%\label{} X\left(t+\frac{2 \pi}{\omega}\right)&=A \cos (\omega t+2 \pi)\\ &=A \cos (\omega t)=X(t). \end{align} We conclude $X(t)$ is in fact a periodic signal with period $T=\frac{2 \pi}{\omega}$. Therefore, the statistical properties of $X(t)$ do not change by shifting the time by $T$ units, so $X(t)$ is a cyclostationary random process with period $T=\frac{2 \pi}{\omega}$. Similarly, we can define wide-sense cyclostationary random processes.

A continuous-time random process $\big\{X(t), t \in \mathbb{R}\big\}$ is cyclostationary if there exists a positive real number $T$ such that, for all $t_1,t_2,\cdots, t_r \in \mathbb{R}$, the joint CDF of \begin{align}%\label{} X(t_1), X(t_2), \cdots, X(t_r) \end{align} is the same as the joint CDF of \begin{align}%\label{} X(t_1+T), X(t_2+T), \cdots, X(t_r+T). \end{align}

A continuous-time random process $\big\{X(t), t \in \mathbb{R}\big\}$ is weak-sense cyclostationary or wide-sense cyclostationary if there exists a positive real number $T$ such that

$\mu_X(t+T)=\mu_X(t)$, for all $t\in \mathbb{R}$;
$R_X(t_1+T,t_2+T)=R_X(t_1,t_2)$, for all $t_1,t_2 \in \mathbb{R}$.

Similarly, you can define cyclostationary discrete-time processes. For example, a discrete-time random process $\big\{X(n), n \in \mathbb{Z}\big\}$ is wide-sense cyclostationary if there exists $M \in \mathbb{N}$ such that

$\mu_X(n+M)=\mu_X$, for all $n\in \mathbb{Z}$;
$R_X(n_1+M,n_2+M)=R_X(n_1,n_2)$, for all $n_1,n_2 \in \mathbb{Z}$.

Derivatives and Integrals of Random Processes:

Many real-life systems are described by differential equations. To analyze such systems when randomness is involved, we often need to differentiate or integrate the random processes that are present in the system. You have seen concepts such as continuity, differentiability, and integrability in calculus for deterministic signals (deterministic functions). Here, we need to extend those concepts to random processes. Without going much into mathematical technicalities, here we would like to provide some guidelines on how to deal with derivatives and integrals of random processes.

Let $X(t)$ be a continuous-time random process. We say that $X(t)$ is mean-square continuous at time $t$ if \begin{align*} \lim_{\delta\rightarrow 0} E\bigg[\big|X(t+\delta)-X(t)\big|^2\bigg]=0. \end{align*} Note that mean-square continuity does not mean that every possible realization of $X(t)$ is a continuous function. It roughly means that the difference $X(t+\delta)-X(t)$ is small on average.

Example
The Poisson process is discussed in detail in Chapter 11. If $X(t)$ is a Poisson process with intensity $\lambda$, then for all $t>s \geq 0$, we have \begin{align*} X(t)-X(s) \sim Poisson\big(\lambda (t-s)\big). \end{align*} Show that $X(t)$ is mean-square continuous at any time $t \geq 0$.

Solution
- We have \begin{align*} X(t+\delta)-X(t) \sim Poisson\big(\lambda \delta\big). \end{align*} Thus, \begin{align*} \lim_{\delta\rightarrow 0} E[|X(t+\delta)-X(t)|^2]&= \lim_{\delta\rightarrow 0} \lambda \delta+(\lambda \delta)^2\\ &=0. \end{align*}

It is worth noting that there are jumps in a Poisson process; however, those jumps are not very "dense" in time, so the random process is still continuous in the mean-square sense. Figure 10.5 shows a possible realization of a Poisson process.

We can similarly talk about mean-square differentiability and mean-square integrability. If $X(t)$ is a random process, the derivative of $X(t)$, \begin{align}%\label{} Y(t)=\frac{d}{dt} X(t), \end{align} is also a random process. For nice and smooth processes, the derivative can be obtained in a natural way. For example, if you have a random process defined as \begin{align}%\label{} X(t)=A+Bt+Ct^2, \quad \textrm{ for all }t \in [0,\infty), \end{align} where $A$, $B$, and $C$ are random variables, then the derivative of $X(t)$ can be written as \begin{align}%\label{} X'(t)=B+2Ct, \quad \textrm{ for all }t \in [0,\infty). \end{align} Without trying to go much into mathematical technicalities, here we would like to provide some guidelines on how to deal with derivatives and integrals of random processes (assuming some mild regularity conditions are satisfied). A key point to note is that differentiation and integration are linear operations. This, for example, means that you can often interchange integration and expectation. More specifically, you can write \begin{align}%\label{} E\left[\int_{0}^{t} X(u)du\right]=\int_{0}^{t}E[X(u)]du. \end{align} Similarly, if the derivative of $X(t)$ is well-defined, we can write \begin{align}%\label{} E\left[\frac{d}{dt} X(t)\right]= \frac{d}{dt} E[X(t)]. \end{align}

Example
Consider a random process $X(t)$ and its derivative, $X'(t)=\frac{d}{dt} X(t)$. Assuming that the derivatives are well-defined, show that \begin{align}%\label{} R_{XX'}(t_1,t_2)=\frac{\partial}{\partial t_2} R_X(t_1,t_2). \end{align}

Solution
- We have \begin{align*}%\label{} R_{XX'}(t_1,t_2)&=E[X(t_1)X'(t_2)]\\ &=E\left[X(t_1)\frac{d}{dt_2} X(t_2)\right]\\ &=E\left[\frac{\partial}{\partial t_2} \bigg( X(t_1)X(t_2) \bigg) \right]\\ &=\frac{\partial}{\partial t_2} E\big[ X(t_1)X(t_2) \big]\\ &=\frac{\partial}{\partial t_2} R_X(t_1,t_2). \end{align*}

← previous

The print version of the book is available on Amazon.

Practical uncertainty: Useful Ideas in Decision-Making, Risk, Randomness, & AI