3.2.4 Variance
Consider two random variables $X$ and $Y$ with the following PMFs.
$$ \label{eq:Xvar}
\nonumber P_X(x) = \left\{
\begin{array}{l l}
0.5 & \quad \text{for } x=100\\
0.5 & \quad \text{for } x=100\\
0 & \quad \text{otherwise}
\end{array} \right.
\hspace{10pt} (3.3)
$$
$$ \label{eq:Yvar}
\nonumber P_Y(y) = \left\{
\begin{array}{l l}
1 & \quad \text{for } y=0\\
0 & \quad \text{otherwise}
\end{array} \right.
\hspace{20pt} (3.4)
$$
Note that $EX=EY=0$. Although both random variables have the same mean value, their distribution
is completely different. $Y$ is always equal to its mean of $0$, while $X$ is either $100$ or $100$,
quite far from its mean value. The variance is a measure of how spread out the distribution of
a random variable is. Here, the variance of $Y$ is quite small since its distribution is concentrated at
a single value, while the variance of $X$ will be larger since its distribution is more spread out.
By definition, the variance of $X$ is the average value of $(X\mu_X)^2$. Since $(X\mu_X)^2 \geq 0$, the variance is always larger than or equal to zero. A large value of the variance means that $(X\mu_X)^2$ is often large, so $X$ often takes values far from its mean. This means that the distribution is very spread out. On the other hand, a low variance means that the distribution is concentrated around its average.
Note that if we did not square the difference between $X$ and its mean, the result would be $0$. That is $$E[X\mu_X]=EXE[\mu_X]=\mu_X\mu_X=0.$$ $X$ is sometimes below its average and sometimes above its average. Thus, $X\mu_X$ is sometimes negative and sometimes positive, but on average it is zero.
To compute $Var(X)=E\big[ (X\mu_X)^2\big]$, note that we need to find the expected value of $g(X)=(X\mu_X)^2$,
so we can use LOTUS. In particular, we can write
$$\textrm{Var}(X)=E\big[ (X\mu_X)^2\big]=\sum_{x_k \in R_X} (x_k\mu_X)^2 P_X(x_k).$$
For example, for $X$ and $Y$ defined in Equations 3.3 and 3.4, we have
$$\textrm{Var}(X)=(1000)^2(0.5)+(1000)^2(0.5)=10,000$$
$$\textrm{Var}(Y)=(00)^2(1)=0.$$
As we expect, $X$ has a very large variance while Var$(Y)=0$.
Note that Var$(X)$ has a different unit than $X$. For example, if $X$ is measured in $meters$ then Var$(X)$ is in $meters^2$. To solve this issue, we define another measure, called the standard deviation, usually shown as $\sigma_X$, which is simply the square root of variance.
The standard deviation of $X$ has the same unit as $X$. For $X$ and $Y$ defined in Equations 3.3 and 3.4, we have
$\sigma_X$  $=\sqrt{10,000}= 100$ 
$\sigma_Y$  $=\sqrt{0}=0$. 
Here is a useful formula for computing the variance.
To prove it note that
\begin{align}%\label{}
\nonumber \textrm{Var}(X) &= E\big[ (X\mu_X)^2\big]\\
\nonumber &= E \big[ X^22 \mu_X X + \mu_X^2 \big]\\
\nonumber &= E\big[X^2\big]2E\big[\mu_X X\big]+E\big[\mu_X^2\big] &\textrm{ by linearity of expectation.}
\end{align}
Note that for a given random variable $X$, $\mu_X$ is just a constant real number. Thus,
$E\big[\mu_X X\big]=\mu_X E[X]=\mu_X^2$, and $E[\mu_X^2 \big]=\mu_X^2$, so we have
\begin{align}%\label{}
\nonumber\textrm{Var}(X) &= E\big[X^2\big]2\mu_X^2+\mu_X^2\\
\nonumber &= E\big[X^2\big]\mu_X^2.
\end{align}
quation 3.5 is usually easier to work with compared to $\textrm{Var}(X)=E\big[ (X\mu_X)^2\big]$.
To use this equation, we can find $E[X^2]=EX^2$ using LOTUS
$$E X^2=\sum_{x_k \in R_X} x_k^2 P_X(x_k),$$
and then subtract $\mu_X^2$ to obtain the variance.
Example
I roll a fair die and let $X$ be the resulting number. Find $EX$, Var$(X)$, and $\sigma_X$.
 Solution
 We have $R_X=\{1,2,3,4,5,6\}$ and $P_X(k)=\frac{1}{6}$ for $k=1,2,...,6$. Thus, we have $$EX=1 \cdot \frac{1}{6}+ 2 \cdot \frac{1}{6}+ 3 \cdot \frac{1}{6}+ 4 \cdot \frac{1}{6}+ 5 \cdot \frac{1}{6}+ 6 \cdot \frac{1}{6}=\frac{7}{2};$$ $$EX^2=1 \cdot \frac{1}{6}+ 4\cdot \frac{1}{6}+ 9\cdot \frac{1}{6}+ 16 \cdot \frac{1}{6}+ 25\cdot \frac{1}{6}+ 36 \cdot \frac{1}{6}=\frac{91}{6}.$$ Thus $$\textrm{Var}(X)=E\big[X^2\big]\big(EX\big)^2=\frac{91}{6}\left(\frac{7}{2}\right)^2=\frac{91}{6}\frac{49}{4}\approx 2.92,$$ $$\sigma_X= \sqrt {\textrm{Var}(X)}\approx \sqrt{2.92} \approx 1.71$$
Note that variance is not a linear operator. In particular, we have the following theorem.
For a random variable $X$ and real numbers $a$ and $b$, $$\hspace{70pt} \textrm{Var}(aX+b)=a^2 \textrm{Var}(X) \hspace{70pt} (3.6)$$
Proof
If $Y=aX+b$, $EY=aEX+b$. Thus,
\begin{align}%\label{}
\nonumber \textrm{Var} (Y) &= E[ (YEY)^2 ]\\
\nonumber &= E[ (aX+baEXb)^2 ]\\
\nonumber &= E[a^2(X\mu_X)^2]\\
\nonumber &= a^2 E[(X\mu_X)^2]\\
\nonumber &= a^2 \textrm{Var}(X)\\
\end{align}
From Equation 3.6, we conclude that, for standard deviation, $\textrm{SD}(aX+b)=a\textrm{SD}(X)$. We mentioned that variance is NOT a linear operation. But there is a very important case, in which variance behaves like a linear operation and that is when we look at sum of independent random variables.
If $X_1, X_2,\cdots ,X_n$ are independent random variables and $X=X_1+X_2+\cdots+X_n$, then $$\hspace{70pt} \textrm{Var}(X)=\textrm{Var}(X_1)+\textrm{Var}(X_2)+\cdots+\textrm{Var}(X_n) \hspace{70pt} (3.7)$$
We will prove this theorem in Chapter 6, but for now we can look at an example to see how we can use it.
Example
If $X \sim Binomial(n,p)$ find Var$(X)$.
 Solution

We know that we can write a $Binomial(n,p)$ random variable as the sum of $n$ independent
$Bernoulli(p)$ random variables, i.e., $X=X_1+X_2+\cdots+X_n$. Thus, we conclude
$$\textrm{Var}(X)=\textrm{Var}(X_1)+\textrm{Var}(X_2)+\cdots+\textrm{Var}(X_n).$$
If $X_i \sim Bernoulli(p)$, then its variance is
$$\textrm{Var}(X_i)=E[X_i^2](EX_i)^2=1^2 \cdot p+0^2 \cdot (1p)p^2=p(1p).$$
Thus,
$\textrm{Var}(X)$ $=p(1p)+p(1p)+\cdots+p(1p)$ $=np(1p)$.

We know that we can write a $Binomial(n,p)$ random variable as the sum of $n$ independent
$Bernoulli(p)$ random variables, i.e., $X=X_1+X_2+\cdots+X_n$. Thus, we conclude
$$\textrm{Var}(X)=\textrm{Var}(X_1)+\textrm{Var}(X_2)+\cdots+\textrm{Var}(X_n).$$
If $X_i \sim Bernoulli(p)$, then its variance is
$$\textrm{Var}(X_i)=E[X_i^2](EX_i)^2=1^2 \cdot p+0^2 \cdot (1p)p^2=p(1p).$$
Thus,