Independence | Conditional Independence

1.4.1 Independence

Let $A$ be the event that it rains tomorrow, and suppose that $P(A)=\frac{1}{3}$. Also suppose that I toss a fair coin; let $B$ be the event that it lands heads up. We have $P(B)=\frac{1}{2}$.

Now I ask you, what is $P(A|B)$? What is your guess? You probably guessed that $P(A|B)=P(A)=\frac{1}{3}$. You are right! The result of my coin toss does not have anything to do with tomorrow's weather. Thus, no matter if $B$ happens or not, the probability of $A$ should not change. This is an example of two independent events. Two events are independent if one does not convey any information about the other. Let us now provide a formal definition of independence.

Two events $A$ and $B$ are independent if and only if $P(A \cap B)=P(A)P(B)$.

Now, let's first reconcile this definition with what we mentioned earlier, $P(A|B)=P(A)$. If two events are independent, then $P(A \cap B)=P(A)P(B)$, so

$P(A\|B)$	$ = \frac{P(A \cap B)}{P(B)}$
	$= \frac{P(A)P(B)}{P(B)}$
	$=P(A)$.

Thus, if two events $A$ and $B$ are independent and $P(B)\neq 0$, then $P(A|B)=P(A)$. To summarize, we can say "independence means we can multiply the probabilities of events to obtain the probability of their intersection", or equivalently, "independence means that conditional probability of one event given another is the same as the original (prior) probability".

Sometimes the independence of two events is quite clear because the two events seem not to have any physical interaction with each other (such as the two events discussed above). At other times, it is not as clear and we need to check if they satisfy the independence condition. Let's look at an example.

Example

I pick a random number from $\{1,2,3,\cdots,10\}$, and call it $N$. Suppose that all outcomes are equally likely. Let $A$ be the event that $N$ is less than $7$, and let $B$ be the event that $N$ is an even number. Are $A$ and $B$ independent?

Solution
- We have $A=\{1,2,3,4,5,6\}$, $B=\{2,4,6,8,10\}$, and $A\cap B=\{2,4,6\}$. Then $$P(A) =0.6,$$ $$P(B) =0.5,$$ $$P(A \cap B)=0.3$$ Therefore, $P(A \cap B)=P(A)P(B)$, so $A$ and $B$ are independent. This means that knowing that $B$ has occurred does not change our belief about the probability of $A$. In this problem the two events are about the same random number, but they are still independent because they satisfy the definition.

The definition of independence can be extended to the case of three or more events.

Three events $A$, $B$, and $C$ are independent if all of the following conditions hold $$P(A \cap B)=P(A)P(B),$$ $$P(A \cap C)=P(A)P(C),$$ $$P(B \cap C)=P(B)P(C),$$ $$P(A \cap B \cap C)=P(A)P(B)P(C).$$

Note that all four of the stated conditions must hold for three events to be independent. In particular, you can find situations in which three of them hold, but the fourth one does not. In general, for $n$ events $A_1, A_2,\cdots,A_n$ to be independent we must have $$P(A_i \cap A_j)=P(A_i)P(A_j), \textrm{ for all distinct } i,j \in \{1,2,\cdots,n\};$$ $$P(A_i \cap A_j \cap A_k)=P(A_i)P(A_j)P(A_k), \textrm{ for all distinct } i,j,k \in \{1,2,\cdots,n\};$$ $$\hspace{50pt} . \hspace{50pt} .$$ $$\hspace{50pt} . \hspace{50pt} .$$ $$\hspace{50pt} . \hspace{50pt} .$$ $$P(A_1 \cap A_2 \cap A_3 \cdots \cap A_n)=P(A_1)P(A_2)P(A_3) \cdots P(A_n).$$

This might look like a difficult definition, but we can usually argue that the events are independent in a much easier way. For example, we might be able to justify independence by looking at the way the random experiment is performed. A simple example of an independent event is when you toss a coin repeatedly. In such an experiment, the results of any subset of the coin tosses do not have any impact on the other ones.

Example

I toss a coin repeatedly until I observe the first tails at which point I stop. Let $X$ be the total number of coin tosses. Find $P(X=5)$.

Solution

Here, the outcome of the random experiment is a number $X$. The goal is to find $P(A)=P(5)$. But what does $X=5$ mean? It means that the first $4$ coin tosses result in heads and the fifth one results in tails. Thus the problem is to find the probability of the sequence $HHHHT$ when tossing a coin five times. Note that $HHHHT$ is a shorthand for the event "(The first coin toss results in heads) and (The second coin toss results in heads) and (The third coin toss results in heads) and (The fourth coin toss results in heads) and (The fifth coin toss results in tails)." Since all the coin tosses are independent, we can write

$P(HHHHT)$	$ =P(H)P(H)P(H)P(H)P(T)$
	$= \frac{1}{2} . \frac{1}{2} . \frac{1}{2} . \frac{1}{2} . \frac{1}{2}$
	$=\frac{1}{32}$.

Discussion: Some people find it more understandable if you look at the problem in the following way. I never stop tossing the coin. So the outcome of this experiment is always an infinite sequence of heads or tails. The value $X$ (which we are interested in) is just a function of the beginning part of the sequence until you observe a tails. If you think about the problem this way, you should not worry about the stopping time. For this problem it might not make a big difference conceptually, but for some similar problems this way of thinking might be beneficial.

We have seen that two events $A$ and $B$ are independent if $P(A \cap B)=P(A)P(B)$. In the next two results, we examine what independence can tell us about other set operations such as complements and unions.

Lemma

If $A$ and $B$ are independent then

$A$ and $B^c$ are independent,
$A^c$ and $B$ are independent,
$A^c$ and $B^c$ are independent.

Proof

We prove the first one as the others can be concluded from the first one immediately. We have

$P(A \cap B^c)$	$ =P(A-B)$
	$= P(A)-P(A \cap B)$
	$=P(A)-P(A)P(B) \hspace{20pt} \textrm{since $A$ and $B$ are independent}$
	$=P(A)(1-P(B))$
	$=P(A)P(B^c)$.

Thus, $A$ and $B^c$ are independent.

Sometimes we are interested in the probability of the union of several independent events $A_1, A_2,\cdots,A_n$. For independent events, we know how to find the probability of intersection easily, but not the union. It is helpful in these cases to use De Morgan's Law: $$A_1 \cup A_2 \cup\cdots\cup A_n=(A_1^c \cap A_2^c \cap\cdots\cap A_n^c)^c$$ Thus we can write

$P\big(A_1 \cup A_2 \cup\cdots\cup A_n\big)$	$=1-P\big(A_1^c \cap A_2^c \cap\cdots\cap A_n^c\big)$
	$=1-P(A_1^c)P(A_2^c) \cdots P(A_n^c) \textrm{(since the $A_i$'s are independent)}$
	$=1-\big(1-P(A_1)\big)\big(1-P(A_2)\big)\cdots\big(1-P(A_n)\big)$.

If $A_1, A_2,\cdots,A_n$ are independent then $$P\big(A_1 \cup A_2 \cup\cdots\cup A_n\big)=1-\big(1-P(A_1)\big)\big(1-P(A_2)\big)\cdots\big(1-P(A_n)\big).$$

Example

Suppose that the probability of being killed in a single flight is $p_c=\frac{1}{4 \times 10^6}$ based on available statistics. Assume that different flights are independent. If a businessman takes $20$ flights per year, what is the probability that he is killed in a plane crash within the next $20$ years? (Let's assume that he will not die because of another reason within the next $20$ years.)

Solution
- The total number of flights that he will take during the next $20$ years is $N=20 \times 20=400$. Let $p_s$ be the probability that he survives a given single flight. Then we have $$p_s=1-p_c.$$ Since these flights are independent, the probability that he will survive all $N=400$ flights is $$P(\textrm{Survive $N$ flights})=p_s \times p_s \times \cdots \times p_s=p_s^N=(1-p_c)^N.$$ Let $A$ be the event that the businessman is killed in a plane crash within the next $20$ years. Then $$P(A)=1-(1-p_c)^N=9.9995 \times 10^{-5}\approx \frac{1}{10000}.$$

Warning! One common mistake is to confuse independence and being disjoint. These are completely different concepts. When two events $A$ and $B$ are disjoint it means that if one of them occurs, the other one cannot occur, i.e., $A\cap B=\emptyset$. Thus, event $A$ usually gives a lot of information about event $B$ which means that they cannot be independent. Let's make it precise.

Lemma

Consider two events $A$ and $B$, with $P(A)\neq 0$ and $P(B)\neq 0$. If $A$ and $B$ are disjoint, then they are not independent.

Proof

Since $A$ and $B$ are disjoint, we have $$P(A \cap B) = 0 \neq P(A)P(B).$$ Thus, $A$ and $B$ are not independent. $\quad \square$

Table 1.1 summarizes the two concepts of disjointness and independence.

Concept	Meaning	Formulas
Disjoint	$A$ and $B$ cannot occur at the same time	$A \cap B=\emptyset, $ $P(A \cup B)=P(A)+P(B)$
Independent	$A$ does not give any information about $B$	$P(A\|B)=P(A), P(B\|A)=P(B)$ $P(A \cap B)=P(A)P(B)$

Table 1.1: Differences between disjointness and independence.

Example (A similar problem is given in [6])

Two basketball players play a game in which they alternately shoot a basketball at a hoop. The first one to make a basket wins the game. On each shot, Player 1 (the one who shoots first) has probability $p_1$ of success, while Player 2 has probability $p_2$ of success (assume $0 < p_1,p_2 < 1$). The shots are assumed to be independent.

Find $P(W_1)$, the probability that Player 1 wins the game.
For what values of $p_1$ and $p_2$ is this a fair game, i.e., each player has a $50$ percent chance of winning the game?

Solution

In this game, the event $W_1$ can happen in many different ways. We calculate the probability of each of these ways and then add them up to find the total probability of winning. In particular, Player 1 may win on her first shot, or her second shot, and so on. Define $A_i$ as the event that Player 1 wins on her $i$'th shot. What is the probability of $A_i$? $A_i$ happens if Player 1 is unsuccessful at her first $i-1$ shots and successful at her $i$th shot, while Player 2 is unsuccessful at her first $i-1$ shots. Since different shots are independent, we obtain $$P(A_1) = p_1,$$ $$P(A_2) =(1-p_1)(1-p_2)p_1,$$ $$P(A_3) =(1-p_1)(1-p_2)(1-p_1)(1-p_2)p_1,$$ $$\cdots$$ $$P(A_k) =\big[(1-p_1)(1-p_2)\big]^{k-1}p_1,$$ $$\cdots$$ Note that $A_1, A_2, A_3,\cdots$ are disjoint events, because if one of them occurs the other ones cannot occur. The event that Player 1 wins is the union of the $A_i$'s, and since the $A_i$'s are disjoint, we have

$P(W_1)$	$=P(A_1 \cup A_2 \cup A_3 \cup \cdots )$
	$= P(A_1)+P(A_2)+P(A_3)+ \cdots$
	$=p_1+(1-p_1)(1-p_2)p_1+\big[(1-p_1)(1-p_2)\big]^{2}p_1+\cdots$
	$=p_1 \bigg[1+(1-p_1)(1-p_2)+[(1-p_1)(1-p_2)\big]^{2}+\cdots\bigg]$.

Note that since $0 < p_1,p_2 < 1$, for $x=(1-p_1)(1-p_2)$ we have $0 < x < 1$. Thus, using the geometric sum formula ($ \sum_{k=0}^{\infty} ax^k= a \frac{1}{1-x}$ for $|x| < 1$), we obtain $$P(W_1)=\frac{p_1}{1-(1-p_1)(1-p_2)}=\frac{p_1}{p_1+p_2-p_1p_2}.$$ It is always a good idea to look at limit cases to check our answer. For example, if we plug in $p_1=0, p_2\neq 0$, we obtain $P(W_1)=0$, which is what we expect. Similarly, if we let $p_2=0, p_1\neq 0$, we obtain $P(W_1)=1$, which again makes sense.

Now, to make this a fair game (in the sense that $P(W_1)=.5$), we have $$P(W_1)=\frac{p_1}{p_1+p_2-p_1p_2}=0.5$$ and we obtain $$p_1 =\frac{p_2}{1+p_2}.$$ Note that this means that $p_1 < p_2$, which makes sense intuitively. Since Player 1 has the advantage of starting the game, she should have a smaller success rate so that the whole game is fair.

← previous

The print version of the book is available on Amazon.

Practical uncertainty: Useful Ideas in Decision-Making, Risk, Randomness, & AI