Bayesian Hypothesis Testing

9.1.8 Bayesian Hypothesis Testing

Suppose that we need to decide between two hypotheses $H_0$ and $H_1$. In the Bayesian setting, we assume that we know prior probabilities of $H_0$ and $H_1$. That is, we know $P(H_0)=p_0$ and $P(H_1)=p_1$, where $p_0+p_1=1$. We observe the random variable (or the random vector) $Y$. We know the distribution of $Y$ under the two hypotheses, i.e, we know \begin{align} f_{Y}(y|H_0), \quad \textrm{and} \quad f_{Y}(y|H_1). \end{align} Using Bayes' rule, we can obtain the posterior probabilities of $H_0$ and $H_1$: \begin{align} P(H_0|Y=y)&=\frac{f_{Y}(y|H_0)P(H_0)}{f_Y(y)}, \\ P(H_1|Y=y)&=\frac{f_{Y}(y|H_1)P(H_1)}{f_Y(y)}. \end{align} One way to decide between $H_0$ and $H_1$ is to compare $P(H_0|Y=y)$ and $P(H_1|Y=y)$, and accept the hypothesis with the higher posterior probability. This is the idea behind the maximum a posteriori (MAP) test. Here, since we are choosing the hypothesis with the highest probability, it is relatively easy to show that the error probability is minimized.

To be more specific, according to the MAP test, we choose $H_0$ if and only if

\begin{align} P(H_0|Y=y) \geq P(H_1|Y=y). \end{align} In other words, we choose $H_0$ if and only if \begin{align} f_{Y}(y|H_0)P(H_0) \geq f_{Y}(y|H_1)P(H_1). \end{align} Note that as always, we use the PMF instead of the PDF if $Y$ is a discrete random variable. We can generalize the MAP test to the case where you have more than two hypotheses. In that case, again we choose the hypothesis with the highest posterior probability.

MAP Hypothesis Test
Choose the hypothesis with the highest posterior probability, $P(H_i|Y=y)$. Equivalently, choose hypothesis $H_i$ with the highest $f_{Y}(y|H_i)P(H_i)$.

Example
Suppose that the random variable $X$ is transmitted over a communication channel. Assume that the received signal is given by \begin{align} Y=X+W, \end{align} where $W \sim N(0,\sigma^2)$ is independent of $X$. Suppose that $X=1$ with probability $p$, and $X=-1$ with probability $1-p$. The goal is to decide between $X=1$ and $X=-1$ by observing the random variable $Y$. Find the MAP test for this problem.

Solution
- Here, we have two hypotheses:
  
  $\quad$ $H_0$: $X=1$,
  
  $\quad$ $H_1$: $X=-1$.
  Under $H_0$, $Y=1+W$, so $Y|H_0 \; \sim \; N(1, \sigma^2)$. Therefore, \begin{align} f_{Y}(y|H_0)=\frac{1}{ \sigma\sqrt{2 \pi}} e^{-\frac{(y-1)^2}{2\sigma^2}}. \end{align} Under $H_1$, $Y=-1+W$, so $Y|H_1 \; \sim \; N(-1, \sigma^2)$. Therefore, \begin{align} f_{Y}(y|H_1)=\frac{1}{ \sigma\sqrt{2 \pi}} e^{-\frac{(y+1)^2}{2\sigma^2}}. \end{align} Thus, we choose $H_0$ if and only if \begin{align} \frac{1}{ \sigma\sqrt{2 \pi}} e^{-\frac{(y-1)^2}{2\sigma^2}} P(H_0) \geq \frac{1}{ \sigma\sqrt{2 \pi}} e^{-\frac{(y+1)^2}{2\sigma^2}} P(H_1). \end{align} We have $P(H_0)=p$, and $P(H_1)=1-p$. Therefore, we choose $H_0$ if and only if \begin{align} \exp \left (\frac{2y}{\sigma^2} \right) \geq \frac{1-p}{p}. \end{align} Equivalently, we choose $H_0$ if and only if \begin{align} y \geq \frac{\sigma^2}{2} \ln \left(\frac{1-p}{p}\right). \end{align}

Note that the average error probability for a hypothesis test can be written as \begin{align} P_e =P( \textrm{choose }H_1 | H_0) P(H_0)+ P( \textrm{choose }H_0 | H_1) P(H_1). \hspace{30pt} (9.6) \end{align} As we mentioned earlier, the MAP test achieves the minimum possible average error probability.

Example
Find the average error probability in Example 9.10

Solution
- in Example 9.10, we arrived at the following decision rule: We choose $H_0$ if and only if \begin{align} y \geq c, \end{align} where \begin{align} c=\frac{\sigma^2}{2} \ln \left(\frac{1-p}{p}\right). \end{align} Since $Y|H_0 \; \sim \; N(1, \sigma^2)$, \begin{align} P( \textrm{choose }H_1 | H_0)&=P(Y \lt c|H_0)\\ &=\Phi\left(\frac{c-1}{\sigma} \right)\\ &=\Phi\left(\frac{\sigma}{2} \ln \left(\frac{1-p}{p}\right)-\frac{1}{\sigma}\right). \end{align} Since $Y|H_1 \; \sim \; N(-1, \sigma^2)$, \begin{align} P( \textrm{choose }H_0 | H_1)&=P(Y \geq c|H_1)\\ &=1-\Phi\left(\frac{c+1}{\sigma} \right)\\ &=1-\Phi\left(\frac{\sigma}{2} \ln \left(\frac{1-p}{p}\right)+\frac{1}{\sigma}\right). \end{align} Figure 9.4 shows the two error probabilities for this example. Therefore, the average error probability is given by \begin{align} P_e &=P( \textrm{choose }H_1 | H_0) P(H_0)+ P( \textrm{choose }H_0 | H_1) P(H_1)\\ &=p \cdot \Phi\left(\frac{\sigma}{2} \ln \left(\frac{1-p}{p}\right)-\frac{1}{\sigma}\right)+(1-p) \cdot \left[ 1-\Phi\left(\frac{\sigma}{2} \ln \left(\frac{1-p}{p}\right)+\frac{1}{\sigma}\right)\right]. \end{align}

error-prob-Bayes-Hyp — Figure 9.4 - Error probabilities for Example 9.10 and Example 9.11

Minimum Cost Hypothesis Test:

Suppose that you are building a sensor network to detect fires in a forest. Based on the information collected by the sensors, the system needs to decide between two opposing hypotheses:

$\quad$ $H_0$: There is no fire,

$\quad$ $H_1$: There is a fire.

There are two possible types of errors that we can make: We might accept $H_0$ while $H_1$ is true, or we might accept $H_1$ while $H_0$ is true. Note that the cost associated with these two errors are not the same. In other words, if there is a fire and we miss it, we will be making a costlier error. To address situations like this, we associate a cost to each error type:

$\quad$ $C_{10}$: The cost of choosing $H_1$, given that $H_0$ is true.

$\quad$ $C_{01}$: The cost of choosing $H_0$, given that $H_1$ is true.

Then, the average cost can be written as \begin{align} C =C_{10} P( \textrm{choose }H_1 | H_0) P(H_0)+ C_{01} P( \textrm{choose }H_0 | H_1) P(H_1). \end{align} The goal of minimum cost hypothesis testing is to minimise the above expression. Luckily, this can be done easily. Note that we can rewrite the average cost as \begin{align} C = P( \textrm{choose }H_1 | H_0) \cdot [P(H_0) C_{10}]+ P( \textrm{choose }H_0 | H_1) \cdot [P(H_1) C_{01}]. \end{align} The above expression is very similar to the average error probability of the MAP test (Equation 9.6). The only difference is that we have $p(H_0) C_{10}$ instead of $P(H_0)$, and we have $p(H_1) C_{01}$ instead of $P(H_1)$. Therefore, we can use a decision rule similar to the MAP decision rule. More specifically, we choose $H_0$ if and only if \begin{align} \label{eq:min-cost-test} f_{Y}(y|H_0)P(H_0) C_{10} \geq f_{Y}(y|H_1)P(H_1)C_{01} \hspace{30pt} (9.7) \end{align} Here is another way to interpret the above decision rule. If we divide both sides of Equation 9.7 by $f_Y(y)$ and apply Bayes' rule, we conclude the following: We choose $H_0$ if and only if \begin{align} P(H_0|y) C_{10} \geq P(H_1|y) C_{01}. \end{align} Note that $P(H_0|y) C_{10}$ is the expected cost of accepting $H_1$. We call this the posterior risk of accepting $H_1$. Similarly, $P(H_1|y) C_{01}$ is the posterior risk (expected cost) of accepting $H_0$. Therefore, we can summarize the minimum cost test as follows: We accept the hypothesis with the lowest posterior risk.

Minimum Cost Hypothesis Test
Assuming the following costs

$\quad$ $C_{10}$: The cost of choosing $H_1$, given that $H_0$ is true.

$\quad$ $C_{01}$: The cost of choosing $H_0$, given that $H_1$ is true.

We choose $H_0$ if and only if \begin{align} \frac{f_{Y}(y|H_0)}{f_{Y}(y|H_1)} \geq \frac{P(H_1)C_{01}}{P(H_0) C_{10}}. \end{align} Equivalently, we choose $H_0$ if and only if \begin{align} P(H_0|y) C_{10} \geq P(H_1|y) C_{01}. \end{align}

Example
A surveillance system is in charge of detecting intruders to a facility. There are two hypotheses to choose from:

$\quad$ $H_0$: No intruder is present.

$\quad$ $H_1$: There is an intruder.

The system sends an alarm message if it accepts $H_1$. Suppose that after processing the data, we obtain $P(H_1|y)=0.05$. Also, assume that the cost of missing an intruder is $10$ times the cost of a false alarm. Should the system send an alarm message (accept $H_1$)?

Solution
- First note that \begin{align} P(H_0|y)=1-P(H_1|y)=0.95 \end{align} The posterior risk of accepting $H_1$ is \begin{align} P(H_0|y) C_{10} =0.95 C_{10}. \end{align} We have $C_{01}=10 C_{10}$, so the posterior risk of accepting $H_0$ is \begin{align} P(H_1|y) C_{01} &=(0.05) (10 C_{10})\\ &=0.5 C_{10}. \end{align} Since $P(H_0|y) C_{10} \geq P(H_1|y) C_{01}$, we accept $H_0$, so no alarm message needs to be sent.

← previous

The print version of the book is available on Amazon.

Practical uncertainty: Useful Ideas in Decision-Making, Risk, Randomness, & AI