8.4.2 General Setting and Definitions

Example 8.22 provided a basic introduction to hypothesis testing. Here, we would like to provide a general setting for problems of hypothesis testing and formally define the terminology that is used in hypothesis testing. Although there are several new phrases such as null hypothesis, type I error, significance level, etc., there are not many new concepts or tools here. Thus, after going through a few examples, the concepts should become clear.

Suppose that $\theta$ is an unknown parameter. A hypothesis is a statement such as $\theta=1$, $\theta>1.3$, $\theta \neq 0.5$, etc. In hypothesis testing problems, we need to decide between two contradictory hypotheses. More precisely, let $S$ be the set of possible values for $\theta$. Suppose that we can partition $S$ into two disjoint sets $S_0$ and $S_1$. Let $H_0$ be the hypothesis that $\theta \in S_0$, and let $H_1$ be the hypothesis that $\theta \in S_1$.



$\quad$ $H_0$ (the null hypothesis): $\theta \in S_0$.

$\quad$ $H_1$ (the alternative hypothesis): $\theta \in S_1$.

In Example 8.22, $S=[0,1]$, $S_0=\{\frac{1}{2} \}$, and $S_1=[0,1]-\{\frac{1}{2} \}$. Here, $H_0$ is an example of a simple hypothesis because $S_0$ contains only one value of $\theta$. On the other hand, $H_1$ is an example of composite hypothesis since $S_1$ contains more than one element. It is often the case that the null hypothesis is chosen to be a simple hypothesis.

Often, to decide between $H_0$ and $H_1$, we look at a function of the observed data. For instance, in Example 8.22, we looked at the random variable $Y$, defined as

\begin{align} Y=\frac{X-n\theta_0}{\sqrt{n\theta_0(1-\theta_0)}}, \end{align} where $X$ was the total number of heads. Here, $X$ is a function of the observed data (sequence of heads and tails), and thus $Y$ is a function of the observed data. We call $Y$ a statistic.
Definition . Let $X_1$, $X_2$, $\cdots$, $X_n$ be a random sample of interest. A statistic is a real-valued function of the data. For example, the sample mean, defined as \begin{align}%\label{} W(X_1,X_2, \cdots,X_n)=\frac{X_1+X_2+...+X_n}{n}, \end{align} is a statistic. A test statistic is a statistic based on which we build our test.

To decide whether to choose $H_0$ or $H_1$, we choose a test statistic, $W=W(X_1,X_2, \cdots,X_n)$. Now, assuming $H_0$, we can define the set $A \subset \mathbb{R}$ as the set of possible values of $W$ for which we would accept $H_0$. The set $A$ is called the acceptance region, while the set $R=\mathbb{R}-A$ is said to be the rejection region. In Example 8.22, the acceptance region was found to be the set $A=[-1.96, 1.96]$, and the set $R=(-\infty,-1.96) \cup (1.96, \infty)$ was the rejection region.

There are two possible errors that we can make. We define type I error as the event that we reject $H_0$ when $H_0$ is true. Note that the probability of type I error in general depends on the real value of $\theta$. More specifically,

\begin{align}%\label{} P(\textrm{type I error} \; | \; \theta )&=P(\textrm{Reject }H_0 \; | \; \theta)\\ &=P(W \in R \; | \; \theta), \quad \textrm{ for }\theta \in S_0. \end{align} If the probability of type I error satisfies \begin{align}%\label{} P(\textrm{type I error}) \le \alpha, \quad \textrm{ for all }\theta \in S_0, \end{align} then we say the test has significance level $\alpha$ or simply the test is a level $\alpha$ test. Note that it is often the case that the null hypothesis is a simple hypothesis, so $S_0$ has only one element (as in Example 8.22). The second possible error that we can make is to accept $H_0$ when $H_0$ is false. This is called the type II error. Since the alternative hypothesis, $H_1$, is usually a composite hypothesis (so it includes more than one value of $\theta$), the probability of type II error is usually a function of $\theta$. The probability of type II error is usually shown by $\beta$: \begin{align}%\label{} \beta(\theta)=P(\textrm{Accept }H_0 \; | \; \theta), \quad \textrm{ for }\theta \in S_1. \end{align} We now go through an example to practice the above concepts.

Example
Consider a radar system that uses radio waves to detect aircraft. The system receives a signal and, based on the received signal, it needs to decide whether an aircraft is present or not. Let $X$ be the received signal. Suppose that we know

$X = W$, $\quad$ $\quad$ if no aircraft is present.

$X = 1+W$, $\quad$ $\quad$ if an aircraft is present.

where $W \sim N(0, \sigma^2=\frac{1}{9})$. Thus, we can write $X=\theta+W$, where $\theta=0$ if there is no aircraft, and $\theta=1$ if there is an aircraft. Suppose that we define $H_0$ and $H_1$ as follows:

$\quad$ $H_0$ (null hypothesis): No aircraft is present.

$\quad$ $H_1$ (alternative hypothesis): An aircraft is present.

  1. Write the null hypothesis, $H_0$, and the alternative hypothesis, $H_1$, in terms of possible values of $\theta$.
  2. Design a level $0.05$ test ($\alpha=0.05)$ to decide between $H_0$ and $H_1$.
  3. Find the probability of type II error, $\beta$, for the above test. Note that this is the probability of missing a present aircraft.
  4. If we observe $X=0.6$, is there enough evidence to reject $H_0$ at significance level $\alpha=0.01$?
  5. If we would like the probability of missing a present aircraft to be less than $5 \%$, what is the smallest significance level that we can achieve?
  • Solution
      1. The null hypothesis corresponds to $\theta=0$ and the alternative hypothesis corresponds to $\theta=1$. Thus, we can write

        $\quad$ $H_0$ (null hypothesis): No aircraft is present: $\theta=0$.

        $\quad$ $H_1$ (alternative hypothesis): An aircraft is present: $\theta=1$.

        Note that here both hypotheses are simple.

      2. To decide between $H_0$ and $H_1$, we look at the observed data. Here, the situation is relatively simple. The observed data is just the random variable $X$. Under $H_0$, $X \sim N(0, \frac{1}{9})$, and under $H_1$, $X \sim N(1, \frac{1}{9})$. Thus, we can suggest the following test: We choose a threshold $c$. If the observed value of $X$ is less than $c$, we choose $H_0$ (i.e., $\theta=EX=0$). If the observed value of $X$ is larger than $c$, we choose $H_1$ (i.e., $\theta=EX=1$). To choose $c$, we use the required $\alpha$: \begin{align} P(\textrm{type I error}) &= P(\textrm{Reject }H_0 \; | \; H_0) \\ &= P(X > c \; | \; H_0)\\ &= P(W>c)\\ &=1-\Phi(3c) \quad \big(\textrm{since assuming }H_0, X \sim N\big(0, \frac{1}{9}\big)\big). \end{align} Letting $P(\textrm{type I error})=\alpha$, we obtain \begin{align} c = \frac{1}{3} \Phi^{-1}(1-\alpha). \end{align} Letting $\alpha=0.05$, we obtain \begin{align} c = \frac{1}{3} \Phi^{-1}(0.95) =0.548 \end{align}

      3. Note that, here, the alternative hypothesis is a simple hypothesis. That is, it includes only one value of $\theta$ (i.e., $\theta=1$). Thus, we can write \begin{align} \beta&=P(\textrm{type II error}) = P(\textrm{accept }H_0 \; | \; H_1) \\ &= P(X \lt c \; | \; H_1)\\ &= P(1+W \lt c)\\ &= P(W \lt c-1)\\ &=\Phi(3(c-1)). \end{align} Since $c=0.548$, we obtain $\beta=0.088$.

      4. In part (b), we obtained \begin{align} c = \frac{1}{3} \Phi^{-1}(1-\alpha). \end{align} For $\alpha=0.01$, we have $c=\frac{1}{3} \Phi^{-1}(0.99)=0.775$ which is larger than $0.6$. Thus, we cannot reject $H_0$ at significance level $\alpha=0.01$.

      5. In part (c), we obtained \begin{align} \beta=\Phi(3(c-1)). \end{align} To have $\beta=0.05$, we obtain \begin{align} c&=1+\frac{1}{3} \Phi^{-1}(\beta)\\ &=1+\frac{1}{3} \Phi^{-1}(0.05)\\ &=0.452 \end{align} Thus, we need to have $c \leq 0.452$ to obtain $\beta \leq 0.05$. Therefore, \begin{align} P(\textrm{type I error})&=1-\Phi(3c)\\ &=1-\Phi(3 \times 0.452)\\ &=0.0875, \end{align} which means that the smallest significance level that we can achieve is $\alpha=0.0875$.


Trade-off Between $\alpha$ and $\beta$: Since $\alpha$ and $\beta$ indicate error probabilities, we would ideally like both of them to be small. However, there is in fact a trade-off between $\alpha$ and $\beta$. That is, if we want to decrease the probability of type I error ($\alpha$), then the probability of type II error ($\beta$) increases, and vise versa. To see this, we can look at our analysis in Example 8.23. In that example, we found \begin{align} \alpha &=1-\Phi(3c),\\ \beta &=\Phi(3(c-1)). \end{align} Note that $\Phi(x)$ is an increasing function. If we make $c$ larger, $\alpha$ becomes smaller, and $\beta$ becomes larger. On the other hand, if we make $c$ smaller, $\alpha$ becomes larger, and $\beta$ becomes smaller. Figure 8.10 shows type I and type II error probabilities for Example 8.23.
alpha-color
Figure 8.10 - Type I and type II errors in Example 8.23.


The print version of the book is available on Amazon.

Book Cover


Practical uncertainty: Useful Ideas in Decision-Making, Risk, Randomness, & AI

ractical Uncertaintly Cover