Lab 11

Inferential Statistics (II)

This week, let’s delve into hypothesis testing for the population mean. For our discussion, let’s suppose our interest lies in the population mean height, denoted as \(\mu\), for all Lehigh female students. In this lab, let us presume the assumptions for the hypothesis testings are satisfied.

Hypotheses Set-Up

First, let’s say we are interested in determining whether the population average height of female students exceeds 64.5 inches, and we aim to statistically confirm this claim at a significance level of 0.1. As mentioned, the population average (or mean) is denoted by the Greek letter \(\mu\). Thus, we express our claim as the alternative hypothesis: \(\mu > 64.5\). The null hypothesis, being the opposite, is \(\mu \le 64.5\): \[ H_0:\; \mu \le 64.5 \;\;\text{vs.}\;\;H_a:\; \mu > 64.5. \] Note that the reference value, 64.5, is generally denoted as \(\mu_0\). This test is referred to as a right-tailed test because we aim to show that the target parameter \(\mu\) lies to the right of the reference value \(\mu_0\).

Sample Data

As the target parameter population mean is unknown, we need to collect a sample for inference. For this purpose, let’s use the Height variable in a previous class survey data.

Choice of Test

Now, let us explore the height data vector \(X\) and record necessary components.

Here, the sample size is large \(n\ge 30\), so we can use the \(z\) test statistic: \[ Z^*=\frac{\bar{X}-\mu_0}{\sigma/\sqrt{n}} \] In particular, the population standard deviation \(\sigma\) will be replaced by its sample counterpart, which we calculated earlier, as we are working with a large sample size.

p-value

Once we have the test statistic, the next step is to calculate the corresponding p-value.
In a z-test, the p-value is essentially one of the tail-probabilities from the standard normal distribution. Since this test is a right-tailed test, we calculate the right-tail probability. We could use a \(z\)-table; however, in R, the function \(\texttt{\color{brown}{pnorm()}}\) is used to compute the left-tail probability, so \[ \text{p-value}=P[Z>z^*]=1-P[Z\le z^*]\stackrel{\textsf{R}}{=}\texttt{\color{brown}{1-pnorm(z.star)}}. \]

Note that a small p-value indicates a strong evidence against the Null Hypothesis in favor of Alternative Hypothesis, allowing us to reject the null hypothesis. But, how small is small enough? This is determined by comparing the p-value to the significance level.

Conclusion and Interpretation

In this case, we fail to reject the null hypothesis. Interpretation should be based on the context of the problem:
“Given the significance level 0.1, there is not enough evidence that the population average height of all Lehigh female students is greater than 64.5 inches.”.

Small Sample Case

Now, let us consider a slightly different population parameter: the population average height of all Lehigh female students whose favorite professional sports is NBA. Regarding this target parameter, we wish to perform exactly the same right-tailed test. Let us explore the data.

Note that the sample size of new data set is small (\(n<30\)). In this case, we have to use the \(t\)-test. (Let us presume the population distribution is normal to satisfy the assumption.) \[ T^*=\frac{\bar{X}-\mu_0}{s/\sqrt{n}} \]

For the p-value calculation, we now need to compute one of the tail probabilities of the t-distribution with the corresponding \(n-1\) degrees of freedom. The t-table we use in class does not provide the exact probability, but the R function \(\texttt{\color{brown}{pt()}}\) can be used for the precise computation. Since this is still right-tailed test: \[ \text{p-value}=P[T>t^*]=1-P[T\le t^*]\stackrel{\textsf{R}}{=}\texttt{\color{brown}{1-pt(t.star,n-1)}}. \]

Note that we can reject the null hypothesis in this sub-group analysis. Therefore,
“Given the significance level 0.1, there is enough evidence that the population average height of all Lehigh female students who like the NBA is greater than 64.5 inches.”.

Lab Questions

Let us consider the population average height \(\mu\) of all Lehigh male students. In particular, we are interested in whether the average height of the male students is less than 70 inches and wish to statistically confirm this claim with a 0.1 significance level.

  1. Choose the right form of the null and alternative hypotheses.
  1. \(H_0:\; \mu \le 70 \;\;\text{vs.}\;\;H_a:\; \mu > 70.\)
  2. \(H_0:\; \mu \ge 70 \;\;\text{vs.}\;\;H_a:\; \mu < 70.\)
  3. \(H_0:\; \mu < 70 \;\;\text{vs.}\;\;H_a:\; \mu \ge 70.\)
  4. \(H_0:\; \mu > 70 \;\;\text{vs.}\;\;H_a:\; \mu \le 70.\)
  1. Subset the relevant sample data and record the necessary component.
    What would be your choice of the test? (Determine it solely based on the sample size.)
  1. z-test
  2. t-test
  1. Compute the test-statistic and calculate the p-value.
  1. Make a conclusion and choose the correct interpretation.
  1. Reject \(H_0\). There is enough evidence that the population average height of all Lehigh male students is less than 70.
  2. Reject \(H_0\). There is not enough evidence that the population average height of all Lehigh male students is less than 70.
  3. Fail to reject \(H_0\). There is enough evidence that the population average height of all Lehigh male students is less than 70.
  4. Fail to teject \(H_0\). There is not enough evidence that the population average height of all Lehigh male students is less than 70.

Click HERE to submit your answers.