Day 31
Math 216: Statistical Thinking
Review: Mean vs. Proportion
- Previously studied: Confidence intervals (C.I.) and hypothesis tests for population means.
- Non-parametric tests were focused on population medians.
Transition to Proportions
- Now focusing on population proportion (e.g., “What percentage of voters favor candidate A?”)
- Applies similar principles as the Central Limit Theorem for means, using properties of the sampling distribution of \(\hat{p}\).
Properties of \(\hat{p}\)
- Mean: The expected value (mean) of \(\hat{p}\) is equal to the true population proportion \(p\). \[
E(\hat{p}) = p
\]
- Standard Deviation: For large samples, approximated by \[
\sigma_{\hat{p}} \approx \sqrt{\frac{\hat{p}(1-\hat{p})}{n}}
\]
- Distribution: Becomes approximately normal with large sample sizes.
Large-Sample Hypothesis Testing
- Test Statistic: Normal (z) statistic for \(H_0: p = p_0\): \[
z_c = \frac{(\hat{p} - p_0)}{\sqrt{\frac{p_0(1-p_0)}{n}}}
\]
- Decision Rule: Reject \(H_0\) if p-value \(< \alpha\), or if \(z_c\) falls into the rejection region.
Hypothesis Test Types
- One-Tailed Test:
- \(H_a: p < p_0\) or \(H_a: p > p_0\)
- Rejection region: \(z_c < -z_{\alpha}\) or \(z_c > z_{\alpha}\)
- Two-Tailed Test:
- \(H_a: p \neq p_0\)
- Rejection region: \(z_c < -z_{\alpha/2}\) or \(z_c > z_{\alpha/2}\)
Conditions for Testing
- Random sample from the population.
- Large enough sample size, typically \(n\hat{p} \geq 15\) and \(n\hat{q} \geq 15\).
Confidence Interval for \(\hat{p}\)
- Large-sample confidence interval: \[
\hat{p} \pm z_{\alpha/2} \sqrt{\frac{\hat{p}(1-\hat{p})}{n}}
\]
- Valid when \(n\) is large for normal approximation.
Sample Size Determination
- To specify sampling error (SE) and confidence level: \[
n = \frac{(z_{\alpha/2})^2 \hat{p}(1-\hat{p})}{SE^2}
\]
- Conservative estimate uses \(p \approx 0.5\) when \(\hat{p}\) is unknown.