Day 28

Math 216: Statistical Thinking

Bastola

Effectiveness of two training programs: Paired Data

Pair Method A Method B
1 85 83
2 88 89
3 90 87
4 92 84
5 91 92
6 89 90
7 93 85
8 95 91
9 96 98
10 97 94
Pair Method A Method B
11 98 100
12 99 101
13 100 99
14 101 111
15 102 111
16 103 106
17 104 109
18 105 103
19 106 111
20 107 114

Comparing Two Population Means: Paired Difference

  • Objective: Assess the effectiveness of two training programs using paired observations
  • Key Insight: Analyze differences within pairs (\(d_i = A_i - B_i\)) to control for individual variability
  • Design: Repeated measures from same participants or matched pairs

Conceptual Foundation

  • Dependency: Observations are naturally linked (same subject, matched characteristics)
  • Advantage: Reduces variability by eliminating between-subject differences
  • Data Structure: Focuses on pairwise differences rather than raw scores

Hypothesis Framework

Let \(\mu_d = \mu_A - \mu_B\) denote the population mean difference:

  • Null Hypothesis: \(H_0: \mu_d = 0\)
    (No difference between methods)
  • Alternative Hypothesis: \(H_a: \mu_d > 0\)
    (Method A produces higher scores than Method B)
  • Two-Sided Alternative: \(H_a: \mu_d \neq 0\) if testing for any difference

General Testing Procedure

  1. Calculate Differences: \(d_i = A_i - B_i\) for each pair

  2. Compute Summary Statistics:

    • Mean difference: \(\bar{d} = \frac{\sum d_i}{n}\)
    • Standard deviation: \(s_d = \sqrt{\frac{\sum (d_i - \bar{d})^2}{n-1}}\)
  3. Check Conditions:

    • Normality of differences (QQ-plot or Shapiro-Wilk)
    • Random sampling/assignment
  4. Select Test Statistic:

    \[t = \frac{\bar{d} - \mu_{d0}}{s_d/\sqrt{n}} \quad \text{with } df = n-1\]

    Where \(\mu_{d0}\) is the hypothesized mean difference (0 under \(H_0\))

  5. Make Decision:

    • Compare p-value to \(\alpha\) (typically 0.05)
    • Interpret confidence interval for \(\mu_d\)

Interpretation Guidance

  • Significant Result: Reject \(H_0\) if p-value < \(\alpha\)
    • “Evidence suggests Method A outperforms Method B (t(19)=2.15, p=0.022)”
  • Nonsignificant Result: Fail to reject \(H_0\)
    • “No statistically significant difference detected”
  • Always Report:
    • Effect size (mean difference)
    • Confidence interval
    • Practical significance

Selecting Appropriate Statistical Tests

  • For Normal Distributions: Apply the paired \(t\)-test.
  • For Non-Normal Distributions: Use non-parametric methods that do not assume a normal distribution.

Connection to Confidence Intervals

A 95% CI for \(\mu_d\) is constructed as:

\[\bar{d} \pm t^*_{\alpha/2} \frac{s_d}{\sqrt{n}}\]

  • Interpretation: “We are 95% confident the true mean difference lies between [X, Y]”
  • Decision Rule: If CI excludes 0, reject \(H_0\) at \(\alpha=0.05\)

Diagnostic Plots and R Code

# Define the scores for Method A and Method B
methodA <- c(85, 88, 90, 92, 91, 89, 93, 95, 96, 97, 98,
             99, 100, 101, 102, 103, 104, 105, 106, 107)
methodB <- c(83, 89, 87, 84, 92, 90, 85, 91, 98, 94, 100,
             101, 99, 111, 111, 106, 109, 103, 111, 114)

# Calculate differences
differences <- methodA - methodB

# Generate a QQ plot for normality check
qq_norm <- ggplot(data = tibble(differences), aes(sample = differences)) +
  stat_qq() + stat_qq_line() +
  ggtitle("QQ Plot of Differences")

# Generate a histogram for normality check
histogram <- ggplot(data = as.data.frame(differences), aes(x = differences)) +
  geom_histogram(bins = 10, color = "maroon", fill = "gold") +
  ggtitle("Histogram of Differences")

Preliminary Tests in R

# Perform the Anderson-Darling test for normality
library(nortest)
ad.test(differences)

    Anderson-Darling normality test

data:  differences
A = 0.2269, p-value = 0.787
# Calculate standard deviation of differences
sd(differences) # s_d
[1] 4.92336
# Calculate critical value for t-distribution
qt(0.975, df = 20 - 1) # critical value
[1] 2.093024

t.test for paired samples

# Perform the paired t-test
t.test(methodA, methodB, paired = TRUE, alternative = "greater")

    Paired t-test

data:  methodA and methodB
t = -0.7721, df = 19, p-value = 0.7752
alternative hypothesis: true mean difference is greater than 0
95 percent confidence interval:
 -2.753597       Inf
sample estimates:
mean difference 
          -0.85 
t.test(differences~1, alternative = "greater", data = tibble(differences)) # alternate 1

    One Sample t-test

data:  differences
t = -0.7721, df = 19, p-value = 0.7752
alternative hypothesis: true mean is greater than 0
95 percent confidence interval:
 -2.753597       Inf
sample estimates:
mean of x 
    -0.85 
t.test(differences, alternative = "greater") # alternate 2

    One Sample t-test

data:  differences
t = -0.7721, df = 19, p-value = 0.7752
alternative hypothesis: true mean is greater than 0
95 percent confidence interval:
 -2.753597       Inf
sample estimates:
mean of x 
    -0.85