Day 28

Math 216: Statistical Thinking

Bastola

Effectiveness of two training programs: Paired Data

Pair	Method A	Method B
1	85	83
2	88	89
3	90	87
4	92	84
5	91	92
6	89	90
7	93	85
8	95	91
9	96	98
10	97	94

Pair	Method A	Method B
11	98	100
12	99	101
13	100	99
14	101	111
15	102	111
16	103	106
17	104	109
18	105	103
19	106	111
20	107	114

Comparing Two Population Means: Paired Difference

Objective: Assess the effectiveness of two training programs using paired observations
Key Insight: Analyze differences within pairs (\(d_i = A_i - B_i\)) to control for individual variability
Design: Repeated measures from same participants or matched pairs

Conceptual Foundation

Dependency: Observations are naturally linked (same subject, matched characteristics)
Advantage: Reduces variability by eliminating between-subject differences
Data Structure: Focuses on pairwise differences rather than raw scores

Hypothesis Framework

Let \(\mu_d = \mu_A - \mu_B\) denote the population mean difference:

Null Hypothesis: \(H_0: \mu_d = 0\)
(No difference between methods)
Alternative Hypothesis: \(H_a: \mu_d > 0\)
(Method A produces higher scores than Method B)
Two-Sided Alternative: \(H_a: \mu_d \neq 0\) if testing for any difference

General Testing Procedure

Calculate Differences: \(d_i = A_i - B_i\) for each pair
Compute Summary Statistics:
- Mean difference: \(\bar{d} = \frac{\sum d_i}{n}\)
- Standard deviation: \(s_d = \sqrt{\frac{\sum (d_i - \bar{d})^2}{n-1}}\)
Check Conditions:
- Normality of differences (QQ-plot or Shapiro-Wilk)
- Random sampling/assignment
Select Test Statistic:

\[t = \frac{\bar{d} - \mu_{d0}}{s_d/\sqrt{n}} \quad \text{with } df = n-1\]

Where \(\mu_{d0}\) is the hypothesized mean difference (0 under \(H_0\))
Make Decision:
- Compare p-value to \(\alpha\) (typically 0.05)
- Interpret confidence interval for \(\mu_d\)

Interpretation Guidance

Significant Result: Reject \(H_0\) if p-value < \(\alpha\)
- “Evidence suggests Method A outperforms Method B (t(19)=2.15, p=0.022)”
Nonsignificant Result: Fail to reject \(H_0\)
- “No statistically significant difference detected”
Always Report:
- Effect size (mean difference)
- Confidence interval
- Practical significance

Selecting Appropriate Statistical Tests

For Normal Distributions: Apply the paired \(t\)-test.
For Non-Normal Distributions: Use non-parametric methods that do not assume a normal distribution.

Connection to Confidence Intervals

A 95% CI for \(\mu_d\) is constructed as:

\[\bar{d} \pm t^*_{\alpha/2} \frac{s_d}{\sqrt{n}}\]

Interpretation: “We are 95% confident the true mean difference lies between [X, Y]”
Decision Rule: If CI excludes 0, reject \(H_0\) at \(\alpha=0.05\)

# Define the scores for Method A and Method B
methodA <- c(85, 88, 90, 92, 91, 89, 93, 95, 96, 97, 98,
             99, 100, 101, 102, 103, 104, 105, 106, 107)
methodB <- c(83, 89, 87, 84, 92, 90, 85, 91, 98, 94, 100,
             101, 99, 111, 111, 106, 109, 103, 111, 114)

# Calculate differences
differences <- methodA - methodB

# Generate a QQ plot for normality check
qq_norm <- ggplot(data = tibble(differences), aes(sample = differences)) +
  stat_qq() + stat_qq_line() +
  ggtitle("QQ Plot of Differences")

# Generate a histogram for normality check
histogram <- ggplot(data = as.data.frame(differences), aes(x = differences)) +
  geom_histogram(bins = 10, color = "maroon", fill = "gold") +
  ggtitle("Histogram of Differences")

Preliminary Tests in R

# Perform the Anderson-Darling test for normality
library(nortest)
ad.test(differences)


    Anderson-Darling normality test

data:  differences
A = 0.2269, p-value = 0.787

# Calculate standard deviation of differences
sd(differences) # s_d

[1] 4.92336

# Calculate critical value for t-distribution
qt(0.975, df = 20 - 1) # critical value

[1] 2.093024

`t.test` for paired samples

# Perform the paired t-test
t.test(methodA, methodB, paired = TRUE, alternative = "greater")


    Paired t-test

data:  methodA and methodB
t = -0.7721, df = 19, p-value = 0.7752
alternative hypothesis: true mean difference is greater than 0
95 percent confidence interval:
 -2.753597       Inf
sample estimates:
mean difference 
          -0.85

t.test(differences~1, alternative = "greater", data = tibble(differences)) # alternate 1


    One Sample t-test

data:  differences
t = -0.7721, df = 19, p-value = 0.7752
alternative hypothesis: true mean is greater than 0
95 percent confidence interval:
 -2.753597       Inf
sample estimates:
mean of x 
    -0.85

t.test(differences, alternative = "greater") # alternate 2


    One Sample t-test

data:  differences
t = -0.7721, df = 19, p-value = 0.7752
alternative hypothesis: true mean is greater than 0
95 percent confidence interval:
 -2.753597       Inf
sample estimates:
mean of x 
    -0.85

Pair	Method A	Method B
11	98	100
12	99	101
13	100	99
14	101	111
15	102	111
16	103	106
17	104	109
18	105	103
19	106	111
20	107	114

Pair	Method A	Method B
11	98	100
12	99	101
13	100	99
14	101	111
15	102	111
16	103	106
17	104	109
18	105	103
19	106	111
20	107	114

Day 28

Effectiveness of two training programs: Paired Data

Comparing Two Population Means: Paired Difference

Conceptual Foundation

Hypothesis Framework

General Testing Procedure

Interpretation Guidance

Selecting Appropriate Statistical Tests

Connection to Confidence Intervals

Diagnostic Plots and R Code

Preliminary Tests in R

t.test for paired samples

`t.test` for paired samples

Pair	Method A	Method B
11	98	100
12	99	101
13	100	99
14	101	111
15	102	111
16	103	106
17	104	109
18	105	103
19	106	111
20	107	114