Lesson 21: Statistical Investigation Lab

In honor of Thayer Time

Lesson Administration

Calendar

Day 1

Day 2

SIL 1

Exploration Exercise 7.2

EE7.2 due 0700 23 Oct

SIL 2

  • Today is SIL 2

WPR 2

  • Next Lesson

Resources

To Help Prepare

Exploration Exercise 2.3
Exploration Exercise 3.2
Board Problems on each lesson, 11-20
Coursewide review to be published in the coming days

For WPR

  • 8x11 Note Sheet written by you front and back
  • Course Guide
  • Calculator
  • R/RStudio
  • Pen/Pencil
  • Positive Can Do Attitude
  • Water / Caffeine
  • No AI, no outside internet, no buddies, no website

Math 1 vs EECS

7-0

Running Review

Review: \(z\)-Tests for One Proportion

For all cases:

\(H_0:\ \pi = \pi_0\)

\[ z = \frac{\hat{p} - \pi_0}{\sqrt{\frac{\hat{p}\,(1-\hat{p})}{n}}} \]

Alternative Hypothesis Formula for \(p\)-value R Code
\(H_A:\ \pi > \pi_0\) \(p = 1 - \Phi(z)\) p_val <- 1 - pnorm(z_stat)
\(H_A:\ \pi < \pi_0\) \(p = \Phi(z)\) p_val <- pnorm(z_stat)
\(H_A:\ \pi \neq \pi_0\) \(p = 2 \cdot (1 - \Phi(|z|))\) p_val <- 2 * (1 - pnorm(abs(z_stat)))

Where:

  • \(\hat{p} = R/n\) (sample proportion)
  • \(\pi_0\) = hypothesized proportion under \(H_0\)
  • \(\Phi(\cdot)\) = cumulative distribution function (CDF) of the standard normal distribution

Confidence Interval for \(\pi\) (one proportion)

\[ \hat{p} \;\pm\; z_{\,1-\alpha/2}\,\sqrt{\frac{\hat{p}\,(1-\hat{p})}{n}} \]

I am \((1 - \alpha)\%\) confident that the true population proportion \(\pi\) lies between \([\text{lower bound}, \text{upper bound}]\).


Review: \(t\)-Tests for One Mean

For all cases:

\(H_0:\ \mu = \mu_0\)

\[ t = \frac{\bar{x} - \mu_0}{s / \sqrt{n}} \]

Alternative Hypothesis Formula for \(p\)-value R Code
\(H_A:\ \mu > \mu_0\) \(p = 1 - F_{t,df}(t)\) p_val <- 1 - pt(t_stat, df)
\(H_A:\ \mu < \mu_0\) \(p = F_{t,df}(t)\) p_val <- pt(t_stat, df)
\(H_A:\ \mu \neq \mu_0\) \(p = 2 \cdot (1 - F_{t,df}(|t|))\) p_val <- 2 * (1 - pt(abs(t_stat), df))

Where:

  • \(\bar{x}\) = sample mean
  • \(\mu_0\) = hypothesized mean under \(H_0\)
  • \(s\) = sample standard deviation
  • \(n\) = sample size
  • \(df = n - 1\) (degrees of freedom)
  • \(F_{t,df}(\cdot)\) = CDF of Student’s \(t\) distribution with \(df\) degrees of freedom

Confidence Interval for \(\mu\) (one mean)

\[ \bar{x} \;\pm\; t_{\,1-\alpha/2,\;df}\,\frac{s}{\sqrt{n}}, \qquad df = n-1 \] I am \((1 - \alpha)\%\) confident that the true population mean \((\mu)\) lies between \([\text{lower bound}, \text{upper bound}]\).


Review: \(z\)-Tests for Two Proportions

For all cases:

\(H_0:\ \pi_1 - \pi_2 = 0\)

\[ z \;=\; \frac{(\hat{p}_1 - \hat{p}_2) - (\pi_1 - \pi_2)}{\sqrt{\hat{p}(1-\hat{p})\left(\tfrac{1}{n_1} + \tfrac{1}{n_2}\right)}} \]

Where the pooled proportion is

\[ \hat{p} \;=\; \frac{x_1 + x_2}{n_1 + n_2}. \]

Alternative Hypothesis Formula for \(p\)-value R Code
\(H_A:\ \pi_1 - \pi_2 > 0\) \(p = 1 - \Phi(z)\) p_val <- 1 - pnorm(z_stat)
\(H_A:\ \pi_1 - \pi_2 < 0\) \(p = \Phi(z)\) p_val <- pnorm(z_stat)
\(H_A:\ \pi_1 - \pi_2 \neq 0\) \(p = 2 \cdot (1 - \Phi(|z|))\) p_val <- 2 * (1 - pnorm(abs(z_stat)))

Where:

  • \(\hat{p}_1 = x_1/n_1\) (sample proportion in group 1)
  • \(\hat{p}_2 = x_2/n_2\) (sample proportion in group 2)
  • \(\pi_1, \pi_2\) = hypothesized proportions under \(H_0\)
  • \(\Phi(\cdot)\) = cumulative distribution function (CDF) of the standard normal distribution

Confidence Interval for \(\pi_1 - \pi_2\) (unpooled SE)

\[ (\hat{p}_1 - \hat{p}_2) \;\pm\; z_{\,1-\alpha/2}\, \sqrt{\tfrac{\hat{p}_1(1-\hat{p}_1)}{n_1} + \tfrac{\hat{p}_2(1-\hat{p}_2)}{n_2}} \]

I am \((1 - \alpha)\%\) confident that the true difference in population proportions \((\pi_1 - \pi_2)\) lies between \([\text{lower bound}, \text{upper bound}]\).

Review: \(t\)-Tests for Two Means

For all cases:

\(H_0:\ \mu_1 - \mu_2 = 0\)

\[ t = \frac{(\bar{x}_1 - \bar{x}_2) - 0}{\sqrt{\tfrac{s_1^2}{n_1} + \tfrac{s_2^2}{n_2}}} \]

Alternative Hypothesis Formula for \(p\)-value R Code
\(H_A:\ \mu_1 - \mu_2 > 0\) \(p = 1 - F_{t,df}(t)\) p_val <- 1 - pt(t_stat, df)
\(H_A:\ \mu_1 - \mu_2 < 0\) \(p = F_{t,df}(t)\) p_val <- pt(t_stat, df)
\(H_A:\ \mu_1 - \mu_2 \neq 0\) \(p = 2 \cdot (1 - F_{t,df}(|t|))\) p_val <- 2 * (1 - pt(abs(t_stat), df))

Where:

  • \(\bar{x}_1,\ \bar{x}_2\) = sample means in groups 1 and 2
  • \(s_1,\ s_2\) = sample standard deviations
  • \(n_1,\ n_2\) = sample sizes
  • \(df = n_1 + n_2 - 2\)
  • \(F_{t,df}(\cdot)\) = CDF of Student’s \(t\) distribution with \(df\) degrees of freedom

Confidence Interval for \(\mu_1 - \mu_2\)

\[ (\bar{x}_1 - \bar{x}_2) \;\pm\; t_{\,1-\alpha/2,\;df}\, \sqrt{\tfrac{s_1^2}{n_1} + \tfrac{s_2^2}{n_2}}, \qquad df = n_1+n_2-2 \]

I am \((1 - \alpha)\%\) confident that the true difference in population means \((\mu_1 - \mu_2)\) lies between \([\text{lower bound}, \text{upper bound}]\).

Review: Paired \(t\)-Tests

For all cases:

\(H_0:\ \mu_d = 0\)
(where \(d = \text{Before – After}\) is the difference within each pair)

\[ t = \frac{\bar{d} - 0}{s_d / \sqrt{n}} \]

Alternative Hypothesis Formula for \(p\)-value R Code
\(H_A:\ \mu_d > 0\) \(p = 1 - F_{t,df}(t)\) p_val <- 1 - pt(t_stat, df)
\(H_A:\ \mu_d < 0\) \(p = F_{t,df}(t)\) p_val <- pt(t_stat, df)
\(H_A:\ \mu_d \neq 0\) \(p = 2 \cdot (1 - F_{t,df}(|t|))\) p_val <- 2 * (1 - pt(abs(t_stat), df))

Where:

  • \(\bar{d}\) = mean of the paired differences
  • \(s_d\) = standard deviation of the paired differences
  • \(n\) = number of pairs
  • \(df = n - 1\) (degrees of freedom)
  • \(F_{t,df}(\cdot)\) = CDF of Student’s \(t\) distribution with \(df\) degrees of freedom

Confidence Interval for \(\mu_d\) (paired mean difference)

\[ \bar{d} \;\pm\; t_{\,1-\alpha/2,\;df}\,\frac{s_d}{\sqrt{n}}, \qquad df = n-1 \]

I am \((1 - \alpha)\%\) confident that the true mean difference in the population \((\mu_d)\) lies between \([\text{lower bound}, \text{upper bound}]\).


Interpreting the \(p\)-value

  • Rejecting \(H_0\)
    > Since the \(p\)-value is less than \(\alpha\) (e.g., \(0.05\)), we reject the null hypothesis.
    > We conclude that there is sufficient evidence to suggest that [state the alternative claim in context].

  • Failing to Reject \(H_0\)
    > Since the \(p\)-value is greater than \(\alpha\) (e.g., \(0.05\)), we fail to reject the null hypothesis.
    > We conclude that there is not sufficient evidence to suggest that [state the alternative claim in context].

  • Strength of evidence: Smaller \(p\) means stronger evidence against \(H_0\).


Generalization / Causation

  • Generalization: We can generalize results to a larger population if the data come from a random and representative sample of that population.
  • Causation: We can claim causation if participants are randomly assigned to treatments in an experiment. Without random assignment, we can only conclude association, not causation.

\[ \begin{array}{|c|c|c|} \hline & \text{Randomly Sampled} & \text{Not Randomly Sampled} \\ \hline \textbf{Randomly Assigned} & \begin{array}{c} \text{Generalize: Yes} \\ \text{Causation: Yes} \end{array} & \begin{array}{c} \text{Generalize: No} \\ \text{Causation: Yes} \end{array} \\ \hline \textbf{Not Randomly Assigned} & \begin{array}{c} \text{Generalize: Yes} \\ \text{Causation: No} \end{array} & \begin{array}{c} \text{Generalize: No} \\ \text{Causation: No} \end{array} \\ \hline \end{array} \]


Parameters vs Statistics

  • Parameters vs. Statistics: A parameter is a fixed (but usually unknown) numerical value describing a population (e.g., \(\mu\), \(\sigma\), \(\pi\)). A statistic is a numerical value computed from a sample (e.g., \(\bar{x}\), \(s\), \(\hat{p}\)).
    • Parameters = target (what we want to know).
    • Statistics = evidence (what we can actually measure).
    • We use statistics to estimate parameters, and because different samples give different statistics, we capture this variability with confidence intervals.

Validity Conditions

  1. Proportions
    • Independence / Randomness: data come from a random sample (or are representative of the population). For two-sample tests, groups are independent.
    • Success–Failure Condition: each proportion must have at least
      • \(n\hat{p} \geq 10\) (expected successes)
      • \(n(1 - \hat{p}) \geq 10\) (expected failures)
  2. Means
    • Independence / Randomness: data come from a random sample (or are representative of the population). For two-sample tests, groups (or paired differences) are independent.
    • Sample Size / Shape:
      • Data should be approximately normally distributed when \(n > 20\).
      • If \(n \leq 20\): distribution should be roughly symmetric with no extreme skew or outliers.
      • For paired tests: check the distribution of the differences.

Margin of Error

It is the half-width of the confidence interval — the distance from the sample statistic to either endpoint of the interval.

So a confidence interval can always be written as:

\[ \text{Sample Statistic} \; \pm \; \text{Margin of Error}. \]

For example, for a population mean the confidence interval is:

\[ \bar{x} \; \pm \; t_{\alpha/2, \; df} \times \frac{s}{\sqrt{n}}. \]

The margin of error is:

\[ t_{\alpha/2, \; df} \times \frac{s}{\sqrt{n}} \]

For a population proportion the confidence interval is:

\[ \hat{p} \; \pm \; z_{\alpha/2} \times \sqrt{\frac{\hat{p}(1 - \hat{p})}{n}}. \]

The margin of error is:

\[ z_{\alpha/2} \times \sqrt{\frac{\hat{p}(1 - \hat{p})}{n}} \]


Class Review

Group Activity

Component 1-Sample Mean 2-Sample Mean Paired Mean 1-Sample Prop 2-Sample Prop
Null (H0)
Alt >
Alt <
Alt =
Test Stat
p-value >
p-value >
p-value =
CI Formula

Scenarios

  1. A nutritionist wonders if teenagers consume more sodium than the recommended 2,300 mg per day.
    She collects data from 40 teens across several high schools in her state. The teens self-reported their daily intake, averaging 2,450 mg per day with a standard deviation of 400 mg. Many of the students were also athletes, which could influence dietary choices.

  1. Nationwide surveys suggest about 62% of adults own a smartphone.
    In a phone survey of 150 adults from one midsized city, 105 reported owning a smartphone. The city has a relatively young population compared to the national average, which could affect the results.

  1. A school district wants to know whether male and female students spend different amounts of time on homework each week.
  • In a sample of 35 male students, the average reported time was 12.4 hours with a standard deviation of 4.2 hours.
  • In a sample of 40 female students, the average was 14.1 hours with a standard deviation of 3.9 hours.
    All students were enrolled in honors-level courses, which may influence homework expectations.

  1. A company is comparing two training programs to see if one produces higher certification pass rates.
  • Of the 60 employees enrolled in Program A, 48 passed the exam.
  • Of the 55 employees in Program B, 38 passed the exam.
    Supervisors assigned employees to the training programs based on their work schedules, and employees in Program A tended to have more prior experience with the material.

  1. A researcher measures the resting heart rates of 25 participants before and after an 8-week aerobic training program.
    On average, heart rates were 5.2 beats per minute lower after training, with the differences across participants having a standard deviation of 7.5 bpm. Some participants also reported starting new diets at the same time as the training program.

Questions

  1. What is the research question?

  2. Are the variables categorical or quantitative? What type of test are you completing?

  3. What is the response variable and what is the explanatory variable in this study?

  4. Is there a potential confounding variable? If so, what might it be?

  5. Describe the parameter of interest in context of the question.

  6. State the null and alternative hypotheses in both symbols and words.

  7. List the appropriate summary statistic(s) by name, symbol, and value.

  8. State the appropriate validity conditions and whether they are met.

  9. Report the standardized statistic, the p-value, and a confidence interval.

  10. Based on an α = 0.05 significance level, do you reject or fail to reject the null? Provide your evidence.

  11. Can the results of this study be generalized to a larger population? Why or why not?

  12. Can we assume a causal relationship from these results? Why or why not?

Solutions

A nutritionist wonders if teenagers consume more sodium than the recommended 2,300 mg per day.
She collects data from 40 teens across several high schools in her state. The teens self-reported their daily intake, averaging 2,450 mg per day with a standard deviation of 400 mg. Many of the students were also athletes, which could influence dietary choices.

  1. Research Question
    Do teenagers consume more sodium on average than the recommended 2,300 mg per day?

  2. Variables & Test
    Quantitative (sodium intake). One-sample t-test for a mean.

  3. Response & Explanatory
    Response = sodium intake (mg/day).
    Explanatory = comparison to recommended guideline.

  4. Confounding Variable
    Athletic status (athletes may consume more sodium).

  5. Parameter of Interest
    µ = true mean sodium intake of all teenagers.

  6. Hypotheses
    H₀: µ = 2300
    Hₐ: µ > 2300

  7. Summary Statistics
    n = 40, x̄ = 2450, s = 400.

  8. Validity Conditions
    Sample size \(n = 40\) is larger than 20, so the mean intake can be reasonably modeled with a t-test.

  9. Test Statistic & CI

Test statistic:

\[ t = \frac{\bar{x} - \mu_0}{s / \sqrt{n}} \]

xbar <- 2450
mu0 <- 2300
s <- 400
n <- 40

t_stat <- (xbar - mu0) / (s / sqrt(n))
t_stat

p-value (right-tailed):

df <- n - 1
p_value <- 1 - pt(t_stat, df)
p_value

95% CI:

alpha <- 0.05
t_star <- qt(1 - alpha/2, df)

ME <- t_star * s / sqrt(n)
lower <- xbar - ME
upper <- xbar + ME
c(lower, upper)

Result: \(t \approx 2.37\), \(p \approx 0.011\).
I am 95% confident the true mean sodium intake is between 2346 and 2554 mg/day.

  1. Decision at α = 0.05
    Reject H₀. We conclude it is possible that the true mean sodium intake of teenagers is greater than 2300 mg per day. In context, this suggests teens in the study tend to consume more sodium than recommended.

  2. Generalization
    Sample from one state → cannot generalize to all U.S. teens.

  3. Causality
    Observational design → cannot infer causality.


Nationwide surveys suggest about 62% of adults own a smartphone.
In a phone survey of 150 adults from one midsized city, 105 reported owning a smartphone. The city has a relatively young population compared to the national average, which could affect the results.

  1. Research Question
    Is the proportion of smartphone ownership in this city different from 62%?

  2. Variables & Test
    Categorical (own smartphone or not). One-sample z-test for a proportion.

  3. Response & Explanatory
    Response = smartphone ownership (yes/no).
    Explanatory = national benchmark proportion (\(p_0 = 0.62\)).

  4. Confounding Variable
    Age distribution of the city (younger population may have higher ownership).

  5. Parameter of Interest
    \(p =\) true proportion of adults in this city who own a smartphone.

  6. Hypotheses
    \(H_0: p = 0.62\)
    \(H_a: p \ne 0.62\)

  7. Summary Statistics
    \(n = 150,\; x = 105,\; \hat{p} = 105/150 = 0.70\)

  8. Validity Conditions
    Success–failure condition holds: \(n\hat{p} = 105 \geq 10\) and \(n(1-\hat{p}) = 45 \geq 10\). Random/representative sample assumed.

  9. Test Statistic, p-value, and 95% CI

Formulas
- Test statistic: \(z = \dfrac{\hat{p} - p_0}{\sqrt{p_0(1-p_0)/n}}\)
- Two-sided p-value: \(2 \,[1-\Phi(|z|)]\)
- 95% CI: \(\hat{p} \pm z_{\alpha/2} \sqrt{\dfrac{\hat{p}(1-\hat{p})}{n}}\) with \(z_{0.025} \approx 1.96\)

R code (template)

# Given data
n <- 150
x <- 105
p.hat <- x / n
p0 <- 0.62

# Test statistic & p-value (two-sided)
se0 <- sqrt(p0*(1 - p0)/n)
z <- (p.hat - p0) / se0
p.value <- 2 * (1 - pnorm(abs(z)))

z; p.value
# 95% confidence interval for p
alpha <- 0.05
z.star <- qnorm(1 - alpha/2)
se.hat <- sqrt(p.hat*(1 - p.hat)/n)
ME <- z.star * se.hat
lower <- p.hat - ME
upper <- p.hat + ME
c(lower, upper)

Numerical results
\(z \approx 2.02\), two-sided \(p \approx 0.044\).
95% CI: I am 95% confident that the true proportion of adults in this city who own a smartphone is between 0.627 and 0.773.

  1. Decision at \(\alpha = 0.05\)
    Reject \(H_0\). We conclude it is possible that the true proportion is not equal to 0.62. In context, smartphone ownership in this city likely differs from the national rate.

  2. Generalization
    This is one midsized city; generalization to all U.S. adults is limited without broader sampling.

  3. Causality
    This is a survey (no random assignment), so we cannot infer a causal explanation for differences in ownership.


A school district wants to know whether male and female students spend different amounts of time on homework each week.
- 35 male students: \(\bar{x}_1 = 12.4\) hours, \(s_1 = 4.2\)
- 40 female students: \(\bar{x}_2 = 14.1\) hours, \(s_2 = 3.9\)
All students were in honors-level courses, which may influence homework expectations.

  1. Research Question
    Do male and female students differ in average weekly homework hours?

  2. Variables & Test
    Quantitative response (hours), categorical explanatory (gender). Two-sample t-test for means.

  3. Response & Explanatory
    Response = hours of homework.
    Explanatory = gender (male vs female).

  4. Confounding Variable
    Honors course enrollment.

  5. Parameter of Interest
    \(\mu_2 - \mu_1 =\) difference in true mean hours (female − male).

  6. Hypotheses
    \(H_0: \mu_2 - \mu_1 = 0\)
    \(H_a: \mu_2 - \mu_1 \ne 0\)

  7. Summary Statistics
    \(n_1 = 35, \; \bar{x}_1 = 12.4, \; s_1 = 4.2\)
    \(n_2 = 40, \; \bar{x}_2 = 14.1, \; s_2 = 3.9\)

  8. Validity Conditions
    Both groups have \(n > 20\), so approximate normality is reasonable.

  9. Test Statistic, p-value, and 95% CI

Formulas
- Pooled variance:
\[ s_p^2 = \frac{(n_1 - 1)s_1^2 + (n_2 - 1)s_2^2}{n_1 + n_2 - 2} \]

  • Standard error:
    \[ SE = \sqrt{s_p^2 \left( \frac{1}{n_1} + \frac{1}{n_2} \right)} \]

  • Test statistic:
    \[ t = \frac{(\bar{x}_2 - \bar{x}_1) - 0}{SE} \]

  • Degrees of freedom:
    \[ df = n_1 + n_2 - 2 \]

  • Confidence interval:
    \[ (\bar{x}_2 - \bar{x}_1) \;\pm\; t_{\alpha/2, df} \times SE \]

R code (template)

# Given data
n1 <- 35; xbar1 <- 12.4; s1 <- 4.2
n2 <- 40; xbar2 <- 14.1; s2 <- 3.9

# Pooled variance and standard deviation
sp2 <- ((n1 - 1)*s1^2 + (n2 - 1)*s2^2) / (n1 + n2 - 2)
sp <- sqrt(sp2)

# Standard error
SE <- sp * sqrt(1/n1 + 1/n2)

# Difference in means
diff <- xbar2 - xbar1

# Test statistic and df
t_stat <- diff / SE
df <- n1 + n2 - 2

# Two-sided p-value
p_value <- 2 * (1 - pt(abs(t_stat), df))

# 95% CI
alpha <- 0.05
t_star <- qt(1 - alpha/2, df)
lower <- diff - t_star * SE
upper <- diff + t_star * SE

t_stat; p_value; c(lower, upper)

Numerical results
\(t \approx 1.87\) (female − male), \(p \approx 0.066\), with \(df = 73\).
95% CI: I am 95% confident that the true difference in mean homework hours (female − male) is between −3.5 and 0.2 hours.

  1. Decision at \(\alpha = 0.05\)
    Fail to reject \(H_0\). It is possible that the true difference in mean homework hours is 0. In context, this means there may be no real difference in homework time between male and female students.

  2. Generalization
    Sample only from honors students in a limited number of schools → generalization is limited.

  3. Causality
    Observational data; cannot infer causality.


A company is comparing two training programs to see if one produces higher certification pass rates.
- Program A: 60 employees, 48 passed.
- Program B: 55 employees, 38 passed.
Supervisors assigned employees to programs based on schedules; Program A employees had more prior experience.

  1. Research Question
    Do pass rates differ between Program A and Program B?

  2. Variables & Test
    Categorical response (pass/fail), categorical explanatory (program). Two-sample \(z\)-test for proportions.

  3. Response & Explanatory
    Response = pass/fail.
    Explanatory = training program (A vs B).

  4. Confounding Variable
    Prior experience with the certification material.

  5. Parameter of Interest
    \(p_1 - p_2 =\) difference in true pass rates (Program A \(-\) Program B).

  6. Hypotheses
    \(H_0: p_1 - p_2 = 0\)
    \(H_a: p_1 - p_2 \ne 0\)

  7. Summary Statistics
    \(n_1 = 60,\; x_1 = 48,\; \hat p_1 = 0.80\)
    \(n_2 = 55,\; x_2 = 38,\; \hat p_2 \approx 0.6909\)
    Observed difference: \(\hat p_1 - \hat p_2 \approx 0.1091\).

  8. Validity Conditions
    Each group has at least 10 successes and 10 failures:

  • Program A: successes \(=48\), failures \(=12\)
  • Program B: successes \(=38\), failures \(=17\)
    Conditions met; groups treated as independent.
  1. Test Statistic, p-value, and 95% CI

Formulas
- Pooled proportion (for the test): \(\hat p_{pool} = \dfrac{x_1 + x_2}{n_1 + n_2}\)
- Test statistic:
\[ z = \frac{(\hat p_1 - \hat p_2) - 0}{\sqrt{\hat p_{pool}(1-\hat p_{pool})\left(\frac{1}{n_1} + \frac{1}{n_2}\right)}} \] - Two-sided p-value: \(2\,[1-\Phi(|z|)]\) - 95% CI (unpooled SE):
\[ (\hat p_1 - \hat p_2) \;\pm\; z_{\alpha/2}\,\sqrt{\frac{\hat p_1(1-\hat p_1)}{n_1} + \frac{\hat p_2(1-\hat p_2)}{n_2}} \]

R code (template)

# Data
n1 <- 60; x1 <- 48; p1.hat <- x1 / n1
n2 <- 55; x2 <- 38; p2.hat <- x2 / n2

# Pooled for H0
p.pool <- (x1 + x2) / (n1 + n2)
SE0 <- sqrt(p.pool * (1 - p.pool) * (1/n1 + 1/n2))

# Test statistic & two-sided p-value
z <- (p1.hat - p2.hat) / SE0
p.value <- 2 * (1 - pnorm(abs(z)))

z; p.value
# 95% CI for (p1 - p2) using unpooled SE
alpha <- 0.05
z.star <- qnorm(1 - alpha/2)
SE.hat <- sqrt(p1.hat*(1 - p1.hat)/n1 + p2.hat*(1 - p2.hat)/n2)
ME <- z.star * SE.hat
lower <- (p1.hat - p2.hat) - ME
upper <- (p1.hat - p2.hat) + ME

c(lower, upper)

Numerical results
\(z \approx 1.35\), two-sided \(p \approx 0.178\).
95% CI: I am 95% confident that the true difference in pass rates (Program A \(-\) Program B) is somewhere between \(-0.050\) and \(0.268\).

  1. Decision at \(\alpha = 0.05\)
    Fail to reject \(H_0\). It is possible that the true difference in pass rates is \(0\) (no difference between Program A and Program B). In context, the programs may perform similarly based on this sample.

  2. Generalization
    This is one company’s workforce; generalization beyond similar employees is limited.

  3. Causality
    Assignment was not randomized (schedules/prior experience differ), so we cannot make a causal claim that one program causes higher pass rates.


A researcher measures the resting heart rates of 25 participants before and after an 8-week aerobic training program.
Mean decrease (Before − After) \(=\;5.2\) bpm, standard deviation of differences \(s_d = 7.5\) bpm. Some participants also started new diets.

  1. Research Question
    Does the training program reduce average resting heart rate?

  2. Variables & Test
    Quantitative paired differences (before/after on the same people). Paired t-test for a mean difference.

  3. Response & Explanatory
    Response = resting heart rate.
    Explanatory = time (before vs after program) on the same participants.

  4. Confounding Variable
    Diet changes begun during the program.

  5. Parameter of Interest
    \(\mu_d =\) true mean difference in resting heart rate (Before − After).

  6. Hypotheses
    \(H_0: \mu_d = 0\)
    \(H_a: \mu_d > 0\) (a positive mean difference indicates a decrease after training)

  7. Summary Statistics
    \(n = 25,\;\; \bar{d} = 5.2,\;\; s_d = 7.5\)

  8. Validity Conditions
    Pairs are matched by person; differences are the unit of analysis. With \(n=25\,(>20)\) the paired t-procedure is reasonable; we assume the distribution of differences is approximately normal and pairs are independent of each other.

  9. Test Statistic, p-value, and 95% CI

Formulas
- Test statistic:
\[ t = \frac{\bar{d} - 0}{s_d/\sqrt{n}} \] - Right-tailed p-value: \(1 - T_{df}(t)\) with \(df = n - 1\)
- 95% CI for \(\mu_d\):
\[ \bar{d} \;\pm\; t_{\alpha/2,\,df}\;\frac{s_d}{\sqrt{n}}, \quad df=n-1 \]

R code (template)

# Given data
n <- 25
dbar <- 5.2
sd_d <- 7.5
mu0 <- 0

# Test statistic & right-tailed p-value
t_stat <- (dbar - mu0) / (sd_d / sqrt(n))
df <- n - 1
p_value <- 1 - pt(t_stat, df)

t_stat; p_value
# 95% CI for mean difference
alpha <- 0.05
t_star <- qt(1 - alpha/2, df)
ME <- t_star * sd_d / sqrt(n)
lower <- dbar - ME
upper <- dbar + ME
c(lower, upper)  # I am 95% confident the true mean decrease is between lower and upper

Numerical results
\(t \approx 3.47\), right-tailed \(p \approx 0.001\).
95% CI: I am 95% confident that the true mean decrease in resting heart rate is somewhere between \(2.2\) and \(8.2\) bpm.

  1. Decision at \(\alpha = 0.05\)
    Reject \(H_0\). We conclude it is possible that the true mean decrease in resting heart rate is greater than \(0\). In context, this suggests the training program may lower resting heart rates.

  2. Generalization
    Volunteer sample from a single program; generalization to all adults is limited without broader sampling.

  3. Causality
    No random assignment and concurrent diet changes; we cannot make a causal claim that the program alone caused the reduction.


Before you leave

Today:

  • Any questions for me?

Upcoming Graded Events

  • WPR 2: Lesson 22