Lesson 15: Confidence Interval (Quantitative)

Lesson Administration

Calendar

Day 1

Day 2

Impact of Government Shut Down

G1: No Change
G1: No Change
I1: Possibly Meet in TH120 but for right now stay put
J2: No Change

Milestone 4

Lesson 17
Milestone 4
- With partner
- Write 1-2 paragraphs per article summarizing the articles topic with a take away for its insight on your project.
- Make updates from Milestone 3 feedback.
- Fill out Annex B for my comments on Milestone 3.
- Turn in EVERYTHING in your working write up.
- Keep your binder up-to-date, but I don’t want to see it.

SIL 1

Lesson 17
25 Points!
Like a WPR What does that mean?!!
Read ahead

Exploration Exercise 2.3

⏰ Due 0700 on **Lesson 18*
Lets take a look at it
- Day 1: Tuesday, 14 Oct 2025
- Day 2: Wednesday, 15 Oct 2025

Cal

Things didn’t start well…

But Cal slowly got his act together.

Then he was the first bracket pitcher on the next day against the 2 Seed…

Got the dub:

Lost in the semis vs the eventual champion - not a bad tournament

Reese

We’ll talk about her next lesson!

Review: \(z\)-Tests for One Proportion

For all cases:

\[ z = \frac{\hat{p} - \pi_0}{\sqrt{\frac{\pi_0 (1 - \pi_0)}{n}}} \]

Alternative Hypothesis	Formula for \(p\)-value	R Code
\(H_A: p > \pi_0\)	\(p = 1 - \Phi(z)\)	`p_val <- 1 - pnorm(z_stat)`
\(H_A: p < \pi_0\)	\(p = \Phi(z)\)	`p_val <- pnorm(z_stat)`
\(H_A: p \neq \pi_0\)	\(p = 2 \cdot (1 - \Phi(\|z\|))\)	`p_val <- 2 * (1 - pnorm(abs(z_stat)))`

Where:

\(\hat{p} = R/n\) (sample proportion)
\(\pi_0\) = hypothesized proportion under \(H_0\)
\(\Phi(\cdot)\) = cumulative distribution function (CDF) of the standard normal distribution.

Review: \(t\)-Tests for One Mean

For all cases:

\[ t = \frac{\bar{x} - \mu_0}{s / \sqrt{n}} \]

Alternative Hypothesis	Formula for \(p\)-value	R Code
\(H_A: \mu > \mu_0\)	\(p = 1 - F_{t_{df}}(t)\)	`p_val <- 1 - pt(t_stat, df)`
\(H_A: \mu < \mu_0\)	\(p = F_{t_{df}}(t)\)	`p_val <- pt(t_stat, df)`
\(H_A: \mu \neq \mu_0\)	\(p = 2 \cdot (1 - F_{t_{df}}(\|t\|))\)	`p_val <- 2 * (1 - pt(abs(t_stat), df))`

Where:

\(\bar{x}\) = sample mean
\(\mu_0\) = hypothesized mean under \(H_0\)
\(s\) = sample standard deviation
\(n\) = sample size
\(df = n - 1\) = degrees of freedom
\(F_{t_{df}}(\cdot)\) = cumulative distribution function (CDF) of the Student’s \(t\) distribution with \(df\) degrees of freedom

Interpreting the \(p\)-value

Rejecting \(H_0\)
> Since the \(p\)-value is less than \(\alpha\) (e.g., \(0.05\)), we reject the null hypothesis.
> We conclude that there is sufficient evidence to suggest that [state the alternative claim in context].
Failing to Reject \(H_0\)
> Since the \(p\)-value is greater than \(\alpha\) (e.g., \(0.05\)), we fail to reject the null hypothesis.
> We conclude that there is not sufficient evidence to suggest that [state the alternative claim in context].

Other Notes

What impacts does altering different values have?

Generalization: We can generalize results to a larger population if the sample is random and representative of that population. Convenience samples don’t justify broad claims.
Causation: We can claim causation only if the study design is a randomized experiment. Observational studies can show associations, but not cause-and-effect.
Parameters vs. Statistics: A parameter is a fixed (but usually unknown) numerical value describing a population (e.g., \(\mu\), \(\sigma\), \(\pi\)). A statistic is a numerical value computed from a sample (e.g., \(\bar{x}\), \(s\), \(\hat{p}\)).
- Parameters = target (what we want to know).
- Statistics = evidence (what we can actually measure).
- We use statistics to estimate parameters, and because different samples give different statistics, we capture this variability with confidence intervals.

Quantity	Population (Parameter)	Sample (Statistic)
Center (mean)	\(\mu\)	\(\bar{x}\)
Spread (SD)	\(\sigma\)	\(s\)
Proportion “success”	\(\pi\)	\(\hat{p}\)

Confidence Intervals on Means

Suppose we take a random sample of \(n\) cadets and measure a quantitative outcome (e.g., 2-mile run times). We want to estimate the true population mean \(\mu\).

We start with the one-sample \(t\) statistic:

\[ t \;=\; \frac{\bar{x} - \mu_0}{s/\sqrt{n}}. \]

At significance level \(\alpha = 0.05\), we reject \(H_0\) whenever \(t\) falls into the rejection region (the critical tail(s)) of the \(t\) distribution with \(df=n-1\).

Where on this plot would we reject?

Where are these lines?

So, the left tail contains \(0.025\) of the distribution, and the right tail contains \(0.025\).

How do we find where \(0.025\) of the distribution is on each side?

We want to solve: \[ 0.025 \;=\; \int_{-\infty}^{x} f_{t,\,df}(z)\,dz, \] where \(f_{t,\,df}\) is the density of the Student’s \(t\) distribution with \(df=n-1\).

When we solve for \(x\), this is the quantile function (inverse CDF).
This integral does not have a simple closed form — that’s why we have R.

qt(0.025, df = df)

[1] -2.063899

That’s the left cutoff. And the right cutoff:

qt(0.975, df = df)

[1] 2.063899

So our critical values are about \(-t^\star\) and \(+t^\star\) (here, with \(df=24\), approximately \(\pm 2.064\)).

That is, \[ \pm t^\star \;=\; \frac{\bar{x} - \mu_0}{s/\sqrt{n}}. \]

We don’t know the true \(\mu\), and there isn’t a \(\mu_0\) to plug in for estimation.
So what do we do? We invert the inequality to find the set of \(\mu_0\) values we would not reject:

\[ \bar{x} - \mu_0 \;=\; \pm\, t^\star \cdot \frac{s}{\sqrt{n}} \]

With a little algebra: \[ \begin{align*} \bar{x} - \mu_0 \;&=\; \pm\, t^\star \cdot \frac{s}{\sqrt{n}} \\[6pt] -\mu_0 \;&=\; -\bar{x} \;\pm\; t^\star \cdot \frac{s}{\sqrt{n}} \\[6pt] \mu_0 \;&=\; \bar{x} \;\pm\; t^\star \cdot \frac{s}{\sqrt{n}} \end{align*} \]

So the general 95% confidence interval for a mean is: \[ \bar{x} \;\pm\; t^\star \cdot \frac{s}{\sqrt{n}}, \] with \(t^\star\) taken from the \(t\) distribution with \(df=n-1\).

(Equivalently: the set of \(\mu_0\) values we would not reject at \(\alpha=0.05\).)

Board Problems

Board Problem 1

A random sample of \(n=20\) cadets recorded their 2-mile run times. The sample mean was \(\bar{x} = 13.8\) minutes with a sample standard deviation of \(s = 1.4\) minutes. Assume run times are approximately normal.

Construct a 90% confidence interval for the true mean run time.
Construct a 95% confidence interval for the true mean run time.
Construct a 99% confidence interval for the true mean run time.

Solution

Use the one-sample mean CI: \[ \bar{x}\;\pm\; t^\star \cdot \frac{s}{\sqrt{n}}, \quad df=n-1. \]

Here \(n=20 \Rightarrow df=19\), \(\bar{x}=13.8\), \(s=1.4\).

Compute the \(t^\star\) cutoffs in R (echoed):

df <- 19
qt(0.05, df)  # 90% CI

[1] -1.729133

qt(0.025, df)  # 95% CI

[1] -2.093024

qt(0.005, df)  # 99% CI

[1] -2.860935

90% CI: \(13.8 \pm (1.729)\left(\tfrac{1.4}{\sqrt{20}}\right) = 13.8 \pm 0.541 \;\Rightarrow\; [\,13.259,\;14.341\,]\)
95% CI: \(13.8 \pm (2.093)\left(\tfrac{1.4}{\sqrt{20}}\right) = 13.8 \pm 0.655 \;\Rightarrow\; [\,13.145,\;14.455\,]\)
99% CI: \(13.8 \pm (2.861)\left(\tfrac{1.4}{\sqrt{20}}\right) = 13.8 \pm 0.896 \;\Rightarrow\; [\,12.904,\;14.696\,]\)

Board Problem 2

A sample of \(n=15\) cadets reported hours of sleep the night before a training exercise. The sample mean was \(\bar{x} = 6.2\) hours with a sample standard deviation of \(s = 1.1\) hours. Assume sleep hours are approximately normal.

Construct a 90% confidence interval for the true mean hours of sleep.
Construct a 95% confidence interval for the true mean hours of sleep.
Construct a 99% confidence interval for the true mean hours of sleep.

Solution

Use the one-sample mean CI: \[ \bar{x}\;\pm\; t^\star \cdot \frac{s}{\sqrt{n}}, \quad df=n-1. \]

Here \(n=15 \Rightarrow df=14\), \(\bar{x}=6.2\), \(s=1.1\).

Compute the \(t^\star\) cutoffs in R (echoed):

df <- 14
qt(0.05,  df)  # 90% CI

[1] -1.76131

qt(0.025, df)  # 95% CI

[1] -2.144787

qt(0.005, df)  # 99% CI

[1] -2.976843

90% CI: \(6.2 \pm (1.761)\left(\tfrac{1.1}{\sqrt{15}}\right) = 6.2 \pm 0.500 \;\Rightarrow\; [\,5.700,\;6.700\,]\)
95% CI: \(6.2 \pm (2.145)\left(\tfrac{1.1}{\sqrt{15}}\right) = 6.2 \pm 0.609 \;\Rightarrow\; [\,5.591,\;6.809\,]\)
99% CI: \(6.2 \pm (2.977)\left(\tfrac{1.1}{\sqrt{15}}\right) = 6.2 \pm 0.846 \;\Rightarrow\; [\,5.354,\;7.046\,]\)

Board Problem 3

In a random sample of \(n=240\) respondents, \(x=102\) reported using public transit at least once per week.

Construct a 90% confidence interval for the true proportion who use public transit weekly.
Construct a 95% confidence interval for the true proportion.
Construct a 99% confidence interval for the true proportion.

Solution

Use the confidence interval for a proportion: \[ \hat{p}\;\pm\; z^\star \sqrt{\frac{\hat{p}\,(1-\hat{p})}{n}},\qquad \hat{p}=\frac{x}{n}. \]

Here, \(\hat{p}=\dfrac{102}{240}=0.425\) and \(n=240\).

Compute the \(z^\star\) cutoffs in R:

n <- 240; x <- 102
phat <- x/n
z90 <- qnorm(0.95)   # for 90% CI
z95 <- qnorm(0.975)  # for 95% CI
z99 <- qnorm(0.995)  # for 99% CI
z90; z95; z99

[1] 1.644854

[1] 1.959964

[1] 2.575829

90% CI: \[ 0.425 \;\pm\; (1.645)\sqrt{\frac{0.425(1-0.425)}{240}} \;=\; 0.425 \pm 0.052 \;\Rightarrow\; [\,0.373,\;0.477\,]. \]
95% CI: \[ 0.425 \;\pm\; (1.960)\sqrt{\frac{0.425(1-0.425)}{240}} \;=\; 0.425 \pm 0.063 \;\Rightarrow\; [\,0.362,\;0.488\,]. \]
99% CI: \[ 0.425 \;\pm\; (2.576)\sqrt{\frac{0.425(1-0.425)}{240}} \;=\; 0.425 \pm 0.082 \;\Rightarrow\; [\,0.343,\;0.507\,]. \]

Board Problem 4

A random sample of \(n=30\) cadets was asked how many push-ups they could complete in two minutes. The sample mean was \(\bar{x} = 58.5\) with a sample standard deviation of \(s = 9.3\). Assume counts are approximately normal.

Construct a 90% confidence interval for the true mean number of push-ups.
Construct a 95% confidence interval for the true mean number of push-ups.
Construct a 99% confidence interval for the true mean number of push-ups.

Solution

Use the one-sample mean CI: \[ \bar{x}\;\pm\; t^\star \cdot \frac{s}{\sqrt{n}}, \quad df=n-1. \]

Here \(n=30 \Rightarrow df=29\), \(\bar{x}=58.5\), \(s=9.3\).

Compute the \(t^\star\) cutoffs in R (echoed):

df <- 29
qt(0.05, df)  # 90% CI

[1] -1.699127

qt(0.025, df)  # 95% CI

[1] -2.04523

qt(0.005, df)  # 99% CI

[1] -2.756386

90% CI: \(58.5 \pm (1.699)\left(\tfrac{9.3}{\sqrt{30}}\right) = 58.5 \pm 2.884 \;\Rightarrow\; [\,55.616,\;61.384\,]\)
95% CI: \(58.5 \pm (2.045)\left(\tfrac{9.3}{\sqrt{30}}\right) = 58.5 \pm 3.471 \;\Rightarrow\; [\,55.029,\;61.971\,]\)
99% CI: \(58.5 \pm (2.756)\left(\tfrac{9.3}{\sqrt{30}}\right) = 58.5 \pm 4.676 \;\Rightarrow\; [\,53.824,\;63.176\,]\)

Lesson 15: Confidence Interval (Quantitative)

Lesson Administration

Calendar

Day 1

Day 2

Impact of Government Shut Down

Milestone 4

SIL 1

Exploration Exercise 2.3

Cal

Reese

Review: \(z\)-Tests for One Proportion

Review: \(t\)-Tests for One Mean

Interpreting the \(p\)-value

Other Notes

Confidence Intervals on Means

Board Problems

Board Problem 1

Board Problem 2

Board Problem 3

Board Problem 4

Before you leave

Today:

Upcoming Graded Events