Confidence interval for the mean:
x̄ ± z*(σ/√n) (known σ)
x̄ ± t*(s/√n) (unknown σ, use t-distribution)
A 95% CI means: if we repeated the sampling many times, about 95% of the intervals would contain the true parameter. This frequentist interpretation connects to probability theory. The margin of error shrinks as n grows — reflecting the limit behavior of estimation.
Hypothesis Testing
The framework:
State hypotheses: H₀ (null) vs. Hₐ (alternative)
Choose α: Significance level (usually 0.05)
Compute test statistic: z = (x̄ − μ₀)/(σ/√n)
Find p-value: P(observing data this extreme | H₀ is true)
Decision: If p-value < α, reject H₀
Example: One-sample z-test
Claim: μ = 500. Sample: n = 36, x̄ = 515, σ = 60
z = (515 − 500)/(60/√36) = 15/10 = 1.5
p-value ≈ 0.134 > 0.05 → fail to reject H₀
Type I & II Errors
Type I (α): Rejecting H₀ when it's true (false positive)
Type II (β): Failing to reject H₀ when it's false (false negative)
Regression finds the line of best fit using calculus optimization (minimizing the sum of squared residuals). For multiple predictors, matrix algebra gives the solution: b = (XᵀX)⁻¹Xᵀy.
ANOVA & Chi-Square Tests
ANOVA (Analysis of Variance) tests whether means differ across groups — it generalizes the t-test. The F-statistic = MS_between / MS_within. Chi-Square tests independence in contingency tables and goodness-of-fit for categorical data.
Modern statistics increasingly uses computational methods: bootstrapping, permutation tests, and Bayesian approaches. These still rely on the probability and descriptive foundations covered earlier, but add computational power to handle complex real-world data.