Statistical Inference

Point & Interval Estimation

Confidence interval for the mean:
x̄ ± z*(σ/√n) (known σ)
x̄ ± t*(s/√n) (unknown σ, use t-distribution)

A 95% CI means: if we repeated the sampling many times, about 95% of the intervals would contain the true parameter. This frequentist interpretation connects to probability theory. The margin of error shrinks as n grows — reflecting the limit behavior of estimation.

Hypothesis Testing

The framework:

State hypotheses: H₀ (null) vs. Hₐ (alternative)
Choose α: Significance level (usually 0.05)
Compute test statistic: z = (x̄ − μ₀)/(σ/√n)
Find p-value: P(observing data this extreme | H₀ is true)
Decision: If p-value < α, reject H₀

Example: One-sample z-test

Claim: μ = 500. Sample: n = 36, x̄ = 515, σ = 60

z = (515 − 500)/(60/√36) = 15/10 = 1.5

p-value ≈ 0.134 > 0.05 → fail to reject H₀

Type I & II Errors

Type I (α): Rejecting H₀ when it's true (false positive)
Type II (β): Failing to reject H₀ when it's false (false negative)
Power = 1 − β: Probability of correctly rejecting a false H₀

Increasing sample size increases power without inflating α. These trade-offs are fundamental to experimental design.

Regression Analysis

Simple linear regression: ŷ = b₀ + b₁x
b₁ = Σ(xᵢ − x̄)(yᵢ − ȳ)/Σ(xᵢ − x̄)²
b₀ = ȳ − b₁x̄
R² = 1 − SS_res/SS_tot

Regression finds the line of best fit using calculus optimization (minimizing the sum of squared residuals). For multiple predictors, matrix algebra gives the solution: b = (XᵀX)⁻¹Xᵀy.

ANOVA & Chi-Square Tests

ANOVA (Analysis of Variance) tests whether means differ across groups — it generalizes the t-test. The F-statistic = MS_between / MS_within. Chi-Square tests independence in contingency tables and goodness-of-fit for categorical data.

Modern statistics increasingly uses computational methods: bootstrapping, permutation tests, and Bayesian approaches. These still rely on the probability and descriptive foundations covered earlier, but add computational power to handle complex real-world data.

← PreviousProbability & Distributions Back to →Statistics Overview