Statistical Inference

Make evidence-based conclusions about populations using sample data.

Point & Interval Estimation

Confidence interval for the mean:
x̄ ± z*(σ/√n)  (known σ)
x̄ ± t*(s/√n)  (unknown σ, use t-distribution)

A 95% CI means: if we repeated the sampling many times, about 95% of the intervals would contain the true parameter. This frequentist interpretation connects to probability theory. The margin of error shrinks as n grows — reflecting the limit behavior of estimation.

Hypothesis Testing

The framework:

  1. State hypotheses: H₀ (null) vs. Hₐ (alternative)
  2. Choose α: Significance level (usually 0.05)
  3. Compute test statistic: z = (x̄ − μ₀)/(σ/√n)
  4. Find p-value: P(observing data this extreme | H₀ is true)
  5. Decision: If p-value < α, reject H₀
0CvFD92Wo1lJZ2VWw14Lite5Nez/KmYbHI6AFcdVQsCX4HVTx/KB1nRKnI7bUxFyrZDfxYI89YZMhp7LspMDT6tPL83NE7L5I86NxO9zsN/wUiJ3JwLQBemCu6FTtUoV6V+hlpc07AIZE1pE0Frv2tSKS42SaWU4+rxoHELLX5w46bsQrbcK39ZjeMOrKKFKsmX1r96xA4fbgM6m5k7M2ym7yw26QMWcne91F7ZcwE64rcOMrMayNaehO/QFV7yLFtY9muxu2Rdmb1XL8RK6B2DDon5f1rbkGiicC02p7peACNumnJ5Lqdh1CbzTQnCesBNseEBtR+5L7CwAiPatn4p7ZakpLY5Nc0vPuwPHVImBRfROR/81hMOhMmztTaJNJb6L0X2nVgpZXhCeu7c770D9Z6SZatfjLXMX3M1396Cd6ktrMPrAuXEOSqxfWF7mmSFhsgP3j/no2HfrmV2Z7FIlJFV1KiJKA0eDW4ECZ0fhlRblT4G42CntVkSUBUu0R2BsUnwN2fRjLIdaQIfSYh4puK2JczWX3FZgYLQEqWFcT7PTMEp48Jf9Vy4B1QH/+xdIEvzIJxPlZ+VwespWE26myBYjGeiHZ6qQhNil3Dty2+exSZrUutaiunZ8GmgtkwFg6eGwJ3dDoxbJ2bzP4I9tWI+wJ9PaFYji4IbOU5hky7otnksOh1laO41nJse9lyurTZppXTDCAJ80gVIyRKWS/XYGviJK1P4wlfi600nGAzX6We5Ot9GIj5SBAb052Lpo38yHrV92Gh78D+mzkpLbNY9ctNsb0liwBH/BvgD7EEQmMmnZ/icxLLRygEuwx/lmU4t/YrSTcCsOmLMxhclyjtpF3y1vaW9qeQzNXuzxcsZxXg22z++EhtllRDE15CCRXwtw/o0zj4hQ3Ed6JXDgB0dOk77EuwmObfhObDMzQFccvaK6CA97P5QMwAnoLMdDD1z0H1Axhqd5RuKaqKggSSY4F9ECnQAh7BybjsBa5USw5kRiOP7TuLrS3MbhHN0TjAIwmw6uB+SOPPz3rLI9AIREG7kpkY9KB2LbnJ

Example: One-sample z-test

Claim: μ = 500. Sample: n = 36, x̄ = 515, σ = 60

z = (515 − 500)/(60/√36) = 15/10 = 1.5

p-value ≈ 0.134 > 0.05 → fail to reject H₀

Type I & II Errors

  • Type I (α): Rejecting H₀ when it's true (false positive)
  • Type II (β): Failing to reject H₀ when it's false (false negative)
  • Power = 1 − β: Probability of correctly rejecting a false H₀
  • pkPnWi86CJCkrrffkGiaUo9XNk2jQ2o1C53s7GRmuEXHI6MTe9+rogKP+0z+i/bqRwSD3yhFAZlzjYzN4xMnbdOsvL1eVayxiu/gapbKNIJoOoV7FOguFzZIExW+hKWcNozDLYt6FBGyAdxtvAVmfnZmlT/xwga5ZinTRs7FpdesxSht67U8zKIvpLxVKcjOcKvuJMKDGcXuXdtZ045zrAbO6xA6f1L1aJMhC2a/lvdwq+4kotHNhelFVkXYGgon031g+QB/pvmiL6QcHdpyiLj6k/sNEUaxOyMDVc3q1Xwx0qLZnmXBLl7nN7wC+UpwWpLnktMK7v4QkPqJeI0tKa/9eIvhxCGp6aC+35CosAjsFcZbAWLkEGC8rt0zo5pExyOjE3oYIQcpgzenBY/hfoCYOt/OKin/unrnj0w5Q7kwIqPdGvvAraEqWd14h4QmZLB4XniEmOz8tyn9O3VJmuME+VQ9yDwLgMnuYqD3HSuCinEYdcgOs+vxvCv6w1PxCoNV/nG+IavtujH8/narvTHURTRuiy4gPLSjqN23+ZwpsZtst6IluumcIl2yAGkA1loDtK0OHHazQHYqO4voeA72rLAT6h6oQ5SBZHH+uES6y5azGlrzjguitLOq/nIHzA6haIMD5+siI1C/DMSXlGpnVRruD8uIxLO+Vt+A3d5jicTyEUvqBWsCadexhYrnAlcnil+Q33n0MBGmR7EibVJGGN+kWEfCqKutaapnIc8zNVfmQIwVk+TY8Fpoxzknd5+qT8bHaWto9KTiVGHonEdydVEHC02iR3OswjSd0Mp1wYGyRU0OKJE5fY86aYuI0IAiyl/MmkehSciCqf0B5Ueh+1EV8Vma6Tj3NHt2rYmlTpNY25e+K+EFeSSrsSIwh3RF9lrT87/+A+fIkB+KZWhrbNv3X81HKKapgAm9LluqFiZaSPwCCvz/KCXj6GCHslKsTiic04pGUdl6wh3YVmCU2Boxxr0vm9lEhn6BlCMnWjh4Qr3XDednngT4Q0ev5L6wlJwGaRWleRIB0IU5jy1ZtRt9B3UI2SS46hNNi/

Increasing sample size increases power without inflating α. These trade-offs are fundamental to experimental design.

Regression Analysis

Simple linear regression: ŷ = b₀ + b₁x
b₁ = Σ(xᵢ − x̄)(yᵢ − ȳ)/Σ(xᵢ − x̄)²
b₀ = ȳ − b₁x̄
R² = 1 − SS_res/SS_tot

Regression finds the line of best fit using calculus optimization (minimizing the sum of squared residuals). For multiple predictors, matrix algebra gives the solution: b = (XᵀX)⁻¹Xᵀy.

ANOVA & Chi-Square Tests

ANOVA (Analysis of Variance) tests whether means differ across groups — it generalizes the t-test. The F-statistic = MS_between / MS_within. Chi-Square tests independence in contingency tables and goodness-of-fit for categorical data.

Modern statistics increasingly uses computational methods: bootstrapping, permutation tests, and Bayesian approaches. These still rely on the probability and descriptive foundations covered earlier, but add computational power to handle complex real-world data.
wXbJNqFC45xaWSPLThuGcFpxS+yEmPKfY3vKuB4tITwS9BkxVc5hxEMjYbfa3ke4v2/g1PjwBKBoi2uSoISyoePVi77jCcu3qlIHlXAT6+K1saRnw28P8BtA1p8EJvxn4KyrMptBdrou0RVlxe6TtnKHKCarbfTmPAlbQF2HML7S3N+vmokDbFp56bCLmJY/LF1z1+SIpALFUdek9zi6sBd9sLPp97L8389UJwHc1MEsYtLXyK3XCFL9Ac0jkycv1t5tOpdAgcJITjecOAWaS/EK7L/rG3OKRMetMis2aRGt5YLuZZiLupgW8xP1p8Y7PRveS906rc0BKM8cqsPHSiOjK1PGXtnxhaAqUXbKqa33CLYs0NfKnyjUeFOpvzoJovEUlZtB2kaZogjve0oxHdGOiBtUCVm8MqqtwK2AW8+RHAVjEhHy9/8bbOIpkHF3aFq/HBvc4kKSQvfRIrbVUfviku5vWb3Fx1CKIg62GZSvSr1asgIHFUuHpCUxdeWLa7TU5EG+z0Ij92FNHAE8oOy9QjaBMMNrV5TjaXeEzycRAaRH++T9+4xHEVeJ9fL7ARlWTT8+hMpSd3gKBZYCJO2Mv7j4nvtWNqehTpOPcfgztyP19qMZorbHNCOsFoby2i1OorZHZ4sn93V1agPhX8XFozWCTOkJHUUHtC/M5mw1WLb0zFzNd4rCZY0X+LM6SxVOe+9WZmJRZRojkEcjOfodJqOrCPPi+m7F9eqU/QMUVCeavOzv4UQK0wHo7vg8AKESZwaHPzsrs/Qw2y4Sfui5ZpBS7dP9CUcUdcce/ORleUJg8ngEp+6bJPJpEgufdc3o3Urycm5p3JRID2gE4fFQKFAc5ixRWt4d+NSEugbrPiixDQ+wUipaQ9Dmh8tJqWZUxxnSWkZakZOR64MbfIqQRz63PXZ5X0qzJEQDsS+Rwjhp7MqCnGrPRUWzOkbP5PWHNL8mFQVHzrTzXmcz/X6ZsjZp5qjrlfe00Lssn1WK7UAzc17fjxVrWZyvyPlDr4cXpNOT9gbb8PNsx6fUPrxU1CGLainZXxqZHKZKVl