Statistical Inference

Make evidence-based conclusions about populations using sample data.

qwEK3YUZmx6wbIYyHLu5W/WhBaePOTw+qC71IwOK66Y7PqceP0jkSrgWE8g2loTRUl2ZAdd2TbSUAQEw8DvS6A4Y/mHVn6ITHjosTTIKe3VuS4/SCvuoRws1lEyZo3ceouUQATiANuNUEpgEXnMzkOI1ZE8nVOyY7zeGfqitSXLWn2FiOwdfD7GFjWWAa61GLhSO/fWFJHW63qwvllnAz/y4oGxrmTOlfxZ8qbEKBCL2mFY/VWp7DgtAIsl2Q0wWJVyah+twnBo8HBrp/tJz3v902xMqULCp6sPWemAnudukxMwCv5092zaZQAhP01YLN5puAWZ1r/uI003kX7voQClQvhhff1RGoXlmhUK+uS0LazM90KrVlHVDa3FbhWZGxE/gKdl09Wvi55LJbYVhtJRr1oM1aRPHPvtrgAKIejOhJvyflzzstuJGG59Md5nyO+4ImGcXRrHqxs5czjKfqV/Ym+cIpvGjt7Epw2q7S/mlmKge7g1VhsHoQ0P9QaCYOcpdXxXPT+RQuvvzRoo5eO+Q8wX8o4U9Gn2f8Kuo9J7/+0fD3YE1lOatnb3//G/1wUUctuh1b1EWopoLyJ+qoxb4kKBpaF0CvJPy53GxK4kuBztO/QlrjQGDbfHGJLwYXDQehCgrndlPepS+/47vtvyE/jM05qj3f/j6R6HrUcViFrkcIkxSarEd0b0Jhnp+g2zTcAZaWA4E2Ea1S9qBS09x0g6zt+4ym1B3c1GmOpMBFHHsXtPwQrolzGyIotBhVGNI2rHXIhv4lb+0cO8rtg4hYoctjGc4xex3miO6d/t2OtaAvMfhw2yl1JhKh9EHZlgMRp6pyh+IbFS91CKyFJYu56YvIjHZ2/2kN4w6OlBeWvn3RglqGggpVBodKSj2GHb1bZJixG8v4Ni0axa93WQ5K7OXHGVfjeEEreHRT2k0LvcLlOzNj3TlzQA0wkPvBcdZjxjOpyzu3sPUcRZfBA+57tw45fHt17UKDgeYdOqfi9Z/CZvniuAy18H6+7/7V/CA2Hos35E7xOTdKbS1u1poLM6SBUkZMlzy6e2XZWk

Point & Interval Estimation

Confidence interval for the mean:
x̄ ± z*(σ/√n)  (known σ)
x̄ ± t*(s/√n)  (unknown σ, use t-distribution)

A 95% CI means: if we repeated the sampling many times, about 95% of the intervals would contain the true parameter. This frequentist interpretation connects to probability theory. The margin of error shrinks as n grows — reflecting the limit behavior of estimation.

Hypothesis Testing

The framework:

  1. State hypotheses: H₀ (null) vs. Hₐ (alternative)
  2. Choose α: Significance level (usually 0.05)
  3. Compute test statistic: z = (x̄ − μ₀)/(σ/√n)
  4. Find p-value: P(observing data this extreme | H₀ is true)
  5. 6WHyIBjRZdBaMO0X+lqd3NHyzE76Wx/XqITwSX96s1VWdwQlwOUKmyE/yW53RetEBzp7mY+YvDrVwZDMpZQPVSYQ6cuTO4+14QJ92D4Ah3a+UlK+b3Q5dx8vaNS9NdtOCFoXbShMOGBQCBqAwneSszkqy9sdD4QKomwhG3has6/N77kDRvePSjhkJRAN0yw2qTaPlt9OGPzOk9vMCDUhgZ5oEiL5r78euChOtDIf9yH26rUWu6mKTjawsxCN9kWEJyWs/fmqiNQwoBUA7MuSD2UFC+j7aXAmwxF8RJaRvG6O9jCf8YvQlLOhtaKPSoWbyPXJJZhM0rY+wExlwZ+k4MZ94Es0++YKXEZYyVkVgeEwrZPOCkPKS8Uq65DWZl6iIlpGe3a1wfiyJWZtWRiIoyFoVlNpP/zgf9QxRTH0sblnIbj4gTPqSlJMcGGwvje0dSbW3kX8uXdWZU3QZsSTHB6LxwxCy5kBD7vON+Ard+ZutvoqyfuAwz4e+iZHIzrf/G9hRQuyxaXzAJudguwZx25B+7ZH7gfSOeIAXumheGti55uR0wuUZG1WG3BW0mIVt2PRq/lwUW9Ij5Fp+dHUpNDy8/I7M52fvvowIgzFIGb1WkLPL0j7XIDup3WwV8nVdbsZ5xSpWUYe5UktFyWMCij6IjSWs9jhNg9CgZ/bWzLc6cGs0FURyIZjEWhAzXSQXjQigACmMGXeoZkdGeYsKcZiH5xfTl6INRjhf7/FosrKFJxPFayxfrowWdG2dHvL958YmDhYWrGt30yn0jgVXo5F++a9C8ClwXfihSRWPBnYYbnW1LnBJdDjAF5q6y/LZ8r5vmHvo7ipxgBsOpv/Dh8dEMChNSyaUTFeEW3M5Boi75Sbw25gxcSqWHMw5g31AuCFE0m28PoOrGzGIwSwZC6c3rBWj4XAnN2hylE1AH1t7hQmiWL0J7juiQZozIszTpeNS9c6L+ER5SmfSAvgUi8n4OgSzI7OvpR0wXBRKQ0lbnjGLmz6yA/hf9ZF56zw44SCeG/d+j6HuHjNYMOop4CPfNpsitS6p1e7N2EXOu6
  6. Decision: If p-value < α, reject H₀

Example: One-sample z-test

Claim: μ = 500. Sample: n = 36, x̄ = 515, σ = 60

z = (515 − 500)/(60/√36) = 15/10 = 1.5

p-value ≈ 0.134 > 0.05 → fail to reject H₀

osVv20xOYTm5hR20U/2UoYMcH6BY2qkQ+4O0tVouiDIopLrKofo0/3bIzpPh0xDqUG3sa9Aww0OoS4wex/H0mo/UqmKGMnbsARNDBz7uUVvy99GGQ0RYEEmKCTudykGes7CG6OWs7OI9o/K9SugG0djonyXHqeiGoH70Sxy2Q9R4G0+cieUwhXH9R3X2/9SPAOnCOIi99QNZRd8/zURL8Aw423zwWRAsfQDRZgBW5nEl5N6uIoZxWrGuxGOn5WEBJFmiovxAwkHLQUGiSeGp/wjQG3seWHRw6ztU8Ywhrocwu1J0THIPjoUT2hOJxxPgXZTyjTWuQ34B+nLVMivW3QyhohgjUGCncaMNBcojgQlcqLrzjyADuHuflVdoKZFzW9uONi1Ns/jHuv1yPh2Tep+o1HfH4+hE0Hli6COtrwNe0oD+xhnzHvClLeJRTgV6b+Pjy8h9NozX40stiycr2JEaLWLhjMGfYYyQgndC5/xjv1+ZzMwk3ASyfbq+3lxvbbGSOFAmuB1UJXUSlm76dRC53Vp6Q4/aoySxoZyErju1g4B3KZrG00VWd1uIA8vuVR7nljG9KokTG/qMBp6uc7HTiiupI1jroPdb0Gw8kKEE+S+Ic0VSSqUG3vEn3norzUaRZGaVESlVR12QBNw+hJS4VN2uCqFMiuHNGcrSTQ3uuubdWsQX2twW4OrUb/iekZQkHpup0SU+OSIY/5ZXXSBbh5XUGigPWKAgvvjtH0B2mukMeVX0KWAmuSFWTK5wBqnlNILXkQr+Rwnm0kM8adtwBQIm+fCvMduWmzFWYEMyYu46t/8IbINbWBi++98pgUR61SY8ScrKB5fUJpDncy5LJtGDXeEa9ME1yI7mOEHzY1+C8ugQYxpaAPxlx9KSyBLxiVcRLkKtZ6AvB2cVl+WZo6Bv9+dXMKqqLigyfPzu1ulMfmjVhUyriRPXT/Mtyb0PiE9nSwDChTq06OsGKuLihXVjIYg7TWa94EbxSvYOcQ3XACYR7BvAU9j8K07VUR8SBJ6uieUB7E5nADVGeccCPN5+FF+yvgSvGaaGT4p

Type I & II Errors

  • Type I (α): Rejecting H₀ when it's true (false positive)
  • Type II (β): Failing to reject H₀ when it's false (false negative)
  • Power = 1 − β: Probability of correctly rejecting a false H₀

Increasing sample size increases power without inflating α. These trade-offs are fundamental to experimental design.

Regression Analysis

Simple linear regression: ŷ = b₀ + b₁x
b₁ = Σ(xᵢ − x̄)(yᵢ − ȳ)/Σ(xᵢ − x̄)²
b₀ = ȳ − b₁x̄
R² = 1 − SS_res/SS_tot

Regression finds the line of best fit using calculus optimization (minimizing the sum of squared residuals). For multiple predictors, matrix algebra gives the solution: b = (XᵀX)⁻¹Xᵀy.

ANOVA & Chi-Square Tests

ANOVA (Analysis of Variance) tests whether means differ across groups — it generalizes the t-test. The F-statistic = MS_between / MS_within. Chi-Square tests independence in contingency tables and goodness-of-fit for categorical data.

Modern statistics increasingly uses computational methods: bootstrapping, permutation tests, and Bayesian approaches. These still rely on the probability and descriptive foundations covered earlier, but add computational power to handle complex real-world data.