Statistical Inference

Make evidence-based conclusions about populations using sample data.

Point & Interval Estimation

Confidence interval for the mean:
x̄ ± z*(σ/√n)  (known σ)
x̄ ± t*(s/√n)  (unknown σ, use t-distribution)

A 95% CI means: if we repeated the sampling many times, about 95% of the intervals would contain the true parameter. This frequentist interpretation connects to probability theory. The margin of error shrinks as n grows — reflecting the limit behavior of estimation.

Hypothesis Testing

The framework:

  1. State hypotheses: H₀ (null) vs. Hₐ (alternative)
  2. JWF3vyvYnnpjJuEmBsRh+ZIMGCrbUMLLwQyzvjtFwYA+j7NKGB8ruoddx3mXkvM0iPzrWangDVhKovQZoiO8Ie/hu51snQZHyii9NA4N4ezNd9U3/M1oxT4+sVxYqiJLt/MojLbCbO4LB27GcUdM6FJdA2UQLASHuWEtwiTYMYalLSMO11imrMKNa8mxPhcU3/oF/9lpIzbZACvS/PuaBN4+IYylK5C8YbTcCtAgNsuwcoxCFp242WflK64i0egoFe5jN9PxrjGAEF0dg4Wn9wAgQ1TSZXhleYeCHTT53VDfW7K05Q7k3llOnaM9giG1IizpmMKUtxOkN0rfjJWV4xc8Nk9i39HXqv7Lf+8FBPrj2QKK173sW5/eVm3syqsGRrXI/yebVRdTVug4Sa/Ac4iAvW8dsPMzQefNTH9/BYRPBvOfuns/WXNf+/qEUOBBD9CYBLvdeRGBB2scWWUp9v7lsw9NdYtwzammLQsbPD07DkXezR0doJ0Q5rq2fodvontobZPHzlHPNYWicxyVq83HGrfrUibQJ0F3w6XoWmbevEzOZQX56qP7g1sMv9MhNwT6KB+tyqTrY6unV2RKktKF/Hhyke0vNwg2gfJhgr80S2EE/hN/q+6jnRa+7U1KYn6VlXewax1vwXoYeo1V49lCQIk1wKJIKjReExKWK/Ba6eM6tyqQQOcVHGuL+zCP1NkIDteX2tAWvm1cBhFUjIVOtZxEV6Aiow9fTaebkPNzJmi98Izq8r67AuKkRKnzAHZICHWDQfAnDxxJzQvrPRdGzDxjYAdgYo13nKqyCFkAYRz+ZSFFjMBPOe85O0nT20yC2ka+PUpD5jn2obN4bzLswpJwEQDNAEC9f1NAA7SV0J+t+vF2iAg7OIuGzpIGiH+KBAWlQ7wWRxghEr/xvVxOnvh3xX5B0pvyVlW0DNZ1mhK2QB9TwYh7G70WwRagf8Je1thCZm7xD9tiZbr5kKU3Ses+GQ9uZDmyGNk4fJkYuNPC7AxDqG3sthpUOpQkAAewHHkexlLX0F6HPMsSGRPp4mUe5JhMW2ddFL5e9Ox
  3. Choose α: Significance level (usually 0.05)
  4. Compute test statistic: z = (x̄ − μ₀)/(σ/√n)
  5. Find p-value: P(observing data this extreme | H₀ is true)
  6. Decision: If p-value < α, reject H₀

Example: One-sample z-test

Claim: μ = 500. Sample: n = 36, x̄ = 515, σ = 60

z = (515 − 500)/(60/√36) = 15/10 = 1.5

p-value ≈ 0.134 > 0.05 → fail to reject H₀

Type I & II Errors

  • Type I (α): Rejecting H₀ when it's true (false positive)
  • cMYRV/ubRDlW8qS57I8rGpjfzLnPkSOvCEQtHSmDROhtUnRmJgknvuFGYB7PvPMQtRpGWETs4dfxGqTHc4sSFClsiLZXecdOgB+4gXHXphz2iAbmFQ3YRgdkqiV+93jetEQc0jZbu4Bm7dR+6wFEsZ73bHt+cFasED5oPpuKTMJCcKX5aHjhXD290+N+yYIPUTWNOD0pmaa3BTJOZUvNxatCZpqS/jHeEGL33jlcXQS6Fm/L8yXN6bG4AanbWdI1rW/WmJcNFllaEYiM2mPcE3euPpwWbNtWNNj9IGSpU0mrUKvifVvxUq8hYNkBXzwAe0DdruqMrdIGAkPwgc/IOiUSacDzmwINmCAZjvkq1DymU2E4TIMmvxuWgt57YRi1g1iqLu/2YTucuPW5ikU9tI9A/6FUpbFET3KvIsV/zwZ/2Miuzl4nyCofuPPJ/HmB3Id6QCJoZ5cRRlDPr1UwuTMMCpayQS/8yRLmRMVkoMwILPHQrxVMJZT0GS2VC93oAGcGco8faeLQVuPHnjvUo5E5z36FAnR8iYLdmQA46ZVodup7TooavHrxPb19I2TMPWd2I33+DnQNyvNZovx2wRpIWTOgUQSBlCEE4WUB32IE9KJ6IeGONXAEqlmPF2ixZHNgUz55o5e2SxfO2mNpEsrg6PzlRzs16DegAX5lWXkGKy8Q8z15W2GuyVFq7Qmqk3rd8ekcktYIe45tZ6np6eBbgOouPhsD29vJT6E/69Rg4uDNo8Gfj+TcIypYnf5abaAuCc8d6YxoB23x0fC7ILWWxRWAgGN0sZN4xZmKtqQfaJYa4UHNwblXGWnNt3Cf+pA915UAFVQZs+aA6dtt6eJntd72py7JhHs8n74rR8b2YbuEPHHI7tiA7KqcF11ClwQFJRKvHYmAL/R90AjG2Mi/NDH4J+hqNHIHpFsJsLClClA5jJC6UWwF3YK2aJTffrKd1N5wwcIeW6RBtwPsnruEBM+uTdUrpz/DSXLzr+Hr2uTCg4ygfPIU933MX+LZCeH92sdhBISoCPRifgqXWw+gnVTkVr1qEpmANe6ANg
  • Type II (β): Failing to reject H₀ when it's false (false negative)
  • Power = 1 − β: Probability of correctly rejecting a false H₀

Increasing sample size increases power without inflating α. These trade-offs are fundamental to experimental design.

Regression Analysis

Simple linear regression: ŷ = b₀ + b₁x
b₁ = Σ(xᵢ − x̄)(yᵢ − ȳ)/Σ(xᵢ − x̄)²
b₀ = ȳ − b₁x̄
R² = 1 − SS_res/SS_tot

Regression finds the line of best fit using calculus optimization (minimizing the sum of squared residuals). For multiple predictors, matrix algebra gives the solution: b = (XᵀX)⁻¹Xᵀy.

ANOVA & Chi-Square Tests

ANOVA (Analysis of Variance) tests whether means differ across groups — it generalizes the t-test. The F-statistic = MS_between / MS_within. Chi-Square tests independence in contingency tables and goodness-of-fit for categorical data.

Modern statistics increasingly uses computational methods: bootstrapping, permutation tests, and Bayesian approaches. These still rely on the probability and descriptive foundations covered earlier, but add computational power to handle complex real-world data.
PtIIzoMvgrvmPzU2zSRspHo2MUVYxgQ2ZcxU3Ly+s0f98WREL2SU3NE8ZFW4vXj1gAtFJncttGwvGYQxVxh35sS6BvSMahOwNyw0YtffLr60ZkX63hm0lzy8PiR6MVszpoZy4xoQmVQDB8J7FA8RArIHT+Kvg/863v+xZExAKVD+MCnqkXyYdAAwZ4JQdYo17jNtN26GxSPCfX/rjOsVZjtQLVKiv5I67ooljF/7XaTCy22IFfbGuBZeVU3GbPdl2/q4GEQnkANyS9R38kihUglaF35+c8njHSYiYRhX8qK4C/8pHav04qRaqlO1aQnxPshr8zOizexyRvdgLQ2/mXJuIcECYrjK0bwsS+xF7hYeYlglwZMdrpWcbjzDrKC0CNSBkAy7vRzsHbnOYvPeElPdpSyRlswmtoii/ahLDUX+a0Lrj40vRSQIhxvZ6QYGIxPQL62JAhm0VfkRjWH+Dz3ES4m26ovyT/ok+2q76NRzYR5pgj7opuKoLrtyOwtQljogXMlHcV0aBZER1e0VA9meRJydlRbFN9eCQf1uXAJdjTLtoHuVXxqtcjBdqVmtT4UeaSrdyS6m5IAtaWH+597ly836RR3JFQ8XszVwefd5DYhgKZMU+pz0QjK/QvkTOOgOsyWyYx5IRxEEV5nH4H8RCJqpgcNFevepl+d3uwiCLxok8dH1vfx7MDoFAa8MnD5E+ZNOq4sc0wz87NOxxOxG59CgzjhuIwmfIfmYEIozfHohmmuJVuewzn4VA6G1J008Lnv1558bNHBTqtmhXjXkG5SCZwfHEZvi2LtrEG/cUqXiivcy1YfmMEuWrPIX40IEFIzOXuU1wxrG95T5v55MhQqB33dXy53MvnRADg8TOhXJOFS9u1hQyoruJjpfmmw2py48/4Zgik9xi/ld+dtQr+yFRGmHP4NVzZYFAQNOzY3DX3vPeC1EyctuK23pgGEpbCMyuJ7JaJXFypcTfGAVD2WZAwZqzi3OKDMlxYA/qvsesTFhbspt8COruoSjhqh8mYLQ9urgtc+ZGAF0M6LeomSdSit4+fREPs+y9D