1Be7X3Ej7csaujl4OBFUTUTsZD4iMTbxsfJMCgqRS6y0M0wpsVUprOrZE38akpti7s/r466QqOZlRa4HVOwb1YRodQJDkZAHA1b4XId9Hoq8DdF8oUCXTU+PWMCSzYOqXes3/L1XnPeEmjaD+JM3v5J5Ga1JNyRJbOiB4XENG9CHbmn5IGemyu/sS/MPoHxX/ntD81k6jK4wLp6zFW8HtW4Wa/sVRc1FdNfFsuoP+85Ebq8ZY729lp+xYYjhECQ1SpUnpuU8J6N+54+zz2/sf4QaUOvQSP/5HEBnCWWyO9qyM8Ebce7jLfiAjdFTyguqj+766vEkgoo3jj0bYEQVqDxE5o5LVchU2SAU+DNkEDLR3iYmauUc2uKUuaBOepMbfneESF3GK3RDOt+oFsiqc7kg54UDRfPVz2XYa+smbn/XtnUojmdXPkSCB0c9Mac61tdPxV5sA7/l/kRvLZaZ+bMY6/nzTctjcQDKKCwCsLbthEG4OKa99kdf2ef8jmM8FjyBPzl7Z4uMcrRsaIBT9T4dFZ9j6+ZobK4mBdZn88jORnF1NfbFx2Z5pz3rd8SaopSfQPHB9cmeFOl3mZhKA6OjiIFDHnCOWlGqDFBreCQWIRLLZRl+A5aoQj1TKR8roBWsDafexRYqliUKlKwu1H2pg/omYwRRuqgg219Z7gtIh49wUdpy5QRMA4moEK4zP5GtV16zW8Ejf/K/EWljiaoGss7H85a+DnY18aDDJ5M3DjRGGygxVALgUyL2o/xHvj0VE40beaUOIEojCy19Nm0cmSXz6GWTfJzfmmMV+fQcPJBDwQdX9zOjDFr3WWJe2L/1F/20xScs+I+jz6WBCYNLAkXhcRg

Probability & Distributions

Model uncertainty with probability and understand how random variables behave.

ykGakoyPH9+5SIlcKvPqqOFMzpY+ST/nyR2oQCmEg9GEt13ksi/8L2Uf3dF9KttR7jUdybmfPJT/iGKvmwXHhEV8q8uwNxJHBPRl7OH7o/GLdYCW2kk8qrbqDVgs6rBlZKvk1OS+H0QXe+EThjOAclxp88TPnngi59pRSQp30qlEhZ2HLjpLcZkb31p4MOHaTMuN4jfSNT6RD0jdigySjjO/51OezTyeH+jJNc5et98ZvudjFZZrKTNnhSkill+wQF0RCKZwHwSM0oHXvDJ6ftnwjSgWkfUrrWT234yLSIXyAJLaOzpKzENldj5NnOvJIPeGWFI4vuP8qz2CPF7Fht3LE4m8wie2S4S4kgP5M31wNcOD6pFOfxTBSIIjpzSD/GaJNVTp0nJK5HOEx8c4FQdqCCIOG3om7M7GLWihcZXYZbtTFOJ30kUmyntmhqYWe9ie1ReyEPgObLfs36JtCvbW6tn/sCNucVJW1Q6GAuxzjiWjntg5Ae2jme39K1Bn+WWz9Ww3szBkTeSoFhXtBNOThyEu3a1c7CgYMasvCoK8zh36Bf5Q71MppiBPkCa880RWryot0sMYUqxE/UBQ3nxbDJXpDHb05UggyPCxKFrb6i0Zu92ae6JNtw3JAv8MUiOrVtMHTYWVkfyEsSEFeex742z/0kdw9Gu57984MP/bBtrpomT5M9mXEUTOZflrxYBKPdgmJGqslONr0f1s/xO7WQpCq8nEBPRZ8winDtIx02CwTjHO9tz+KaZGCzfwIjI/kX0Hl1PrMjlAzkIvE9IhrboE0I/pPO95bSqY9PrMPRMpNGQj2NN9Lth7Vm0knuqMC5wtsPG/djIwTkBZ0TrCuT9thJl

Probability Basics

P(A) = favorable outcomes / total outcomes
0 ≤ P(A) ≤ 1
P(A ∪ B) = P(A) + P(B) − P(A ∩ B)
P(A') = 1 − P(A)

Probability quantifies uncertainty. These rules come from set theory — unions and intersections. Counting techniques like combinatorics (permutations and combinations) are essential for computing probabilities in finite sample spaces.

Conditional Probability & Bayes' Theorem

P(A|B) = P(A ∩ B)/P(B)
Independent events: P(A ∩ B) = P(A)·P(B)
Bayes: P(A|B) = P(B|A)·P(A)/P(B)

Bayes' Theorem is the backbone of Bayesian inference and machine learning. It lets us update beliefs with new evidence. Total probability connects this to partition: P(B) = Σ P(B|Aᵢ)·P(Aᵢ).

Example: Disease Testing

Disease prevalence: 1%. Test sensitivity: 99%. False positive rate: 5%.

abSbp+qxiYX4Q7izbkI1kzST13sJjMs42hSL8BMK6iLQw9DVNMeDPHpwq/LlYbIiLvZqtAajER/pRA128s0V/9l4eUphw8GS9Ho8uGJ7PhL7kfJF5suWPD+c9lZs9vbb277hcPxRnQikdzACMB1BBeES8LnX9yLPVnj04Eo608yvBqmPnbl/+CbmW7zSoVNGB8lowTyXtZL6W2kHlrWlGKLf8y6v9X73ZNeovo+mTXvRsk+IaTBeCF1WIC4gF5Hk1Ovve+whYw2OT17AdHksjkW1kHIJL/txQ5sXBMdEL76aDBfE+6m9GETHIDn7w/O6S00EbVx0aUAxsK0BE132myFiimwbqWpPLWZElZxqesHOvSVZt2Kf0lW09UTbjHkdZnkAgzKH0kk9jbzzxgNK7sXVkAVMQbj34Yq7yUy+L7vy2g+Iqs8o1VSZ/3chEbhXo5gB8l3n9x7puQReMu6YSK8yrGlknBOyjEc/kMSMMbUqDkf4OQta0IVfSATzuNBcE0hguBVkQIwWGdkgHxBJAO/2B5TxL3T0UTwIdkbnPD/07Sy8HlCFf8aUTt2yh3eccl/tvIX3GDipPiHVKQDB3S93WPGDLxI+eTp79Peay7FGK8sTC2IOjjV43gdJ2sn2IcRQEHwnpA+QpW6Kh7aM1qMIv0QwV4s4F3bnVy2BbdmkPYr08gFruQ+O1xO+z5iWiKrwyw68LQxxPoxH1Hvu+GDhnSbDp/Me3Tzy46ZLsN8H+rsV3BMiEOfWCqcdA4HjlH4pKd7W6LDByUb7QglEjrxsPGtYX7neKKNX5c8uRSTYV8z/PJal0poer+3WFUeYbrRLolhsP59DgvmiwkYM1YA0eTgWqHl

P(Disease | Positive) = (0.99×0.01)/(0.99×0.01 + 0.05×0.99) ≈ 0.0099/0.0594 ≈ 16.7%

Even with a good test, a positive result is only 16.7% likely to be a true positive when the disease is rare!

Discrete Distributions

  • Bernoulli: Single trial, p = success. E[X] = p, Var = p(1−p)
  • Binomial(n, p): P(X = k) = C(n,k)·pᵏ(1−p)ⁿ⁻ᵏ — uses polynomial expansion
  • Poisson(λ): P(X = k) = e⁻λ·λᵏ/k! — models rare events per interval
  • IcYbuYfsIVWmtFQBZw/U2mqZ/wBQYduAbb1cE6D6jN+laHYI7lsYGZuq8WHZyzEL/xvgGtq4Xeq1EwgDWafiQfb5f6beoXIkCWeYDSlZGC0+7xAR/McE7HVFb+pAAXhDAtJlHH19HByNbW21fQRptobpNqL0xngSnthZ4xlDV/bKL2r0MDtfv+h2D9e2ZJl2BiPEUnystf3L5gAhZzN27c/ELuQ0GXq7fLcUMLZn08X4SGjT4KpBf/KQUXzw/DPUT45Z54+jF2xSeJM6146n3xURf07iLoNzCoCMCFV7qoL123PY21YdQaEriJZ1h+MK0mhbp+Hm7agtKDxR6ERyzhwC7DxDUFShLn3TBcCZvQlaNsQSDcYKyYQhM+N5Bjvd+9XVGAQtU7Wl67fGW0gM4kLN4lynG+3b8mQVxp84yc3A1raM3onBRkb0q8nNOFRuSxvDSgu4XDVvvJKk8pD6m/b/qY/dcWFgAf7xMf6A4tiXw+DYdXQEwp6ca+j/twDEISJ1+rVGR1r2DZaYULvucyVGs4zslqFN3vL6XapSECGwwwwLdna5F4zlc+QBeNf+2J5TmenHM9h3Kg3aGoANfR4hFVUcrwM+KDkDQZ8Cf0SJQMn86gEtO1k1K2DsULgpP5DPaErluBAm+Ohp6tcJV/oIEw2CfZBCaZoIW6lhgp+yf+ADHCNMpbtaITNqCVte0vI+u0b++xZO2UJ90hxBE4zB9rnmfI5OlSjaBbxapoAjYQid5bziuG3QDzov9LuzjDyin3Zofgs/4Bf4Rt765BR9Y72Mz7c0nBatjPCPoMA36YkZJ/f3Kq6e1h/TpuIoWf2/viMNPWp2VdK70g9etpSh1iaIa4a
  • Geometric(p): P(X = k) = (1−p)ᵏ⁻¹·p — trials until first success. Connects to exponential functions

Continuous Distributions

Normal: f(x) = (1/σ√(2π))·e^(−(x−μ)²/(2σ²))
Z-score: z = (x − μ)/σ
Standard normal: μ = 0, σ = 1

For continuous distributions, probabilities are areas under the curve — you need integration. The normal (Gaussian) distribution is the most important, governing everything from measurement error to stock prices.

  • Uniform(a, b): f(x) = 1/(b−a) — constant density
  • Exponential(λ): f(x) = λe⁻λˣ — time between events
  • t-distribution: Used in hypothesis testing with small samples
  • Chi-squared: Sum of squared standard normals — used in goodness-of-fit tests

Central Limit Theorem

Central Limit Theorem: Regardless of the population distribution, the sample mean X̄ approaches a normal distribution N(μ, σ²/n) as n → ∞. This is why the normal distribution is so important, and why statistical inference works. The convergence concept mirrors limits in calculus.