xBZqdZD2quD3xeN27ViknHANONq1CSkP+VmYfhl2xcVhN9iZzo0QR6gUhY3Vnii280qfrzKVpLutlTjVzWTgB0X9HVkgeZjPRWC+qxJc9N+wbTCxXeR/gDu/hBSKlvi8nIMwkTspVhAyIHH17UkGkGDB2C3fDXVLlAk4/vcSR7CsPhnSOokkPkpmiHRGPo2P+ynbN15PTmwF67g/ACbFnvEOKG1xpv7/UXU4zCyQ5n29N103Rli1QEnyo1u917foh1Q+KAsljrI2JEZFlIbaQgQh24Ak0lpMHn2Uc9mxL1cS2iRQQ50ZTJGVhsMyUO0oSSZNH06Q0l5A9FS6A82XcuZKFdBVT6KcPnwzTKTM75zsT/PupY8izVpmy1vLn87ysMDI1z59gPnRNrtiHLwmbhC6PH+aK5xKSu2mmBJlpxpHRy+HDjsjq1fgQ7VQJSxypcJGHG01mTF0gPslgBLewvzxhC9S45JMnhyGjxR+1MDOyAr1ki2sOJwxieHDg+ox8U7q0Jpx1UpnbUTf+gw0Lw8qV3eXy+8x9b692LWFOoyxACpeXV4BkLrzYDrHSqCyuA8rfEXN5iiBLNhUD+JUNGbLotqA4hk4A0sldsiqaqMnOhFuSfSxZHUmIHM14aNOkxDJ1bizkqJEgC1FBTxogtuCYtEl5fdCihBHLREg6YPFsOqwHM0UjbBBfmdPD4kmitT1thermkz9voDxkKm/4U0fiJHR1Ls9eCRba15hs4DKGTajpZXaphpH0PVEmSzj3CynGO0lnlQ9NdRvsOUbEjVWxztG8co+KNh2VfeVfeqfednkf4bhWFkCX1W6+drVzDygpwvGSCHJ7NesNqoQrlrRCAC7qsCTyjtf3cJsxVl/EeyLkCSV5C8BTwLGbQuCab9NGy0+SGL8fgLnu52UIzMr

Descriptive Statistics

Summarize, visualize, and understand data before diving into inference.

Measures of Center

NDtkd1+pD0PXVs2r9dTuk+5MsH0/VCvMbVZGF3SbfIlGRw3hnKEkuJS9iDP6ok9jcj64Gr2yexAape0QUzZ8H+RvABSYifkZ8VZKnt2fDmYxuguRZVfV9a349PX8frrZpBB+5hIpi2xznjlH4bPCXVRza2X8Md+Pn+TKgaMC6yH/vcpYXbwJNPf3SbouZ7D/F2jxFLvk4IWNwowDjNEXbzH8VSMXwMkuaUeY8NbcnK2Jito/z2kWwcohvogL7HzPNBKcnNzm/fcE1QeqHX9NgLt5kxmz4+iDm6N+4hS2fugJTplg6BAr2lVHcBW/PdLwoCOjDaxNq9A4QJS4gcFh7IzepcqtRGq7he8gUUjcYSAht0MWDF96q8VubyYsEILiEcPl6x8fvAi7Y+Wrb8S1x72SU9riX52cZ2mJr156o0AnMhfAeRRbRBoHrcc1zzX+HhJG9fXcsenjUV3j+nfeHcnPHix3ksDhQ9htd6D1O3D60NQ0+tfYlFT3gtLaMXpGzi+vAUx/926aF46jfWX+TiCqJHeK5gMsu82c4PjOQpWDxaTQVjmoI92V97F127G7lF86CwsCLKs39n30+wH1xgNzgJMhrXEu2c1YcGHgmuV/9QoQBSSsJsdHXt7kRh2nKsIfcJxkE6buURL9ijaZe26lo3ODfQE9RIp5mRxJNwdoooKOHiy/om2jbR+eaFmRuQop5pyhjmNKRXcz1ruNz2zya+OODgivpEZLSLfAc9wa78XswC/R+0tU9bh5b7r82UUBAYnZFm6ROATy4rHKduCfSsK9LwhJt97gHtG4umfieMVq2QpqZynplKdfJokfkA6hc0sk6JocasHeBOmK9g6AgRUBrmLnTziEyygF02ndbQ63YvWX6V7Fy2MgNtAQ7OzAmVM5/ZsxJEbY6QddAk+s
Mean: x̄ = (Σxᵢ)/n
Median: middle value when sorted
Mode: most frequent value

The mean uses algebraic operations; it's sensitive to outliers. The median is more robust. For symmetric distributions, mean ≈ median ≈ mode.

Measures of Spread

fJ5ZlsXd+F5rR5ezEFOZcO2ZRr6qP+C0ftinZCWyXnurUQEldfVGymEbhY9+FxDNznhVq43F3CiJQC8aUaE0RWYd/J2/My5e4o4AvqgMOgvo0IQsEDjVgLyhOmpzlWKr8NHt4GJQPescsYd26uE504wV0r26L1FwPPeOU1bkr0H4jFYO2vLb96GsyTymAqK3eaUD5ogK21t2xOtYaRJMMQFKG4I4WB9/CqqYNin5RCvgUNtgnJj3iAhhKBwM3Opah9HAe/F3hi2+je5SMlYntNdItsVURR6/hzhsvSLDCSCvOEqUGpGSJgjwT3XI7GvUPD9LptoaGDbQYEyJjU0EdzTSBSEVuRdJwS1/C/7whDsvEHutFAAGc06CfIkT6dRQSSRB86TMbZd18Xz00p9AUj71wmXP/T2LzCayTQOTlhqBzx3OwSY3YWDHf6HazyDKPqXTGOjGMBx0ryur+iwFXMqgDS53aSZSttaPAeVcO2J7POIZXOUl9yhO4iPOVHgcFlIkzcs9vGgB4PK35ZpKOBQHmkc2Fr28BTDpXBhsMGkD6K5aeWeANf5FGQiRZ92W69VvGIzo9mt62RNzRZOVgrycU83XmFLA4f/nS0YYNWzkiDhqRl/M2hPj93FFO0ajNNIfI0ODgDVWljiA3yszAvAHmq/lNz3ySd99Z8qZAwBcAzd5Z445gAVheEifM0ncJYbL6lM9foCkTEJn+BO40EsrXhjdVW5RN4L8flZWOTDPmSMk/ankXmuHaygSbhEKr3hms9c8sARygjMdXl23Fk0RX72g40TyJGvqbMMSlPqbHu1dGvjQsyDvXU3om/v9u0vgADX2oexp4ZPExKOgXZm+UY0QJI69BLdpV35kGGfzHQ91BSn/h1kgsKHZvZg9fddCZ5m29icN4wU+J6gOkBxx
Variance: σ² = Σ(xᵢ − x̄)²/n
Standard deviation: σ = √(σ²)
Range: max − min
IQR: Q3 − Q1

Variance measures average squared deviation from the mean. Standard deviation has the same units as the data — it's the most widely used spread measure. These connect to the normal distribution via the 68-95-99.7 rule.

Example: Data: 4, 7, 8, 10, 11

Mean = 40/5 = 8

sqgxq3vi8EC4HjGzZvrhHARalQ06PiuiEw+Mn1r7OYcIhHtlvCPv4WgAbUyPsK3SN5eRXytn0LjMpNaFyY5GTEqkEqsHqMKEjgJtk5hOMebYHYiaqvFbZljYris/FVx9ocEioH+c2JulYm8YUtryWy3H7BHXIymsdXhywwJ0EhX62tdfz+k6puIx7v64O8Qvo2BbLqkSqC5lnrGifVfVrkW6dRKxGPBr3Lajg7EQAs3yCujoN1QrqBMFjJZb+q3lrwqiCRjiL/bG8WUsOb6dUPo2Lcb82lmrFywdW/CxKEMgqHvB6+ObVmJz4hObTjHnWB2KeK+MM28hl7Dy/+XZI9XLjLZDsbevxoQ+xYpKynO4ZdwDouJtbx51O8alO3zL4IbLLelXonhjKsIb0820ACApp9Mu+f3IZD2Dn7SRrAC2RVZL9m0QqDR57OSTZKtb6yBbbqkS6SzlO9IGWynQsyYk+mmLFpIynQChsgr19DTigdfZuTXlLBW9kC5qyytTJZYlTP6LpC/aHXXZ0/F+paF64L7R2QtZdHIV509AEuviEM6YEHk5kTAKR9uqDc3A2PceOaQ84aQNfZKN87skgQ9qfLCOcUuLVTIzo+JK/I9BWiWPVzoqcKi5rla+QzprBqhySy3C9u7Y3ssL51gE91LPGh3f3pQvcER7FlFNwivaw4O93xvWGpeKRmh0yfdA1YVAKPpd1yVG88r7Y7PO6vUcjZo5NMM5QYdTnm5ICodJW6BLH4xSiJbhrOxFmxSV41XzT6gSh5F8m5AOMIGNMHH7BZJ25mymELBfZCzp2AInsKCqVv7QSc9D11kWVwCoza9rpceF9pBOrE8TtnffgaTaW0gh7ywcmR4cehD6pz2X5pftHP+JkA9+UHuWCkZAKqhz1MxkI+lmnpB7MCn/aM2

Deviations: −4, −1, 0, 2, 3. Squared: 16, 1, 0, 4, 9

Variance = 30/5 = 6. SD = √6 ≈ 2.45

MOjMTiTRlVMwYETZrPLl3qVGbe47Qi8yDLwRneY74AGiKcNFTawi9e9RZJ8IqeI+FGtWAwzN9v3vaMwNIwU6e1Jqs3SWFM3P8WNGYDB6r9QfukxFU4Cxf5Z6xLNs8I6OTBYykLmTQfQQd0WvhykvYjv5O9GKjbUnObvQiEeuZSR8DTy2OR6xRTNdWuhvSAbTpBqGGOgFgrtzNjteRDB0dNVuuu+UBTKIFBqi50XoRLM55ARQRAEBSQzULwCE2qPbX/ulMxtbPEcasGgV/yBA49CJlHLzDMisKoMGqK6XAD6z8fm9RMM9Vpg7x8HIE/VUumuTpe3N/dJlOVPZFz36TjzfhL2dccHcjZ1vFsiIzml+AMN2kcRFA/Vs7k5UVIrf125ytn2iamisXbpH+BkikTp/G9KZRCPS5QCZbQ4ix7x4qltnAtEox6bgnpii/8D9e/NJRQY4Xc4nE1hvgElVfTJu2W6IFFfXX6sylTU5itTOlChFF+67ZqRoVmm3f0BoySkL/q/xq9VFMBp7U+ZhbkUbV1mpj2OOEm+ycMEoRfaGQnG2JTl/HN/gNBtagRsMGRIR/zbsnBglsfmoLa/6ZwW0xn2yGDZj+wde00tY6LWAjunC5efzkeOsBOSdBQ8q5ckoaQY91auu9CFRj9KgmEK62p/bfOdd2PslEy94/ANU9EBj9MVcGohsbmfCYvfD0wbNrYBo4/Xg5wxKWvRHPzOy5re6HHrZC37FPUAXGaBLESQYhzsibq85YttHv9BlabNH6j/75bhji7VyffzVT8xlBI2jtLDMOib9F3J68eE9xXhWrYqHl8EkEz73JP1TRI7i7Xk4YaAxV9OIHam1+PmLdIVJ5oHmgJS7XpON+4VmJaabGiMeei152pgBg7u+/+2zHSoJvwd+Zr8uMcki5est

Data Visualization

RHPg/B0sIdUyKIw7R9amxXX+MjnsTdeMxaPnAWo0Wrm/NDJsOsnd1yizpNUhCU1//nFNJVIsDyTGv93W3TclT0x+seVJJQULP0BD0H/Vhke4rEbiGe8wwj7pSD1FtNSuRXeyuUgQsDuBB0IYWZXIN9keSKTzeeNETnO7HZEjZ4v7yBgkTopbFMcJlGxY1moItS2XbgbHusUXogzG9rWsqrIHYMF9pQ3uA/kQYnFOBFlBHjTV2Wq2dy297v/WnpbGet2O3u8JZlx6DJeBsEA28QL71Uu4kLV6Ydkzf61KYfD5vQ9qZKSGu4aY0edE4uVVbSarPjWgMcp6weI9Cb1ikoNAZtkxHH87fUtk5OZpQ5tbnvHnRWjTudCln9rjL/V99Z6+J71GhE2NBJer/pgUR0hVlAlL/3boOF3GGpjGzjDOR9FVxEYfqAdpdQldZ/BTmSa0NrsK5UKfNS8U+cAxWJxioWKyZHbcecwtjcKomnHsfQyRy3qXv5fTyM9hwKu2u4XoPgp+Lh1vC6oNCvvD6hlo575svuHsUASLqCOcp9WaMOQ2nIvxH/qRm573bI59XMdA4vE+1/IF7Cp5L3/w+LmgJ17/z8afg25K8rB2Olt5+34+qPenVLHe5l2jQJDW8KBEY0KvziaoF/0s6TmBtWcEth7/nEGCUXKjxK8JQA+1Mqrv/DEDHeBhEyRQCzQNP1973o7Z3NI0QccSq3nVtK30DWbjMTHyAipRY8nyzZmezUog9H6ySXk6OofArfEs168nte1QeGlg4lAURhI0NQpX1Cp+gXcNcWuvWWMLl1FET535SjX7SIT6H/5Z/6C8P6LfGi37CPo9eQ5khYJUIgpRIsjc3zKQoJ8OgVLN5E7P40S9JUXjYvRVABgjrRTJ0KJwbM6AeOgJ5WR19/pvyWSK
  • Histograms: Show distribution shape and frequency (area under the curve connects to integration)
  • Box plots: Display median, quartiles, and outliers
  • Scatter plots: Show relationships between two variables → regression
  • BPSEqjC1LtbSCawMhs0G4wCF34zb3KmDb1iRQBItd5wpvkD7Vbz44slOGPIX9iB51j8CBNe8S0gNY1KkLafedzzEkLEduNxw3kHiDNvfh6xWG4owkfAzrkWeXdVshb8hbMG9+wJdvJrAPgg9mL9VBVUeAjBX2MYukkIV1EVGeBhXpMQ3COFxV8AOqNO020ShnYdMcL61jO/fX60AOOoTSsO3PS8v9RVcMVEwVxfOxmEXhExPdVUaORMPS61xXbeMoLMcaIIc4K8eONZACJ3GCcdaIHCoHvtRxxewH6m9h6jD0JqgnYNr/si2kaG+/24XAdfOXaHXbAtMQxbASL3mztd+qffbpyZbXpkKEE+j6dgez9PXJV7iXYl/ECmvbv2ZzMcfDqpaHT0NawKBrtUr8JmqMVlNE2lEMIq8vH42dWRp2gNS+NFnGbLJnl7JXLVV/LXEXZXGtiBWToLF/JpWn1DDAOYPyU7pBCEUkguaP8JzSH6yxWOICSkG/dV+I8u0dcNnE4Tu/m4fq7CT2FT2L+ybmUzXkz4evhoT42C5ZIOTzqmZPMRTkKauurS5fwsAsdVJKa4uMVRv1tNoq3SnIwedwOakxnyx9i64Q5zuIZ05kSZXufRZaZJNpkDqxhRdtVgb1nGjt6nrEANUL36LnwCfo2CViv+Mr08038jBmh3O/5AkXCTTvEDt9lNRjYutmkNAzIyTx/AEIwfb+QofrnOBs0UrAF0JVS8j/ir2iZoSy9W6Ak2BvVm1fUdZLlgKwyO2sj9lG8+Oi5FLp82ak3dLkUMmJGEh95mlNshZdwmmvJokpe2c+aHcIueuNjMHa0BA7khD4TC3ULpv8S7CId4fHDIv8Mk55G2K/82vCb75w++YrLLmDIxb8Ch1yZSvz/6NYeig+FkkILVtPzZ9v/Sr
  • Bar/pie charts: Compare categorical data

Distribution Shape

Skewness describes asymmetry: right-skewed (mean > median, long right tail), left-skewed (mean < median). Kurtosis describes tail heaviness. The normal distribution has skewness 0 and kurtosis 3 (by convention, "excess kurtosis" = 0).

Descriptive statistics is the first step of any data analysis. Before fitting regression models or running hypothesis tests, always visualize and summarize your data. As the saying goes: "Plot your data."