Z+UfWFLzm+2wbwiCDZjyAu+JXCxJ0hqiKEEkDA0v5U12xH7y9xTfmz7LseNtmnwexAakNBjsBXoOiafpLVQSZ1EhXxOzYSg+AGRNITF4J38xaKA+f9XrqkaBZspVbreuNUiSQX93CBaNz1iN2VvoQ2qeXuk5ekdlYF5gNcfQxaLIObJxdrYm4hrMF0XMv2Ooo0IQCzVsjmiIzxgMkAOJUglH3TQLKbtGMxCwWZLVkUehCnbMNSc2T2PMCxpP8QO5j19INbcJVMQPhgxlpEoECGT/7zyxGFzFZVzmx/BkI3RrlzKAHTDJVeW3ERZE0JO4XimZdJtHp6iDm7KNb85w3m5vtV+m6vBIZ5RTl1ZVhp6Bru6jKR7nIcEQKxAHP0L8auuNxoYmA+Q4kXiJrVNxYVOGr+RFlrbZ/uZdE9nBw8cxYAZJ51+k6Br1lbPgPjy+zEx+3DA9LNxMimr3U+Euj1MtzUCYudQXg4STfkuPt2tgvfX7uhJW1fCy6ZwMjtCuXvngFZss+KsCx66lKTp6z+MChSw5J6uqFla2pypwBoXbZEYx7XDbh+kcjZak0JOzfucVdDEwIQoMY0Q3PoMmI7VrenIJgfvVNeH+BvdwpqqEOebGA2erHpNtJujJiThcHz/RPBK+v8UWnxbrhBIDoX9aMA5+ysJ5XFoyGNQRdeabA9uCVDqWqvxgmXmhHiPCivsaWoSbojAXiuLJTJ7TJsp98gqVmZ/md7n4FSmRXih/XebeiFrUQt21QJK1ZPZA1an4w8YX9a4XJ2yRPb35Rk2Rv6GIEERQ5VbFaxXT07GObmx5p1KImbS3qWbNkhEnd1lEX9/V4VrDHeNVyl5HILEHtilS9LJPIMhhQvzQdyGp5wkLK5Wrl390nZH2qxreTaPnEnkx2bynje/r1NYKsW9aM

Descriptive Statistics

Summarize, visualize, and understand data before diving into inference.

pUw96r/cQ/4PGQ8e//EEAsJCxW5lRgOjX50NnlXdgboCnwt1uePxNHjtpw7n0xckSOZudszVw9KGuyJge2/R4m5pnzPfX2hjzHLbTAQbypvTMdwGbpHGZK7hA0wNyhCZkwk5ned46M7FLEX0xllCYNr6RPWH1h2yMWiSvag7vpF1AQsimzvqiCwF0eWxaHy9sTmgZeU2OUtjbBbY0zQATLKHwm7ld4loh+VHb1tOnTYsfc+yj5mmwvl/yzwhsJh7eD20t7WhaVbk3AdwSTEfnHu7mfp2OeWSmSvvYYE3+w5yZxpqnbmax3y3pCGzN12A2sAieMQb1GkjcdLitfKkU9ODEwXeMi387NPfry9/fb29KuzEhbi4+jpxwRG31mPSXuLe7acHh+H97SOktCFETo2FXH7+zuOMCKpFrxfYD0SCZq8KH396puHL4tAlh3LRYxwH4pPpWqKtSwyhBmSw/TTgZNOMc/oFopbaT+hygd7WLhdbSQUjhiX4qPfn5Ezfielk7yZ6Od63Z0bvmSZq0JZaR9IaP2OcAQsh3cWXXnkcj7ln3yFlvYoPDM5SHayCTzL5QEYOdQthpeza73qs4Ug7FiSLJho4QclXbbmlkyjyG2bqjl0n7or7RF8gAt8KG3VxW2M/xprv2uRRnrFbBPvKMup1YOUfjvPQ+bWsXtDgft9VlIuLvXkp2XjRaTXv4iWs7BYJ6aQcCEUTkNxD6HJCCzfFsXFS754D9yeMzAnYqi8b1cJYOdN0zMdeMRLSExQKxtv7JRKl+cnaizqZDCsruJrkYjyAfBnLEGH6UYRm0aUvfFyRuSUCS50Nf7+9rOoAipeSfId8HennDTx18vIYqU878NOimSJXehRfpB//2cyQKvNdlWxhKNlQeFG0LDb8M/qcOja1u1rxVT+jK5/LeX
JzQdXVeiqkb4AnY64kvo02kmeJnVF/jzRdM+Qf48s/AJAC7mt5Vl16Y3SMmGBY3zXfnsYH7+h59oMyf/+I6+q4c8cNz5hJLgbIvv1UoZjgQ2/uxaPgHkpOrG+Bv/GALA9IwCy7UzBtPgXH6b4/6aX91kBYBXv1Ywn6ufW+0iIbZ9ug1ZylFqqBhpuPhfMip7QZRL0WVKOfzEo9+5cocELT5gNmxwgYKLqpfZal2BjIfMPUo4T4yK20n7IYRiSeW/qmO7taWtoIWvbniiOlxQI+Wcs/A6AqPXN9xCjTrUIaZRzGUUA5vpXs2nHMefvsHJVeOqVaEdFKsvdJbQINC4BJa8ZU3wp9GrG9zLKQbia2aE9ZQ9a1Kjyyt0IDaHe64kIPLnjnkGo7BbJ5ajqYreD0Xx8HYuqPULspB9OAuOT02bUuLo5rTRfVxG+DGgEzhXfdzC8274ZVNV+V+7c61QkKhRN4z0z3mqW60R4Todt7BUMIqp1ukzlEI51zi69gUFU07QqFIMDv2YelONdk621XiIjKxJG+ae06rq70dhQJoU4n34xII75cmb/15xo+X1WxBPyC5b3xGiVj3FA3f1tTreRCGURE9Hdaa0jA1GEZuE9BHsrUaXTv+3Nnh5ivg205+v2jOY8ujW9DxQuLWmSE6rt80e6HVBaqbIOtT03pggs5NXIkl7YVl1OKF/1T8iYFaxa+Rwz3X12neMD3JFeL2TwMFoUnLNJFG7Mzv+eb8rHI0nMM1s7UjfKyn9CfbaQHrgP0RNnxK5EHxN8/8SnIWAxcTn7iYljTdC0MlzST66wim77VrC6P8W1UvVSUYk7qZCXBtET98KC6JTM+Qijx8nURx7IN+QtBOe9pIENzxJi3ttUHUvNlbAAyix1QhlcT+eGpKekrQxAFQL5AF1XOCuF

Measures of Center

Mean: x̄ = (Σxᵢ)/n
Median: middle value when sorted
Mode: most frequent value

The mean uses algebraic operations; it's sensitive to outliers. The median is more robust. For symmetric distributions, mean ≈ median ≈ mode.

Measures of Spread

Variance: σ² = Σ(xᵢ − x̄)²/n
Standard deviation: σ = √(σ²)
Range: max − min
IQR: Q3 − Q1
pO4OUuyYYqNnDqh5EwHvWPZnWDNjPSSDX+h5XMFDgfiCjQu3p97n0bvr4MvMjf/HpvgIoib+l9zZuLmNOVm8H/fGk2pPjPLoPQPfzk7OrajoV0YtWp9a5oKexC1QM+4XdUJL82vWD8/wZ/02T+s1YLcepA6s4jF4QXM/gj2iZpTnQK6z5zCpAhBbeUf+KtkzFAI5F3ckldkGpDT8k9KeKHm7pN3PPl9sUwLB2eZdL+YFBWfLvsAdW1+nihaNtj66BCEywb/IgSrWAXUPeGEtuPSHWtb4lRlqRePCesBrpkoWlsm1B/Kdq8J27LjhsCwTzSE7GCyo89V/o4OHAbBeLO6NwQOL8WbdmgkxvRbDEA4sGdFN92Sd0DPAsmSbpwN3HPsGDmFNFIGK1WGCgO8daCSJ2AuAPWmkLPJ1lHcSUowQa6J+sY+nBrIV/8jSD9H9fRdXbVvsyTqaav07Dw2eFx5+m/FNg/3wtlcgpw1WGvkDOWxNpFAXRB88w73BSp7WLkJw7IpFwcaGdXGxNMEwwhmYjPcl5KkIwsyinIz0aCjPuxLzaAChUZ5oS96krRB74Mi3+nvz6RzXVMYDBurKZFuJXGw585HiZPVPh2tysT5AgLJDZM+1aYREFTBOUVrPUej3aCMB4AnPvZ/tJbEsKPUXR0Ec02N6QVjjgamKGH1mYCSjRirbHp/wqOONrXDYSIWgFQKNWjedTOO95FuoGeP1UzH4c949gzRuWHvYsA9JQ+3a6xjlyypj3BLeJaGL5oqWEitW4YXBlUzxRz+5Kgj0ilgRV48onDG5NS5QCk/MQo2T/2eSBR+lcXKl8S89COEleZk9qQh+t/NEo8XoeHqQ15VNJIlwoPcoxWobYKjk53SdCA3j84BszQ9XhVkBGbYwfLbBPO3GPis1zRuQ5Or9Y

Variance measures average squared deviation from the mean. Standard deviation has the same units as the data — it's the most widely used spread measure. These connect to the normal distribution via the 68-95-99.7 rule.

Example: Data: 4, 7, 8, 10, 11

Mean = 40/5 = 8

Deviations: −4, −1, 0, 2, 3. Squared: 16, 1, 0, 4, 9

xBZqdZD2quD3xeN27ViknHANONq1AtsFT0x64g7hSy+uyBesLQDer055GQaBR9iAzE/fryzeFlAINTnluLSQUZQtXWNIsTP3LZhWWwyTvpnNSDGxXeR/gDu/AMnX9leY1ZJLn3WZFB8WuDH285lHvK949N0+CX6oFPu49rULQ4oAZi37OokgPkiZxxQWrI32OYr30Hgq4LtbCntGMIzh3hf4Qfmg50BdmaHCehX1t7jKpYRQ98CeLf6GbF4ZbTsGClXAzKFRwzTR+J+Xonc8SSd+AmQJZPIFvvQO7pnR/dFvJVXhXJhmqjxjGppv8jJi0Q1zxPBEHllc94/J1SocN9fBveCGEYgQ2AD4F4+8JEXOexEDdhQREKyZ8k7pZ3kGeugzEM1hgP1QCoNiHBvENjtO0gWLhMsG7rVEQ46C3oeq99Ws68CNznyxY/1z5bfAe6NKnrbXNM+uKPv61mWudaFfBNu1qisjT/mDQzLULe5g2QE3SJg0lBkaN77qcRexcazvLaaXWMdgWSRm4OEe8qjk9VzYEMHNmgkZmZoNkx+Mdi5P+bAHIsgmts4CpRr/nnpwSPQfDI/DsoNy954QPdWnY8rQpmaCLtyg/z8lMbjirJgAxPtcvhKZX+0RkszXe/vO9yLkS91/shR6HIb/lA+iGG1NZQBVYXrMNr0gDmEs9SNZ9ipbjd69imhNHPazghCiUd4UaibZCfHQs2y6c2+JNtKf5CMIm5OhCcVsJQKHsr//TuFrdOeGdvYTeiBn5xVPG3tlDEGwGTkttUthF3sVMjgTxuqKcj9D94aPqy3/v7347HHxIpCSh4H0i2Ep1VLN2iFw+UhymEs6h3HCAFX5OEInNA/WAowR7mt8o395Q2zll8im1yXRjE8ETRzY9LLph0Z9hs5t38WQu2HRWVaRIF

Variance = 30/5 = 6. SD = √6 ≈ 2.45

Data Visualization

  • Histograms: Show distribution shape and frequency (area under the curve connects to integration)
  • Box plots: Display median, quartiles, and outliers
  • Scatter plots: Show relationships between two variables → regression
  • Bar/pie charts: Compare categorical data

Distribution Shape

Skewness describes asymmetry: right-skewed (mean > median, long right tail), left-skewed (mean < median). Kurtosis describes tail heaviness. The normal distribution has skewness 0 and kurtosis 3 (by convention, "excess kurtosis" = 0).

TSjuIyNEa7DKT91HGk77dx0mDIoY4WgocNzOQzT2sokooNmoDxK2y1eagEkC1fFAevNfue92hWxgnIZu+L/wIUqpDq6h/GbfHd4dYsvS/5Rv+o3GFAP95PIp6Vo3vB9UAxVxU48pXObgpmV7O9xI3YaABQ2IhWncV+5EKFuyG5IiVeorP8szCHbNXCxlDtA5yaH86QZvajO9WaY7+IFBWXLg/2AnvBc0myUvUBvBYID9dqSYcM61pGdQXqQSqFnjk0i3ZX0zcx2AH5LyDhtPbNqtNubXhnXr0IdgddPMt//Y0jGIcu9RmRGu115YqYxZlmx1Q8dgoSkGMcvho11FKcat+yyD/0zA4IBtUkQL2LtIGZALgGe5bfg/VvKpy1LLfSLgpxzvSbqigHiYahxOuxykDVmC6FXmrKmQRAelriCfBvtiNUtULw60h1Kpn1XAjxO+iSB+/d4HYGUX9hG5+id4c1qD6chm4k8tZbOGn6P99FCHsYymESfL9BwZugIrGdgwdLm0oPDMsEwayzVqXfccuPe9kUEy6tTyu/7CjM6J+8OzV49wsx1j3l4KbiI5y6H7aw5vcj/mqBar+bFJWeLo/24rWsb7QjwRtrxXkjHWBlR3e5ZOW5gyR7reQkjA51ZDmiGq4k8UQj3Bz8APneYbFXjjlC5uEc6vdix0JeJQsR9Cri277LjJXmuJoC6q1KQ+gUdHt16ZGGycft96kv/nYmkeEYHTEpz89opfbqg7MlWXhz+SyGkQQHsxOdm9LCiVYSd4ZbBd4a2Id7OzR0Z+QIefJqrul7XYNsr9rgZm7QC5lCCxnE9HOE0qKDgm+2+FgjOFpUFg1EA2/cH6T90un1xmnAT/dswirVARgoIc4vfGmzRMDSD15g4nwhY/XrfL//szqkjnSGqfhbrXd/C/I
Descriptive statistics is the first step of any data analysis. Before fitting regression models or running hypothesis tests, always visualize and summarize your data. As the saying goes: "Plot your data."