qf1bP+WnsNBVGcdHSdcH7p1bJ5jhK4V8vXv78UyEIpgLqPgPWbvII3uRJBlxRa85vU79NcsY4Cps3G/gWzG/iCu9w7Qpbjdm6reQ+LpH+iq5lk4LaaY2sySwhllTXPRhdlvcJrckkiQcUvW2241C/pBjRuOptyIXGDNkV3pm6ldPtWWJzsgIwKX6VZpZe9AGl0RpV/L+xjozmSOA/V7SaMp+uzlIGxMJzreXa8xeUSSEmiNs2cGGcwBbSr+lgzXLlpYvwFGoZK1606E2/Yu9KSprwLPPQWqoZINEw1Lm+Mn/55XFUV2xIYJBVaLtVXGmhDpiAKOvNto9p2zDUktoM2koyHMZGBPJZfUDaDsHyrb/ZG9by1WQfhIUYdARZfl38k5BfocGE8XavtfsGZ+d6wYwi5rVN5aGICMsNCRps5Mw2apln4IVTM50oiAYz5FJZXTYIyzKFvI0IfAdbkZZutZo7DO8RGDNyyF1Iuhwc4PO2bw+a0LG28XHo6PTkgxsNA35if4EzXUESHAsIn0fUOlIlzi3+sjG9YJ71IWtfCRj/mWl3hYEHhlR59SPr5dzx2Y59UacUae/Xf3ZfmOWqddkk8rRqQ5CTrncqPABI1vhjGTqpFreqeWEsHObgXRzzw9JqwDHiAkQe2KVoQNC1P1/z1vWN5YWOG3YlIKC1q/GNHPzD53btbZchf4k2nwTkz+qednBhnMAW6pc2He0kI6afqazEF3Lu38ewSvrPPpkSFooVCVzVHzxio4KXghY1mTjM7xcpLwl0Q74K+uyGLkRZLILIkVVfH8blT3P4fQYlzDj18EiwHXgafXbjUL+pASXLgjjFlI0PJJHeVL+0jrKRcr/kJiAsjz3cKkPdewQ7CkKbE/viAI84ioB4EmbaVric26M+ppipvIW2z6zGDc

Descriptive Statistics

Summarize, visualize, and understand data before diving into inference.

AR3Ec/vwnBCWJu6HFwSbIp4cC+YLsBzVBoZ8Zt/9r7zEBrSYUfc/ufk27LFRFDqXCK2zKvSpK/X9ileH+q/QhfB9FVCUzPX+d1j0vYpaRFZMhjQtqSSIGbrwpunCrWJnOtiq97Y4O+wIFfApo4C2M82VQ5WSoLBUjF5F8Wgu9XBJYwOXRhLLdB4nScxtb4cp9F1GMDADZgd4FRLBgnK+A3ijBENDUDkl6V0mn69ouWM7yFnRI71uqw9USxGyAgRRzVSf/F92LzHh04PRVG84o7Y+5FRGTWWtXAe6Aj+UCb7Kldc1FhmHbwwNl1g1LnAA9YY2TBCQM6lGAV+mErTf0i+qmlUpuuCL2XuuibEdOiVkhFFNWtEhardsVKvC+gUg2hbYqNYtvqlJ/z14As40pHMKy97ivanlg422avqmdghq1lc7JtF+ZrKcn7DBSIgg3JuGeIBFvKKLXSs3GU9O9ku5zyVjBy5AjdqIk+zrkTBaXR9mfo8NQxPy9Y9TqGwQZsumvadWKznxoXtg6nZltE3aFygCgbLamqCpBzVaSl5JdLR8yVkuRqYT5bU+6ZuojyqkvJpuEp25Hv/VU3vHin/M+7T9z9ILaPY7iTqDAk6lTyv+VwPN06Gl3VTzFLAGBYaX01lakTXxTV6EsmLaRZQYx6t8ydOAHXZ2kaU1vO0cEpyvwQwafPnrvn53MaAct1AWoYcyZSspez5FkyHwK2cBYSV73m89DNDckzgeNwxIXUpn1kznrUg7gjaxkKdINkN/VyaecWQ/ykT8wEz0u32mu2DEHSsOVwEKm1hXX1JHnhRoZ9GzHYv7NaZsWY4GZ0Eod4hCtms420OWmex1LqfAXgJiuyYXIN4JA8d38KkZwA2FcLUt/Y1dNrcGkraKVZF+yWW1QQZYgwlROGQ9E5D

Measures of Center

Mean: x̄ = (Σxᵢ)/n
Median: middle value when sorted
Mode: most frequent value

The mean uses algebraic operations; it's sensitive to outliers. The median is more robust. For symmetric distributions, mean ≈ median ≈ mode.

UyhZ67e2hDzF4cI2tFlKJzEbG1m11wAFayOzZ9qzPkJEC6NYHVMIl+GxcdXzet6Ic6oQQmsgkfFMVoZC//eZ/PiFGlEQo+GlQdJTdnaamx8Qkb2V9k5K5bA+qXaVccZa/ZDHxcyd6B8L0P+4CwJKWvaCEIDqNuSHgiYr+TIq/ZTmI8VnAfsg1LoVecgE1NolW0Gq0ZViLP0X3AHWP6RaaHqFnShwMRz7TgKKeT2aTVrOUh9rXmHlVjQIuLCXEelrT0INEvvRdp4TKU6I1bVrJDlgaW6WGcJCbiwMjjFjL1Q4jJvFWTyGnxozVA/KgbMbAdfoXSDvsIsEnOOAfgpXRM3V7q+/yC5pjRTJdY5fxgrhVIJgFrrVwBx4+zKquScAobagxBJz9LV6ojgZJnsaX2c/TQp40Z3nflSxA6O3BdYD4SCbCalhVFg/ahFGeoMHI/VMyKvjHOOSvSaIbHMHnqq01DRVGOMc3qFopbaFZYvScRXXZFMkHhdshySzlkgBABNAbbrUn6JaQsqhYga6Kbq0JZaR9Ibv2ucAQ8gGcOKcpB/Twj7On9Ihjjv6CIAa2N+KDLxYEcAdQtupf9a73qgZV861iSDWCv4AclXfbnVlKgPt3ItG3e8mGoVyN/H+wPD83dM70Cfh4TOFZ656FsjAgCGG7/vu0cUZheadZFtFog/2erKt3Pynhukz7Ve3GkrliSjp52UIiJGP6s/YBEdgZHVOTrHRZF2YwbmYAHzUp5CX5/ruokVhQSRDQ8UvgZw/LVQZbuDRR1ZxdeW+C28/kL+fD+kSFnFA1SuwBKgF8xigkqKeJujTzOU2zAetWAuNpHM/BH14f1pw5YGV6URX32dzQhjbK3MHrwJ9ngkWRGdRH5425k0X03N3gHNjNT39xkWx6vBMeuLjohnUZk5

Measures of Spread

Variance: σ² = Σ(xᵢ − x̄)²/n
Standard deviation: σ = √(σ²)
Range: max − min
IQR: Q3 − Q1

Variance measures average squared deviation from the mean. Standard deviation has the same units as the data — it's the most widely used spread measure. These connect to the normal distribution via the 68-95-99.7 rule.

Example: Data: 4, 7, 8, 10, 11

Mean = 40/5 = 8

Deviations: −4, −1, 0, 2, 3. Squared: 16, 1, 0, 4, 9

Variance = 30/5 = 6. SD = √6 ≈ 2.45

Data Visualization

  • Histograms: Show distribution shape and frequency (area under the curve connects to integration)
  • Box plots: Display median, quartiles, and outliers
  • Scatter plots: Show relationships between two variables → regression
  • Bar/pie charts: Compare categorical data

Distribution Shape

Skewness describes asymmetry: right-skewed (mean > median, long right tail), left-skewed (mean < median). Kurtosis describes tail heaviness. The normal distribution has skewness 0 and kurtosis 3 (by convention, "excess kurtosis" = 0).

Descriptive statistics is the first step of any data analysis. Before fitting regression models or running hypothesis tests, always visualize and summarize your data. As the saying goes: "Plot your data."
JCDAXiWpzidD4HcdCPWb+rfYZ/zAZpp8JTNAdVIdwAIIinlU9+1e0LgSam50CXG1NG1TinpujESnofVJqXuscX9IW00RtTJoqfj7DIfG2LFaVtIibH+RhVuvMMGGJU7feWHb96h8w2nUyV6e9WwL7JA48Zw8tryydD0z2En5s3XFX0AXtUuXJ0gPe6ISDfj3jYU2eTBu0DjeI/l+bFrp6HgC77OvQNqOpoysbeSCggSzmWR+QZ3r2IKznL5YhvJgpsW6+fHmP8FE219d0ElpYTO0LIuOrLrFKfolq2HuvY65Jpv5X0lZm0blmrUkk1C4CHwzEijk+vdH02+PfrqxGJavJggaAdF7mR+BLA8ku1epm2XrIcslc9LvoyGn1ANjOk5ICDJGiCikk1lAuNzNZ0Rzr30kHt3oByfspfPii63hs/DcFSmqp6WPlin3hscLf1GnBe7tyTjnfzhpSeVUR2BoGtZvnU7Hpi37WiE2iO8IYt+F8x83zVB36BpapYJvSRdcAE9h3II3HOv/m/Co3foKREegn8rb8md4aJldU4XRf6QdqMnryzjFYqK6TP0VIFdlDx8CE57/TUIakzAeX/QEDP16kvGW4Y9iLgfS8ELIQlt2Xl7BN1uBf6Zdvm8kq4YeLC85KJydRMkd5eOgIWSfp25GNOcQnA9NgKy9Ytg5592J0tdmKBkNf2cOW5WAwOclAPr56vXPSj/Vl6tfnqq+RAljtiG04TYerjnbBPv+HRg6eE3o+buQ4cqClBTrDaIN0sl5JPlrlpqun1xg4CWtlxmBDlD8UTl1cJamyamohyaWlTbt4zcb7PAwk3Lqqfiwnr5/iTYWDkJGpTvfoZEDodbhtjZxTvgT9gydTFDkyTgWDR6mSREa243qtVPQpv07yYDeTdc8v1saAIxKz1