Chapter 23 — Foundations

Probability & Statistics Foundations

The math under every model and test: distributions, the Central Limit Theorem, confidence intervals, bootstrapping, and what a p-value actually means.

You don't need heavy math to be a great analyst — but you do need these intuitions. They explain why the methods in the rest of the handbook work, and stop you trusting numbers that lie.
23.1 Populations, samples & sampling error

You almost never measure the whole population — you measure a sample and infer. The gap between your sample estimate and the truth is sampling error, and it shrinks as sample size grows.

core idea
Population (unknown truth: mean μ, proportion p)
        │  draw a random sample
        ▼
Sample statistic (x̄, p̂)  ── estimates ──►  Population parameter
        │
        └── uncertainty quantified by the standard error
            SE = s / √n   (smaller as n grows)
23.2 Distributions you must recognise
DistributionShapeModelsExample
Normal (Gaussian)Symmetric bellSums/averages of many effectsHeights, measurement error
BinomialDiscrete countsk successes in n trialsConversions out of N visitors
PoissonRight-skewed countsRare events per intervalSupport tickets per hour
Log-normalRight-skewed positiveMultiplicative growthIncome, prices, session time
UniformFlatEqual-likelihood valuesRandom IDs, dice
ExponentialDecayingTime between eventsTime to next purchase
Before any mean-based method, plot a histogram. A right-skewed (log-normal) column means median over mean and often a log transform.
23.3 The Central Limit Theorem (CLT)

The single most useful theorem in applied statistics: the distribution of the sample mean is approximately normal for large n, no matter the shape of the underlying data. This is why t-tests and confidence intervals work even on non-normal data.

python
# Even a skewed population yields a normal-looking mean distribution
import numpy as np
pop = np.random.exponential(scale=2.0, size=1_000_000)   # very skewed
means = [np.random.choice(pop, 50).mean() for _ in range(5000)]
print(np.mean(means), np.std(means))   # ~normal, centred on the true mean
23.4 Confidence intervals — report ranges, not points

A 95% confidence interval means: if you repeated the study many times, ~95% of such intervals would contain the true value. Always prefer "12.4% ± 1.1%" to a bare "12.4%".

python
from scipy import stats
import numpy as np
data = np.array([...])
mean = data.mean()
se = stats.sem(data)                       # standard error
ci = stats.t.interval(0.95, len(data)-1, loc=mean, scale=se)
print(f"mean={mean:.2f}  95% CI={ci}")
23.5 Bootstrapping — CIs for anything

When a metric has no neat formula (median, a ratio, an AUC), resample with replacement thousands of times and read the percentiles. No distribution assumptions.

bootstrap flow
Original sample (n rows)
   │  repeat B = 10,000 times
   ▼
Resample n rows WITH replacement  ──►  compute statistic
   │
   └── collect B statistics ──► 2.5th & 97.5th percentile = 95% CI
python
import numpy as np
boot = [np.median(np.random.choice(data, len(data), replace=True))
        for _ in range(10_000)]
lo, hi = np.percentile(boot, [2.5, 97.5])
print(f"median 95% CI: [{lo:.2f}, {hi:.2f}]")
23.6 What a p-value really is (and isn't)
It IS
The probability of seeing data this extreme if the null hypothesis were true. Small p = data is surprising under "no effect".
It is NOT
The probability the hypothesis is true. NOT the size or importance of the effect. NOT "p=0.04 means 96% sure".
Statistical significance ≠ practical significance. With huge n, a meaningless 0.1% lift can be "significant". Always pair the p-value with an effect size and a confidence interval.
23.7 Type I / Type II errors & power
Reality: no effectReality: real effect
Test says effectType I error (α, false positive)Correct (power = 1−β)
Test says nothingCorrectType II error (β, false negative)

Professional recommendation

ReportEstimate + 95% CI + effect size
Skewed metricBootstrap the CI
Significanceα = 0.05, but pre-register it
Power target80%+ (plan n up front)
Common mistakes to avoid
Quick cheatsheet
stats.sem(x) -> standard error of the mean
stats.t.interval(0.95, n-1, loc, scale) -> confidence interval
np.percentile(boot, [2.5, 97.5]) -> bootstrap CI
stats.norm / binom / poisson -> distribution objects
effect size + CI -> always report alongside p