Chapter 16 — Stat Tests

Statistical Test Selection Guide

Don't memorise tests — learn to choose them. Answer what you want to know and the tree points to the correct test, its assumptions, and its non-parametric backup.

16.0 What do you want to know?
master decision tree
What is your question?
│
├── Compare means / averages
│   ├── 1 group vs known value ───────► One-sample t-test
│   ├── 2 groups
│   │   ├── Independent ──────────────► Independent t-test
│   │   └── Paired (before/after) ────► Paired t-test
│   └── 3+ groups ───────────────────► ANOVA
│
├── Compare categories / proportions
│   └── Counts in a table ───────────► Chi-square test
│
└── Measure a relationship
    ├── Linear, numeric ─────────────► Pearson correlation
    └── Monotonic / ranked ──────────► Spearman correlation
16.1 Parametric vs non-parametric backup
assumptions check
Is data ~ normal AND sample large enough?
│
├── YES → use parametric test
│   ├── 2 groups ──► t-test
│   └── 3+ groups ─► ANOVA
│
└── NO (skewed, ordinal, small n) → use non-parametric
    ├── 2 groups ──► Mann-Whitney U
    └── 3+ groups ─► Kruskal-Wallis
16.2 Full selection table
GoalDataTestNon-parametric backup
Mean vs a fixed value1 numeric groupttest_1sampWilcoxon signed-rank
Compare 2 group means2 independent groupsttest_indMann-Whitney U
Before vs after2 paired groupsttest_relWilcoxon signed-rank
Compare 3+ group means3+ groupsf_oneway (ANOVA)Kruskal-Wallis
Category association2 categorical varschi2_contingencyFisher's exact (small n)
Linear relationship2 numeric varspearsonrSpearman
16.3 Reading the result
p-value < 0.05 → reject the null hypothesis (the effect is statistically significant). p ≥ 0.05 → not enough evidence. Always report the effect size (Cohen's d, Cramér's V, r) too — significance with a tiny effect rarely matters to the business.
python
from scipy import stats

# Two independent groups — does pricing change conversion?
group_a = df[df['variant']=='A']['conversion']
group_b = df[df['variant']=='B']['conversion']

t, p = stats.ttest_ind(group_a, group_b, equal_var=False)
print(f"t={t:.3f}  p={p:.4f}")
print("Significant" if p < 0.05 else "Not significant")

Professional recommendation

A/B testTwo-proportion z / t-test
3+ variantsANOVA + post-hoc Tukey
Survey / LikertSpearman + Chi-square
Skewed metricsMann-Whitney U
16.4 Common mistakes
Common mistakes to avoid
Quick cheatsheet
stats.ttest_ind() -> Compare 2 independent group means
stats.f_oneway() -> ANOVA — compare 3+ groups
stats.chi2_contingency() -> Association between categories
stats.pearsonr() -> Linear correlation + p-value
stats.mannwhitneyu() -> Non-parametric 2-group test