Chapter 11 — Reporting
Reporting & Communication
Export results, format outputs, and communicate your findings clearly to stakeholders.
11.0 Reporting structure (what to include)
- Context and goalState the business question, period, scope, and why this analysis matters.
- Data quality summaryMention missingness, exclusions, cleaning rules, and any bias risks.
- Core findingsShare 3-5 key insights with numbers, not only visuals.
- Method transparencyExplain transformations, statistics/tests/models used and why they were selected.
- LimitationsClearly state what you cannot claim from this dataset.
- Actionable recommendationsTie each recommendation to evidence and expected impact.
| Report element | Use when | Why | Skip when |
|---|---|---|---|
| Executive summary | Any stakeholder audience | Gives fast decisions without technical detail | Never skip |
| Technical appendix | Data/ML teams review results | Ensures reproducibility and trust | Skip for non-technical one-page brief |
| Raw table dump | Audit/compliance request | Traceability | Skip in presentation deck; move to appendix |
| Method comparison | You tested multiple approaches | Shows why final method was chosen | Skip only if one method was clearly mandated |
11.1 Export tables and results
python
# Export to CSV df.to_csv('output/results.csv', index=False) # Export multiple sheets to Excel with pd.ExcelWriter('output/report.xlsx', engine='openpyxl') as writer: df.to_excel(writer, sheet_name='Data', index=False) df.describe().to_excel(writer, sheet_name='Summary') summary.to_excel(writer, sheet_name='Analysis', index=False) # Save a chart as high-resolution image plt.savefig('output/chart.png', dpi=150, bbox_inches='tight')
11.2 Format DataFrame for display
python
# Style table in Jupyter (df.style .highlight_max(subset=['sales'], color='#d4edda') .highlight_min(subset=['sales'], color='#f8d7da') .format({ 'sales': '${:,.0f}', 'growth_pct': '{:.1%}', 'date': '{:%Y-%m-%d}' }) .background_gradient(subset=['score'], cmap='RdYlGn') .set_caption('Monthly Sales Report') )
11.3 Create a summary dashboard function
python
def quick_report(df, numeric_col, category_col=None):
"""Quick summary report for any numeric column."""
print(f"=== Report: {numeric_col} ===")
print(f"Count: {df[numeric_col].count():,}")
print(f"Mean: {df[numeric_col].mean():,.2f}")
print(f"Median: {df[numeric_col].median():,.2f}")
print(f"Std Dev: {df[numeric_col].std():,.2f}")
print(f"Min/Max: {df[numeric_col].min():,.2f} / {df[numeric_col].max():,.2f}")
if category_col:
print(f"
By {category_col}:")
print(df.groupby(category_col)[numeric_col].mean().sort_values(ascending=False))
quick_report(df, 'salary', 'department')11.4 Analytics project checklist
- Define the business question before touching any data
- Understand the data source: who collected it, how, and when
- Run EDA and document all findings and anomalies
- Save df_raw = df.copy() before any cleaning
- Document every cleaning decision with a comment explaining why
- Validate your findings: do they make sense in real life?
- Use at least 2 different methods to confirm important conclusions
- Present findings visually — stakeholders read charts, not tables
- State limitations and data quality issues in your report
- Save cleaned data + notebook for full reproducibility
11.5 Full library reference
| Library | Purpose | Key functions |
|---|---|---|
pandas | Data manipulation | read_csv, groupby, merge, pivot_table |
numpy | Math & arrays | array, mean, std, log1p, where |
matplotlib | Base plotting | plot, subplots, savefig |
seaborn | Statistical charts | histplot, boxplot, heatmap, pairplot |
plotly | Interactive charts | scatter, bar, line, choropleth |
scipy | Statistical tests | ttest_ind, chi2_contingency, normaltest |
scikit-learn | Machine learning | fit, predict, cross_val_score, GridSearchCV |
openpyxl | Excel files | Used by pandas to read/write xlsx |
Great analysts ask "why?" at every step. Numbers tell a story — your job is to find it and communicate it clearly.
Dashboard & business storytelling
Build your final output in Power BI, Tableau, or Excel depending on stakeholder tooling. Export a clean dataset and always answer three questions: what happened, why it happened, and what we should do next.
Common mistakes to avoid
- Skipping business context before running technical steps
- Not writing assumptions and limitations explicitly
- Treating one metric as the full story
Quick cheatsheet
df.info() -> Structure and non-null countsdf.describe() -> Numeric summary statisticsdf.isnull().sum() -> Missing-value counts by columndf.groupby() -> Segmented aggregationpd.merge() -> Join multiple datasets