Chapter 11 — Reporting

Reporting & Communication

Export results, format outputs, and communicate your findings clearly to stakeholders.

11.0 Reporting structure (what to include)

Context and goalState the business question, period, scope, and why this analysis matters.
Data quality summaryMention missingness, exclusions, cleaning rules, and any bias risks.
Core findingsShare 3-5 key insights with numbers, not only visuals.
Method transparencyExplain transformations, statistics/tests/models used and why they were selected.
LimitationsClearly state what you cannot claim from this dataset.
Actionable recommendationsTie each recommendation to evidence and expected impact.

Report element	Use when	Why	Skip when
Executive summary	Any stakeholder audience	Gives fast decisions without technical detail	Never skip
Technical appendix	Data/ML teams review results	Ensures reproducibility and trust	Skip for non-technical one-page brief
Raw table dump	Audit/compliance request	Traceability	Skip in presentation deck; move to appendix
Method comparison	You tested multiple approaches	Shows why final method was chosen	Skip only if one method was clearly mandated

11.1 Export tables and results

python

# Export to CSV
df.to_csv('output/results.csv', index=False)

# Export multiple sheets to Excel
with pd.ExcelWriter('output/report.xlsx', engine='openpyxl') as writer:
    df.to_excel(writer, sheet_name='Data', index=False)
    df.describe().to_excel(writer, sheet_name='Summary')
    summary.to_excel(writer, sheet_name='Analysis', index=False)

# Save a chart as high-resolution image
plt.savefig('output/chart.png', dpi=150, bbox_inches='tight')

11.2 Format DataFrame for display

python

# Style table in Jupyter
(df.style
  .highlight_max(subset=['sales'], color='#d4edda')
  .highlight_min(subset=['sales'], color='#f8d7da')
  .format({
      'sales': '${:,.0f}',
      'growth_pct': '{:.1%}',
      'date': '{:%Y-%m-%d}'
  })
  .background_gradient(subset=['score'], cmap='RdYlGn')
  .set_caption('Monthly Sales Report')
)

11.3 Create a summary dashboard function

python

def quick_report(df, numeric_col, category_col=None):
    """Quick summary report for any numeric column."""
    print(f"=== Report: {numeric_col} ===")
    print(f"Count:    {df[numeric_col].count():,}")
    print(f"Mean:     {df[numeric_col].mean():,.2f}")
    print(f"Median:   {df[numeric_col].median():,.2f}")
    print(f"Std Dev:  {df[numeric_col].std():,.2f}")
    print(f"Min/Max:  {df[numeric_col].min():,.2f} / {df[numeric_col].max():,.2f}")

    if category_col:
        print(f"
By {category_col}:")
        print(df.groupby(category_col)[numeric_col].mean().sort_values(ascending=False))

quick_report(df, 'salary', 'department')

11.4 Analytics project checklist

Define the business question before touching any data
Understand the data source: who collected it, how, and when
Run EDA and document all findings and anomalies
Save df_raw = df.copy() before any cleaning
Document every cleaning decision with a comment explaining why
Validate your findings: do they make sense in real life?
Use at least 2 different methods to confirm important conclusions
Present findings visually — stakeholders read charts, not tables
State limitations and data quality issues in your report
Save cleaned data + notebook for full reproducibility

11.5 Full library reference

Library	Purpose	Key functions
`pandas`	Data manipulation	read_csv, groupby, merge, pivot_table
`numpy`	Math & arrays	array, mean, std, log1p, where
`matplotlib`	Base plotting	plot, subplots, savefig
`seaborn`	Statistical charts	histplot, boxplot, heatmap, pairplot
`plotly`	Interactive charts	scatter, bar, line, choropleth
`scipy`	Statistical tests	ttest_ind, chi2_contingency, normaltest
`scikit-learn`	Machine learning	fit, predict, cross_val_score, GridSearchCV
`openpyxl`	Excel files	Used by pandas to read/write xlsx

Great analysts ask "why?" at every step. Numbers tell a story — your job is to find it and communicate it clearly.

Dashboard & business storytelling

Build your final output in Power BI, Tableau, or Excel depending on stakeholder tooling. Export a clean dataset and always answer three questions: what happened, why it happened, and what we should do next.

Common mistakes to avoid

Skipping business context before running technical steps
Not writing assumptions and limitations explicitly
Treating one metric as the full story

Quick cheatsheet

df.info() -> Structure and non-null counts

df.describe() -> Numeric summary statistics

df.isnull().sum() -> Missing-value counts by column

df.groupby() -> Segmented aggregation

pd.merge() -> Join multiple datasets