Chapter 08 — Visualization
Data Visualization
Create clear and insightful charts. Choose the right chart for the right question.
8.0 Visualization decision guide
| Goal | Best chart | Why | Avoid / skip |
|---|---|---|---|
| Show one numeric distribution | Histogram + KDE | Reveals skew, modes, tails | Skip pie chart for numeric distributions |
| Compare groups on numeric value | Boxplot/violin + strip | Shows median + spread + outliers | Skip only bar mean without spread info |
| Compare category totals | Bar chart | Best for discrete comparisons | Avoid stacked bars with too many segments |
| Show trend over time | Line chart | Natural for temporal continuity | Skip if x-axis is unordered categories |
| Explore two numeric vars | Scatter/regplot | Shows pattern, clusters, outliers | Skip with heavy overplotting without alpha/binning |
Before publishing a chart: label units, start bars at zero, limit colors, and add one sentence interpretation. A chart without context is easy to misread.
DataXForgeChart without code: CSV → Charts · Histograms · Scatter Plots · Heatmaps · Time Series Visualizer · KPI Dashboard Builder.
8.1 Distribution charts
python
# Histogram with density curve fig, axes = plt.subplots(1, 2, figsize=(12, 4)) sns.histplot(df['salary'], bins=30, kde=True, ax=axes[0]) axes[0].set_title('Salary Distribution') # Boxplot — shows median, IQR, and outliers sns.boxplot(x='department', y='salary', data=df, ax=axes[1]) axes[1].tick_params(axis='x', rotation=45) plt.tight_layout(); plt.show()
8.2 Relationship charts
python
# Scatter plot with color grouping sns.scatterplot(x='age', y='salary', hue='department', data=df, alpha=0.7) plt.title('Age vs Salary by Department'); plt.show() # Regression line sns.regplot(x='age', y='salary', data=df, scatter_kws={'alpha':0.5}) # Pairplot — all numeric columns vs each other cols = ['age', 'salary', 'experience', 'score'] sns.pairplot(df[cols + ['category']], hue='category', diag_kind='kde') plt.show()
8.3 Categorical charts
python
# Horizontal bar chart (good for many categories) df['department'].value_counts().sort_values().plot( kind='barh', figsize=(8, 5), color='#4a90d9' ) plt.title('Employees per Department'); plt.tight_layout(); plt.show() # Grouped bar chart df.groupby(['year', 'product'])['sales'].sum().unstack().plot( kind='bar', figsize=(10, 5), edgecolor='white' ) plt.xticks(rotation=0); plt.legend(title='Product'); plt.show()
8.4 Time series charts
python
# Basic line chart df_ts = df.set_index('date').sort_index() df_ts['sales'].plot(figsize=(12, 4), color='#2196F3', linewidth=1.5) plt.title('Sales Over Time'); plt.ylabel('Sales'); plt.grid(alpha=0.3) plt.tight_layout(); plt.show() # With rolling average df_ts['rolling_7d'] = df_ts['sales'].rolling(window=7).mean() df_ts[['sales', 'rolling_7d']].plot(figsize=(12, 4))
8.5 Interactive charts with Plotly
python
# Interactive scatter fig = px.scatter(df, x='age', y='salary', color='department', size='experience', hover_name='name', title='Salary by Age and Department') fig.show() # Interactive bar fig = px.bar(df, x='month', y='revenue', color='product', barmode='group', title='Monthly Revenue by Product') fig.show() # Interactive line (time series) fig = px.line(df, x='date', y='sales', color='region') fig.show()
| Chart type | Use when... |
|---|---|
| Histogram | Show distribution of one numeric column |
| Boxplot | Compare distributions across groups, spot outliers |
| Bar chart | Compare counts or totals across categories |
| Scatter plot | Show relationship between two numeric columns |
| Line chart | Show change over time |
| Heatmap | Show correlations or 2D frequency tables |
| Pie chart | Show proportions (max 5-6 slices) |
Common mistakes to avoid
- Using pie charts for high-cardinality categories
- Using wrong chart type for the question
- Publishing charts without axis units or context
Quick cheatsheet
df.info() -> Structure and non-null countsdf.describe() -> Numeric summary statisticsdf.isnull().sum() -> Missing-value counts by columndf.groupby() -> Segmented aggregationpd.merge() -> Join multiple datasets