3.4.5 Visualization Best Practices

Chart selection decision tree

Section focus

When many beginners learn visualization, they focus on:

whether the colors look nice
whether the style looks flashy enough

But the really important question is:

Does this chart help people understand faster, rather than make it easier to be misled?

So the most important things in this section are not “beautification,” but:

choosing the right chart
reducing distractions
communicating honestly

Learning objectives

Master strategies for choosing chart types
Understand color principles and color-blind-friendly design
Understand the concept of “data-ink ratio”
Identify and avoid common visualization misleading techniques

First, build a map

It’s easier to understand visualization best practices in the order “communication goal -> chart choice -> reduce distractions -> prevent misleading”:

So what this section really wants to solve is:

why making a chart does not automatically make it clear
why many bad charts are not a code problem, but a communication-logic problem

Why learn best practices?

With the same data, good visualization lets people understand it in one second, while bad visualization makes people more and more confused, or even misleads them.

A better beginner-friendly analogy

You can think of a chart as:

a very short presentation

A good chart is like a clear presentation:

you immediately know the key point

A bad chart is like:

too much talk
too much decoration
but after hearing it, you still do not know the key point

So the most important thing in best practices is not “looking nice,” but:

whether the information is easier to understand

Chart type selection guide

Core principle: first ask yourself, “What do I want to show?”

Detailed chart selection table

My goal	Recommended chart	Not recommended	Explanation
Show time trend	Line chart	Pie chart	Line charts are naturally suited to continuous change
Compare several categories	Bar chart	Pie chart (more than 6 categories)	Bar heights are easy to compare at a glance
See the relationship between two variables	Scatter plot	Line chart	Scatter plots show distribution and trends more directly
Check data distribution	Histogram / box plot	Line chart	Histograms show frequency distribution
Show proportion	Pie chart (few categories) / stacked bar chart	3D pie chart	3D pie charts distort area
Compare distribution differences	Box plot / violin plot	Only mean-value bar chart	Mean bars lose distribution information
Show a correlation matrix	Heatmap	Table	Color coding is more intuitive than numbers

Five questions worth remembering before you start drawing

Before you actually start making a chart, ask yourself:

Am I showing trend, comparison, distribution, or relationship?
Is this chart for my own exploration, or for reporting to others?
What should the reader notice first?
Which decorations can actually be removed?
Could this chart mislead people?

These 5 questions are more important than many plotting library parameters.

Color principles

Three uses of color

Use	Scenario	Example
Categorical (qualitative color)	Distinguish different categories	Male/female in different colors
Sequential (continuous color)	Represent magnitude	Temperature from blue to red
Diverging (two-sided color)	Data with a center value	Correlation coefficients from -1 to 1

Recommended color palettes

import matplotlib.pyplot as plt
import seaborn as sns

# === Categorical colors (distinguish categories)===
# Matplotlib default palette (Tab10)
# Colors: blue, orange, green, red, purple, brown, pink, gray, yellow-green, cyan
# Suitable for: up to 10 categories

# Seaborn palettes
sns.color_palette("Set2")       # Soft tones
sns.color_palette("colorblind") # Color-blind-friendly!

# === Sequential colors (show magnitude)===
# From light to dark
# "Blues", "Greens", "Reds", "YlOrRd" (yellow-orange-red)

# === Diverging colors (has a center value)===
# "RdBu_r" (red-white-blue) — often used for correlation coefficients
# "RdYlGn" (red-yellow-green) — shows good/bad

About 8% of men worldwide have color vision deficiency (the most common is red-green color blindness).

Combinations to avoid:

Avoid	Why	Alternative
Red + green	People with red-green color blindness cannot distinguish them	Blue + orange
Relying only on color	Color-blind readers cannot see the difference	Color + shape / line style

Recommended practice:

# Use a color-blind-friendly palette
sns.set_palette("colorblind")

# Or use different marker shapes + colors
markers = ["o", "s", "^", "D"]  # Circle, square, triangle, diamond
linestyles = ["-", "--", ":", "-."]  # Solid, dashed, dotted, dash-dot

# Example: dual encoding with color + line style
fig, ax = plt.subplots()
for i, (style, marker) in enumerate(zip(linestyles, markers)):
    ax.plot(range(10), [x + i*2 for x in range(10)],
            linestyle=style, marker=marker, label=f"Series {i+1}")
ax.legend()
plt.show()

Dual encoding

Do not rely on color alone to convey information. Use shape, line style, labels, and patterns together to help distinguish series, so the chart can still be understood even when printed in black and white.

Data-ink ratio

What is the data-ink ratio?

This concept comes from visualization master Edward Tufte:

Data-ink ratio = ink used to display data / all ink used in the chart

Simply put: remove all unnecessary decoration so that every drop of ink serves the data.

Reduce “chart junk”

Practical comparison

Bad practice (low data-ink ratio):

# ❌ Over-decorated
fig, ax = plt.subplots(figsize=(8, 5))
values = [25, 40, 30, 55, 45]
categories = ["A", "B", "C", "D", "E"]

ax.bar(categories, values, color="skyblue", edgecolor="navy", linewidth=2, hatch="//")
ax.set_facecolor("#f0f0f0")          # Background color
ax.grid(True, linewidth=2, alpha=1)  # Heavy grid
ax.set_title("Sales Data", fontsize=20, fontweight="bold",
             fontstyle="italic", color="red")
# Too much decoration makes the data less prominent
plt.show()

Good practice (high data-ink ratio):

# ✅ Clean and clear
fig, ax = plt.subplots(figsize=(8, 5))

bars = ax.bar(categories, values, color="#4CAF50", width=0.6)

# Put values directly on the bars (reduces dependence on the axis)
for bar, val in zip(bars, values):
    ax.text(bar.get_x() + bar.get_width()/2, bar.get_height() + 1,
            str(val), ha="center", fontsize=12)

ax.set_title("Sales Data", fontsize=14)
ax.spines["top"].set_visible(False)     # Remove top border
ax.spines["right"].set_visible(False)   # Remove right border
ax.set_ylabel("Sales")

plt.tight_layout()
plt.show()

Simplification checklist

Element	Keep?	Reason
Top/right borders	Remove	No information value
Dense grid lines	Remove or soften	Use `alpha=0.2`
3D effects	Remove	Distort proportions
Data labels	As needed	Sometimes more intuitive than the axis
Legend	As needed	Not needed when there is only one series
Background color	White background	Least distracting

A minimal clean template that beginners can remember first

If you are not sure how to make a chart look more “professional,” you can start with these defaults:

White background
Remove top and right borders
Make grid lines lighter
Let the title say only the key point
Use fewer exaggerated colors

Charts made this way are usually already clearer than many “carefully decorated” versions.

Common visualization misleading techniques

Misleading 1: Truncating the Y-axis

The sales of three products are 98, 99, and 100 — almost the same. But if the Y-axis does not start at 0:

Truncated Y-axis (starting from 97) — looks like a huge difference:

Product C looks 3 times taller than Product A! But in reality it is only 2% higher.

Y-axis starts at 0 — true proportions:

When the axis starts at 0, the three bars are almost the same height — that is the true picture of the data.

Key takeaway

Not starting the Y-axis at 0 can make tiny differences look huge. This kind of misleading technique is very common in news media, so learn to spot it!

Code comparison:

import matplotlib.pyplot as plt

fig, axes = plt.subplots(1, 2, figsize=(12, 4))

data = [98, 99, 100]
labels = ["Product A", "Product B", "Product C"]

# ❌ Misleading: truncated Y-axis
axes[0].bar(labels, data, color="#F44336")
axes[0].set_ylim(97, 101)  # Y-axis starts at 97!
axes[0].set_title("❌ Truncated Y-axis (starts at 97)")
axes[0].set_ylabel("Sales")

# ✅ Correct: start at 0
axes[1].bar(labels, data, color="#4CAF50")
axes[1].set_ylim(0, 110)
axes[1].set_title("✅ Y-axis starts at 0")
axes[1].set_ylabel("Sales")

plt.tight_layout()
plt.show()

When can you avoid starting at 0?

For line charts that focus on trend changes, you can truncate the Y-axis (because readers are looking at the line’s movement). But bar charts, where bar area represents quantity, must start at 0.

Misleading 2: 3D pie charts

3D pie charts make the part closer to the viewer look larger:

# ❌ 3D pie chart
# Matplotlib does not actually have a real 3D pie chart, but this shows the idea:
# slices in front appear larger due to perspective, and slices in back appear smaller
# causing readers to judge proportions incorrectly

# ✅ Correct approach: use a 2D pie chart or a bar chart instead
fig, axes = plt.subplots(1, 2, figsize=(12, 5))

labels = ["Python", "Java", "JS", "C++"]
sizes = [35, 25, 25, 15]

axes[0].pie(sizes, labels=labels, autopct="%1.0f%%")
axes[0].set_title("2D Pie Chart (clear)")

axes[1].barh(labels, sizes, color=["#4CAF50", "#2196F3", "#FFC107", "#FF5722"])
axes[1].set_xlabel("Proportion (%)")
axes[1].set_title("Bar Chart (more precise comparison)")

plt.tight_layout()
plt.show()

Misleading 3: Dual Y-axis trap

# ❌ Dual Y-axes may suggest a correlation that does not really exist
fig, ax1 = plt.subplots(figsize=(8, 5))

months = range(1, 13)
temp = [5, 7, 12, 18, 23, 28, 30, 29, 24, 17, 10, 6]
ice_cream = [20, 25, 35, 50, 70, 90, 95, 88, 60, 40, 22, 18]

ax1.plot(months, temp, "r-", label="Temperature")
ax1.set_ylabel("Temperature (°C)", color="r")

ax2 = ax1.twinx()
ax2.plot(months, ice_cream, "b-", label="Ice cream sales")
ax2.set_ylabel("Ice cream sales", color="b")

ax1.set_title("Temperature vs Ice Cream Sales")
plt.show()

# There is indeed a correlation here, but the scale of dual Y-axes can be adjusted arbitrarily
# making the two lines look perfectly aligned or completely unrelated
# A scatter plot is a more honest way to show correlation

Misleading 4: Incorrect area/size mapping

# ❌ Using diameter instead of area to represent quantity
# If A = 100, B = 200
# doubling the diameter -> area becomes 4 times larger -> readers think B is 4 times A

# ✅ Correct approach: map values to area
import numpy as np

values = [100, 200, 300]
# Area should be proportional to the value, so radius should be proportional to sqrt(value)
sizes = [v * 2 for v in values]  # Area proportional

Summary of common misleading techniques

Misleading method	Why it misleads	Correct approach
Truncated Y-axis	Exaggerates differences	Bar charts start at 0
3D pie chart	Distorts area proportions	Use 2D pie charts or bar charts
Dual Y-axis	Allows visual correlation to be manipulated	Use scatter plots or separate charts
Incorrect area usage	Distorts size perception	Make area proportional to the value
Selective display	Hides unfavorable data	Show complete data
Color manipulation	Bright colors emphasize small data	Use neutral colors as the main palette

Full checklist

After finishing a chart, check it with this list:

☐ Is the chart type appropriate? (line vs. bar vs. scatter...)
☐ Does the title clearly express what the chart is saying?
☐ Do the axes have labels and units?
☐ Is the Y-axis start value reasonable? (bar charts should start at 0)
☐ Is the legend necessary and clear?
☐ Are the colors color-blind-friendly?
☐ Have unnecessary decorations been removed? (3D effects, fancy backgrounds)
☐ Do the data labels help understanding?
☐ Is the font large enough? (Can others read it clearly?)
☐ Is the data presentation honest? (No misleading tricks?)

The safest default order for your first reporting chart

A more stable order is usually:

Choose the right chart first
Then make the title and axes clear
Then remove extra decoration
Finally, check whether anything is misleading

This is less likely to go off track than worrying about colors and shadows from the very beginning.

From “usable” to “easy to use” templates

Minimal professional template

import matplotlib.pyplot as plt
import numpy as np

def professional_style(ax):
    """One-click professional styling"""
    ax.spines["top"].set_visible(False)
    ax.spines["right"].set_visible(False)
    ax.grid(True, axis="y", alpha=0.2, linestyle="--")
    ax.tick_params(labelsize=10)

# Use the template
fig, ax = plt.subplots(figsize=(8, 5))

categories = ["Product A", "Product B", "Product C", "Product D", "Product E"]
values = [42, 38, 55, 29, 47]

bars = ax.bar(categories, values, color="#1976D2", width=0.6)

# Data labels
for bar, val in zip(bars, values):
    ax.text(bar.get_x() + bar.get_width()/2, bar.get_height() + 1,
            f"{val}k", ha="center", fontsize=11)

ax.set_title("2024 Sales of Each Product", fontsize=14, pad=15)
ax.set_ylabel("Sales (ten thousand yuan)")
ax.set_ylim(0, max(values) * 1.15)

professional_style(ax)
plt.tight_layout()
plt.show()

Seaborn minimal template

import seaborn as sns
import matplotlib.pyplot as plt

# Global settings
sns.set_theme(
    style="ticks",                         # Minimal tick style
    palette="colorblind",                   # Color-blind-friendly
    rc={
        "figure.figsize": (8, 5),
        "axes.spines.top": False,           # Remove top border
        "axes.spines.right": False,         # Remove right border
        "font.size": 11,
    }
)

# All later sns/plt charts will use this style

Summary

Three core sentences:

Choose the right chart — let the data decide the chart type, not the other way around
Less is more — remove everything that does not serve the data
Present honestly — do not exaggerate, hide, or mislead

What you should take away from this section

The core of best practices is not “prettier,” but “clearer and more honest”
Choosing the right chart matters much more than adding lots of styling
The biggest visualization danger is not being flashy enough, but making people focus on the wrong thing

Hands-on exercises

Exercise 1: Improve an “ugly chart”

# The chart below has many problems. Please transform it into a professional version:
import matplotlib.pyplot as plt

fig, ax = plt.subplots()
data = [45, 52, 38, 67, 41]
labels = ["Beijing", "Shanghai", "Guangzhou", "Shenzhen", "Hangzhou"]

ax.bar(labels, data, color=["red", "green", "blue", "yellow", "purple"],
       edgecolor="black", linewidth=3, hatch="xxx")
ax.set_ylim(30, 70)  # Truncated Y-axis!
ax.set_facecolor("#cccccc")
ax.grid(True, linewidth=3)
ax.set_title("SALES DATA!!!", fontsize=24, color="red")
plt.show()

# Problem list:
# 1. Y-axis does not start at 0 (misleading)
# 2. Colors are flashy and inconsistent
# 3. Background color is distracting
# 4. Grid is too thick
# 5. Title is unclear
# 6. Borders are unnecessary
# Please fix them one by one!

Exercise 2: Chart selection practice

Please choose the appropriate chart type for each scenario and explain why:

Show the annual revenue change of a company from 2018 to 2024
Compare the average housing prices of 5 cities
Analyze the relationship between advertising spending and sales
Check the age distribution of 1,000 employees
Show the employee proportion of each department in a company (4 departments)
Compare the distribution differences among three groups of experimental data

# Transform the following chart into a color-blind-friendly version
# Requirement: use a color-blind-safe palette + dual encoding with different line styles/markers

fig, ax = plt.subplots()
x = range(10)
ax.plot(x, [i**1.5 for i in x], color="red", label="Model A")
ax.plot(x, [i**1.3 for i in x], color="green", label="Model B")
ax.plot(x, [i**1.1 for i in x], color="red", alpha=0.5, label="Model C")  # Too similar to A!
ax.legend()
plt.show()

3.4.5 Visualization Best Practices

Learning objectives

First, build a map

Why learn best practices?

A better beginner-friendly analogy

Chart type selection guide

Core principle: first ask yourself, “What do I want to show?”

Detailed chart selection table

Five questions worth remembering before you start drawing

Color principles

Three uses of color

Recommended color palettes

Color-blind-friendly design

Data-ink ratio

What is the data-ink ratio?

Reduce “chart junk”

Practical comparison

Simplification checklist

A minimal clean template that beginners can remember first

Common visualization misleading techniques

Misleading 1: Truncating the Y-axis

Misleading 2: 3D pie charts

Misleading 3: Dual Y-axis trap

Misleading 4: Incorrect area/size mapping

Summary of common misleading techniques

Full checklist

The safest default order for your first reporting chart

From “usable” to “easy to use” templates

Minimal professional template

Seaborn minimal template

Summary

What you should take away from this section

Hands-on exercises

Exercise 1: Improve an “ugly chart”

Exercise 2: Chart selection practice

Exercise 3: Color-blind-friendly redesign

Learning objectives​

First, build a map​

Why learn best practices?​

A better beginner-friendly analogy​

Chart type selection guide​

Core principle: first ask yourself, “What do I want to show?”​

Detailed chart selection table​

Five questions worth remembering before you start drawing​

Color principles​

Three uses of color​

Recommended color palettes​

Color-blind-friendly design​

Data-ink ratio​

What is the data-ink ratio?​

Reduce “chart junk”​

Practical comparison​

Simplification checklist​

A minimal clean template that beginners can remember first​

Common visualization misleading techniques​

Misleading 1: Truncating the Y-axis​

Misleading 2: 3D pie charts​

Misleading 3: Dual Y-axis trap​

Misleading 4: Incorrect area/size mapping​

Summary of common misleading techniques​

Full checklist​

The safest default order for your first reporting chart​

From “usable” to “easy to use” templates​

Minimal professional template​

Seaborn minimal template​

Summary​

What you should take away from this section​

Hands-on exercises​

Exercise 1: Improve an “ugly chart”​

Exercise 2: Chart selection practice​

Exercise 3: Color-blind-friendly redesign​

Learning objectives

First, build a map

Why learn best practices?

A better beginner-friendly analogy

Chart type selection guide

Core principle: first ask yourself, “What do I want to show?”

Detailed chart selection table

Five questions worth remembering before you start drawing

Color principles

Three uses of color

Recommended color palettes

Color-blind-friendly design

Data-ink ratio

What is the data-ink ratio?

Reduce “chart junk”

Practical comparison

Simplification checklist

A minimal clean template that beginners can remember first

Common visualization misleading techniques

Misleading 1: Truncating the Y-axis

Misleading 2: 3D pie charts

Misleading 3: Dual Y-axis trap

Misleading 4: Incorrect area/size mapping

Summary of common misleading techniques

Full checklist

The safest default order for your first reporting chart

From “usable” to “easy to use” templates

Minimal professional template

Seaborn minimal template

Summary

What you should take away from this section

Hands-on exercises

Exercise 1: Improve an “ugly chart”

Exercise 2: Chart selection practice

Exercise 3: Color-blind-friendly redesign