12 Matplotlib and Data Visualization#
Goal#
Learn how to create professional plots and visualizations using Matplotlib. Data visualization is crucial for understanding results and communicating findings.
Prerequisites#
1. Introduction#
Matplotlib is Python’s primary library for creating static plots and visualizations. It’s widely used in scientific research for publication-quality figures.
In this tutorial, you’ll learn to create:
Line plots
Scatter plots
Histograms
Bar plots
Subplots (multiple plots in one figure)
Customized plots with labels, legend, and styling
2. Installation#
Matplotlib usually comes with Anaconda/Miniconda. Install explicitly with:
pip install matplotlib
3. Basic Plotting#
3.1 Simple Line Plot#
import matplotlib.pyplot as plt
# Data
x = [1, 2, 3, 4, 5]
y = [2, 4, 6, 8, 10]
# Create plot
plt.plot(x, y)
plt.xlabel("X-axis Label")
plt.ylabel("Y-axis Label")
plt.title("Simple Line Plot")
plt.show()
3.2 Scatter Plot#
import matplotlib.pyplot as plt
x = [1, 2, 3, 4, 5]
y = [2, 4, 5, 4, 6]
plt.scatter(x, y)
plt.xlabel("X")
plt.ylabel("Y")
plt.title("Scatter Plot")
plt.show()
3.3 Histogram#
import matplotlib.pyplot as plt
import numpy as np
data = np.random.normal(100, 15, 1000) # 1000 values, mean=100, std=15
plt.hist(data, bins=30)
plt.xlabel("Value")
plt.ylabel("Frequency")
plt.title("Distribution of Data")
plt.show()
3.4 Bar Plot#
import matplotlib.pyplot as plt
categories = ["Physics", "Chemistry", "Biology"]
values = [85, 78, 92]
plt.bar(categories, values)
plt.ylabel("Score")
plt.title("Subject Scores")
plt.show()
4. Customizing Plots#
4.1 Colors, Markers, and Line Styles#
x = [1, 2, 3, 4, 5]
y = [1, 4, 9, 16, 25]
# Customize appearance
plt.plot(x, y,
color='red', # or 'r', '#FF0000'
marker='o', # o, s, ^, *, +
linestyle='--', # '-', '--', '-.', ':'
linewidth=2,
markersize=8,
label='y = x²'
)
plt.legend() # Show label
plt.show()
4.2 Axis Limits and Grid#
x = [1, 2, 3, 4]
y = [1, 4, 2, 3]
plt.plot(x, y)
plt.xlim(0, 5) # Set x-axis limits
plt.ylim(0, 5) # Set y-axis limits
plt.grid(True, alpha=0.3) # Add grid
plt.show()
4.3 Labels and Text#
plt.figure(figsize=(10, 6)) # Set figure size
plt.plot([1, 2, 3], [1, 4, 9])
plt.title("Quadratic Function", fontsize=16)
plt.xlabel("X", fontsize=12)
plt.ylabel("Y", fontsize=12)
# Add text at specific location
plt.text(2.5, 7, "Important Point", fontsize=10)
plt.show()
5. Multiple Plots (Subplots)#
5.1 Create Subplots#
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 5))
# First subplot
ax1.plot([1, 2, 3], [1, 4, 9])
ax1.set_title("Plot 1")
ax1.set_xlabel("X")
ax1.set_ylabel("Y")
# Second subplot
ax2.scatter([1, 2, 3], [3, 2, 1])
ax2.set_title("Plot 2")
ax2.set_xlabel("X")
ax2.set_ylabel("Y")
plt.tight_layout() # Adjust spacing
plt.show()
5.2 2x2 Subplots#
fig, axes = plt.subplots(2, 2, figsize=(10, 8))
# Flatten to iterate easily
for ax in axes.flat:
ax.plot([1, 2, 3], [1, 2, 3])
ax.set_xlabel("X")
ax.set_ylabel("Y")
plt.tight_layout()
plt.show()
6. Working with Data#
With Pandas#
import matplotlib.pyplot as plt
import pandas as pd
df = pd.DataFrame({
"Month": ["Jan", "Feb", "Mar", "Apr"],
"Sales": [100, 150, 120, 200]
})
plt.bar(df["Month"], df["Sales"])
plt.ylabel("Sales")
plt.title("Monthly Sales")
plt.show()
With NumPy#
import numpy as np
import matplotlib.pyplot as plt
x = np.linspace(0, 2*np.pi, 100) # 100 points from 0 to 2π
y = np.sin(x)
plt.plot(x, y)
plt.title("Sine Wave")
plt.show()
7. Saving Figures#
import matplotlib.pyplot as plt
plt.plot([1, 2, 3], [1, 4, 9])
plt.title("My Plot")
# Save to file
plt.savefig("my_plot.png", dpi=300, bbox_inches='tight')
plt.show()
# Also save as PDF for publications
plt.savefig("my_plot.pdf")
8. Common Plot Types#
Box Plot (for distributions)#
data1 = [1, 2, 3, 4, 5]
data2 = [2, 4, 6, 8, 10]
plt.boxplot([data1, data2], labels=['Dataset 1', 'Dataset 2'])
plt.ylabel("Values")
plt.show()
Heatmap (requires seaborn or special handling)#
import matplotlib.pyplot as plt
import numpy as np
data = np.random.random((5, 5))
plt.imshow(data, cmap='viridis') # cmap = colormap
plt.colorbar()
plt.title("Heatmap")
plt.show()
9. Style and Aesthetics#
import matplotlib.pyplot as plt
# Use a different style
plt.style.use('seaborn-v0_8-darkgrid') # Other: 'ggplot', 'bmh', etc.
x = [1, 2, 3]
y = [1, 4, 9]
plt.plot(x, y, marker='o', linewidth=2)
plt.show()
10. Pro Tips#
Always label axes - viewers need to understand what they’re looking at
Use appropriate plot types - bar plot for categories, scatter for correlations, histogram for distributions
Keep it simple - too many colors/elements make plots hard to read
Use high DPI when saving -
dpi=300is good for publicationsAdd units to labels - e.g., “Temperature (°C)” not just “Temperature”
Resources#
Matplotlib Gallery - examples of different plot types
Seaborn - higher-level plotting on top of Matplotlib
Next Steps#
Now that you can visualize data, let’s learn about advanced statistical analysis: 13 SciPy and Fitting.