Data Visualization with Matplotlib

60 min
Matplotlib
30%

Introduction to Matplotlib

Matplotlib is Python's foundational plotting library. Understanding it is essential for machine learning because visualization helps you understand your data, debug models, and communicate results.

Why Visualization for ML?

1. Exploratory Data Analysis: Understand distributions, relationships, and outliers 2. Feature Engineering: Visualize feature correlations and transformations 3. Model Evaluation: Plot learning curves, confusion matrices, ROC curves 4. Communication: Present findings to stakeholders

Installation

pip install matplotlib

Basic Import Convention

import matplotlib.pyplot as plt
import numpy as np

Your first plot

x = np.linspace(0, 10, 100) y = np.sin(x)

plt.plot(x, y) plt.title('Sine Wave') plt.xlabel('x') plt.ylabel('sin(x)') plt.show()

Two Interfaces

Matplotlib offers two ways to create plots:

1. Pyplot interface (quick and simple)

plt.plot(x, y) plt.title('Simple Plot') plt.show()

2. Object-Oriented interface (more control, recommended for complex plots)

fig, ax = plt.subplots() ax.plot(x, y) ax.set_title('OO Plot') plt.show()

Figure and Axes

Create figure with multiple subplots

fig, axes = plt.subplots(nrows=2, ncols=2, figsize=(10, 8))

Access individual subplots

axes[0, 0].plot(x, y) axes[0, 0].set_title('Subplot 1')

axes[0, 1].scatter(x, y) axes[0, 1].set_title('Subplot 2')

axes[1, 0].bar([1, 2, 3], [4, 5, 6]) axes[1, 0].set_title('Subplot 3')

axes[1, 1].hist(np.random.randn(1000), bins=30) axes[1, 1].set_title('Subplot 4')

plt.tight_layout()

Prevent overlapping

plt.show()