Module 4: Persistent Homology and Persistence Diagrams¶
Introduction¶
Essential Concepts¶
Topological Data Analysis (TDA) is a framework for extracting meaningful shape information from data. Unlike traditional statistical methods that focus on distances and distributions, TDA reveals the topological structure underlying complex datasets.
Key Ideas:¶
- Topological Data Analysis (TDA): A methodology for analyzing the shape of data using tools from algebraic topology
- Point cloud data: A collection of points in some metric space (e.g., $\mathbb{R}^n$), often obtained from:
- Sensor measurements
- High-dimensional data embeddings
- Sampling from manifolds
- Scientific observations
Central Question: How do we extract topological features (connected components, loops, voids) from discrete point clouds?
Example: Point Cloud Data¶
Let's generate a simple point cloud sampled from a circle with some noise.
# Import necessary libraries
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.patches import Circle
from scipy.spatial.distance import cdist
# Set random seed for reproducibility
np.random.seed(42)
# Generate points sampled from a noisy circle
n_points = 50
theta = np.linspace(0, 2*np.pi, n_points, endpoint=False)
radius = 1.0
noise_level = 0.1
# Add noise to the circle
x = radius * np.cos(theta) + np.random.normal(0, noise_level, n_points)
y = radius * np.sin(theta) + np.random.normal(0, noise_level, n_points)
points = np.column_stack([x, y])
# Visualize the point cloud
plt.figure(figsize=(8, 8))
plt.scatter(x, y, s=50, alpha=0.6)
plt.axis('equal')
plt.title('Point Cloud Sampled from a Noisy Circle', fontsize=14)
plt.xlabel('x')
plt.ylabel('y')
plt.grid(True, alpha=0.3)
plt.show()
print(f"Generated {n_points} points in 2D space")
Generated 50 points in 2D space
1. Filtrations (Core Idea)¶
Definition¶
A filtration is a nested sequence of topological spaces:
$$ \emptyset = X_0 \subseteq X_1 \subseteq X_2 \subseteq \cdots \subseteq X_n $$
Each space $X_i$ is indexed by a scale parameter (often denoted $\epsilon$ or $r$) that controls the granularity at which we examine the shape of the data.
Why Filtrations?¶
- Point clouds themselves have trivial topology (just discrete points)
- We need to "thicken" the points to reveal structure
- Different scales reveal different features
- Filtrations provide a systematic way to explore all scales
Most Common Filtrations¶
1. Vietoris–Rips Filtration¶
Definition: At scale $r$, connect points $p_i$ and $p_j$ with an edge if $d(p_i, p_j) \leq r$. Add higher-dimensional simplices for all cliques.
- Pros: Easy to compute, only requires pairwise distances
- Cons: May include simplices that don't correspond to actual geometric balls
- Most widely used in practice
Mathematical notation: $$ \text{Rips}_r(X) = \{\sigma \subseteq X : d(p_i, p_j) \leq r \text{ for all } p_i, p_j \in \sigma\} $$
# Visualize Vietoris-Rips filtration at different scales
def plot_rips_filtration(points, radii):
fig, axes = plt.subplots(1, len(radii), figsize=(5*len(radii), 5))
if len(radii) == 1:
axes = [axes]
for idx, r in enumerate(radii):
ax = axes[idx]
# Plot points
ax.scatter(points[:, 0], points[:, 1], s=100, c='red', zorder=3)
# Plot edges
distances = cdist(points, points)
for i in range(len(points)):
for j in range(i+1, len(points)):
if distances[i, j] <= r:
ax.plot([points[i, 0], points[j, 0]],
[points[i, 1], points[j, 1]],
'b-', alpha=0.3, linewidth=1)
ax.set_aspect('equal')
ax.set_title(f'Rips Filtration at r = {r:.2f}', fontsize=12)
ax.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()
# Use a smaller subset for clearer visualization
sample_points = points[::5] # Take every 5th point
radii = [0.3, 0.5, 0.8]
plot_rips_filtration(sample_points, radii)
2. Čech Filtration¶
Definition: At scale $r$, a simplex is included if the intersection of balls of radius $r$ centered at its vertices is non-empty.
$$ \text{Čech}_r(X) = \left\{\sigma \subseteq X : \bigcap_{p \in \sigma} B(p, r) \neq \emptyset\right\} $$
- Pros: Theoretically correct (captures true nerve of covering)
- Cons: Computationally expensive (requires checking higher-order intersections)
- Used when theoretical guarantees are essential
Relationship: $\text{Rips}_r(X) \supseteq \text{Čech}_r(X) \supseteq \text{Rips}_{r/2}(X)$
# Visualize Čech filtration concept with ball intersections
def plot_cech_concept():
fig, axes = plt.subplots(1, 2, figsize=(14, 6))
# Example points
pts = np.array([[0, 0], [1, 0], [0.5, 0.8]])
for idx, r in enumerate([0.4, 0.6]):
ax = axes[idx]
# Draw circles
for i, pt in enumerate(pts):
circle = Circle(pt, r, fill=False, edgecolor='blue', linewidth=2)
ax.add_patch(circle)
circle_fill = Circle(pt, r, alpha=0.1, facecolor='blue')
ax.add_patch(circle_fill)
ax.scatter(pt[0], pt[1], s=100, c='red', zorder=3)
# Draw edges if balls intersect
for i in range(len(pts)):
for j in range(i+1, len(pts)):
if np.linalg.norm(pts[i] - pts[j]) <= 2*r:
ax.plot([pts[i, 0], pts[j, 0]],
[pts[i, 1], pts[j, 1]],
'k-', linewidth=2, alpha=0.5)
ax.set_xlim(-1, 2)
ax.set_ylim(-1, 2)
ax.set_aspect('equal')
ax.set_title(f'Čech Complex at r = {r:.1f}', fontsize=12)
ax.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()
plot_cech_concept()
3. Alpha Filtration¶
Definition: Based on the Delaunay triangulation and Voronoi diagram. A simplex is included at scale $r$ if there exists a ball of radius $r$ that passes through its vertices and contains no other points.
- Pros: Efficient in low dimensions (2D, 3D)
- Cons: Scales poorly to high dimensions
- Provides a subset of Čech complex with same homology
4. Sublevel-Set Filtration¶
Definition: Given a function $f: X \to \mathbb{R}$, the sublevel set at height $a$ is:
$$ X_a = \{x \in X : f(x) \leq a\} $$
- Use cases: Analyzing scalar fields, images, height functions
- Common in applications involving functions on manifolds
- Example: elevation data, density functions, image pixel intensities
# Visualize sublevel-set filtration on a 2D function
def plot_sublevel_filtration():
# Create a 2D function (e.g., sum of Gaussians)
x = np.linspace(-3, 3, 100)
y = np.linspace(-3, 3, 100)
X, Y = np.meshgrid(x, y)
# Function: two Gaussian peaks
Z = (np.exp(-((X-1)**2 + (Y-1)**2)) +
np.exp(-((X+1)**2 + (Y+1)**2)))
fig, axes = plt.subplots(1, 3, figsize=(18, 5))
# Original function
im = axes[0].contourf(X, Y, Z, levels=20, cmap='viridis')
axes[0].set_title('Function f(x,y)', fontsize=12)
axes[0].set_aspect('equal')
plt.colorbar(im, ax=axes[0])
# Sublevel sets at different heights
for idx, threshold in enumerate([0.3, 0.7]):
axes[idx+1].contourf(X, Y, Z, levels=[0, threshold], colors=['lightblue'], alpha=0.6)
axes[idx+1].contour(X, Y, Z, levels=[threshold], colors='blue', linewidths=2)
axes[idx+1].contourf(X, Y, Z, levels=20, cmap='viridis', alpha=0.3)
axes[idx+1].set_title(f'Sublevel Set: f(x,y) ≤ {threshold}', fontsize=12)
axes[idx+1].set_aspect('equal')
plt.tight_layout()
plt.show()
plot_sublevel_filtration()
Comparison of Filtrations¶
| Filtration | Computational Cost | Theoretical Accuracy | Typical Use Case |
|---|---|---|---|
| Vietoris–Rips | Low | Approximation | General point clouds |
| Čech | High | Exact (nerve theorem) | Theoretical analysis |
| Alpha | Medium (low dim) | Exact (in low dim) | 2D/3D geometric data |
| Sublevel-set | Depends on domain | Exact | Scalar fields, images |
2. Topological Features Across Scales¶
Birth and Death of Features¶
As we increase the scale parameter in a filtration, topological features appear and disappear:
- Birth: The scale at which a feature first appears
- Death: The scale at which the feature is filled, merged, or destroyed
Types of Features¶
Persistent homology tracks features in different dimensions:
0-Dimensional Features: Connected Components¶
- Birth: A component is born when a point is added
- Death: A component dies when it merges with an older component
- Tracks connectivity of the space
$$H_0 \text{ counts connected components}$$
# Visualize birth and death of connected components
def visualize_components():
# Create 3 clusters of points
np.random.seed(123)
cluster1 = np.random.randn(8, 2) * 0.2 + np.array([0, 0])
cluster2 = np.random.randn(8, 2) * 0.2 + np.array([2, 0])
cluster3 = np.random.randn(8, 2) * 0.2 + np.array([1, 2])
points = np.vstack([cluster1, cluster2, cluster3])
fig, axes = plt.subplots(1, 4, figsize=(20, 5))
radii = [0.0, 0.5, 1.6, 2]
component_counts = []
for idx, r in enumerate(radii):
ax = axes[idx]
ax.scatter(points[:, 0], points[:, 1], s=100, c='red', zorder=3)
if r > 0:
distances = cdist(points, points)
for i in range(len(points)):
for j in range(i+1, len(points)):
if distances[i, j] <= r:
ax.plot([points[i, 0], points[j, 0]],
[points[i, 1], points[j, 1]],
'b-', alpha=0.3, linewidth=1.5)
ax.set_aspect('equal')
ax.set_title(f'r = {r:.1f}', fontsize=14)
ax.grid(True, alpha=0.3)
# Count connected components (simplified - actual computation requires Union-Find)
if r == 0.0:
n_components = len(points)
elif r == 0.5:
n_components = 3
elif r == 1.6:
n_components = 2
else:
n_components = 1
ax.text(0.05, 0.95, f'Components: {n_components}',
transform=ax.transAxes, fontsize=12,
verticalalignment='top',
bbox=dict(boxstyle='round', facecolor='wheat', alpha=0.5))
plt.suptitle('Birth and Death of Connected Components (H₀)', fontsize=16, y=1.02)
plt.tight_layout()
plt.show()
visualize_components()
1-Dimensional Features: Loops and Holes¶
- Birth: A loop is born when a cycle is formed
- Death: A loop dies when it is filled by a 2-dimensional simplex
- Captures circular structure in data
$$H_1 \text{ counts loops/holes}$$
Example: Points sampled from a circle create a 1-dimensional hole
# Visualize birth and death of a loop
def visualize_loop():
# Generate circle points
n = 12
theta = np.linspace(0, 2*np.pi, n, endpoint=False)
circle_points = np.column_stack([np.cos(theta), np.sin(theta)])
fig, axes = plt.subplots(1, 4, figsize=(15, 5))
radii = [0.5, 0.8, 1.8, 2]
for idx, r in enumerate(radii):
ax = axes[idx]
ax.scatter(circle_points[:, 0], circle_points[:, 1], s=100, c='red', zorder=3)
distances = cdist(circle_points, circle_points)
for i in range(len(circle_points)):
for j in range(i+1, len(circle_points)):
if distances[i, j] <= r:
ax.plot([circle_points[i, 0], circle_points[j, 0]],
[circle_points[i, 1], circle_points[j, 1]],
'b-', alpha=0.3, linewidth=1.5)
ax.set_xlim(-1.5, 1.5)
ax.set_ylim(-1.5, 1.5)
ax.set_aspect('equal')
ax.grid(True, alpha=0.3)
if idx == 0:
title = f'r = {r:.1f}\nNo loop yet'
elif idx == 1:
title = f'r = {r:.1f}\nLoop is BORN (H₁)'
elif idx == 1.8:
title = f'r = {r:.1f}\nLoop is still alive (H₁)'
else:
title = f'r = {r:.1f}\nLoop is FILLED (dies)'
ax.set_title(title, fontsize=12)
plt.suptitle('Birth and Death of a 1-Dimensional Loop', fontsize=16, y=1.02)
plt.tight_layout()
plt.show()
visualize_loop()
2-Dimensional Features: Voids¶
- Birth: A void is born when a hollow 3D cavity is formed
- Death: A void dies when it is filled by a 3-dimensional simplex
- Captures volumetric structure
$$H_2 \text{ counts voids/cavities}$$
Example: Points sampled from the surface of a sphere create a 2-dimensional void
Key Insight¶
By tracking when features are born and when they die across the filtration, we obtain a multiscale signature of the topological structure of our data. Features that persist for a long time (long lifespan) are typically considered signal, while short-lived features are often treated as noise.
This leads naturally to the concept of persistence, which we'll explore in the next sections.
Signal vs. Noise Heuristic¶
A fundamental heuristic (widely used in practice):
- Long-lived features (large persistence) → meaningful structure, signal
- Short-lived features (small persistence) → noise, artifacts
Example: Points sampled from a noisy circle create a prominent 1-dimensional loop that persists across many scales (signal), while small noise-induced loops appear and disappear quickly (noise).
# Example: Generate noisy circle data
n_points = 50
theta = np.linspace(0, 2*np.pi, n_points, endpoint=False)
radius = 1.0
noise_level = 0.15
x = radius * np.cos(theta) + np.random.normal(0, noise_level, n_points)
y = radius * np.sin(theta) + np.random.normal(0, noise_level, n_points)
circle_points = np.column_stack([x, y])
# Visualize the evolution of topology
fig, axes = plt.subplots(1, 4, figsize=(20, 5))
radii = [0.2, 0.35, 0.5, 0.8]
for idx, r in enumerate(radii):
ax = axes[idx]
ax.scatter(x, y, s=80, c='red', zorder=3, alpha=0.7)
# Draw edges for Rips complex
distances = cdist(circle_points, circle_points)
for i in range(len(circle_points)):
for j in range(i+1, len(circle_points)):
if distances[i, j] <= r:
ax.plot([circle_points[i, 0], circle_points[j, 0]],
[circle_points[i, 1], circle_points[j, 1]],
'b-', alpha=0.2, linewidth=1)
ax.set_aspect('equal')
ax.set_title(f'Scale r = {r:.2f}', fontsize=13)
ax.grid(True, alpha=0.3)
ax.set_xlim(-1.5, 1.5)
ax.set_ylim(-1.5, 1.5)
# Add annotation
if idx == 0:
ax.text(0.5, -1.3, 'Disconnected\ncomponents', ha='center', fontsize=10)
elif idx == 1:
ax.text(0.5, -1.3, 'Components\nmerging', ha='center', fontsize=10)
elif idx == 2:
ax.text(0.5, -1.3, 'Loop BORN\n(H₁ feature)', ha='center', fontsize=10,
bbox=dict(boxstyle='round', facecolor='yellow', alpha=0.5))
else:
ax.text(0.5, -1.3, 'Loop persists\n(signal)', ha='center', fontsize=10)
plt.suptitle('Persistent Homology: Tracking Features Across Scales', fontsize=16, y=1.02)
plt.tight_layout()
plt.show()
print("Key observation: The main loop (H₁) persists from r≈0.5 to large r → SIGNAL")
print("Small loops appear and disappear quickly → NOISE")
Key observation: The main loop (H₁) persists from r≈0.5 to large r → SIGNAL Small loops appear and disappear quickly → NOISE
3. Persistent Homology¶
Formal Definition: Persistence Module¶
Given a filtration: $$ \emptyset = X_0 \subseteq X_1 \subseteq X_2 \subseteq \cdots \subseteq X_n $$
The persistence module is a sequence of homology groups with homomorphisms: $$ H_k(X_0) \xrightarrow{f_0} H_k(X_1) \xrightarrow{f_1} H_k(X_2) \xrightarrow{f_2} \cdots \xrightarrow{f_{n-1}} H_k(X_n) $$
where:
- $H_k(X_i)$ is the $k$-th homology group of space $X_i$ (it can be taken with some coefficent group)
- $f_i: H_k(X_i) \to H_k(X_{i+1})$ are homomorphisms induced by inclusion $X_i \subseteq X_{i+1}$
Structure Theorem¶
By the Structure Theorem for persistence modules, any persistence module can be decomposed as: $$ M \cong \bigoplus_i I_{[b_i, d_i)} $$
where $I_{[b, d)}$ is an interval module (non-zero on $[b, d)$, zero outside).
Each interval $[b_i, d_i)$ corresponds to a topological feature born at $b_i$ and dying at $d_i$.
4. Persistence Diagrams & Barcodes¶
The results of persistent homology are visualized using two equivalent representations:
4.1 Barcodes¶
A barcode is a collection of intervals $[birth, death)$ visualized as horizontal bars.
Structure:¶
- Each horizontal bar represents one topological feature
- Left endpoint = birth time
- Right endpoint = death time
- Length of bar = persistence (death - birth)
Interpretation:¶
- Long bars → significant features (signal)
- Short bars → insignificant features (noise)
# Example: Create a barcode visualization
def plot_barcode(ax, intervals, title, dimension):
# Sort by birth time
intervals_sorted = sorted(intervals, key=lambda x: x[0])
for idx, (birth, death) in enumerate(intervals_sorted):
if death == np.inf:
# Essential feature - draw arrow
ax.arrow(birth, idx, 2.5, 0, head_width=0.15, head_length=0.1,
fc='red', ec='red', linewidth=2)
ax.plot([birth, birth + 2.5], [idx, idx], 'r-', linewidth=3)
else:
# Regular feature
persistence = death - birth
color = 'darkblue' if persistence > 0.3 else 'lightgray'
linewidth = 3 if persistence > 0.3 else 1.5
ax.plot([birth, death], [idx, idx], color=color, linewidth=linewidth)
ax.set_xlabel('Filtration Parameter (scale)', fontsize=12)
ax.set_ylabel('Features', fontsize=12)
ax.set_title(f'{title}\n{dimension}', fontsize=14, fontweight='bold')
ax.grid(True, alpha=0.3, axis='x')
ax.set_ylim(-1, len(intervals))
# Add legend
from matplotlib.lines import Line2D
legend_elements = [
Line2D([0], [0], color='darkblue', linewidth=3, label='Signal (long-lived)'),
Line2D([0], [0], color='lightgray', linewidth=1.5, label='Noise (short-lived)'),
Line2D([0], [0], color='red', linewidth=3, label='Essential (death = ∞)')
]
ax.legend(handles=legend_elements, loc='upper right')
# Example data: H₀ (connected components)
h0_intervals = [
(0.0, np.inf), # Essential: main component
(0.0, 0.25), # Components merge
(0.0, 0.30),
(0.0, 0.35),
(0.0, 0.40),
(0.0, 0.45),
(0.0, 0.50),
]
# Example data: H₁ (loops)
h1_intervals = [
(0.48, 1.80), # Main loop (SIGNAL)
(0.25, 0.35), # Noise
(0.30, 0.38), # Noise
(0.52, 0.61), # Noise
(0.45, 0.53), # Noise
(0.70, 0.82), # Small loop
]
# Create subplots with two columns
fig, axes = plt.subplots(1, 2, figsize=(12, 6))
# Plot H₀ barcode on the first axis (left)
plot_barcode(axes[0], h0_intervals, 'Barcode', 'H₀ (Connected Components)')
# Plot H₁ barcode on the second axis (right)
plot_barcode(axes[1], h1_intervals, 'Barcode', 'H₁ (Loops/Holes)')
# Adjust layout for better spacing
plt.tight_layout()
plt.show()
4.2 Persistence Diagrams¶
A persistence diagram is a multiset of points $(b, d)$ in the plane.
Structure:¶
- Each point $(b, d)$ represents a feature born at time $b$ and dying at time $d$
- The $x$-axis represents birth times
- The $y$-axis represents death times
- All points lie above or on the diagonal $y = x$ (since death ≥ birth)
The Diagonal: The line $y = x$
- Represents features with zero persistence (birth = death)
- Points near the diagonal → noise
- Distance from diagonal = persistence
Essential Features: Points with $death = \infty$
- Features that never die in the filtration
- Often represent true topological structure
# Create persistence diagram visualization
def plot_persistence_diagram(ax, points, title, dimension):
# Separate finite and infinite death times
finite_points = [(b, d) for b, d in points if d != np.inf]
essential_points = [(b, d) for b, d in points if d == np.inf]
# Plot diagonal
max_val = max([d for b, d in finite_points] + [b for b, d in finite_points]) if finite_points else 2.0
diag_line = np.linspace(0, max_val * 1.1, 100)
ax.plot(diag_line, diag_line, 'k--', linewidth=2, label='Diagonal (y=x)', alpha=0.5)
# Plot finite points
if finite_points:
births = [b for b, d in finite_points]
deaths = [d for b, d in finite_points]
persistences = [d - b for b, d in finite_points]
# Color by persistence
colors = ['red' if p > 0.3 else 'lightblue' for p in persistences]
sizes = [200 if p > 0.3 else 50 for p in persistences]
for b, d, c, s in zip(births, deaths, colors, sizes):
ax.scatter(b, d, c=c, s=s, alpha=0.7, edgecolors='black', linewidth=1)
# Plot essential points (at top of diagram)
if essential_points:
essential_births = [b for b, d in essential_points]
essential_y = [max_val * 1.05] * len(essential_births) # Place near top
ax.scatter(essential_births, essential_y, c='darkred', s=300,
marker='^', label='Essential (death = ∞)',
edgecolors='black', linewidth=2)
ax.set_xlabel('Birth', fontsize=13)
ax.set_ylabel('Death', fontsize=13)
ax.set_title(f'{title}\n{dimension}', fontsize=14, fontweight='bold')
ax.grid(True, alpha=0.3)
ax.set_aspect('equal')
# Add legend
from matplotlib.lines import Line2D
legend_elements = [
Line2D([0], [0], marker='o', color='w', markerfacecolor='red',
markersize=12, label='Signal (far from diagonal)'),
Line2D([0], [0], marker='o', color='w', markerfacecolor='lightblue',
markersize=8, label='Noise (near diagonal)'),
Line2D([0], [0], color='k', linestyle='--', linewidth=2, label='Diagonal (y=x)')
]
if essential_points:
legend_elements.append(
Line2D([0], [0], marker='^', color='w', markerfacecolor='darkred',
markersize=12, label='Essential (death = ∞)')
)
ax.legend(handles=legend_elements, loc='lower right', fontsize=10)
# Create subplots with two columns
fig, axes = plt.subplots(1, 2, figsize=(12, 6))
# Plot H₀ persistence diagram on the first axis (left)
plot_persistence_diagram(axes[0], h0_intervals, 'Persistence Diagram', 'H₀ (Connected Components)')
# Plot H₁ persistence diagram on the second axis (right)
plot_persistence_diagram(axes[1], h1_intervals, 'Persistence Diagram', 'H₁ (Loops/Holes)')
# Adjust layout for better spacing
plt.tight_layout()
plt.show()
Interpreting Persistence Diagrams¶
Distance from diagonal = persistence = death - birth
For a point $(b, d)$:
- Perpendicular distance to diagonal: $\frac{d - b}{\sqrt{2}}$
- Larger distance → more significant feature
Key observations:
- Points clustered near diagonal → noise
- Points far from diagonal → topological signal
- Essential features (triangles at top) → persistent structure
- Empty regions → no features with those birth/death times
Equivalence of Representations¶
Barcodes and persistence diagrams contain the same information:
- Barcode $[b, d)$ ↔ Diagram point $(b, d)$
- Choice depends on application
- Barcodes: Better for visualizing temporal evolution
- Diagrams: Better for computing distances between datasets
# Side-by-side comparison
def compare_representations(intervals, dimension):
fig = plt.figure(figsize=(18, 6))
# Barcode (left)
ax1 = plt.subplot(1, 2, 1)
intervals_sorted = sorted([i for i in intervals if i[1] != np.inf], key=lambda x: x[0])
for idx, (birth, death) in enumerate(intervals_sorted):
persistence = death - birth
color = 'darkblue' if persistence > 0.3 else 'lightgray'
linewidth = 3 if persistence > 0.3 else 1.5
ax1.plot([birth, death], [idx, idx], color=color, linewidth=linewidth)
ax1.set_xlabel('Filtration Parameter', fontsize=12)
ax1.set_ylabel('Features', fontsize=12)
ax1.set_title(f'Barcode - {dimension}', fontsize=13, fontweight='bold')
ax1.grid(True, alpha=0.3, axis='x')
# Persistence diagram (right)
ax2 = plt.subplot(1, 2, 2)
finite_points = [(b, d) for b, d in intervals if d != np.inf]
if finite_points:
births = [b for b, d in finite_points]
deaths = [d for b, d in finite_points]
persistences = [d - b for b, d in finite_points]
max_val = max(deaths + births)
diag_line = np.linspace(0, max_val * 1.1, 100)
ax2.plot(diag_line, diag_line, 'k--', linewidth=2, alpha=0.5)
colors = ['red' if p > 0.3 else 'lightblue' for p in persistences]
sizes = [200 if p > 0.3 else 50 for p in persistences]
for b, d, c, s in zip(births, deaths, colors, sizes):
ax2.scatter(b, d, c=c, s=s, alpha=0.7, edgecolors='black', linewidth=1)
ax2.set_xlabel('Birth', fontsize=12)
ax2.set_ylabel('Death', fontsize=12)
ax2.set_title(f'Persistence Diagram - {dimension}', fontsize=13, fontweight='bold')
ax2.grid(True, alpha=0.3)
ax2.set_aspect('equal')
plt.suptitle('Two Equivalent Representations of the Same Topological Information',
fontsize=15, fontweight='bold', y=1.02)
plt.tight_layout()
plt.show()
compare_representations(h1_intervals, 'H₁')
5.2 Bottleneck Distance¶
The bottleneck distance is the standard metric for comparing persistence diagrams.
Definition¶
Given two persistence diagrams $D_1$ and $D_2$, the bottleneck distance is:
$$d_B(D_1, D_2) = \inf_{\gamma} \sup_{p \in D_1} \|p - \gamma(p)\|_\infty$$
where:
- $\gamma$ ranges over all bijections between $D_1 \cup \Delta$ and $D_2 \cup \Delta$
- $\Delta$ is the diagonal (points with birth = death)
- $\|\cdot\|_\infty$ is the supremum norm: $\|(b_1, d_1) - (b_2, d_2)\|_\infty = \max\{|b_1 - b_2|, |d_1 - d_2|\}$
Intuition¶
The bottleneck distance finds the best matching between points in two diagrams, where:
- Each point in $D_1$ is matched to exactly one point in $D_2$ or to the diagonal
- The distance is determined by the worst (maximum) matching cost
- Points can be matched to the diagonal, representing creation or destruction of features
Example: Computing Bottleneck Distance¶
Consider two simple persistence diagrams:
- $D_1 = \{(1, 4), (2, 5)\}$
- $D_2 = \{(1.2, 4.1), (2.1, 4.9)\}$
Matching 1: $(1,4) \leftrightarrow (1.2, 4.1)$ and $(2,5) \leftrightarrow (2.1, 4.9)$
$$\text{Cost}_1 = \max\{0.2, 0.1\} = 0.2$$ $$\text{Cost}_2 = \max\{0.1, 0.1\} = 0.1$$ $$d_B = \max\{0.2, 0.1\} = 0.2$$
This would be the optimal matching in this case.
import numpy as np
import matplotlib.pyplot as plt
# Visualize two persistence diagrams
D1 = np.array([[1, 4], [2, 5]])
D2 = np.array([[1.2, 4.1], [2.1, 4.9]])
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 5))
# Plot D1
ax1.scatter(D1[:, 0], D1[:, 1], s=100, c='blue', marker='o', label='D1')
ax1.plot([0, 6], [0, 6], 'k--', alpha=0.3, label='Diagonal')
ax1.set_xlabel('Birth', fontsize=12)
ax1.set_ylabel('Death', fontsize=12)
ax1.set_title('Persistence Diagram D1', fontsize=14)
ax1.legend()
ax1.grid(True, alpha=0.3)
ax1.set_aspect('equal')
# Plot D2
ax2.scatter(D2[:, 0], D2[:, 1], s=100, c='red', marker='s', label='D2')
ax2.plot([0, 6], [0, 6], 'k--', alpha=0.3, label='Diagonal')
ax2.set_xlabel('Birth', fontsize=12)
ax2.set_ylabel('Death', fontsize=12)
ax2.set_title('Persistence Diagram D2', fontsize=14)
ax2.legend()
ax2.grid(True, alpha=0.3)
ax2.set_aspect('equal')
plt.tight_layout()
plt.show()
5.3 Wasserstein Distance¶
The Wasserstein distance (also called Earth Mover's Distance) is a generalization of the bottleneck distance.
Definition¶
The $p$-Wasserstein distance between persistence diagrams $D_1$ and $D_2$ is:
$$W_p(D_1, D_2) = \left(\inf_{\gamma} \sum_{p \in D_1} \|p - \gamma(p)\|_\infty^p\right)^{1/p}$$
where $p \geq 1$ and $\gamma$ ranges over all bijections as before.
Special Cases¶
- $p = 1$: Total cost of optimal matching
- $p = 2$: Commonly used in machine learning applications
- $p \to \infty$: Recovers the bottleneck distance
$$d_B(D_1, D_2) = \lim_{p \to \infty} W_p(D_1, D_2)$$
Example: Wasserstein vs Bottleneck¶
Consider:
- $D_1 = \{(1, 3), (2, 4), (3, 5)\}$
- $D_2 = \{(1.1, 3.1), (2.1, 4.1), (3.1, 5.1)\}$
With optimal matching:
Bottleneck: $d_B = 0.1$ (maximum individual cost)
1-Wasserstein: $W_1 = 0.1 + 0.1 + 0.1 = 0.3$ (sum of costs)
2-Wasserstein: $W_2 = \sqrt{0.1^2 + 0.1^2 + 0.1^2} = \sqrt{0.03} \approx 0.173$
# Demonstrate difference between Bottleneck and Wasserstein
def compute_wasserstein_p(D1, D2, p):
"""Simple computation assuming optimal matching is identity"""
if len(D1) != len(D2):
raise ValueError("Diagrams must have same size for this example")
costs = np.max(np.abs(D1 - D2), axis=1) # L-infinity norm
if p == np.inf:
return np.max(costs)
else:
return np.power(np.sum(np.power(costs, p)), 1/p)
# Example diagrams
D1 = np.array([[1, 3], [2, 4], [3, 5]])
D2 = np.array([[1.1, 3.1], [2.1, 4.1], [3.1, 5.1]])
print("Distance comparisons:")
print(f"Bottleneck (p=∞): {compute_wasserstein_p(D1, D2, np.inf):.4f}")
print(f"1-Wasserstein: {compute_wasserstein_p(D1, D2, 1):.4f}")
print(f"2-Wasserstein: {compute_wasserstein_p(D1, D2, 2):.4f}")
print(f"5-Wasserstein: {compute_wasserstein_p(D1, D2, 5):.4f}")
Distance comparisons: Bottleneck (p=∞): 0.1000 1-Wasserstein: 0.3000 2-Wasserstein: 0.1732 5-Wasserstein: 0.1246
5.4 Stability Theorem¶
The stability theorem is the key theoretical justification for using persistent homology in data analysis.
Statement¶
Let $f, g: X \to \mathbb{R}$ be two continuous functions on a topological space $X$. Let $D_f$ and $D_g$ be their persistence diagrams obtained from sublevel-sets filtrations given by $f$ and $g$. Then:
$$d_B(D_f, D_g) \leq \|f - g\|_\infty$$
where $\|f - g\|_\infty = \sup_{x \in X} |f(x) - g(x)|$ is the uniform distance between functions.
Interpretation¶
This theorem tells us that:
- Small perturbations in the input function lead to small changes in the persistence diagram
- The bottleneck distance between diagrams is bounded by the maximum pointwise difference in functions
- Persistent homology is robust to noise in the data
Practical Implications¶
Noise tolerance: If we add noise $\epsilon$ to our data, features in the persistence diagram can move by at most $\epsilon$.
Feature reliability: Features that persist much longer than the noise level are reliable.
Sampling: When approximating a continuous space with a finite sample, the persistence diagram changes predictably.
Example: Noisy Circle¶
Consider a circle sampled with noise:
- Clean circle: Shows one persistent $H_1$ feature (the loop)
- Noisy circle: The main loop persists, but many small features appear and die quickly
- Stability: The main feature moves slightly, but remains distinguishable
# Demonstrate stability with noisy data
def generate_circle(n_points, radius=1, noise_level=0):
"""Generate points on a circle with optional noise"""
theta = np.linspace(0, 2*np.pi, n_points, endpoint=False)
x = radius * np.cos(theta)
y = radius * np.sin(theta)
if noise_level > 0:
x += np.random.normal(0, noise_level, n_points)
y += np.random.normal(0, noise_level, n_points)
return np.column_stack([x, y])
# Generate clean and noisy circles
np.random.seed(42)
circle_clean = generate_circle(50, radius=1, noise_level=0)
circle_noisy = generate_circle(50, radius=1, noise_level=0.1)
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 5))
# Plot clean circle
ax1.scatter(circle_clean[:, 0], circle_clean[:, 1], s=50, c='blue', alpha=0.6)
ax1.set_xlabel('x', fontsize=12)
ax1.set_ylabel('y', fontsize=12)
ax1.set_title('Clean Circle', fontsize=14)
ax1.set_aspect('equal')
ax1.grid(True, alpha=0.3)
# Plot noisy circle
ax2.scatter(circle_noisy[:, 0], circle_noisy[:, 1], s=50, c='red', alpha=0.6)
ax2.set_xlabel('x', fontsize=12)
ax2.set_ylabel('y', fontsize=12)
ax2.set_title('Noisy Circle (noise = 0.1)', fontsize=14)
ax2.set_aspect('equal')
ax2.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()
print("Expected behavior:")
print("- Both should show one persistent H1 feature (the loop)")
print("- Noisy version will have many short-lived features")
print("- Main loop feature persists and is stable (moves by ≤ noise level)")
Expected behavior: - Both should show one persistent H1 feature (the loop) - Noisy version will have many short-lived features - Main loop feature persists and is stable (moves by ≤ noise level)
Extended Stability Results¶
The stability theorem extends to:
- Wasserstein distances: Similar bounds hold for $W_p$ distances
- Different filtrations: Rips, Čech, alpha complexes all satisfy stability
- Distance functions: The distance function $d_A(x) = \inf_{a \in A} d(x, a)$ satisfies stability under Hausdorff perturbations
The Boundary Matrix¶
For a filtered simplicial complex $K_1 \subseteq K_2 \subseteq \cdots \subseteq K_n$, we consider simplices ordered by their filtration time.
The boundary matrix $\partial$ has columns indexed by simplices and rows indexed by their faces, with entry 1 if the row simplex is in the boundary of the column simplex.
For simplicity, all arithmetic is mod 2 (we take coefficients in $\mathbb{Z}_2$).
Matrix Reduction Algorithm¶
Goal: Reduce the boundary matrix to a form where we can read off birth-death pairs.
Algorithm (Standard Reduction):
- Order simplices by filtration value (ties broken arbitrarily)
- Initialize: $D = \partial$ (boundary matrix)
- For each column $j$ from left to right:
- While column $j$ has the same lowest 1 as a previous column $i$:
- Add column $i$ to column $j$ (mod 2)
- Mark the lowest 1 position in column $j$ (if any exists)
- While column $j$ has the same lowest 1 as a previous column $i$:
Result: The reduced matrix $R$ has the property that no two columns have their lowest 1 in the same row.
Reading Birth-Death Pairs¶
After reduction:
- Birth: A simplex $\sigma$ creates a feature if its column becomes empty (all zeros)
- Death: A simplex $\tau$ kills a feature created by $\sigma$ if $\tau$'s column has its lowest 1 in $\sigma$'s row
Example: Computing $H_0$ and $H_1$ for a Triangle¶
Consider building a triangle with filtration:
- Time 0: vertices $\{a, b, c\}$
- Time 1: edges $\{ab, bc, ca\}$
- Time 2: triangle $\{abc\}$
Step 1: List simplices in filtration order
| Index | Simplex | Dimension | Filtration time |
|---|---|---|---|
| 0 | $a$ | 0 | 0 |
| 1 | $b$ | 0 | 0 |
| 2 | $c$ | 0 | 0 |
| 3 | $ab$ | 1 | 1 |
| 4 | $bc$ | 1 | 1 |
| 5 | $ca$ | 1 | 1 |
| 6 | $abc$ | 2 | 2 |
Step 2: Construct boundary matrix
Columns = simplices, Rows = faces (using mod 2 arithmetic)
$$\partial = \begin{pmatrix} & & & ab & bc & ca & abc \\ a & & & 1 & 0 & 1 & 0 \\ b & & & 1 & 1 & 0 & 0 \\ c & & & 0 & 1 & 1 & 0 \\ ab & & & & & & 1 \\ bc & & & & & & 1 \\ ca & & & & & & 1 \end{pmatrix}$$
The first three columns (vertices) are empty since vertices have no boundary.
Step 3: Reduce the matrix
Process columns left to right:
- Columns 0,1,2 (vertices): already empty, no action needed
- Column 3 ($ab$): lowest 1 at row $b$ (index 1)
- Column 4 ($bc$): lowest 1 at row $c$ (index 2)
- Column 5 ($ca$): lowest 1 at row $c$ (index 2) — CONFLICT!
- Column 4 also has lowest 1 at row $c$
- Add column 4 to column 5: $(0,0,1) + (0,1,1) = (0,1,0)$
- New lowest 1 at row $b$ (index 1) — CONFLICT!
- Column 3 has lowest 1 at row $b$
- Add column 3 to column 5: $(1,1,0) + (0,1,0) = (1,0,0)$
- New lowest 1 at row $a$ (index 0) — unique!
After processing edges, reduced matrix for edges:
$$\begin{pmatrix} ab & bc & ca \\ 1 & 0 & 1 \\ 1 & 1 & 0 \\ 0 & 1 & 0 \end{pmatrix}$$
- Column 6 ($abc$): has three 1's (rows 3,4,5)
- Process similarly, reducing against previous columns
- Final result: column 6 becomes empty (all zeros)
Step 4: Extract persistence pairs
For $H_0$ (connected components):
- Simplex $a$ (index 0): column empty → birth at time 0
- Simplex $b$ (index 1): column empty → birth at time 0
- Simplex $c$ (index 2): column empty → birth at time 0
- Simplex $ab$ (index 3): lowest 1 at row $b$ → kills $b$ at time 1
- Simplex $bc$ (index 4): lowest 1 at row $c$ → kills $c$ at time 1
Persistence pairs for $H_0$:
- $(b, ab)$: birth 0, death 1 → (0, 1)
- $(c, bc)$: birth 0, death 1 → (0, 1)
- $a$ is never killed → (0, ∞) (essential feature)
For $H_1$ (loops):
- Simplex $ca$ (index 5): after reduction, lowest 1 at row $a$, but $a$ is a vertex (dimension 0), so this doesn't kill anything in $H_1$
- Actually, we need to look at which edge creates a cycle:
- Edge $ca$ closes the loop after $ab$ and $bc$ exist
- But column $ca$ (after reduction) is non-zero, so it doesn't create a feature
- The triangle $abc$ has empty column → creates $H_1$ feature at time 1 (when the loop first appears)
- Wait, let me reconsider...
Correct interpretation for $H_1$:
- When $ca$ is added at time 1, a 1-cycle is created (the loop $ab + bc + ca$)
- This cycle is born at time 1
- The 2-simplex $abc$ at time 2 fills this loop → death at time 2
- Persistence pair: (1, 2) for $H_1$
Summary of Algorithm¶
- Build boundary matrix from filtered complex
- Reduce matrix using column operations (mod 2)
- Extract pairs:
- Empty columns → birth of feature
- Lowest 1 in row $i$ of column $j$ → simplex $j$ kills feature born at simplex $i$
- Persistence diagram: plot (birth, death) pairs
6.2 Computational Complexity¶
Bottlenecks¶
The complexity of persistent homology computation is dominated by:
- Filtration size: Number of simplices in the complex
- Matrix reduction: Can be $O(n^3)$ in worst case, where $n$ is the number of simplices
Rips Complex Growth¶
For a point cloud with $N$ points, the Vietoris-Rips complex at dimension $d$ can have:
$$\text{Number of simplices} \sim O\left(N^{d+1}\right)$$
This exponential growth makes computation expensive for:
- Large point clouds ($N > 1000$)
- High dimensions ($d > 3$)
- Dense point clouds
Practical Strategies¶
- Subsample large point clouds
- Limit maximum dimension (usually $d \leq 2$ suffices)
- Bound filtration parameter (maximum edge length)
- Use sparse representations
- Apply cohomology algorithms (dual computation)
7. Interpretation & Usage¶
Signal vs. Noise Separation¶
Key Principle: Persistent features = signal, short-lived features = noise
Persistence as a Filter:
- Long persistence → topological feature survives across many scales → likely real structure
- Short persistence → feature appears and disappears quickly → likely noise
Quantitative threshold: $$\text{Keep feature } (b, d) \text{ if } |d - b| > \tau$$
where $\tau$ is a chosen persistence threshold.
Topological Features for Data Analysis¶
1. Vectorization of Persistence Diagrams
Since machine learning algorithms typically require fixed-dimensional feature vectors, we convert persistence diagrams:
A. Persistence Statistics: $$\text{features} = \{\max(p_i), \text{mean}(p_i), \sum p_i^2, \ldots\}$$ where $p_i = d_i - b_i$ is persistence.
B. Persistence Images: Convert diagram to a 2D grid with Gaussian weighting: $$\rho(x,y) = \sum_{(b_i,d_i) \in D} w(d_i - b_i) \cdot \mathcal{N}((x,y) | (b_i, d_i), \Sigma)$$
where $w(p) = p$ weights by persistence.
C. Persistence Landscapes: Piecewise linear function capturing diagram structure: $$\lambda_k(t) = k\text{-th largest value of } \{|t - b_i|, |t - d_i|\}$$
D. Betti Curves: $$\beta_k(t) = \text{rank}(H_k(K_t))$$ counts features alive at scale $t$.
2. Feature Extraction Pipeline
Raw Data → Filtration → Persistence Diagram → Vectorization → ML Algorithm
↓ ↓ ↓ ↓ ↓
Points Simplicial Birth-Death Feature Vector Classification/
Complex Pairs (fixed-dim) Regression
Practical Use Cases¶
Example 1: Texture Classification
Given images of different materials:
- Compute sublevel set filtration on pixel intensities
- Extract persistence diagrams for $H_0$ and $H_1$
- Features distinguish:
- $H_0$: blob-like structures (bright regions)
- $H_1$: holes and voids (dark regions)
- Train classifier on persistence features
Result: Topological features capture texture patterns independent of exact pixel positions.
Example 2: Time Series Analysis
For a periodic signal (e.g., heartbeat ECG):
- Takens embedding: Convert time series to point cloud $$\mathbf{x}_i = (s_i, s_{i+\tau}, s_{i+2\tau}, \ldots, s_{i+(d-1)\tau})$$
- Build Rips complex on embedded points
- Compute persistence:
- Periodic signals → prominent $H_1$ feature (closed loop in embedding)
- Noise → many short-lived features
- Detect anomalies: Changes in $H_1$ persistence indicate irregularities
Summary: Why Persistent Homology Works¶
Theoretical guarantees:
- Stability theorem: Small data perturbations → small diagram changes
- Completeness: Captures all homological features across all scales
- Well-defined metrics: Bottleneck and Wasserstein distances
Practical advantages:
- Noise filtering: Persistence distinguishes structure from noise
- Feature extraction: Provides interpretable topological descriptors
- Generalization: Works across diverse data types and applications
- Robustness: Inherently stable to small perturbations
The persistence paradigm:
"Features that persist across scales are more likely to represent genuine structure than noise."
This simple principle, backed by rigorous mathematical theory, makes persistent homology a powerful tool for modern data analysis.
Comparison: Traditional vs. Topological Features¶
| Aspect | Traditional Features | Topological Features |
|---|---|---|
| Invariance | Often coordinate-dependent | Coordinate-free |
| Noise | Sensitive to outliers | Stable by design |
| Scale | Single scale or fixed scales | Multi-scale automatically |
| Interpretation | Domain-specific | Geometric/topological |
| Computation | Often fast | Can be expensive |
| Applicability | Dataset-specific | General purpose |
Best practice: Combine both types of features for optimal performance!
Further Reading¶
Foundational papers:
- Edelsbrunner et al. (2002): "Topological Persistence and Simplification"
- Carlsson (2009): "Topology and Data" (survey paper)
- Cohen-Steiner et al. (2007): "Stability of Persistence Diagrams"
Programming tasks¶
Task 1: Persistence Diagrams for Shape Analysis of Point Clouds¶
Write a Python program that uses persistence diagrams to analyze the shape of point clouds in space.
Background:¶
Persistent homology is a method from topological data analysis that tracks topological features (connected components, holes, voids) across multiple scales. A persistence diagram visualizes these features by plotting their birth and death times during a filtration process. Features that persist longer are considered more significant to the underlying shape.
Input:¶
- The first line contains an integer $n$, the number of points.
- The next line contains a single character:
2for 2D points or3for 3D points. - The last $n$ lines each contain the coordinates of a point (space-separated floats).
Output:¶
- First, print the number of persistent features found in dimension 0 (connected components).
- Then, print the number of persistent features found in dimension 1 (loops/holes).
- For each dimension (0, then 1), print the persistence diagram: each line contains two floats (birth time, death time) for one feature.
- Finally, print a single line with either
CONNECTED,CLUSTERED, orHAS_HOLESbased on the analysis of the persistence diagram.
Notes:¶
- Use Rips or Alpha filtration to build the filtered simplicial complex.
- Features with persistence (death - birth) below a threshold can be considered noise.
- You may use libraries such as
gudhiorripserfor computing persistence diagrams. - For classification:
CONNECTEDif there is one dominant H₀ feature;CLUSTEREDif there are multiple significant H₀ features;HAS_HOLESif there are significant H₁ features.
Task 2: Computing Bottleneck Distance Between Persistence Diagrams¶
Write a Python program that computes the bottleneck distance between two persistence diagrams.
Background:¶
The bottleneck distance between two persistence diagrams is a measure of similarity between topological spaces. It is defined as the infimum over all bijections between the diagrams (including diagonal points) of the maximum distance between matched points. The distance between two points $(b_1, d_1)$ and $(b_2, d_2)$ is typically measured using the $L_\infty$ norm: $\max(|b_1 - b_2|, |d_1 - d_2|)$.
Input:¶
- The first line contains an integer $k$, the number of points in the first persistence diagram.
- The next $k$ lines each contain two floats: birth time and death time of a feature.
- The next line contains an integer $m$, the number of points in the second persistence diagram.
- The next $m$ lines each contain two floats: birth time and death time of a feature.
Output:¶
- Print a single float: the bottleneck distance between the two persistence diagrams, rounded to 6 decimal places.
Notes:¶
- Points can be matched to the diagonal (birth = death) with cost equal to half the persistence.
- Consider using
scipy.spatial.distanceor similar tools for distance computation. - Libraries like
gudhiprovide built-in bottleneck distance functions.
Task 3: Topological Analysis of Images Using Persistence Diagrams¶
Write a Python program that performs topological analysis of a grayscale image using persistence diagrams.
Background:¶
Images can be analyzed topologically by treating pixel intensities as a height function. By applying sublevel set filtration (considering pixels below increasing intensity thresholds), we can track how connected components merge and how holes appear and disappear. This reveals the topological structure of the image.
Input:¶
- The first line contains two integers $h$ and $w$: the height and width of the image.
- The next $h$ lines each contain $w$ space-separated integers (0-255): the pixel intensities in row-major order.
Output:¶
- First, print the number of significant features in dimension 0 (connected components).
- Then, print the number of significant features in dimension 1 (holes).
- For each dimension (0, then 1), print the persistence pairs: each line contains two integers (birth intensity, death intensity).
- Finally, print a summary line describing the dominant topological features found.
Notes:¶
- Use sublevel set filtration: consider the cubical complex formed by pixels.
- A feature is "significant" if its persistence exceeds a threshold (e.g., 10% of the intensity range).
- You may use libraries such as
gudhi(cubical complex) for computing persistence. - Consider 4-connectivity or 8-connectivity for neighboring pixels.
- The summary should describe features like "SIMPLE" (few features), "COMPLEX" (many features), or "CONTAINS_HOLES" (significant H₁ features).