High Dimensional Data Visualization

Settings

Number of Clusters: 3

Dimensionality Reduction Method:

Rotation Angle: 0°

Why is ML Hard?

This demo shows data points in a 20-dimensional space, projected down to 2D. The original data has 3 clusters that are easily separable in high dimensional space, but:

Information is lost when projecting from high dimensions to 2D
Different projection methods preserve different aspects of the data
The "curse of dimensionality" makes distance measures less meaningful
Visualizing high-dimensional relationships becomes extremely difficult

PCA finds the directions of maximum variance in the data and projects it onto a lower-dimensional subspace.

What This Demonstrates:

Dimensionality Reduction Trade-offs: Each method (PCA, t-SNE, UMAP) makes different compromises when reducing dimensions
Information Loss: Notice how clusters that are separate in high-dimensional space might overlap in 2D
Feature Importance: In real ML problems, determining which dimensions (features) matter most is challenging
Visualization Limits: Humans can only visualize 2D/3D, but ML models work in much higher dimensions
The Reality: Real-world ML often deals with thousands or millions of dimensions, making this problem far more complex

Visualizing High-Dimensional Data

Settings

Why is ML Hard?

What This Demonstrates: