Skip to main content

TSNE

t-distributed Stochastic Neighbor Embedding: a popular non-linear method for visualising high-dimensional data in 2-D/3-D.

The examples/manifold/tsne.cpp program embeds three 4-D Gaussian clusters into 2-D and renders them with SkigenPlot:

Three 4-D Gaussian clusters embedded into 2-D by exact Skigen::TSNEThree 4-D Gaussian clusters embedded into 2-D by exact Skigen::TSNE

Algorithm

Models pairwise similarities as conditional probabilities in both spaces and minimises their KL divergence by gradient descent, using a heavy-tailed Student-t kernel in the embedding to avoid crowding. v1.1.0 implements the exact (non-Barnes-Hut) gradient.

Constructor

Skigen::TSNE<Scalar> model(int n_components = 2, Scalar perplexity = 30.0, Scalar learning_rate = 200.0, int max_iter = 1000, uint64_t random_state = 0);

Parameters

ParameterDefaultDescription
n_components2Embedding dimensionality.
perplexity30.0Effective neighbourhood size.
learning_rate200.0Gradient-descent step size.
max_iter1000Optimisation iterations.

Methods

MethodDescription
fit_transform(X)Return the embedding.

Fitted Attributes

AccessorDescription
kl_divergence()Final KL divergence.
n_iter()Iterations run.

Example

Skigen::TSNE<double> tsne(2, 30.0);
auto Y = tsne.fit_transform(X);
Verified against scikit-learn

This estimator is checked by the parity suite. See the generator tests/parity/generate_manifold_reference.py and the reference fixtures in tests/parity/data/tsne/, exercised by tests/parity/parity_manifold.cpp.

API Reference

For full signatures see the TSNE API Reference.