Skip to main content

UMAP

#include <Skigen/Manifold>

template <typename Scalar = double>
class Skigen::UMAP(n_components=2, n_neighbors=15, min_dist=0.1, learning_rate=1, n_epochs=200, negative_sample_rate=5, random_state=std::nullopt)

Uniform Manifold Approximation and Projection (UMAP).

Non-linear dimensionality reduction that preserves both local and global structure. Constructs a weighted KNN graph in high-dimensional space, then optimises a low-dimensional layout via stochastic gradient descent on a cross-entropy objective.

Mirrors umap-learn.


Parameters:

  • n_components : int, default=2 Embedding dimension (default 2).

  • n_neighbors : int, default=15 Local neighborhood size (default 15).

  • min_dist : Scalar, default=0.1 Minimum distance in the embedding (default 0.1).

  • learning_rate : Scalar, default=1 Initial SGD learning rate (default 1.0).

  • n_epochs : int, default=200 Number of optimisation epochs (default 200).

  • negative_sample_rate : int, default=5 Negative samples per positive edge (default 5).

  • random_state : std::optional< uint64_t >, default=std::nullopt Optional RNG seed (default nullopt).


Attributes:

  • embedding : MatrixType Low-dimensional embedding (n_samples x n_components).

Methods

SKIGEN_PARAMS()

Fit the UMAP model to training data X.

Builds the fuzzy KNN graph, computes membership strengths, then runs SGD to optimise the low-dimensional layout.

Parameters:

  • X Training data of shape (n_samples, n_features).

Returns:

  • result Reference to the fitted transformer (*this).

transform()

Return the stored embedding for the training data.

Parameters:

  • X Data matrix of shape (n_samples, n_features).

Returns:

  • result : MatrixType Embedding of shape (n_samples, n_components).