Skip to main content

StandardScaler

#include <Skigen/Preprocessing>

template <typename Scalar = double>
class Skigen::StandardScaler(with_mean=true, with_std=true)

Standardize features by removing the mean and scaling to unit variance.

The standard score of a sample x is calculated as:

z=xμσz = \frac{x - \mu}{\sigma}

where μ\mu is the mean of the training samples and σ\sigma is the standard deviation. Centering and scaling happen independently on each feature by computing the relevant statistics on the samples in the training set.

Mirrors sklearn.preprocessing.StandardScaler.


Parameters:

  • with_mean : bool, default=true If true, center the data before scaling (bool, default true).

  • with_std : bool, default=true If true, scale the data to unit variance (bool, default true).


Attributes:

  • with_mean : bool Whether centering is enabled.

  • with_std : bool Whether scaling to unit variance is enabled.

  • mean : RowVectorType Per-feature mean (1 × n_features).

  • var : RowVectorType Per-feature variance (1 × n_features).

  • scale : RowVectorType Per-feature standard deviation (1 × n_features).

  • n_samples_seen : IndexType Number of samples processed during fit().


Methods

SKIGEN_PARAMS()

Compute the mean and std to be used for later scaling.

Parameters:

  • X Training data of shape (n_samples, n_features).

Returns:

  • result Reference to the fitted transformer (*this).

partial_fit(X)

Online update of the mean and variance using Chan's parallel algorithm (a numerically-stable Welford variant).

partial_fit(X1).partial_fit(X2) produces the same fitted attributes (within floating-point tolerance) as fit([X1; X2]), matching sklearn's StandardScaler.partial_fit contract.

Parameters:

  • X : MatrixType Batch of training data, shape (n_samples_batch, n_features). The feature count must match the first batch.

Returns:

  • result : StandardScaler Reference to the fitted transformer (*this).

Throws:

  • std::invalid_argument — if the feature count differs from the first batch, or if X is empty.

transform(X)

Perform standardization by centering and scaling.

Parameters:

  • X : MatrixType Data matrix of shape (n_samples, n_features).

Returns:

  • result : MatrixType Transformed data of same shape.

Throws:

  • std::runtime_error — if the model has not been fitted.

inverse_transform(X)

Scale back the data to the original representation.

Parameters:

  • X : MatrixType Transformed data of shape (n_samples, n_features).

Returns:

  • result : MatrixType Un-transformed data of same shape.

Throws:

  • std::runtime_error — if the model has not been fitted.

transform_inplace(X)

Transform features in-place by standardizing.

Parameters:

  • X : Eigen::Ref< MatrixType > Data matrix of shape (n_samples, n_features), modified in place.

Throws:

  • std::runtime_error — if the model has not been fitted.

inverse_transform_inplace(X)

Inverse-transform features in-place.

Parameters:

  • X : Eigen::Ref< MatrixType > Standardized data matrix, modified in place to original scale.

Throws:

  • std::runtime_error — if the model has not been fitted.

Example

// Fit and transform
Skigen::StandardScaler<double> scaler;
Eigen::MatrixXd Z = scaler.fit_transform(X);

std::cout << "Standardized:\n" << Z << "\n\n";
std::cout << "Mean: " << scaler.mean() << "\n";
std::cout << "Scale: " << scaler.scale() << "\n\n";

// Round-trip: inverse_transform recovers the original data
Eigen::MatrixXd X_back = scaler.inverse_transform(Z);
std::cout << "Recovered:\n" << X_back << "\n\n";