Skip to main content

RobustScaler

Centers and scales features using statistics that are robust to outliers: the median for centering and the interquartile range (IQR) for scaling.

Formula

Zij=XijmedianjIQRjZ_{ij} = \frac{X_{ij} - \text{median}_j}{\text{IQR}_j}

where:

  • medianj\text{median}_j is the median of feature jj
  • IQRj=Q75(Xj)Q25(Xj)\text{IQR}_j = Q_{75}(X_j) - Q_{25}(X_j) is the interquartile range

Unlike StandardScaler, the median and IQR are not influenced by extreme values, so the scaling remains stable even with outliers in the data.

When to Use

  • Data with outliers: When the data contains extreme values that would distort mean/variance-based scaling.
  • Robust pipelines: As a preprocessing step before models sensitive to feature scales.
  • The quantile_range parameter allows customizing the centering quantiles (e.g., using the 5th and 95th percentiles).

Mirrors sklearn.preprocessing.RobustScaler.

Constructor

Skigen::RobustScaler<Scalar> scaler(bool with_centering = true,
bool with_scaling = true,
{q_min, q_max} = {25, 75});
ParameterDefaultDescription
with_centeringtrueCenter by median
with_scalingtrueScale by IQR
quantile_range{25, 75}Quantile range for scaling

Methods

MethodDescription
fit(X)Compute median and IQR
transform(X)Scale using median and IQR
inverse_transform(Z)Recover original scale
transform_inplace(X)Scale in-place

Fitted Attributes

AccessorTypeDescription
center()RowVectorTypePer-feature median
scale()RowVectorTypePer-feature IQR

Example

#include <Skigen/Preprocessing>

Skigen::RobustScaler scaler;
scaler.fit(X);
auto X_scaled = scaler.transform(X);