Skip to main content

Normalizer

Normalizes each row (sample) independently to unit norm. This is a stateless transformer — fit is a no-op.

Formula

L2:x^i=xixi2L1:x^i=xixi1Max:x^i=ximaxjxij\text{L2:}\quad \hat{x}_i = \frac{x_i}{\|x_i\|_2} \qquad \text{L1:}\quad \hat{x}_i = \frac{x_i}{\|x_i\|_1} \qquad \text{Max:}\quad \hat{x}_i = \frac{x_i}{\max_j |x_{ij}|}

where xix_i is the ii-th row (sample) of XX.

Note that Normalizer operates row-wise (per sample), unlike StandardScaler and other scalers which operate column-wise (per feature). This is a fundamental distinction.

When to Use

  • Text classification: After TF-IDF vectorization, L2 normalization ensures that documents of different lengths are comparable.
  • Cosine similarity: L2-normalized vectors have x=1\|x\| = 1, so their dot product equals cosine similarity.
  • Kernel methods: Some kernels assume unit-norm inputs.

Mirrors sklearn.preprocessing.Normalizer.

Constructor

Skigen::Normalizer<Scalar> normalizer(Norm norm = Norm::L2);
ParameterDefaultDescription
normL2Norm to use: L1, L2, or Max

Methods

MethodDescription
fit(X)No-op (stateless)
transform(X)Normalize each row
transform_inplace(X)Normalize in-place

Example

#include <Skigen/Preprocessing>

Skigen::Normalizer normalizer(Skigen::Normalizer<>::Norm::L2);
auto X_normed = normalizer.fit_transform(X);
// Each row now has unit L2 norm