Skip to main content

Pipeline

A compile-time pipeline that chains transformers and a final estimator into a single object. All step types are verified at compile time using C++ template metaprogramming — incompatible steps produce a compile error rather than a runtime failure.

Semantics

For a pipeline with transformers T1,T2,,TmT_1, T_2, \ldots, T_m and a final estimator EE:

Fitting — each transformer is fit and transforms the data sequentially before the estimator is fit:

E.fit ⁣(Tm.fit_transform ⁣(T1.fit_transform(X)),  y)E.\text{fit}\!\left(T_m.\text{fit\_transform}\!\left(\cdots T_1.\text{fit\_transform}(X)\cdots\right),\; y\right)

Prediction — data passes through each transformer's transform, then the estimator's predict:

y^=E.predict ⁣(Tm.transform ⁣(T1.transform(X)))\hat{y} = E.\text{predict}\!\left(T_m.\text{transform}\!\left(\cdots T_1.\text{transform}(X)\cdots\right)\right)

This ensures that the same preprocessing applied during training is automatically applied during prediction, preventing train/test skew.

Mirrors sklearn.pipeline.Pipeline.

Construction

auto pipe = Skigen::make_pipeline(step1, step2, ..., estimator);

Or explicitly:

Skigen::Pipeline<StandardScaler<>, PCA<>, LinearRegression<>> pipe(scaler, pca, model);

Methods

MethodDescription
fit(X, y)Fit all steps sequentially
predict(X)Transform through all steps, then predict
score(X, y)Transform and score
get<I>()Access step at index I

Example

#include <Skigen/Pipeline>
#include <Skigen/Preprocessing>
#include <Skigen/LinearModel>

auto pipe = Skigen::make_pipeline(
Skigen::StandardScaler<>(),
Skigen::LinearRegression<>()
);

pipe.fit(X_train, y_train);
std::cout << "R²: " << pipe.score(X_test, y_test) << "\n";

// Access fitted scaler
auto& scaler = pipe.get<0>();
std::cout << "Mean: " << scaler.mean() << "\n";