SelectKBest

Keeps the top-k features ranked by a univariate score function (f_classif, f_regression, or chi2).

Algorithm

Each feature is scored independently against the target; the k highest-scoring features are retained. The chi-squared score is sparse-aware for text pipelines.

Constructor

Skigen::SelectKBest<Scalar, ScoreFn> model(ScoreFn score, int k);

Parameters

Parameter	Default	Description
`score_func`	`—`	`FClassif`, `FRegression`, or `Chi2`.
`k`	`—`	Number of features to keep.

Methods

Method	Description
`fit(X, y)`	Score and rank the features.
`transform(X)`	Project onto the top-k features.
`get_support_mask()`	Boolean mask of selected features.

Fitted Attributes

Accessor	Description
`scores()`	Per-feature scores.
`pvalues()`	Per-feature p-values.

Example

Skigen::SelectKBest<double, Skigen::feature_selection::FClassif<double>> sel({}, 5);
sel.fit(X, y);
auto X_top = sel.transform(X);

Verified against scikit-learn

This estimator is checked by the parity suite. See the generator tests/parity/generate_feature_selection_reference.py and the reference fixtures in tests/parity/data/f_classif/, exercised by tests/parity/parity_feature_selection.cpp.

API Reference

For full signatures see the SelectKBest API Reference.

Algorithm​

Constructor​

Parameters​

Methods​

Fitted Attributes​

Example​