VarianceThreshold
#include <Skigen/FeatureSelection>
template <typename Scalar = double>
class Skigen::VarianceThreshold(threshold=0)
Feature selector that removes all low-variance features.
This feature selection algorithm looks only at the features (X), not the desired outputs (y), and can thus be used for unsupervised learning.
Mirrors sklearn.feature_selection.VarianceThreshold.
Parameters:
- threshold : Scalar, default=0
Variance threshold (
Scalar, default0). Features with variance strictly less than or equal to this value (or strictly less than this value whenthreshold > 0) are dropped. Matches sklearn semantics: features kept satisfyvariance > threshold.
Attributes:
-
threshold : Scalar The configured variance threshold.
-
variances : RowVectorType Per-feature variance (1 × n_features).
-
get_support_mask : BoolMaskType Boolean support mask:
truefor features that are kept. -
get_support_indices : Eigen::VectorXi Get the integer indices of selected features.
Methods
get_support(indices)
Get the support of the selector.
Parameters:
- indices : bool
If
true, return integer indices instead of a boolean mask.
Returns:
- result : BoolMaskType Either a boolean mask of length n_features (default) or a vector of selected indices.
fit(X)
Compute per-feature variance and the support mask.
transform(X)
Reduce X to selected features.
fit(X)
Fit on a sparse matrix without densifying.
Computes the per-column variance directly from the CSC/CSR representation. For column-major storage this is O(nnz), where nnz is the number of explicit nonzeros — implicit zeros contribute to the mean and variance but are never materialised.
Variance formula (biased, ddof=0):
Matches sklearn's VarianceThreshold.fit behaviour on sparse input.
transform(X)
Reduce a sparse X to selected features without densifying.
Returns a sparse matrix with the same number of rows and only the columns where support_mask_(j) is true.
inverse_transform(X)
Reverse the transformation by zero-padding removed features.