DecisionTreeClassifier
#include <Skigen/Tree>
template <typename Scalar = double>
class Skigen::DecisionTreeClassifier(max_depth=-1, min_samples_split=2, max_features_mode=0, max_features_value=0.0, random_state=std::nullopt)
A decision tree classifier.
A non-parametric supervised learning method used for classification. The model predicts the value of a target variable by learning simple decision rules inferred from the data features. Uses Gini impurity as the splitting criterion.
Mirrors sklearn.tree.DecisionTreeClassifier.
Attributes:
-
classes : const Eigen::VectorXi
-
n_classes : int
-
feature_importances : RowVectorType
Methods
fit(X, y)
Build a decision tree classifier from the training set.
fit(X, y)
Fit on a sparse design matrix.
The sparse input is converted to a dense MatrixType internally and forwarded to the regular split-finding path. The numerical results match a dense fit; a native sparse split finder (skip-zero column scans accumulating per-class counts only over explicit nonzeros) would avoid the densification but is not implemented.
predict(X)
Predict using a sparse design matrix (densifies internally).
fit_with_indices(X, y, sample_indices)
Fit using a specific row index subset (used by RandomForest bootstraps).
Parameters:
- sample_indices : const std::vector< Eigen::Index > & If empty, uses all rows of X. Otherwise, builds the tree from the rows specified (with possible repetitions).
predict(X)
Predict class labels for samples in X.
predict_proba(X)
Per-class probability estimates from leaf class distributions.
build_tree(X, y, indices, depth)
fit(X, y)
predict(X)
Example
// Best model
Skigen::DecisionTreeClassifier<double> best(5);
best.fit(split.X_train, split.y_train);
auto best_pred = best.predict(split.X_test);
std::cout << "\n=== Confusion Matrix (depth=5) ===\n";
auto cm = Skigen::Metrics::confusion_matrix(split.y_test, best_pred);
std::cout << cm << "\n";