GradientBoostingClassifier

#include <Skigen/Ensemble>

template <typename Scalar = double>
class Skigen::GradientBoostingClassifier(loss=Loss::LogLoss, learning_rate=0.1, n_estimators=100, subsample=1.0, criterion=CriterionGB::FriedmanMSE, min_samples_split=2, min_samples_leaf=1, min_weight_fraction_leaf=0, max_depth=3, min_impurity_decrease=0, random_state=std::nullopt, verbose=0, max_leaf_nodes=std::nullopt, warm_start=false, validation_fraction=0.1, n_iter_no_change=std::nullopt, tol=1e-4, ccp_alpha=0)

Gradient Boosting for binary classification.

Stage-wise additive log-odds model fit by gradient boosting on the negative log-likelihood (cross-entropy) of a binomial outcome:

F_M(x) = F_0 + \eta \sum_{m=1}^M h_m(x), \qquad P(y=1 \mid x) = \sigma(F_M(x)).

At each stage the pseudo-residual is the negative gradient of the loss: $r_i = y_i - \sigma(F_{m-1}(x_i))$ , where $y_i \in \{0, 1\}$ . The initial log-odds $F_0$ is the empirical prior log-ratio $\log(p_+ / p_-)$ , matching sklearn's init="zero" default which uses a DummyClassifier(strategy="prior") under the hood.

Mirrors sklearn.ensemble.GradientBoostingClassifier for the binary case.

Attributes:

loss : Loss
learning_rate : Scalar
n_estimators : int
max_depth : int
init : Scalar
classes : const Eigen::VectorXi
n_classes : int
estimators : const std::vector< DecisionTreeRegressor< Scalar > > &
feature_importances : RowVectorType
train_score : VectorType

Methods

SKIGEN_PARAMS()

predict(X)

decision_function(X)

Raw additive score $F(x)$ (log-odds). Length n_samples for binary; for multiclass use decision_function_multi.

decision_function_multi(X)

Per-class raw scores, shape (n_samples, n_classes). For binary problems returns the two complementary log-odds columns.

predict_proba(X)

Probability estimates of shape (n_samples, n_classes). Column ordering matches classes().

Methods​

SKIGEN_PARAMS()​

predict(X)​

decision_function(X)​

decision_function_multi(X)​