Skip to main content

GradientBoostingClassifier

#include <Skigen/Ensemble>

template <typename Scalar = double>
class Skigen::GradientBoostingClassifier(loss=Loss::LogLoss, learning_rate=0.1, n_estimators=100, subsample=1.0, criterion=CriterionGB::FriedmanMSE, min_samples_split=2, min_samples_leaf=1, min_weight_fraction_leaf=0, max_depth=3, min_impurity_decrease=0, random_state=std::nullopt, verbose=0, max_leaf_nodes=std::nullopt, warm_start=false, validation_fraction=0.1, n_iter_no_change=std::nullopt, tol=1e-4, ccp_alpha=0)

Gradient Boosting for binary classification.

Stage-wise additive log-odds model fit by gradient boosting on the negative log-likelihood (cross-entropy) of a binomial outcome:

FM(x)=F0+ηm=1Mhm(x),P(y=1x)=σ(FM(x)).F_M(x) = F_0 + \eta \sum_{m=1}^M h_m(x), \qquad P(y=1 \mid x) = \sigma(F_M(x)).

At each stage the pseudo-residual is the negative gradient of the loss: ri=yiσ(Fm1(xi))r_i = y_i - \sigma(F_{m-1}(x_i)), where yi{0,1}y_i \in \{0, 1\}. The initial log-odds F0F_0 is the empirical prior log-ratio log(p+/p)\log(p_+ / p_-), matching sklearn's init="zero" default which uses a DummyClassifier(strategy="prior") under the hood.

Mirrors sklearn.ensemble.GradientBoostingClassifier for the binary case.



Attributes:

  • loss : Loss

  • learning_rate : Scalar

  • n_estimators : int

  • max_depth : int

  • init : Scalar

  • classes : const Eigen::VectorXi

  • n_classes : int

  • estimators : const std::vector< DecisionTreeRegressor< Scalar > > &

  • feature_importances : RowVectorType

  • train_score : VectorType


Methods

SKIGEN_PARAMS()


predict(X)


decision_function(X)

Raw additive score F(x)F(x) (log-odds). Length n_samples.


predict_proba(X)

Probability estimates of shape (n_samples, 2). Column ordering matches classes().