HistGradientBoostingRegressor
Histogram-based gradient boosting for regression — the fast, large-data GB path.
Algorithm
Features are quantile-binned into a compact uint8 representation, then squared-error boosting proceeds with a native gradient/hessian histogram split finder. Each tree is grown leaf-wise (best-first): the leaf with the highest second-order split gain
is split next, bounded by max_leaf_nodes. Leaf values are the Newton step . The grower honours L2 regularisation (), per-feature monotonic constraints, and holdout-based early stopping.
For squared error the per-sample gradient is and the hessian is .
Constructor
Skigen::HistGradientBoostingRegressor<Scalar> model(
Loss loss = Loss::SquaredError,
Scalar learning_rate = 0.1,
int max_iter = 100,
std::optional<int> max_leaf_nodes = 31,
std::optional<int> max_depth = std::nullopt,
int min_samples_leaf = 20,
Scalar l2_regularization = 0.0,
int max_bins = 255,
std::optional<std::vector<int>> categorical_features = std::nullopt,
std::optional<std::vector<int>> monotonic_cst = std::nullopt,
bool early_stopping = false,
Scalar validation_fraction = 0.1,
int n_iter_no_change = 10,
Scalar tol = 1e-7,
std::optional<uint64_t> random_state = std::nullopt);
Parameters
| Parameter | Default | Description |
|---|---|---|
learning_rate | 0.1 | Shrinkage per iteration. |
max_iter | 100 | Number of boosting iterations. |
max_leaf_nodes | 31 | Leaf-wise growth bound (nullopt = unbounded). |
max_depth | nullopt | Optional depth cap. |
min_samples_leaf | 20 | Minimum samples per leaf. |
l2_regularization | 0.0 | L2 penalty on the Newton step. |
max_bins | 255 | Bin resolution (2–255). |
monotonic_cst | nullopt | Per-feature +1 / -1 / 0 constraint. |
early_stopping | false | Enable holdout-based stopping. |
validation_fraction | 0.1 | Holdout size for early stopping. |
n_iter_no_change | 10 | Patience before stopping. |
tol | 1e-7 | Minimum validation improvement. |
random_state | nullopt | Seed for the holdout split. |
Methods
| Method | Description |
|---|---|
fit(X, y) | Bin features, then boost. |
predict(X) | Boosted prediction. |
score(X, y) | R². |
Fitted Attributes
| Accessor | Description |
|---|---|
bin_edges() | Per-feature quantile bin edges. |
train_score() | Per-iteration training MSE. |
Example
Skigen::HistGradientBoostingRegressor<double> gb;
gb.fit(X, y);
auto preds = gb.predict(X_test);
This estimator is checked by the parity suite. See the generator tests/parity/generate_ensemble_reference.py and the reference fixtures in tests/parity/data/hist_gradient_boosting_regressor/, exercised by tests/parity/parity_ensemble.cpp.
For full signatures see the HistGradientBoostingRegressor API Reference.