LogisticRegression

$\ell_2$ -regularized logistic regression for binary and multiclass classification, solved via Iteratively Reweighted Least Squares (IRLS).

Model

The predicted probability that sample $x$ belongs to the positive class is given by the logistic (sigmoid) function:

P(y = 1 \mid x) = \sigma(w^\top x + b) = \frac{1}{1 + e^{-(w^\top x + b)}}

Decision boundary: a sample is assigned to the positive class when $P(y=1 \mid x) \ge 0.5$ , i.e., when $w^\top x + b \ge 0$ .

Objective Function

\min_w -\frac{1}{n} \sum_{i=1}^{n} \left[ y_i \log \hat{p}_i + (1 - y_i) \log(1 - \hat{p}_i) \right] + \frac{1}{2C} \|w\|_2^2

where $\hat{p}_i = \sigma(w^\top x_i + b)$ and $C > 0$ is the inverse regularization strength. Larger $C$ means less regularization. This formulation matches scikit-learn's LogisticRegression.

IRLS Solver

Skigen uses Iteratively Reweighted Least Squares (a Newton-type method) to solve the logistic regression problem. At each iteration:

Compute predicted probabilities: $\hat{p}_i = \sigma(w^\top x_i + b)$
Compute the diagonal weight matrix: $S_{ii} = \hat{p}_i(1 - \hat{p}_i)$
Compute the working response: $z_i = w^\top x_i + (y_i - \hat{p}_i) / S_{ii}$
Solve the weighted least squares problem: $w \leftarrow (X^\top S X + \frac{1}{C} I)^{-1} X^\top S z$

The algorithm uses a diagonal Hessian approximation for efficiency and converges quadratically near the optimum. The sigmoid computation is numerically stabilized to avoid overflow for large inputs.

Multiclass Classification

For more than two classes, Skigen uses the One-vs-Rest (OvR) strategy: a separate binary classifier is fitted for each class against all others. The class with the highest predicted probability is selected.

Constructor

Skigen::LogisticRegression<Scalar> model(Scalar C = 1, bool fit_intercept = true,
                                          int max_iter = 100, Scalar tol = 1e-4);

Parameter	Default	Description
`C`	`1`	Inverse regularization strength ( $C > 0$ )
`fit_intercept`	`true`	Whether to compute an intercept term
`max_iter`	`100`	Maximum IRLS iterations
`tol`	`1e-4`	Convergence tolerance on the log-likelihood

Methods

Method	Description
`fit(X, y)`	Fit the classifier via IRLS
`predict(X)`	Predict class labels
`predict_proba(X)`	Predict class probabilities
`score(X, y)`	Return classification accuracy

Fitted Attributes

Accessor	Type	Description
`coef()`	`MatrixType`	Coefficient matrix (one row per class in OvR)
`intercept()`	`VectorType`	Intercept vector
`classes()`	`std::vector<int>`	Unique class labels

Example

#include <Skigen/LinearModel>
#include <Eigen/Dense>

int main() {
    Eigen::MatrixXd X(6, 2);
    X << 1,2, 2,3, 3,4, 5,6, 6,7, 7,8;
    Eigen::VectorXi y(6);
    y << 0, 0, 0, 1, 1, 1;

    Skigen::LogisticRegression model(/*C=*/1.0);
    model.fit(X, y);

    auto predictions = model.predict(X);
    std::cout << "Accuracy: " << model.score(X, y) << "\n";

    auto proba = model.predict_proba(X);
    std::cout << "Probabilities:\n" << proba << "\n";
}

Model​

Objective Function​

IRLS Solver​

Multiclass Classification​

Constructor​

Methods​

Fitted Attributes​

Example​