LinearRegression

#include <Skigen/LinearModel>

template <typename Scalar = double>
class Skigen::LinearRegression(fit_intercept=true)

Ordinary least squares Linear Regression.

LinearRegression fits a linear model with coefficients $w = (w_1, \ldots, w_p)$ to minimize the residual sum of squares between the observed targets in the dataset, and the targets predicted by the linear approximation:

\hat{w} = \arg\min_w \|Xw - y\|_2^2

Solves via ColPivHouseholderQR decomposition. When fit_intercept is true, data is centered before solving.

Mirrors sklearn.linear_model.LinearRegression.

Methods

SKIGEN_PARAMS()

Fit the model using Ordinary Least Squares.

Uses ColPivHouseholderQR decomposition. Centers data when fit_intercept is true.

Parameters:

X Design matrix of shape (n_samples, n_features).
y Target vector of shape (n_samples,). Will be cast to Scalar if necessary.

Returns:

result Reference to the fitted estimator (*this).

fit(X, y)

Fit OLS on a sparse design matrix without densifying X.

Solves the (centred when fit_intercept=true) normal equation $(X_c^\top X_c) w = X_c^\top y_c$ via the implicit-centring identities used by Ridge::fit(SparseMatrix, ...). Uses an LDL^T factorisation (positive-semi-definite Cholesky variant) so that rank-deficient X^T X is handled gracefully — sklearn falls back to LSQR for sparse OLS, which is also robust to rank deficiency.

Mirrors sklearn's LinearRegression.fit behaviour on sparse input. sample_weight, positive, n_jobs are not honoured.

fit_multi(X, Y)

Fit OLS with a multi-target response matrix.

Y has shape (n_samples, n_targets). For each target column the OLS solution is computed via the same QR factorisation, exploiting the shared design matrix. The resulting coefficient matrix has shape (n_targets, n_features); the intercept is a vector of length n_targets.

This overload is additive to the single-target API: the coef() / intercept() accessors continue to return the single-target row vector / scalar (extracted from the first column of Y when this overload was used). Multi-target callers use the new accessors coef_matrix() / intercept_vector() and the predict_multi() method.

Mirrors sklearn's LinearRegression.fit(X, Y) behaviour for multi-output regression. sample_weight, positive, n_jobs are not implementeds.

predict_multi(X)

Predict multi-target outputs of shape (n_samples, n_targets).

predict(X)

Predict using the linear model.

Computes $\hat{y} = X w + b$ where $w$ and $b$ are the fitted coefficients and intercept.

Parameters:

X : MatrixType Sample matrix of shape (n_samples, n_features).

Returns:

result : VectorType Predicted values of shape (n_samples,).

Throws:

std::runtime_error — if the model has not been fitted.

score(X, y)

Return the $R^2$ coefficient of determination on test data.

$R^2 = 1 - \frac{\sum (y_i - \hat{y}_i)^2}{\sum (y_i - \bar{y})^2}$ . Best possible score is 1.0; it can be negative if the model is arbitrarily worse than predicting the mean.

Parameters:

X : MatrixType Test samples of shape (n_samples, n_features).
y : VectorType True values of shape (n_samples,).

Returns:

result : ScalarType $R^2$ score.

Throws:

std::runtime_error — if the model has not been fitted.

Example

// Simple 2-feature dataset: y = 2*x1 + 3*x2 + 1
Eigen::MatrixXd X(6, 2);
X << 1, 1,
     1, 2,
     2, 2,
     2, 3,
     3, 3,
     3, 4;
Eigen::VectorXd y(6);
y << 6, 9, 8, 11, 10, 13;

Skigen::LinearRegression<double> model;
model.fit(X, y);

std::cout << std::fixed << std::setprecision(4);
std::cout << "=== Linear Regression ===\n";
std::cout << "Coefficients: " << model.coef() << "\n";
std::cout << "Intercept:    " << model.intercept() << "\n";
std::cout << "R² (train):   " << model.score(X, y) << "\n\n";

Methods​

SKIGEN_PARAMS()​

fit(X, y)​

fit_multi(X, Y)​

predict_multi(X)​

predict(X)​

score(X, y)​

Example​

Methods

SKIGEN_PARAMS()

fit(X, y)

fit_multi(X, Y)

predict_multi(X)

predict(X)

score(X, y)

Example