LinearRegression
#include <Skigen/LinearModel>
template <typename Scalar = double>
class Skigen::LinearRegression(fit_intercept=true)
Ordinary least squares Linear Regression.
LinearRegression fits a linear model with coefficients to minimize the residual sum of squares between the observed targets in the dataset, and the targets predicted by the linear approximation:
Solves via ColPivHouseholderQR decomposition. When fit_intercept is true, data is centered before solving.
Mirrors sklearn.linear_model.LinearRegression.
Read more in the User Guide.
Parameters:
- fit_intercept : bool, default=true
Whether to calculate the intercept (
bool, defaulttrue). Iffalse, no intercept will be used (data is expected to be centered).
Attributes:
-
fit_intercept : bool Whether an intercept is fitted.
-
coef : RowVectorType Parameter vector (1 × n_features).
-
intercept : Scalar Independent term in the decision function.
-
rank : IndexType Numerical rank of the design matrix X.
-
coef_matrix : MatrixType Coefficient matrix of shape (n_targets, n_features).
-
intercept_vector : VectorType Intercept vector of length n_targets.
-
n_targets : int Number of targets seen during the most recent
fit/fit_multicall (1 for the single-target overload).
Methods
SKIGEN_PARAMS()
Fit the model using Ordinary Least Squares.
Uses ColPivHouseholderQR decomposition. Centers data when fit_intercept is true.
Parameters:
-
X Design matrix of shape (n_samples, n_features).
-
y Target vector of shape (n_samples,). Will be cast to
Scalarif necessary.
Returns:
- result
Reference to the fitted estimator (
*this).
fit(X, y)
Fit OLS on a sparse design matrix without densifying X.
Solves the (centred when fit_intercept=true) normal equation via the implicit-centring identities used by Ridge::fit(SparseMatrix, ...). Uses an LDL^T factorisation (positive-semi-definite Cholesky variant) so that rank-deficient X^T X is handled gracefully — sklearn falls back to LSQR for sparse OLS, which is also robust to rank deficiency.
Mirrors sklearn's LinearRegression.fit behaviour on sparse input. sample_weight, positive, n_jobs are not honoured.
fit_multi(X, Y)
Fit OLS with a multi-target response matrix.
Y has shape (n_samples, n_targets). For each target column the OLS solution is computed via the same QR factorisation, exploiting the shared design matrix. The resulting coefficient matrix has shape (n_targets, n_features); the intercept is a vector of length n_targets.
This overload is additive to the single-target API: the coef() / intercept() accessors continue to return the single-target row vector / scalar (extracted from the first column of Y when this overload was used). Multi-target callers use the new accessors coef_matrix() / intercept_vector() and the predict_multi() method.
Mirrors sklearn's LinearRegression.fit(X, Y) behaviour for multi-output regression. sample_weight, positive, n_jobs are not implementeds.
predict_multi(X)
Predict multi-target outputs of shape (n_samples, n_targets).
predict(X)
Predict using the linear model.
Computes where and are the fitted coefficients and intercept.
Parameters:
- X : MatrixType Sample matrix of shape (n_samples, n_features).
Returns:
- result : VectorType Predicted values of shape (n_samples,).
Throws:
std::runtime_error— if the model has not been fitted.
score(X, y)
Return the coefficient of determination on test data.
. Best possible score is 1.0; it can be negative if the model is arbitrarily worse than predicting the mean.
Parameters:
-
X : MatrixType Test samples of shape (n_samples, n_features).
-
y : VectorType True values of shape (n_samples,).
Returns:
- result : ScalarType score.
Throws:
std::runtime_error— if the model has not been fitted.
Example
// Simple 2-feature dataset: y = 2*x1 + 3*x2 + 1
Eigen::MatrixXd X(6, 2);
X << 1, 1,
1, 2,
2, 2,
2, 3,
3, 3,
3, 4;
Eigen::VectorXd y(6);
y << 6, 9, 8, 11, 10, 13;
Skigen::LinearRegression<double> model;
model.fit(X, y);
std::cout << std::fixed << std::setprecision(4);
std::cout << "=== Linear Regression ===\n";
std::cout << "Coefficients: " << model.coef() << "\n";
std::cout << "Intercept: " << model.intercept() << "\n";
std::cout << "R² (train): " << model.score(X, y) << "\n\n";