SGDClassifier
#include <Skigen/LinearModel>
template <typename Scalar = double>
class Skigen::SGDClassifier(loss=Loss::Hinge, alpha=1e-4, max_iter=1000, tol=1e-3, eta0=0.01, random_state=42)
Linear classifier fitted by minimizing a regularized empirical loss with SGD.
SGDClassifier implements a plain Stochastic Gradient Descent learning routine that supports hinge loss (linear SVM) and log loss (logistic regression). Binary classification uses a single weight vector; multiclass is handled via one-vs-rest.
Mirrors sklearn.linear_model.SGDClassifier.
Parameters:
-
loss : Loss, default=Loss::Hinge The loss function (
Loss::HingeorLoss::Log, defaultLoss::Hinge). -
alpha : Scalar, default=1e-4 Regularization constant (
Scalar, default1e-4). -
max_iter : int, default=1000 Maximum number of epochs (
int, default1000). -
tol : Scalar, default=1e-3 Stopping tolerance (
Scalar, default1e-3). -
eta0 : Scalar, default=0.01 Initial learning rate (
Scalar, default0.01). -
random_state : unsigned int, default=42 RNG seed (
unsigned int, default42).
Attributes:
-
is_fitted : bool Whether the estimator has been fitted.
-
coef : MatrixType Coefficient matrix (n_classes × n_features or 1 × n_features).
-
intercept : VectorType Intercept (bias) vector of shape (n_classes,) or (1,).
Methods
fit(X, y)
Fit the linear model with SGD.
Discovers unique classes in y, then trains a binary classifier per class (OvR) using stochastic gradient descent with the chosen loss function.
Parameters:
-
X : MatrixType Training matrix of shape (n_samples, n_features).
-
y : IndexVector Target vector of shape (n_samples,) with integer class labels.
Returns:
- result : SGDClassifier
Reference to the fitted estimator (
*this).
Throws:
std::invalid_argument— if X and y have inconsistent lengths.
predict(X)
Predict class labels for samples in X.
Parameters:
- X : MatrixType Sample matrix of shape (n_samples, n_features).
Returns:
- result : IndexVector Integer vector of predicted class labels (n_samples,).
Throws:
std::runtime_error— if the model has not been fitted.
score(X, y)
Return the mean accuracy on the given test data and labels.
Parameters:
-
X : MatrixType Test samples of shape (n_samples, n_features).
-
y : IndexVector True class labels of shape (n_samples,).
Returns:
- result : Scalar Mean accuracy (fraction of correctly classified samples).
Example
// SGD with hinge loss (SVM-like)
Skigen::SGDClassifier<double> svm(Skigen::SGDClassifier<double>::Loss::Hinge);
svm.fit(split.X_train, split.y_train);
auto svm_pred = svm.predict(split.X_test);
std::cout << "=== SGD Classifier (Hinge Loss) ===\n";
std::cout << "Accuracy: " << Skigen::Metrics::accuracy_score(split.y_test, svm_pred) << "\n\n";
// SGD with log loss (logistic regression-like)
Skigen::SGDClassifier<double> log_clf(Skigen::SGDClassifier<double>::Loss::Log);
log_clf.fit(split.X_train, split.y_train);
auto log_pred = log_clf.predict(split.X_test);
std::cout << "=== SGD Classifier (Log Loss) ===\n";
std::cout << "Accuracy: " << Skigen::Metrics::accuracy_score(split.y_test, log_pred) << "\n\n";