Skip to main content

SGDClassifier

#include <Skigen/LinearModel>

template <typename Scalar = double>
class Skigen::SGDClassifier(loss=Loss::Hinge, alpha=1e-4, max_iter=1000, tol=1e-3, eta0=0.01, random_state=42)

Linear classifier fitted by minimizing a regularized empirical loss with SGD.

SGDClassifier implements a plain Stochastic Gradient Descent learning routine that supports hinge loss (linear SVM) and log loss (logistic regression). Binary classification uses a single weight vector; multiclass is handled via one-vs-rest.

Mirrors sklearn.linear_model.SGDClassifier.


Parameters:

  • loss : Loss, default=Loss::Hinge The loss function (Loss::Hinge or Loss::Log, default Loss::Hinge).

  • alpha : Scalar, default=1e-4 Regularization constant (Scalar, default 1e-4).

  • max_iter : int, default=1000 Maximum number of epochs (int, default 1000).

  • tol : Scalar, default=1e-3 Stopping tolerance (Scalar, default 1e-3).

  • eta0 : Scalar, default=0.01 Initial learning rate (Scalar, default 0.01).

  • random_state : unsigned int, default=42 RNG seed (unsigned int, default 42).


Attributes:

  • is_fitted : bool Whether the estimator has been fitted.

  • coef : MatrixType Coefficient matrix (n_classes × n_features or 1 × n_features).

  • intercept : VectorType Intercept (bias) vector of shape (n_classes,) or (1,).


Methods

fit(X, y)

Fit the linear model with SGD.

Discovers unique classes in y, then trains a binary classifier per class (OvR) using stochastic gradient descent with the chosen loss function.

Parameters:

  • X : MatrixType Training matrix of shape (n_samples, n_features).

  • y : IndexVector Target vector of shape (n_samples,) with integer class labels.

Returns:

  • result : SGDClassifier Reference to the fitted estimator (*this).

Throws:

  • std::invalid_argument — if X and y have inconsistent lengths.

predict(X)

Predict class labels for samples in X.

Parameters:

  • X : MatrixType Sample matrix of shape (n_samples, n_features).

Returns:

  • result : IndexVector Integer vector of predicted class labels (n_samples,).

Throws:

  • std::runtime_error — if the model has not been fitted.

score(X, y)

Return the mean accuracy on the given test data and labels.

Parameters:

  • X : MatrixType Test samples of shape (n_samples, n_features).

  • y : IndexVector True class labels of shape (n_samples,).

Returns:

  • result : Scalar Mean accuracy (fraction of correctly classified samples).

Example

// SGD with hinge loss (SVM-like)
Skigen::SGDClassifier<double> svm(Skigen::SGDClassifier<double>::Loss::Hinge);
svm.fit(split.X_train, split.y_train);
auto svm_pred = svm.predict(split.X_test);

std::cout << "=== SGD Classifier (Hinge Loss) ===\n";
std::cout << "Accuracy: " << Skigen::Metrics::accuracy_score(split.y_test, svm_pred) << "\n\n";

// SGD with log loss (logistic regression-like)
Skigen::SGDClassifier<double> log_clf(Skigen::SGDClassifier<double>::Loss::Log);
log_clf.fit(split.X_train, split.y_train);
auto log_pred = log_clf.predict(split.X_test);

std::cout << "=== SGD Classifier (Log Loss) ===\n";
std::cout << "Accuracy: " << Skigen::Metrics::accuracy_score(split.y_test, log_pred) << "\n\n";