Skip to main content

MultinomialNB

Multinomial Naive Bayes for count features — the classic bag-of-words text classifier. Sparse-aware.

Algorithm

Per-class feature log-probabilities are the Laplace/Lidstone-smoothed relative frequencies of each feature within the class. Prediction sums X · feature_log_prob + class_log_prior. Accepts Eigen::SparseMatrix directly and supports partial_fit.

Constructor

Skigen::MultinomialNB<Scalar> model(Scalar alpha = 1.0, bool fit_prior = true, VectorType class_prior = {});

Parameters

ParameterDefaultDescription
alpha1.0Additive (Laplace/Lidstone) smoothing.
fit_priortrueLearn class priors from data.
class_prioremptyOptional fixed priors.

Methods

MethodDescription
fit(X, y)Accumulate class/feature counts.
partial_fit(X, y, classes)Incremental update.
predict(X)MAP class labels.
predict_proba(X)Class posteriors.

Fitted Attributes

AccessorDescription
feature_log_prob()Smoothed log P(feature
class_log_prior()Log class priors.
feature_count()Accumulated per-class feature counts.

Example

Skigen::MultinomialNB<double> nb;
nb.fit(X_counts, y);
auto preds = nb.predict(X_test);
Verified against scikit-learn

This estimator is checked by the parity suite. See the generator tests/parity/generate_naive_bayes_reference.py and the reference fixtures in tests/parity/data/multinomial_nb/, exercised by tests/parity/parity_naive_bayes.cpp.

API Reference

For full signatures see the MultinomialNB API Reference.