Estimators
NMF Estimators.
The espm.estimators
module implements different NMF algorithms.
The class espm.estimators
is an abstract class for all NMF algorithms. It implements the fit and transform methods.
The fit method is implemented in the abstract class and calls the _iteration method which is implemented in the child classes.
The transform method is implemented in the child classes.
NMF Estimator
- class espm.estimators.NMFEstimator(n_components=2, init=None, tol=0.0001, max_iter=200, random_state=None, verbose=1, debug=False, l2=False, G=None, shape_2d=None, normalize=False, log_shift=1e-14, eval_print=10, true_D=None, true_H=None, fixed_H=None, fixed_W=None, hspy_comp=False, no_stop_criterion=False)[source]
Abstract class for NMF algorithms.
This abstract class espm.estimators.NMFEstimator is used to implement the different NMF algorithms. It solves problems of the form:
\[\dot{W}, \dot{H} = \arg \min_{W \geq \epsilon, H \geq \epsilon} \frac{1}{2} L(X, GWH) + R(W, H)\]where \(X\) is the data matrix, \(G\) is a matrix of known values, \(W\) and \(H\) are the matrices to be learned, \(R\) is a regularization term and \(L\) is a loss function. that represent the generalized KL divergence. As a reminder, the generalized KL divergence is defined as:
\[D_{GKL}(X || Y) = \sum_{i,j} X_{ij} \log \frac{X_{ij}}{Y_{ij}} - X_{ij} + Y_{ij}\]where \(Y = GWH\). Since \(X\) does not depend on \(W\) and \(H\), we obtain the loss function:
\[L(X, Y) = - \sum_{i,j} X_{ij} \log \frac{GWH_{ij}} + GWH_{ij}\]The Generalized KL divergence has the advantage of being zero when \(X = Y\), which is not the case for our loss. Therefore, we shift the loss function by a constant \(C\) such that it equals the Generalized KL divergence. This constant is stored in the attribute
espm.estimators.NMFEstimator.const_KL_
.The loss function can also be selected to be the Frobenius norm. In this case, the loss function is:
\[L(X, Y) = \frac{1}{2} \sum_{i,j} (X_{ij} - Y_{ij})^2\]While the code will work, it is not recommended to use the Frobenius norm as a loss function. This code is optimized for the KL divergence.
The size of:
\(X\) is \((n, p)\),
\(W\) is \((m, k)\),
\(H\) is \((k, p)\),
\(G\) is \((n, m)\).
The columns of the matrices \(H\) and \(X\) are assumed to be images. This is used typically for the smoothness regularization. The parameter shape_2d defines the shape of the images, i.e. shape_2d[0]*shape_2d[1] = p.
- Parameters:
- n_componentsint, default=2
Number of components, i.e. dimensionality of the latent space.
- initstr
Method used to initialize the procedure. Default is None The method use the initialization of
sklearn.decomposition
. It can be imported using: .. code-block::python>>> from sklearn.decomposition._nmf import _initialize_nmf
- tolfloat, default=1e-4
Tolerance of the stopping condition.
- max_iterint, default=200
Maximum number of iterations before timing out.
- random_stateint, RandomState instance, default=None
- verboseint, default=1
The verbosity level.
- debugbool, default=False
If True, the algorithm will log more and perform more checks.
- l2bool, default=False
If True, the algorithm will use the l2 norm instead of the KL divergence.
- Gnp.array, function or None, default=None
If np.array, it is the known matrix of the data. If function, it is a function that takes as input the data matrix and returns the known matrix (np.array). If None, it is assumed that G is the identity matrix.
- shape_2dtuple or None, default=None
If not None, it is the image shape of the columns of the matrices \(X\) and \(H\).
- normalizebool, default=False
If True, the algorithm will normalize the data matrix \(X\).
- log_shiftfloat, default=1e-10
Lower bound for W and H, i.e. \(\epsilon\).
- eval_printint, default=10
Number of iterations between each evaluation of the loss function.
- true_Dnp.array or None, default=None
Ground truth for the matrix \(GW\). Used for evaluation purposes.
- true_Hnp.array or None, default=None
Ground truth for the matrix \(H\). Used for evaluation purposes.
- fixed_Hnp.array or None, default=None
If not None, it fixes the non-zero values of the matrix \(H\).
- fixed_Wnp.array or None, default=None
If not None, it fixes the non-zero values of the matrix \(W\).
- no_stop_criterionbool, default=False
If True, the algorithm will not stop when the stopping criterion is reached and will continue until max_iter is reached.
- hspy_compbool, default=False
If True, the algorithm will use the format compatible with hyperspy. Use this option if you run the algorithm with the method decompositio in hyperspy. For example: .. code-block::python
>>> est = SmoothNMF( n_components = 3, hspy_comp = True) >>> out = spim.decomposition(algorithm = est, return_info=True)
- fit(X, y=None, **params)[source]
Learn a NMF model for the data X.
- Parameters:
- X{array-like, sparse matrix} of shape (n_samples, n_features)
Data matrix to be decomposed
- yIgnored
- paramsdict
Parameters passed to the fit_transform method.
- Returns:
- self
The model.
- fit_transform(X, y=None, W=None, H=None)[source]
Learn a NMF model for the data X and returns the transformed data. This is more efficient than calling fit followed by transform.
The size of:
\(X\) is \((n, p)\),
\(W\) is \((m, k)\),
\(H\) is \((k, p)\),
\(G\) is \((n, m)\).
- Parameters:
- X{array-like, sparse matrix} of shape (n, p)
Data matrix to be decomposed
- yIgnored
Not used, present here for API consistency by convention.
- Warray-like of shape (m, k)
If specified, it is used as initial guess for the solution.
- Harray-like of shape (k, p)
If specified, it is used as initial guess for the solution.
- Returns:
- GWndarrays
Transformed data.
- inverse_transform(W)[source]
Transform data back to its original space.
- Parameters:
- W{ndarray, sparse matrix} of shape (n_samples, n_components)
Transformed data matrix.
- Returns:
- X{ndarray, sparse matrix} of shape (n_samples, n_features)
Data matrix of original shape.
- loss(W, H, average=True, X=None)[source]
Loss function.
Compute the loss function for the given matrices W and H.
- Parameters:
- Wnp.array
Matrix of shape (n, k)
- Hnp.array
Matrix of shape (k, p)
- averagebool, default=True
If True, the loss is averaged over the number of elements of the matrices.
- Xnp.array or None, default=None
If not None, it is the data matrix. If None, it is assumed that the data matrix in self.X_.
- Returns:
- loss_float
Value of the loss function.
- set_inverse_transform_request(*, W: bool | None | str = '$UNCHANGED$') NMFEstimator
Request metadata passed to the
inverse_transform
method.Note that this method is only relevant if
enable_metadata_routing=True
(seesklearn.set_config()
). Please see User Guide on how the routing mechanism works.The options for each parameter are:
True
: metadata is requested, and passed toinverse_transform
if provided. The request is ignored if metadata is not provided.False
: metadata is not requested and the meta-estimator will not pass it toinverse_transform
.None
: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str
: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED
) retains the existing request. This allows you to change the request for some parameters and not others.New in version 1.3.
Note
This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a
pipeline.Pipeline
. Otherwise it has no effect.- Parameters:
- Wstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED
Metadata routing for
W
parameter ininverse_transform
.
- Returns:
- selfobject
The updated object.
- const_KL_ = None
- loss_names_ = ['KL_div_loss']
SmoothNMF
- class espm.estimators.SmoothNMF(lambda_L=0.0, linesearch=False, mu=0, epsilon_reg=1, algo='log_surrogate', force_simplex=True, dicotomy_tol=1e-05, gamma=None, **kwargs)[source]
SmoothNMF - NMF with a smooth regularization term
We encourage to read the example available in the documentation: https://espm.readthedocs.io/en/latest/introduction/notebooks/toy-problem.html
The corresponding notebook is available on github: https://github.com/adriente/espm/blob/main/notebooks/toy-ML.ipynb
The class SmoothNMF implements the regularized NMF algorithm. It solves problems of the form:
\[\dot{W}, \dot{H} = \arg \min_{W \geq \epsilon, H \geq \epsilon} D_{GKL}(X || GWH) + \lambda_L tr(H \Delta H^\top) + \mu \sum_{ij} \log(H_{ij} + \epsilon_{reg})\]where:
D_{GKL} is the Generalized KL divergence loss function defined as:
\[D_{GKL}(X || Y) = \sum_{i,j} X_{ij} \log \frac{X_{ij}}{Y_{ij}} - X_{ij} + Y_{ij}\]See the documentation of the class
espm.estimators.NMFEstimator
for more details.Delta is the Laplacian operator (it can be created using the function create_laplacian_matrix from the utils module).
epsilon_{reg} is the slope of the log regularization/sparsity at 0 (you probably want to leave this to 1).
lambda_L is a regularization parameter, which encourages smoothness in the columns of H.
mu is a regularization parameter, which is similar to an L1 sparsity penalty.
The size of:
X is (n, p)
W is (m, k)
H is (k, p)
G is (n, m)
The columns of the matrices H and X are assumed to be images, typically for the smoothness regularization. The parameter shape_2d defines the shape of the images, i.e., shape_2d[0]*shape_2d[1] = p.
- Parameters:
- lambda_Lfloat, default=1.0
Regularization parameter for the smooth regularization term.
- linesearchbool, default=False
If True, use a line search to find the step size.
- mufloat, default=0
Regularization parameter for the log regularization/sparsity term.
- epsilon_regfloat, default=1
Slope of the log regularization/sparsity at 0.
- algostr, default=”log_surrogate”
Algorithm to use for the smooth regularization term. Can be “log_surrogate”, “l2_surrogate”, or “projected_gradient”.
- force_simplexbool, default=True
If True, force the solution to be in the simplex.
- dicotomy_tolfloat, default=1e-3
Tolerance for the dichotomy algorithm.
- gammafloat, default=None
Initial value for the step size. If None, it is set to the Lipschitz constant of the gradient.
- **kwargsdict
Additional parameters for the NMFEstimator class.
- fit_transform(X, y=None, W=None, H=None)[source]
Fit the model to the data X and returns the transformed data.
The size of:
\(X\) is \((n, p)\),
\(W\) is \((m, k)\),
\(H\) is \((k, p)\),
\(G\) is \((n, m)\).
- Parameters:
- Xarray-like, shape (n, p)
Data matrix to be decomposed
- yIgnored
Not used, present here for API consistency by convention.
- Warray-like, shape (m, k)
If init=’custom’, it is used as initial guess for the solution.
- Harray-like, shape (k, p)
If init=’custom’, it is used as initial guess for the solution.
- Returns:
- GWndarrays
Transformed data.
- set_inverse_transform_request(*, W: bool | None | str = '$UNCHANGED$') SmoothNMF
Request metadata passed to the
inverse_transform
method.Note that this method is only relevant if
enable_metadata_routing=True
(seesklearn.set_config()
). Please see User Guide on how the routing mechanism works.The options for each parameter are:
True
: metadata is requested, and passed toinverse_transform
if provided. The request is ignored if metadata is not provided.False
: metadata is not requested and the meta-estimator will not pass it toinverse_transform
.None
: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str
: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED
) retains the existing request. This allows you to change the request for some parameters and not others.New in version 1.3.
Note
This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a
pipeline.Pipeline
. Otherwise it has no effect.- Parameters:
- Wstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED
Metadata routing for
W
parameter ininverse_transform
.
- Returns:
- selfobject
The updated object.
- loss_names_ = ['KL_div_loss', 'log_reg_loss', 'Lapl_reg_loss', 'gamma']