Measures

The espm.measures module implements different measures for the matrix factorisation problem. In particular it contains the different losses and regularizers used in espm.estimator module. It also contains different metrics to evaluate the results.

espm.measures.Frobenius_loss(X, W, H, average=False)[source]

Frobenius norm of the difference between X and WH.

Compute the Froebenius norm (elementwise L2 norm of a matrix) given \(X,W,H\):

\[\| X - WH \|_F = \sum_{ji} \left| X_{ij} - (W H)_{ij} \right|^2\]
Parameters:
  • X (np.array 2D) – n x m matrix

  • W (np.array 2D) – n x k matrix

  • H (np.array 2D) – k x m matrix

  • average (boolean) – replace the sum with a mean, i.e., divide the result by n*m (default False)

Returns:

the answer

Examples

>>> import numpy as np
>>> from espm.measures import Frobenius_loss
>>> X = np.array([[1, 1, -1], [2, 4, 5]])
>>> W = np.array([[1], [1]])
>>> H = np.array([[1, 2, 3]])
>>> Frobenius_loss(X, W, H)
    26
espm.measures.KL(X, Y, log_shift=1e-14, average=False)[source]

Generalized KL (Kullback–Leibler) divergence for two matrices

\[D_KL(X || Y) = \sum_{ji} X_{ij} \log (X / Y)_{ij} + (Y - X)_{ij}\]
Parameters:
  • X (np.array 2D) – n x m matrix

  • W (np.array 2D) – n x k matrix

  • H (np.array 2D) – k x m matrix

  • log_shift (float) – small constant to ensure the KL divergence does not explode (default value set in module esppy.conf)

  • average (boolean) – replace the sum with a mean, i.e., divide the result by n*m (default False)

Returns:

the answer

Return type:

float

espm.measures.KL_loss_surrogate(X, W, H, Ht, log_shift=1e-14, average=False)[source]

Surrogate loss for the KL divergence.

espm.measures.KLdiv(X, D, H, log_shift=1e-14, average=False)[source]

Generalized KL (Kullback–Leibler) divergence

Compute the generalized KL divergence given \(X,W,H\):

\[D_KL(X || WH) = \sum_{ji} X_{ij} \log (X / D A)_{ij} + (D A - X)_{ij}\]
Parameters:
  • X (np.array 2D) – n x m matrix

  • W (np.array 2D) – n x k matrix

  • H (np.array 2D) – k x m matrix

  • log_shift (float) – small constant to ensure the KL divergence does not explode (default value set in module esppy.conf)

  • average (boolean) – replace the sum with a mean, i.e., divide the result by n*m (default False)

Returns:

the answer

Return type:

float

Examples

>>> import numpy as np
>>> from espm.measures import KLdiv
>>> X = np.array([[1, 1, 1], [2, 4, 5]])
>>> W = np.array([[1], [1]])
>>> H = np.array([[1, 2, 3]])
>>> KLdiv(X, W, H)
    2.921251732961556
espm.measures.KLdiv_loss(X, W, H, log_shift=1e-14, average=False)[source]

Generalized Generalized KL (Kullback–Leibler) divergence loss

Compute the loss based on the generalized KL divergence given :math: X,W,H:

\[\sum_{ji} X_{ij} \log (D W)_{ij} + (D W)_{ij}\]

This does not contains all the term of the KL divergence, only the ones depending on W and H.

Parameters:
  • X (np.array 2D) – n x m matrix

  • Y (np.array 2D) – n x m matrix

  • log_shift (float) – small constant to ensure the KL divergence does not explode (default value set in module esppy.conf)

  • average (boolean) – replace the sum with a mean, i.e., divide the result by n*m (default False)

Returns:

the answer

Return type:

float

Examples

>>> import numpy as np
>>> from espm.measures import KLdiv, KLdiv_loss
>>> X = np.array([[1, 1, 1], [2, 4, 5]])
>>> W = np.array([[1], [1]])
>>> H = np.array([[1, 2, 3]])
>>> KLdiv_loss(X, W, H)
    1.9425903651915402ZSXDerzA
>>> KLdiv(X, W, H)
    2.921251732961556
espm.measures.find_min_MSE(true_maps, algo_maps, get_ind=False, unique=False)[source]

Compare all the mean squared errors between ground truth spectra and NMF spectra and find the minimum configuration.

Parameters:
  • true_maps (np.array 2D) – true maps with shape (number of phases, number of pixels)

  • algo_maps (np.array 2D) – NMF maps with shape (number of phases, number of pixels)

  • get_ind (boolean) – If True, the function will also return the indices of the NMF maps corresponding to the ground truth

  • unique (boolean) – If False it will find the global minimum but several maps can be associated to the same ground truth

Returns:

list of mse, (optionally) tuple of indices

Return type:

(list[float],list[int])

..warning::

The output being either a list or a tuple of list isn’t a great idea. It has to change.

espm.measures.find_min_angle(true_vectors, algo_vectors, get_ind=False, unique=False)[source]

Compare all the angles between ground truth spectra and NMF spectra and find the minimum configuration.

Parameters:
  • true_vectors (np.array 2D) – true spectra with shape (number of phases, number of energy channels)

  • algo_vectors (np.array 2D) – NMF spectra with shape (number of phases, number of energy channels)

  • get_ind (boolean) – If True, the function will also return the indices of the NMF spectra corresponding to the ground truth

  • unique (boolean) – If False it will find the global minimum but several spectra can be associated to the same ground truth

Returns:

list of angles, (optionally) tuple of indices

Return type:

(list[float],list[int])

..warning::

The output being either a list or a tuple of list isn’t a great idea. It has to change.

espm.measures.find_min_config(true_maps, true_spectra, algo_maps, algo_spectra, angles=True)[source]

Determines the best match between the true and the NMF spectra and maps by finding the configuration that minimizes either the sum of angles or the sum of MSE. The function returns a warning (boolean) if the MSE and the angles disagree on the best configuration.

Parameters:
  • true_maps (np.array 2D) – true maps with shape (number of phases, number of pixels)

  • true_spectra (np.array 2D) – true spectra with shape (number of phases, number of energy channels)

  • algo_maps (np.array 2D) – NMF maps with shape (number of phases, number of pixels)

  • algo_spectra (np.array 2D) – NMF spectra with shape (number of phases, number of energy channels)

  • angles (boolean) – If True, the function will minimize the sum of angles between true and NMF spectra. If False, it will minimize the sum of MSE between true and NMF maps.

Returns:

list of angles, list of MSE, configuration of the best match, warning

espm.measures.global_min(matr)[source]
espm.measures.log_reg(H, mu, epsilon=1, average=False)[source]

Log regularisation

Compute the regularization loss:

\[R(\mu, H, \epsilon) = \sum_{ij} \mu_i \log \left( H_{ij} + \epsilon \right).\]
Parameters:
  • H (np.array 2D) – n x m matrix

  • mu (np.array 1D) – n vector

  • epsilon (float) – value of \(\epsilon\) (default 1)

  • average (boolean) – replace the sum with a mean, i.e., divide the result by n*m (default False)

Returns:

the answer

Return type:

float

espm.measures.log_surrogate(H, Ht, mu, epsilon, average=False)[source]

Surrogate loss for the log function.

espm.measures.mae(map1, map2)[source]

Mean average error

Calculate the mean average error between two 2D arrays of the same dimension.

Parameters:
  • map1 (np.array 2D) – first array

  • map2 (np.array 2D) – second array

Returns:

the answer

Examples

>>> import numpy as np
>>> from espm.measures import mae
>>> map1 = np.array([[0, 1][0, 1]])
>>> map2 = np.array([[1, 1][1, 1]])
>>> mae(map1, map2)
    0.5
espm.measures.mse(map1, map2)[source]

Mean square error

Calculate the mean squared error between two 2D arrays of the same dimension.

Parameters:
  • map1 (np.array 2D) – first array

  • map2 (np.array 2D) – second array

Returns:

the answer

Examples

>>> import numpy as np
>>> from espm.measures import mse
>>> map1 = np.array([[0, 1][0, 1]])
>>> map2 = np.array([[1, 1][1, 1]])
>>> mse(map1, map2)
    0.5
espm.measures.ordered_angles(true_spectra, algo_spectra, input_inds)[source]

See ordered mse

espm.measures.ordered_mae(true_maps, algo_maps, input_inds)[source]

input : p x Npx matrix of floats, p x Npx matrix of floats, list of integers output : list of floats %————————-% Takes true maps of p phases and Npx pixels, reconstructed maps of the same size and indices of the correspondance between true phases and reconstructed phases returns the mean average errors of each phase in truth order.

espm.measures.ordered_mse(true_maps, algo_maps, input_inds)[source]

input : p x Npx matrix of floats, p x Npx matrix of floats, list of integers output : list of floats %————————-% Takes true maps of p phases and Npx pixels, reconstructed maps of the same size and indices of the correspondance between true phases and reconstructed phases returns the mean squared errors of each phase in truth order.

espm.measures.ordered_r2(true_maps, algo_maps, input_inds)[source]

input : p x Npx matrix of floats, p x Npx matrix of floats, list of integers output : list of floats %————————-% Takes true maps of p phases and Npx pixels, reconstructed maps of the same size and indices of the correspondance between true phases and reconstructed phases returns the coefficient of determination of each phase in truth order.

espm.measures.r2(map_true, map_pred)[source]

\(R^2\) - Coefficient of determination

Calculates the coefficient of determination between two 2D arrays of the same dimension. This is also called regression score function. See wikipedia.

This function is a wrapper for the function sklearn.metrics.r2_score of Scikit Learn.

Parameters:
  • map1 (np.array 2D) – first array

  • map2 (np.array 2D) – second array

Returns:

the answer

Return type:

float

espm.measures.residuals(data, model)[source]
espm.measures.spectral_angle(v1, v2)[source]

Spectral angle

Calculate the angle between two spectra of the same dimension.

Parameters:
  • v1 (np.array 1D) – first spectrum

  • v2 (np.array 1D) – second spectrum

Returns:

the answer

Return type:

float

Examples

>>> import numpy as np
>>> from espm.measures import spectral_angle
>>> v1 = np.array([0, 1, 0])
>>> v2 = np.array([1, 0, 1])
>>> spectral_angle(v1, v2)
    90.0
espm.measures.squared_distance(x, y=None)[source]

Squared distance between two between all colon vectors matrices.

Calculate the squared L2 distance between all pairs of vectors of two matrices. If only one matrix is given, the function uses each pair of vector of this matrix.

Parameters:
  • x (np.array 2D) – n x m matrix of first colon vectors

  • y (np.array 2D) – n x m matrix of second colon vectors (optional)

Returns:

the answer

Return type:

np.array (m x m)

Examples

>>> import numpy as np
>>> from espm.measures import square_distance
>>> x = np.arange(3)
>>> square_distance(x, x)
    array([[ 0.,  1.,  2.],
    [ 1.,  0.,  1.],
    [ 2.,  1.,  0.]])
espm.measures.trace_xtLx(L, x, average=False)[source]

Trace of \(X^T L X\)

Compute the following expression \(\text{Tr} (X^T L X)\).

Parameters:
  • L (np.array 2D) – n x n matrix

  • mu (np.array 2D) – n x k martrix of k n-sized vector

  • average (boolean) – replace the sum with a mean, i.e., divide the result by n*m (default False)

Returns:

the answer

Return type:

float

espm.measures.unique_min(matrix)[source]

From a square matrix of float values, finds the combination of elements with different lines which mimises the sum of elements.

It is a brute force algorithm, it is not recommended to input a matrix bigger than 20 x 20

Parameters:

matrix (np.array 2D) – square matrix

Returns:

list of unique min values and corresponding indices in the same order

Return type:

(list, list[int])

Examples

>>> import numpy as np
>>> from espm.measures import unique_min
>>> matrix = np.array([[1.2,  1.3,  3.5],
                    [4.9,  2.2,  6.5],
                   [9.0,  4.1,  1.8]])
>>> unique_min(v1, v2)   
    ([1.2, 2.2, 1.8], (0, 1, 2))