espm: The Electron Spectro-Microscopy Python Library
This library contains algorithms to perform non-negative matrix factorization with diverse regularisation (e.g. Laplacian or L1) and contraints (e.g. simplex).
It is specialized for Electron Microscopy applications. It contains code to create artificial Energy dispersive X-ray spectroscopy (EDXS) data and to perform hyperspectral unmixing on EDXS spectrum images.
Installation
You can install this package from PyPi using:
$ pip install espm
If you want to develop, please use the option:
$ git clone https://github.com/adriente/espm.git
$ cd espm
$ pip install cython
$ pip install -e ".[dev]"
Getting started
Try the api.ipynb notebook in the notebooks folder.
Documentation
The documentation is available at https://espm.readthedocs.io/en/latest/
You can get started with the following notebooks:
CITING
If you use this library, please cite the following paper:
@article{teurtrie2023espm,
title={espm: A Python library for the simulation of STEM-EDXS datasets},
author={Teurtrie, Adrien and Perraudin, Nathana{\"e}l and Holvoet, Thomas and Chen, Hui and Alexander, Duncan TL and Obozinski, Guillaume and H{\'e}bert, C{\'e}cile},
journal={Ultramicroscopy},
pages={113719},
year={2023},
publisher={Elsevier}
}
Documentation map
Introduction to the espm
This tutorial will show you the basic operations of the toolbox. After installing the package with pip, start by opening a python shell, e.g. a Jupyter notebook, and import espm. The package espm is built on top of the hyperspy and the scikit-learn packages.
The hyperspy package is a Python library for multidimensional data analysis. It provides the base framework to handles our data. The scikit-learn package is a Python library for machine learning. It provides the base framework to for the Non Negative Matrix Factorization (NMF) algorithms develeoped in this package.
>>> import numpy as np
>>> import matplotlib.pyplot as plt
>>> import hyperspy.api as hs
>>> import espm
Datasets
Let us start by creating some data to play with this tutorial. We can generating artificial datasets using the following lines:
>>> import espm.datasets as ds
>>> ds.generate_built_in_datasets(seeds_range=1)
>>> spim = ds.load_particules(sample = 0)
>>> spim.change_dtype('float64')
The command generate_built_in_datasets will save the generated datasets folder defined in the espm.conf.py file. Alternatively, you can also define the path where the data will be saved using the “base_path” argument.
>>> ds.generate_built_in_datasets(base_path="generated_samples", seeds_range=1)
Here the object spim is of the hyperspy._signals.signal1d.Signal1D
.
This object has different useful attributes and methods. For example,
@ADRIEN — summarize here some of them
Note
Please see the review article espm : a Python library for the simulation of STEM-EDXS datasets for an overview of the simulation methods this package leverages.
Factorization
Taking the non-negative matrix factorization (NMF) is done with the following:
>>> out = spim.decomposition(3)
>>> spim.plot_decomposition_loadings(3)
>>> spim.plot_decomposition_factors(3)


BIt will use the algorithms developped in a coming contribution.
These algorithms are an important part of this package. They are specialized to solve regularized Poisson NMF problems. Mathematically, they can be expressed as:
Here \(D_{GKL}\) is the fidelity term, i.e. the Generalized KL divergence
The loss is regularized using two terms: a Laplacian regularization on \(H\) and a log regularization on \(H\). \(\lambda\) and \(\mu\) are the regularization parameters. The Laplacian regularization is defined as:
where \(\Delta\) is the Laplacian operator (it can be created using the function espm.utils.create_laplacian_matrix
).
Note that the columns of the matrices :math:`H` and :math:`X` are assumed to be images.
The log regularization is defined as:
where \(\epsilon_{reg}\) is the slope of log regularization at 0. This term acts similarly to an L1 penalty but affects less larger values.
Finally, we assume \(W,H\geq \epsilon\) and that the lines of \(H\) sum to 1:
The size of:
\(X\) is (n, p)
\(W\) is (m, k)
\(H\) is (k, p)
\(G\) is (n, m)
The columns of the matrices \(H\) and \(X\) are assumed to be images, typically for the smoothness regularization. In terms of shape, we have \(n_x \cdot n_y = p\), where \(n_x\) and \(n_y\) are the number of pixels in the x and y directions.
A detailed example on the use these algorithms can be found in this notebook.
List of example notebooks
To go deeper, we invite you to consult the following notebooks.
Analysis of simulated EDX data
In this notebook we showcase the analysis of the built-in simulated dataset. We use the espm EDXS modelling to simulate the data. We first perform a KL-NMF decomposition of the data and then plot the results.
Imports
[1]:
%load_ext autoreload
%autoreload 2
%matplotlib inline
# Generic imports
import hyperspy.api as hs
import numpy as np
# espm imports
from espm.estimators import SmoothNMF
import espm.datasets as ds
Generating artificial datasets and loading them
If the datasets were already generated, they are not generated again
[2]:
# Generate the data
# Here `seeds_range` is the number of different samples to generate
ds.generate_built_in_datasets(seeds_range=1)
# Load the data
spim = ds.load_particules(sample = 0)
# We need to change the data to floats for the decomposition to run
spim.change_dtype('float64')
Building G
The information in the metadata of the spim object are used to build the G matrix. This matrix contains a model of the characteristic X-rays and the bremsstrahlung.
[3]:
spim.build_G("bremsstrahlung")
Problem solving
3 components for 3 phases
convergence criterions : tol and max_iter. tol is the minimum change of the loss function between two iterations. max_iter is the max number of iterations.
hspy_comp is a required parameter if you want to use the hyperspy api
[4]:
est = SmoothNMF( n_components = 3,tol=0.000001, max_iter = 50, G = spim.G,hspy_comp = True)
/! It should take a minute to execute in the case of the built-in dataset
[5]:
out = spim.decomposition(algorithm = est, return_info=True)
It 10 / 50: loss 2.047093e-01, 1.363 it/s
It 20 / 50: loss 2.045638e-01, 1.394 it/s
It 30 / 50: loss 2.045228e-01, 1.411 it/s
It 40 / 50: loss 2.045052e-01, 1.418 it/s
exits because max_iteration was reached
Stopped after 50 iterations in 0.0 minutes and 35.0 seconds.
Decomposition info:
normalize_poissonian_noise=False
algorithm=SmoothNMF()
output_dimension=None
centre=None
scikit-learn estimator:
SmoothNMF()
First cell : Printing the resulting concentrations.
Second cell : Ploting the resulting spectra
Thrid cell : Ploting the resulting abundances
Hyperspy is mainly designed to be used with the qt graphical backend of matplotlib. Thus two plots will appear if you are in inline mode.
[6]:
spim.print_concentration_report()
Concentrations report
p0 p1 p2
V : 0.0246 0.0079 0.0178
Rb : 0.4842 0.0863 0.5199
W : 0.0390 0.0092 0.0045
N : 0.0151 0.0851 0.0134
Yb : 0.0009 0.0826 0.0018
Pt : 0.0012 0.0150 0.0013
Al : 0.0005 0.1082 0.0010
Ti : 0.0002 0.0860 0.0007
La : 0.0004 0.2014 0.0012
[7]:
spim.plot_decomposition_loadings(3)
[7]:


[8]:
spim.plot_decomposition_factors(3)
[8]:


Generate simulated data
In this notebook we generate a simulated dataset based on experimental data. - The maps are directly extracted from the experimental data. - The phases (or spectra) are built using chemical composition extracted from the literature (citations to be included)
The output of this notebook is put in the generated_datasets
Imports
[1]:
%load_ext autoreload
%autoreload 2
%matplotlib inline
# espm imports
from espm.datasets.base import generate_dataset
from espm.weights.generate_weights import chemical_map_weights
from espm.models.generate_EDXS_phases import generate_modular_phases
from espm.models.EDXS_function import elts_list_from_dict_list
# Generic imports
from pathlib import Path
import matplotlib.pyplot as plt
import numpy as np
import hyperspy.api as hs
Generate weights
We use an experimental dataset to create realistic abundance maps. Theses maps are based from chemical mapping of the dataset.
[2]:
def get_repo_path():
"""Get the path to the git repository.
This is a bit of a hack, but it works.
"""
this_path = Path.cwd() / Path("generate_data.ipynb")
return this_path.resolve().parent.parent
# Path of the experimental dataset
path = get_repo_path() / Path("generated_datasets/71GPa_experimental_data.hspy")
# Creation of the weights
weights = chemical_map_weights( file = path, line_list = ["Fe_Ka","Ca_Ka"], conc_list = [0.5,0.5])
/home/docs/checkouts/readthedocs.org/user_builds/espm/envs/v0.2.1/lib/python3.7/site-packages/hyperspy/misc/utils.py:475: VisibleDeprecationWarning: Use of the `binned` attribute in metadata is going to be deprecated in v2.0. Set the `axis.is_binned` attribute instead.
VisibleDeprecationWarning,
/home/docs/checkouts/readthedocs.org/user_builds/espm/envs/v0.2.1/lib/python3.7/site-packages/hyperspy/io.py:575: VisibleDeprecationWarning: Loading old file version. The binned attribute has been moved from metadata.Signal to axis.is_binned. Setting this attribute for all signal axes instead.
'signal axes instead.', VisibleDeprecationWarning)
/home/docs/checkouts/readthedocs.org/user_builds/espm/envs/v0.2.1/lib/python3.7/site-packages/hyperspy/misc/utils.py:475: VisibleDeprecationWarning: Use of the `binned` attribute in metadata is going to be deprecated in v2.0. Set the `axis.is_binned` attribute instead.
VisibleDeprecationWarning,
/home/docs/checkouts/readthedocs.org/user_builds/espm/envs/v0.2.1/lib/python3.7/site-packages/hyperspy/io.py:575: VisibleDeprecationWarning: Loading old file version. The binned attribute has been moved from metadata.Signal to axis.is_binned. Setting this attribute for all signal axes instead.
'signal axes instead.', VisibleDeprecationWarning)
[3]:
fig,axs = plt.subplots(1,3,figsize = (20,20*3))
for j in range(axs.shape[0]) :
im = axs[j].imshow(weights[:,:,j],vmin = 0, vmax = 1.0)
axs[j].tick_params(axis = "both",width = 0,labelbottom = False,labelleft = False)
axs[j].set_title("Phase {}".format(j),fontsize = 22)
fig.subplots_adjust(right=0.84)
cbar_ax = fig.add_axes([0.85, 0.47, 0.01, 0.045])
fig.colorbar(im,cax=cbar_ax)
cbar_ax.tick_params(labelsize=22)

Generate phases
We generate phases based on values extracted from the literature. The phases we try to simulate here are Ferropericlase, Bridgmanite and Ca-perovskite.
We use dictionnaries to input the modelling parameters.
[4]:
# Elemental concetration of each phase
elts_dicts = [
# Pseudo ferropericlase
{
"Mg" : 0.522, "Fe" : 0.104, "O" : 0.374, "Cu" : 0.05
},
# Pseudo Ca-Perovskite
{
"Mg" : 0.020, "Fe" : 0.018, "Ca" : 0.188, "Si" : 0.173, "Al" : 0.010, "O" : 0.572, "Ti" : 0.004, "Cu" : 0.05, "Sm" : 0.007, "Lu" : 0.006, "Nd" : 0.006
},
# Pseudo Bridgmanite
{
"Mg" : 0.445, "Fe" : 0.035, "Ca" : 0.031, "Si" : 0.419, "Al" : 0.074, "O" : 1.136, "Cu" : 0.05, "Hf" : 0.01
}]
# Parameters of the bremsstrahlung
brstlg_pars = [
{"b0" : 0.0001629, "b1" : 0.0009812},
{"b0" : 0.0007853, "b1" : 0.0003658},
{"b0" : 0.0003458, "b1" : 0.0006268}
]
# Model parameters : energy scale, detector broadening, x-ray emission database, beam energy, absorption parameters, detector efficiency
model_params = {
"e_offset" : 0.3,
"e_size" : 1980,
"e_scale" : 0.01,
"width_slope" : 0.01,
"width_intercept" : 0.065,
"db_name" : "default_xrays.json",
"E0" : 200,
"params_dict" : {
"Abs" : {
"thickness" : 100.0e-7,
"toa" : 35,
"density" : 4.5,
"atomic_fraction" : False
},
"Det" : "SDD_efficiency.txt"
}
}
# miscellaneous paramaters : average detected number of X-rays per pixel, phases densities, output folder, model name, random seed
data_dict = {
"N" : 100,
"densities" : [1.2,1.0,0.8],
"data_folder" : "71GPa_synthetic_N100",
"model" : "EDXS",
"seed" : 42
}
[5]:
phases = generate_modular_phases(
elts_dicts=elts_dicts, brstlg_pars = brstlg_pars,
scales = [1, 1, 1],
model_params= model_params,
seed = 42
)
# scales : bremsstrahlung parameters modifiers
elements = elts_list_from_dict_list(elts_dicts)
[6]:
fig,axs = plt.subplots(3,1,figsize = (20,15))
# Build the energy scale
x = np.linspace(
model_params["e_offset"],
model_params["e_offset"]+model_params["e_scale"]*model_params["e_size"],
num=model_params["e_size"])
for j in range(axs.shape[0]) :
axs[j].plot(x,phases[j])
axs[j].set_title("Phase {}".format(j),fontsize = 22)
axs[-1].set_xlabel("Energy loss (eV)",fontsize = 16)
[6]:
Text(0.5, 0, 'Energy loss (eV)')

It will produce 1 spectrum images/sample in the target folder.
You can replace seed_range
by the number of samples you want to generate.
[7]:
generate_dataset( phases = phases,
weights = weights,
model_params = model_params,
misc_params = data_dict,
base_seed=data_dict["seed"],
sample_number=1,
elements = elements)
100%|██████████| 1/1 [00:04<00:00, 4.33s/it]
The previous command will save the data in the generated_datasets
folder defined in the espm.conf.py
file.
You can also define the path where the data will be saved using the “base_path” argument
Regularized Poisson NMF on a toy dataset
This notebook is part of the ESPM package. It is available on github
In this notebook, we show how to solve a regularized Poisson Non Negative Matrix Factorization (NMF) on a toy dataset. The general goal of this problem is to separate a matrix \(X\) into two matrices \(D\) and \(H\) such that \(X \approx DH\). We will assume that \(D\) is the product of two matrices \(G\) and \(W\) such that \(D = GW\). The matrix \(G\) is assumed to be known and therefore \(W\) is the matrix we want to find. In the field of electro-microscopy, the matrix \(GW\) represent the phases, the matrix \(H\) is the matrix of maps (or weights) and \(X\) is the data matrix. The maps \(H\) are 2D images. Furthermore, we will assume that \(W, H\) are non-negative or more generally greater than a small positive value \(\epsilon\).
The data measured data \(X\) is follow a Poisson distribution, i.e. \(X_{ij} \sim \text{Poisson}(n_p\dot{X}_{ij})\). In the context of electro-microscopy, \(n_p\) is the number of photons per pixel. \(\dot{X} = G \dot{W} \dot{H}\) would correspond a noiseless measurement.
The size of:
\(X\) is (n, p),
\(W\) is (m, k),
\(H\) is (k, p),
\(G\) is (n, m).
In general, m is assumed to be much smaller than n and \(G\) is assumed to be known. The parameter shape_2d
defines the shape of the images for the columns of \(H\) and \(X\), i.e. shape_2d[0]*shape_2d[1] = p
.
Mathematically. the problem can be formulated as:
Here \(D_{GKL}\) is the fidelity term, i.e. the Generalized KL divergence
The loss is regularized using two terms: a Laplacian regularization on \(H\) and a log regularization on \(H\). \(\lambda\) and \(\mu\) are the regularization parameters. The Laplacian regularization is defined as:
where \(\Delta\) is the Laplacian operator (it can be created using the function :mod:espm.utils.create_laplacian_matrix
). Note that the columns of the matrices :math:`H` and :math:`X` are assumed to be images.
The log regularization is defined as:
where \(\epsilon_{reg}\) is the slope of log regularization at 0. This term acts similarly to an L1 penalty but affects less larger values.
Finally, we assume \(W,H\geq \epsilon\) and that the lines of \(H\) sum to 1:
This is done by adding the parameter force_simplex=True
to the class espm.estimators.SmoothNMF
. This constraint is essential as it prevent the algorithm to find a solution where all the maps tend to 0 because of the other constraints.
In this notebook, we will use the class espm.estimators.SmoothNMF
to solve the problem.
Imports and function definition
Let’s start by importing the necessary libraries.
[1]:
# Some useful modules for notebooks
%load_ext autoreload
%autoreload 2
%matplotlib inline
[2]:
import numpy as np
import matplotlib.pyplot as plt
from espm.estimators import SmoothNMF
from espm.measures import find_min_angle, ordered_mse, ordered_mae, ordered_r2
from espm.models import ToyModel
from espm.weights import generate_weights as gw
from espm.datasets.base import generate_spim_sample
from espm.utils import process_losses
Now we define the parameters that will be used to generate the data.
[3]:
seed = 42 # for reproducibility
m = 15 # Number of components
n = 200 # Length of the phases
n_poisson = 300 # Average poisson number per pixel (this number will be splitted on the m dimension)
densities = np.random.uniform(0.1, 2.0, 3) # Random densities
[4]:
def get_toy_sample():
model_params = {"L": n, "C": m, "K": 3, "seed": seed}
misc_params = {"N": n_poisson, "seed": seed, 'densities' : densities, "model": "ToyModel"}
toy_model = ToyModel(**model_params)
toy_model.generate_phases()
phases = toy_model.phases.T
weights = gw.generate_weights("toy_problem", None)
sample = generate_spim_sample(phases, weights, model_params,misc_params, seed = seed)
return sample
def to_vec(X):
n = X.shape[2]
return X.transpose(2,0,1).reshape(n, -1)
sample = get_toy_sample()
GW = sample["GW"].T
G = sample["G"]
H = to_vec(sample["H"])
X = to_vec(sample["X"])
Xdot = to_vec(sample["Xdot"])
shape_2d = sample["shape_2d"]
Let us look at the dimension of our problem.
[5]:
print(f"""
- X: {X.shape}
- Xdot: {Xdot.shape}
- G: {G.shape}
- GW: {GW.shape}
- H: {H.shape}
- shape_2d: {shape_2d}
""")
- X: (200, 10000)
- Xdot: (200, 10000)
- G: (200, 15)
- GW: (200, 3)
- H: (3, 10000)
- shape_2d: (100, 100)
Here Xdot
contains the noisless data such that the ground truth H
and GW
satisfies:
[6]:
np.testing.assert_allclose(Xdot, GW @ H)
Let us plot the ground truth maps \(H\) (sometimes also called weights). Here the variable shape_2d
is used to reshape the columns of \(H\) into images.
[7]:
vmin, vmax = 0,1
cmap = plt.cm.gray_r
plt.figure(figsize=(10, 3))
for i, hdot in enumerate(H):
plt.subplot(1,3,i+1)
plt.imshow(H[i].reshape(shape_2d), cmap=cmap, vmin=vmin, vmax=vmax)
plt.axis("off")
plt.title(f"GT - Map {i+1}")

We can also plot the phases corresponding to these maps. The phases corresponds to the matrix \(GW\).
[8]:
e = np.linspace(0,1, GW.shape[0])
plt.plot(e, GW)
plt.title("GT - Phases")
plt.xlabel("Frequency [normalized]")
plt.legend([f"phase {i+1}" for i in range(3)]);

The matrix \(G\) is assumed to be known appriori. Here we plot the first 5 lines of \(G\).
[9]:
l = np.linspace(0, 1,n)
plt.plot(l, G[:,:5]);

Note that the function create_toy_sample
did not return \(W\) but only \(GW\).
Using the ground truth \(GW\), it can be computed as follows:
[10]:
W = np.linalg.lstsq(GW, G, rcond=None)[0].T
W.shape
[10]:
(15, 3)
Solving the problem
Let us define the different hyperparameters of the problem. Feel free to change them and see how the results change.
Let us first define our regularisation parameters.
[11]:
lambda_L = 2 # Smoothness of the maps
mu = 0.05 # Sparsity of the maps
We can additionally add a simplex constraint to the problem by setting force_simplex=True
. This will add the constraint that the line of \(H\) sum to 1. This constraint should be activated for the regularizers to work!. Practically, it will prevent the algorithm from simply minimizing \(W\) and increasing \(H\).
[12]:
force_simplex = True
Finally, we can decide to specify \(G\) or not. If the matrix \(G\) is not specified, the algorithm will directly estimate \(GW\) insead of \(W\).
[13]:
Gused = G
Note that wihtout regularisation, i.e. with the parameters
lambda_L = 0
mu = 0
Gused = None
force_simplex=False
we recover the classical Poisson/KL NMF problem. Our algorithm will apply the MU algorithm from Lee and Seung (2001).
Let us define the parameters for the algorithm. Here the class espm.estimator.SmoothNMF
heritates from sckit-learn’s NMF
class.
[14]:
K = len(H) # Number of components / let assume that we know it
params = {}
params["tol"]=1e-6 # Tolerance for the stopping criterion.
params["max_iter"] = 200 # Maximum number of iterations before timing out.
params["hspy_comp"] = False # If should be set to True if hspy data format is used.
params["verbose"] = 1 # Verbosity level.
params["eval_print"] = 10 # Print the evaluation every eval_print iterations.
params["epsilon_reg"] = 1 # Regularization parameter
params["linesearch"] = False # Use linesearch to accelerate convergence
params["shape_2d"] = shape_2d # Shape of the 2D maps
params["n_components"] = K # Number of components
params["normalize"] = True # Normalize the data. It helps to keep the range of the regularization parameters lambda_L and mu in a reasonable range.
estimator = SmoothNMF(mu=mu, lambda_L=lambda_L, G = Gused, force_simplex=force_simplex, **params)
The function fit_transform
will solve the problem and return the matrix \(GW\).
[15]:
GW_est = estimator.fit_transform(X)
It 10 / 200: loss 8.199383e-03, 4.182 it/s
It 20 / 200: loss 8.123480e-03, 4.250 it/s
It 30 / 200: loss 8.096584e-03, 4.276 it/s
It 40 / 200: loss 8.080755e-03, 4.291 it/s
It 50 / 200: loss 8.065004e-03, 4.299 it/s
It 60 / 200: loss 8.019184e-03, 4.304 it/s
It 70 / 200: loss 7.967115e-03, 4.308 it/s
It 80 / 200: loss 7.780344e-03, 4.312 it/s
It 90 / 200: loss 7.693353e-03, 4.314 it/s
It 100 / 200: loss 7.678082e-03, 4.316 it/s
It 110 / 200: loss 7.659617e-03, 4.317 it/s
It 120 / 200: loss 7.625445e-03, 4.316 it/s
It 130 / 200: loss 7.574880e-03, 4.316 it/s
It 140 / 200: loss 7.526745e-03, 4.314 it/s
It 150 / 200: loss 7.498068e-03, 4.314 it/s
It 160 / 200: loss 7.483342e-03, 4.315 it/s
It 170 / 200: loss 7.472914e-03, 4.316 it/s
It 180 / 200: loss 7.464122e-03, 4.315 it/s
It 190 / 200: loss 7.455899e-03, 4.315 it/s
exits because max_iteration was reached
Stopped after 200 iterations in 0.0 minutes and 46.0 seconds.
We can plot the different losses to see how the algorithm converges.
[16]:
losses = estimator.get_losses()
values, names = process_losses(losses)
plt.figure(figsize=(10, 6))
for name, value in zip(names[:4], values[:4]):
plt.plot(value, label=name)
plt.xlabel("iteration")
plt.yscale("log")
plt.legend()
[16]:
<matplotlib.legend.Legend at 0x7f2186eae490>

[17]:
losses[0]
[17]:
(0.00883246, 0.00810643, 0.00018791, 0.00053812, 8., 0.20168493, 0.38053404)
We can easily recover \(W\) and \(H\) from the estimator.
[18]:
H_est = estimator.H_
W_est = estimator.W_
Let us compute some metrics to evaluate the quality of the solution. We will use the metrics defined in the module espm.metrics
.
We first compute the angles between the phases GW
and the estimated phases GW_est
. The angles are computed using the function compute_angles
. This function also return the indices to match the estimations and the ground truths. Indeed, there is no reason for them to have the same order.
[19]:
angles, true_inds = find_min_angle(GW.T, GW_est.T, unique=True, get_ind=True)
print("angles : ", angles)
angles : [7.303671408311623, 25.736214105002425, 30.788294617407914]
Now, we can compute the Mean Squared Error (MSE) between the maps H
and the estimated maps H_est
.
[20]:
mse = ordered_mse(H, H_est, true_inds)
print("mse : ", mse)
mse : [0.37937082838545216, 0.11121192704478366, 0.15250586376963057]
Finally, let us plot the results.
[21]:
def plot_results(Ddot, D, Hdotflat, Hflat):
fontsize = 30
scale = 15
aspect_ratio = 1.4
marker_list = ["-o","-s","->","-<","-^","-v","-d"]
mark_space = 20
# cmap = plt.cm.hot_r
cmap = plt.cm.gray_r
vmax = 1
vmin = 0
K = len(H)
L = len(D)
angles, true_inds = find_min_angle(Ddot.T, D.T, unique=True, get_ind=True)
mse = ordered_mse(Hdotflat, Hflat, true_inds)
mae = ordered_mae(Hdotflat, Hflat, true_inds)
r2 = ordered_r2(Hdotflat, Hflat, true_inds)
fig, axes = plt.subplots(K,3,figsize = (scale/K * 3 * aspect_ratio,scale))
x = np.linspace(0,1, num = L)
for i in range(K):
axes[2,i].plot(x,Ddot.T[i,:],'g-',label='ground truth',linewidth=4)
axes[2,i].plot(x,D[:,true_inds[i]],'r--',label='reconstructed',linewidth=4)
axes[2,i].set_title("{:.2f} deg".format(angles[i]),fontsize = fontsize-2)
axes[2,i].set_xlim(0,1)
axes[1,i].imshow((Hflat[true_inds[i],:]).reshape(shape_2d),vmin = vmin, vmax = vmax , cmap=cmap)
axes[1,i].set_title("R2: {:.2f}".format(r2[true_inds[i]]),fontsize = fontsize-2)
# axes[i,1].set_ylim(0.0,1.0)
axes[1,i].tick_params(axis = "both",labelleft = False, labelbottom = False,left = False, bottom = False)
im = axes[0,i].imshow(Hdotflat[i].reshape(shape_2d),vmin = vmin, vmax = vmax, cmap=cmap)
axes[0,i].set_title("Phase {}".format(i),fontsize = fontsize)
axes[0,i].tick_params(axis = "both",labelleft = False, labelbottom = False,left = False, bottom = False)
axes[2,0].legend()
rows = ["True maps","Reconstructed maps","Spectra"]
for ax, row in zip(axes[:,0], rows):
ax.set_ylabel(row, rotation=90, fontsize=fontsize)
fig.subplots_adjust(right=0.84)
# put colorbar at desire position
cbar_ax = fig.add_axes([0.85, 0.5, 0.01, 0.3])
fig.colorbar(im,cax=cbar_ax)
# fig.tight_layout()
print("angles : ", angles)
print("mse : ", mse)
print("mae : ", mae)
print("r2 : ", r2)
return fig
fig = plot_results(GW, GW_est, H, H_est)
plt.show()
angles : [7.303671408311623, 25.736214105002425, 30.788294617407914]
mse : [0.37937082838545216, 0.11121192704478366, 0.15250586376963057]
mae : [0.5880329885151475, 0.23360594333565174, 0.35517588268316624]
r2 : [-7.756461302568054, -0.5126935634346239, -5.284422099392397]

API reference
Models
The espm.models
module implements the model objects that are used to simulate features of the data (e.g. in EDXS it corresponds to spectra).
The models are also used for the creation of the G matrix, which can be later used for the decomposition of data using NMF.
Note
For now espm supports only the modelling of EDS data but we aim at supporting the modelling of EELS data too.
Models base classes
The PhysicalModel is the abstract class from which the EDX model inherits. The ToyModel can be used for testing data analysis algorithms.
- class espm.models.base.PhysicalModel(e_offset, e_size, e_scale, params_dict, db_name='default_xrays.json', E0=200, **kwargs)[source]
Abstract class of the models
- build_energy_scale(e_offset, e_size, e_scale)[source]
Build an array corresponding to a linear energy scale.
- Parameters
- e_offset
- float
Offset of the energy scale
- e_size
- int
Number of channels of the detector
- e_scale
- float
Scale in keV/channel
- Returns
- energy scale
- np.array 1D
Linear energy scale based on the given parameters of shape (e_size,)
- class espm.models.base.ToyModel(L: int = 200, C: int = 15, K: int = 3, seed: Optional[int] = None)[source]
Toy model.
Simple model with a random G matrix and phases.
The G matrix is generated with a random number of gaussians per column. The phases are generated using a vector drawn from a Laplacian distribution and multiplied by the G matrix.
- Parameters
- Lint
Length of the phases.
- Cint
Number of possible components in the phases.
- Kint
Number of phases.
- seedint, optional
Seed for the random number generator.
Examples
>>> from espm.models import ToyModel >>> import matplotlib.pyplot as plt >>> model = ToyModel(L=200, C=15, K=3, seed=0) >>> model.generate_g_matr() >>> model.generate_phases() >>> print(model.G.shape) (200, 15) >>> print(model.phases.shape) (200, 3) >>> plt.plot(model.phases)
EDXS model
The espm.models.edxs
module implements the creation of the G matrix that contains a modelisation of the characteristic and continuous X-rays.
This module is a key component of the EDXS data simulation and the EDXS data analysis.
- class espm.models.edxs.EDXS(*args, width_slope=0.01, width_intercept=0.065, **kwargs)[source]
- generate_g_matr(g_type='bremsstrahlung', reference_elt={}, *, elements=[], **kwargs)[source]
Generate the G matrix. With a complete model the matrix is (e_size,n+2). The first n columns correspond to the sum of X-ray characteristic peaks associated to each shell of the elements. The last 2 columns correspond to a bremsstrahlung model.
The first n columns of G (noted \(\kappa\)), corresponding to the characteristic X-rays, can be calculated using the following formula:
\[\kappa_{k,Z} = \sum_{ij} x_{ij}(Z) \frac{1}{\Delta(\varepsilon_k) \sqrt{2\pi} } e{^{-\frac{1}{2} {\left( \frac{\varepsilon_k - \varepsilon^{ij}(Z)}{\Delta(\varepsilon_k)} \right)}^2}}\]where \(\varepsilon_k\) is the energy of the kth energy channel, \(\varepsilon^{ij}(Z)\) is the energy of the ijth line of the Zth element, \(\Delta(\varepsilon_k)\) is the FWHM of the detector at the energy \(\varepsilon_k\) and \(x_{ij}(Z)\) is the intensity ratio of the ijth line of the Zth element. The last term of the equation is a gaussian function centered at \(\varepsilon^{ij}(Z)\) and with a FWHM of \(\Delta(\varepsilon_k)\).
The last two columns of G (noted \(\beta\)), corresponding to the bremsstrahlung model, can be calculated using the following formula:
\[ \begin{align}\begin{aligned}\beta_{k,n+1} = \frac{\varepsilon_0 - \varepsilon_k}{\varepsilon_0 \varepsilon_k} \times \frac{1 - (\varepsilon_0 - \varepsilon_k)}{\varepsilon_0}\\\beta_{k,n+2} = \frac{(\varepsilon_0 - \varepsilon_k)^2}{\varepsilon^2_0 \varepsilon_k}\end{aligned}\end{align} \]Note that there is no parameter for the bremsstrahlung since it is supposed to be learned with the W matrix (see espm.estimators).
- Parameters
- g_type
- string
Options of the edxs model to include in the G matrix. The three possibilities are “identity”, “no_brstlg” and “bremsstrahlung”. G is going to be the identity matrix, only characteristic X-ray peaks or characteristic X-rays plus bremsstrahlung model, respectively.
- reference_elt
- dict
The keys are chemical elements (atomic number) and the values are cut-off energies. This argument is used to split some of the columns of G into 2 columns. The first column corresponds to characteristic X-rays before the cut-off and second one corresponds to characteristic X-rays before the cut-off. This feature is implemented to enable more accurate absorption correction.
- elements
- list
List of modeled chemical elements. The list can be populated either with atomic numbers or chemical symbols, e.g. “Fe” or 26.
- Returns
- g matrix
- np.array 2D
matrix of the edx model.
Notes
See our paper about the espm package [TPH+23] for more information about the equations of this model.
- generate_phases(phases_parameters)[source]
Generate a series of spectra from list of phase parameters.
- Parameters
- phases_parameters
- list
List of dicts containing the phase parameters.
- Returns
- phase_array
- np.array 2D
Array which lines correspond to the modelled phases in the input list.
Notes
The absorption correction is done using the average composition of the phases. The same correction is used for each phase.
- generate_spectrum(b0=0, b1=0, scale=1.0, abs_elts_dict={}, *, elements_dict={})[source]
Generate a spectrum from bremsstrahlung parameters and a chemical composition. The modelling is done using the same formula as for the generate_g_matr function.
- Parameters
- b0
- float
First bremsstrahlung parameter.
- b1
- float
Second bremsstrahlung parameter.
- scale
- float
Scale factor to apply to the bremsstrahlung. Feature to be removed, scaling should be done with b0 and b1.
- abs_elts_dict
- dict
Dictionnary of elements and associated concentrations. This dictionnary is used to calculate the contribution of the absorption in the spectrum.
- elements_dict
- dict
Dictionnary of elements and associated concentrations describing the chemical composition of the modeled sample.
- Returns
- ——-
- spectrum
- np.array 1D
Output edx spectrum corresponding to the input chemical composition.
Notes
Check EDXS_function for details about the bremsstrahlung model.
Examples
>>> import matplotlib.pyplot as plt >>> from espm.models.edxs import EDXS >>> from espm.conf import DEFAULT_EDXS_PARAMS >>> b0, b1 = 5.5367e-5, 0.00192181 >>> elts_dict = {"Si" : 1.0,"Ca" : 1.0,"O" : 3.0,"C" : 0.3} >>> model = EDXS(**DEFAULT_EDXS_PARAMS) >>> spectrum = model.generate_spectrum(b0,b1, elements_dict = elts_dict) >>> plt.plot(spectrum)
- espm.models.edxs.G_EDXS(model, g_params, part_W=None, G=None)[source]
Update the bremsstrahlung part of the G matrix. This function is used for the NMF decomposition so that the absorption correction is updated in between each step.
- Parameters
- model_params
- dict
Parameters of the edxs model. Check espm.conf.DEFAULT_EDXS_PARAMS for an example.
- g_params
- dict
Parameters specific to the creation of the G matrix.
- part_W
- np.array 2D
Concentrations determined during the NMF decomposition. It is used to determine the average composition that will be used to determine the absorption correction.
- G
- np.array 2D
Current G matrix. Setting it to None will initialize the G matrix.
- Returns
- updated G
- np.array 2D
Updated G matrix with a new absorption correction.
EDXS functions
The espm.models.EDXS_function
module implements misceallenous functions required for the :mod: espm.models.edxs.
This module include bremsstrahlung modelling, database reading and dicts and list manipulation functions.
- espm.models.EDXS_function.G_bremsstrahlung(x, E0, params_dict, *, elements_dict={})[source]
Computes the two-parts continuum X-rays for the G matrix. The two parts of the bremsstrahlung are constructed separately so that its parameters can fitted to data. Absorption and detection are multiplied to each part.
- Parameters
- x
- np.array 1D
Energy scale.
- params_dict
- dict
Dictionnary containing the absorption and detection parameters.
- elements_dict
- dict
Composition of the studied sample. It is required for absorption calculation.
- Returns
- continuum_xrays
- np.array 2D
Two parts continuum X-rays model with shape (energy scale size, 2).
- espm.models.EDXS_function.chapman_bremsstrahlung(x, b0, b1, b2)[source]
Calculate the bremsstrahlung as parametrized by chapman et al.
- Parameters
- x
- np.array 1D
Energy scale.
- b0
- float
First parameter, corresponding to the inverse of the energy.
- b1
- float
Second parameter, corresponding to the energy.
- b2
- float
Third parameter, corresponding to the square of the energy.
- Returns
- bremsstrahlung
- np.array 1D
Bremsstrahlung model
Notes
For details see [CNC84]
- espm.models.EDXS_function.continuum_xrays(x, params_dict={}, b0=0, b1=0, E0=200, *, elements_dict={'Si': 1.0})[source]
Computes the continuum X-rays, i.e. the bremsstrahlung multiplied by the absorption and the detection efficiency.
- Parameters
- x
- np.array 1D
Energy scale.
- params_dict
- dict
Dictionnary containing the absorption and detection parameters.
- b0
- float
First parameter.
- b1
- float
Second parameter.
- E0
- float
Energy of the incident beam in keV.
- elements_dict
- dict
Composition of the studied sample. It is required for absorption calculation.
- Returns
- continuum_xrays
- np.array 1D
Continuum X-rays model.
Notes
When an empty dict is provided. The continuum X-rays are set to 0. This is useful to perform calculations without the bremsstrahlung for example.
For an example structure of the params_dict parameter, check the DEFAULT_EDXS_PARAMS espm.conf.
For a custom detection efficiency, check the spectrum fit notebook.
- espm.models.EDXS_function.elts_dict_from_W(part_W, *, elements=[])[source]
Create a dictionnary of the elemental concentration from a fitted W. It useful to recompute absorption during the our custom NMF calculations.
- Parameters
- part_W
- np.array 2D
W matrix output from the NMF calculation. It only makes sense when the NMF decomposition if performed with G.
- elements
- list
List of elements as atomic number or symbol.
- Returns
- elements_dictionnary
- dict
Dictionnary containing the concentration associated to each chemical element of the problem.
Notes
This function should maybe move to another part of espm.
Examples
>>> import numpy as np >>> from espm.models.EDXS_function import elts_dict_from_W >>> W = np.ones((4,3)) >>> elts = ["Si","Ca","O","C"] >>> elts_dict_from_W(W,elements = elts) {"Si" : 0.25, "Ca" : 0.25, "O" : 0.25, "C" : 0.25}
- espm.models.EDXS_function.elts_dict_from_dict_list(dict_list)[source]
Create a single dictionnary of the elemental concentration from a list of dictionnary containing chemical compositions. The new dictionnary corresponds to the average chemical composition of the list.
- Parameters
- dict_list
- list [dict]
list of chemical composition dictionnary
- Returns
- elements_dictionnary
- dict
Dictionnary containing the concentration associated to each chemical elements contained in the input.
Notes
This function should maybe move to another part of espm.
Examples
>>> import numpy as np >>> from espm.models.EDXS_function import elts_dict_from_dict_list >>> dicts = [{"Si" : 1.0},{"Ca" : 1.0},{"O" : 1.0},{"C" : 1.0}] >>> elts_dict_from_dict_list(dicts) {"Si" : 0.25, "Ca" : 0.25, "O" : 0.25, "C" : 0.25}
- espm.models.EDXS_function.elts_list_from_dict_list(dict_list)[source]
Create a list of elements from a list of dictionnary containing chemical compositions. The list contains all the elements contained in the input.
- Parameters
- dict_list
- list [dict]
list of chemical composition dictionnary
- Returns
- elements_list
- list
List of elements as atomic number or symbol.
Examples
>>> import numpy as np >>> from espm.models.EDXS_function import elts_list_from_dict_list >>> dicts = [{"Si" : 1.0, "C" : 0.5, "O" : 0.5},{"Ca" : 1.0, "K" : 2.0},{"O" : 1.0, "Si" : 0.5},{"C" : 1.0}] >>> elts_list_from_dict_list(dicts) ["Si","Ca","O","C","K"]
- espm.models.EDXS_function.gaussian(x, mu, sigma)[source]
Calculate the gaussian function according to the following formula:
\[f(x) = \frac{1}{\sigma \sqrt{2 \pi}} e^{-\frac{(x-\mu)^2}{2 \sigma^2}}\]- Parameters
- xnp.array 1D
Energy scale.
- mufloat
Mean of the gaussian.
- sigmafloat
Standard deviation of the gaussian.
- Returns
- gaussiannp.array 1D
Gaussian distribution on x, with mean mu and standard deviation sigma.
- espm.models.EDXS_function.lifshin_bremsstrahlung(x, b0, b1, E0=200)[source]
Calculate the custom parametrized bremsstrahlung inspired by the model of L.Lifshin.
The model is designed to have the same definition set as the original model of L.Lifshin with \(b_0 > 0\) and \(b_1 > 0\).
The model is defined as follows:
\[f(\varepsilon) = b_0 \frac{\varepsilon_0 - \varepsilon}{\varepsilon_0 \varepsilon} \left(1 - \frac{\varepsilon_0 - \varepsilon}{\varepsilon_0}\right) + b_1 \frac{\left( \varepsilon_0 - \varepsilon \right) ^2}{\varepsilon_0^2 \varepsilon}\]where \(\varepsilon_0\) is the energy of the incident beam, \(\varepsilon\) is the energy of the emitted photon and \(b_0\) and \(b_1\) are the parameters of the model.
- Parameters
- x
- np.array 1D
Energy scale.
- b0
- float
First parameter.
- b1
- float
Second parameter.
- E0
- float
Energy of the incident beam in keV.
- Returns
- bremsstrahlung
- np.array 1D
Bremsstrahlung model
Notes
For details see L.Lifshin, Ottawa, Proc.9.Ann.Conf.Microbeam Analysis Soc. 53. (1974)
- espm.models.EDXS_function.lifshin_bremsstrahlung_b0(x, b0, E0=200)[source]
Calculate the first part of our bremsstrahlung model.
- espm.models.EDXS_function.lifshin_bremsstrahlung_b1(x, b1, E0=200)[source]
Calculate the second part of our bremsstrahlung model.
- espm.models.EDXS_function.read_compact_db(elt, db_dict)[source]
Read the energy and cross section of each line of a chemical element for compact databases.
- Parameters
- elt
- string
Atomic number.
- db_dict
- dict
Dictionnary extracted from the json database containing the emission lines and their energies.
- Returns
- energies
- list float
List of energies associated to each emission line of the given element
- cross-sections
- list float
List of emission cross-sections of each line of the given element
- espm.models.EDXS_function.read_lines_db(elt, db_dict)[source]
Read the energy and cross section of each line of a chemical element for explicit databases.
- Parameters
- elt
- string
Atomic number.
- db_dict
- dict
Dictionnary extracted from the json database containing the emission lines and their energies.
- Returns
- energies
- list float
List of energies associated to each emission line of the given element
- cross-sections
- list float
List of emission cross-sections of each line of the given element
- espm.models.EDXS_function.shelf(x, height, length)[source]
Calculate the shelf contribution of the EDXS Silicon Drift Detector (SDD) detector.
- Parameters
- x
- np.array 1D
Energy scale.
- height
- float
Height in intensity of shelf contribution of the detector.
- length
- float
Length in energy of shelf contribution of the detector.
- Returns
- shelf
- np.array 1D
SDD shelf model.
Notes
For details see [SP09]
- espm.models.EDXS_function.update_bremsstrahlung(G, part_W, model, elements_list)[source]
Update the continuum X-rays part of the G matrix with the current average chemical composition.
- Parameters
- G
- np.array 2D
EDXS modelling matrix.
- part_W
- np.array 2D
Learned chemical concentrations.
- model
- espm.models.EDXS
EDXS model.
- elements_list
- list
List of elements as atomic number or symbol.
- Returns
- new_G
- np.array 2D
New G matrix with updated continuum X-rays models.
Notes
This function should maybe move to another part of espm.
Absorption functions
The espm.models.absorption_edxs
module implements the functions to calculate the contribution of absorption in edx spectra.
- espm.models.absorption_edxs.absorption_coefficient(x, atomic_fraction=False, *, elements_dict={'Si': 1.0})[source]
Calculate the mass-absorption coefficent of a given composition at a certain energy or over a certain energy range.
- Parameters
- x
- np.array 1D
Energy value or energy scale over which the mass-absorption coefficient is calculated.
- atomic_fraction
- boolean
If True, the atomic concentrations of elements_dict are converted in atomic weight fractions.
- elements_dict
- dict
Dictionnary with chemical elements as keys and atomic weight fractions (or atomic concentrations) as values.
- Returns
- mass-absorption coefficient
- np.array 1D
Calculated value of the mass-absorption coefficient on all the energy range.
Notes
The mass-absorption coefficients are calculated using the database of hyperspy [dlPPF+22].
- espm.models.absorption_edxs.absorption_correction(x, thickness=1e-05, toa=90, density=None, atomic_fraction=False, *, elements_dict={'Si': 1.0}, **kwargs)[source]
Calculate the contribution of the absorption on the EDX spectrum for a thin slab of material with a given composition at a certain energy or over a certain energy range. The absorption correction is calculated using the following formula :
\[\frac{1 - \exp(-\chi)}{\chi} \]where \(\chi = \mu \rho t / \sin(\theta)\), \(\mu\) is the mass-absorption coefficient, \(\rho\) is the density of the material, \(t\) is the thickness of the sample and \(\theta\) is the average take-off angle of the x-rays travelling from the sample to the x-ray detectors.
- Parameters
- x
- np.array 1D
Energy value or energy scale over which the mass-absorption coefficient is calculated.
- thickness
- float
Thickness of the material slab in meter. If the thickness is set to 0.0, the function will return 1.0.
- toa
- float
Take-off angle in degrees of the x-rays travelling from the sample to the x-ray detectors.
- density
- float
Density of the material in g/m3 (to be checked). If None, an approximation of the density based on the atomic numbers of the elements of the materials is calculated.
- atomic_fraction
- boolean
If True, the atomic concentrations of elements_dict are converted in atomic weight fractions.
- elements_dict
- dict
Dictionnary with chemical elements as keys and atomic weight fractions (or atomic concentrations) as values.
- Returns
- absorption correction
- np.array 1D
Calculated value of the absorption correction on all the energy range.
- espm.models.absorption_edxs.absorption_mass_thickness(x, mass_thickness, toa=90, atomic_fraction=True, *, elements_dict={'Si': 1.0})[source]
Calculate the contribution of the absorption of a mass-thickness map.
- Parameters
- x
- np.array 1D
Energy value or energy scale over which the mass-absorption coefficient is calculated.
- mass_thickness:
- np.array 2D
Mass-thickness map. See the notes for details.
- toa
- float
Take-off angle in degrees of the x-rays travelling from the sample to the x-ray detectors.
- atomic_fraction
- boolean
If True, the atomic concentrations of elements_dict are converted in atomic weight fractions.
- elements_dict
- dict
Dictionnary with chemical elements as keys and atomic weight fractions (or atomic concentrations) as values.
- Returns
- absorption correction
- np.array 3D
Calculated value of the absorption correction on all the energy range for all pixels of the mass-thickness map.
Notes
The mass-thickness map can be extracted from an EDX spectrum image by calculating the concentration of an element for both K and L lines. Since there should be only one concentration for both lines, it is possible to determine the contribution of the absorption for each pixel and thus obtain the mass-thickness map.
- espm.models.absorption_edxs.det_efficiency(x, det_dict)[source]
Simulate the detection efficiency of an EDX detector. The absorption wihtin one layer of the detector is calculated according to the following equation, corresponding to the Beer-Lambert law:
\[d_k = \exp\left( - \mu(\varepsilon_k)\rho t \right)\]where \(d_k\) is the detection efficiency at the energy \(\varepsilon_k\), \(\mu(\varepsilon_k)\) is the mass-absorption coefficient at the energy \(\varepsilon_k\), \(\rho\) is the density of the material, and \(t\) is the thickness of the material.
In a detection layer of the detector, an absobed photon counts as a detected photon so that the detection efficiency is equal to 1 minus the absorption. The detection efficiency is calculated as the product of the absorption of each layer of the detector.
- Parameters
- x
- np.array 1D
Energy value or energy scale over which the mass-absorption coefficient is calculated.
- det_dict
- dict
See Examples for an example of the structure of the dict. One of the layer of the input dictionnary should be named “detection” to model a active layer of the detector.
- Returns
- efficiency
- np.array 1D
Calculated detection efficiency based on the input layers model.
Examples
>>> from espm.models.absorption_edxs import det_efficiency >>> import matplotlib.pyplot as plt >>> import numpy as np >>> dict = {"detection" : {"thickness" : 10e-3, "density" : 2.3, "atomic_fraction" : False, "elements_dict" : {"Si" : 1.0}}, >>> "dead_layer" : {"thickness" : 50e-7, "density" : 2.7, "atomic_fraction" : False, "elements_dict" : {"Al" : 1.0}}} >>> x = np.linspace(0.5,20,num = 1000) >>> efficiency = det_efficiency(x,dict) >>> plt.plot(x,efficiency)
- espm.models.absorption_edxs.det_efficiency_from_curve(x, filename, kind='cubic')[source]
Interpolate a detection efficiency curve that is stored in ~/espm/tables.
- Parameters
- x
- np.array 1D
Energy value or energy scale over which the mass-absorption coefficient is calculated.
- filename:
- string
Name of the file of the detection efficiency.
- kind
- string
Polynomial order of the interpolation.
- Returns
- Detection efficiency
- np.array 1D
Interpolated detection efficiency.
- espm.models.absorption_edxs.det_efficiency_layer(x, thickness=1e-05, density=None, atomic_fraction=False, *, elements_dict={'Si': 1.0})[source]
Calculate the proportion of absorbed X-rays for one virtual layer of the detector.
- Parameters
- x
- np.array 1D
Energy value or energy scale over which the mass-absorption coefficient is calculated.
- thickness
- float
Thickness of the material slab in meter. If the thickness is set to 0.0, the function will return 1.0.
- density
- float
Density of the material in g/m3 (to be checked). If None, an approximation of the density based on the atomic numbers of the elements of the materials is calculated.
- atomic_fraction
- boolean
If True, the atomic concentrations of elements_dict are converted in atomic weight fractions.
- elements_dict
- dict
Dictionnary with chemical elements as keys and atomic weight fractions (or atomic concentrations) as values.
- Returns
- absorption
- np.array 1D
Calculated absorption of one layer of the detector.
Notes
We assume that the X-rays arrive perpendicular to the detector.
- espm.models.generate_EDXS_phases.generate_brem_params(seed)[source]
Generate random parameters for the Bremsstrahlung model with a scaling factor that somewhat resembles the real data.
- Parameters
- seedint
Seed for the random number generator.
- Returns
- dict
Dictionary containing the parameters for the Bremsstrahlung model.
- espm.models.generate_EDXS_phases.generate_elts_dict(seed, nb_elements=3)[source]
Generate a random dictionary of elements and their relative abundance. The elements are limited to the range 6-82.
- Parameters
- seedint
Seed for the random number generator.
- nb_elementsint, optional
Number of elements to generate. The default is 3.
- Returns
- dict
Dictionary containing the elements and their relative abundance.
- espm.models.generate_EDXS_phases.generate_modular_phases(elts_dicts=3, brstlg_pars=None, scales=None, model_params=None, seed=0)[source]
Generate an array of phases of the EDXS model with set model parameters, elemental compositions and bremsstrahlung parameters.
- Parameters
- elts_dictsint or list, optional
Number of phases to generate or list of dictionnaries containing the chemical compositions. The default is 3.
- brstlg_parslist, optional
List of dictionnaries containing the bremsstrahlung parameters. If the parameters are not set, they are generated randomly.
- scaleslist, optional
List of scaling factors for the phases. If the scales are not set, they are set to 1.0.
- model_paramsdict, optional
Dictionary containing the model parameters. If the parameters are not set, they are set to the default values. See the config file for the default values.
- seedint, optional
Seed for the random number generator. The default is 0.
- Returns
- np.array
Array of phases of the EDXS model.
Examples
>>> from espm.models.generate_EDXS_phases import generate_modular_phases >>> import matplotlib.pyplot as plt >>> phases = generate_modular_phases(elts_dicts = [{"8": 0.5, "14": 0.2, "26": 0.3}, {"8" : 0.5, "23" : 0.3, "7" : 0.2}], ... brstlg_pars = [{"b0" : 0.03, "b1" : 0.01}, {"b0" : 0.002, "b1" : 0.0025}]) >>> plt.plot(phases[0]) >>> plt.plot(phases[1])
- espm.models.generate_EDXS_phases.generate_random_phases(n_phases=3, seed=0)[source]
Generate a list of phases of the EDXS model with random elemental compositions and Bremsstrahlung parameters. The model parameters are set to the default values.
- Parameters
- n_phasesint, optional
Number of phases to generate. The default is 3.
- seedint, optional
Seed for the random number generator. The default is 0.
- Returns
- np.array
Array of phases of the EDXS model.
- espm.models.generate_EDXS_phases.unique_elts(dict_list)[source]
Generate a list of unique chemical elements from a list of chemical elements dictionnaries.
- Parameters
- dict_listlist
List of dictionnaries containing the elements and their relative abundance.
- Returns
- list
List of unique chemical elements.
Examples
>>> unique_elts([{"elements_dict" : {"18" : 0.2, "8" : 0.5, "12" : 0.3}}, {"elements_dict" : {"8" : 0.5, "26" : 0.3, "14" : 0.2}}]) ['18', '8', '12', '26', '14']
Weights
Weights
This module contains functions to generate phase distribution maps. Generate simple maps using espm.weights.generate_weights
or more complex maps using espm.weights.abundance
.
Weights creation class
The espm.weights.abundance
module implements the class Abundance
which is used to create the weights of the phases. The weights are stored in the attribute weights
and are normalized to 1.
- class espm.weights.abundance.Abundance(shape_2d, n_phases)[source]
- add_chemical_map(file, element_line, conc_min, conc_max, sigma, phase_id, **kwargs)[source]
Function to add a chemical map extracted from a EDS spectrum image as a phase. A threshold and blurring are automatically applied to the map to suppress noise.
- Parameters
- filestr
Path to the EDS spectrum image.
- element_linestr
Element line to be used for the chemical map. The corresponding element has to be in the metadata of the spectrum image. For example, if the element is ‘Cu’ and the line is ‘K’, the element_line parameter should be ‘Cu_K’.
- conc_minfloat
Minimum concentration. It has to be between 0.0 and 1.0.
- conc_maxfloat
Maximum concentration. It has to be between 0.0 and 1.0.
- sigmafloat
Sigma parameter for the Gaussian filter. It has to be greater than 0.0. Note that the automatic threshold is applied to the blurred map.
- phase_idint
Index of the phase. It has to be between 1 and n_phases-1.
- **kwargs
Keyword arguments for the get_lines_intensity method of the EDS spectrum image.
- Returns
- None.
- add_gaussian_ripple(center, width, conc_max, phase_id)[source]
Function to define a gaussian ripple spanning over the whole length of the weights abundance of a defined phase. The concentration is max at the center and 0.0 on the edges.
- Parameters
- centerint
Position on the mean of the gaussian ripple in pixels.
- widthint
Full width at half maximum of the gaussian in pixels.
- conc_maxfloat
Maximum concentration of the gaussian ripple. It has to be between 0.0 and 1.0.
- phase_idint
Index of the phase. It has to be between 1 and n_phases-1.
- Returns
- None.
Examples
>>> from espm.weights.abundance import Abundance >>> from matplotlib import pyplot as plt >>> shape_2d = (100,100) >>> n_phases = 2 >>> gaussian_ripple = Abundance(shape_2d, n_phases) >>> gaussian_ripple.add_gaussian_ripple(50, 15, 1.0, 1) >>> plt.imshow(gaussian_ripple.weights[:,:,1])
- add_image(image, phase_id, conc_min, conc_max)[source]
Function to add a 2D numpy array as a phase.
- Parameters
- imagenumpy.ndarray
2D numpy array with the phase. It has to be the same size as the shape_2d parameter.
- phase_idint
Index of the phase. It has to be between 1 and n_phases-1.
- conc_minfloat
Minimum concentration. It has to be between 0.0 and 1.0.
- conc_maxfloat
Maximum concentration. It has to be between 0.0 and 1.0.
- Returns
- None.
Examples
>>> from espm.weights.abundance import Abundance >>> import hyperspy.api as hs >>> from matplotlib import pyplot as plt >>> shape_2d = (512,512) >>> n_phases = 2 >>> image = Abundance(shape_2d, n_phases) >>> data = hs.datasets.example_signals.object_hologram() >>> image.add_image(data.data, 1, 0.0, 1.0) >>> plt.imshow(image.weights[:,:,1])
- add_laplacian(seed, phase_id, conc_min, conc_max, size_x=50, size_y=50)[source]
Function to generate smooth noise. The characteristic length scale of the noise variations is defined by the scale_x and scale_y parameters.
- Parameters
- seedint
Seed for the random number generator.
- phase_idint
Index of the phase. It has to be between 1 and n_phases-1.
- conc_minfloat
Minimum concentration. It has to be between 0.0 and 1.0.
- conc_maxfloat
Maximum concentration. It has to be between 0.0 and 1.0.
- size_xint, optional
characteristic length scale of the noise variations in the x direction. The default is 50.
- size_yint, optional
characteristic length scale of the noise variations in the y direction. The default is 50.
- Returns
- None.
Examples
>>> from espm.weights.abundance import Abundance >>> from matplotlib import pyplot as plt >>> shape_2d = (100,100) >>> n_phases = 2 >>> laplacian = Abundance(shape_2d, n_phases) >>> laplacian.add_laplacian(1, 1, 0.0, 1.0, 10, 10) >>> plt.imshow(laplacian.weights[:,:,1])
- add_random(seed, phase_id, conc_min, conc_max)[source]
Function to generate random noise.
- Parameters
- seedint
Seed for the random number generator.
- phase_idint
Index of the phase. It has to be between 1 and n_phases-1.
- conc_minfloat
Minimum concentration. It has to be between 0.0 and 1.0.
- conc_maxfloat
Maximum concentration. It has to be between 0.0 and 1.0.
- Returns
- None.
Examples
>>> from espm.weights.abundance import Abundance >>> from matplotlib import pyplot as plt >>> shape_2d = (100,100) >>> n_phases = 2 >>> random = Abundance(shape_2d, n_phases) >>> random.add_random(1, 1, 0.0, 1.0) >>> plt.imshow(random.weights[:,:,1])
- add_sphere(ind_origin, radius, conc_max, phase_id, asym_x=1.0, asym_y=1.0)[source]
Function to define a sphere abundance of a defined phase. The concentration is max at centre and 0.0 on the edges.
- Parameters
- ind_origintuple of integers
Coordinates of the centre of the sphere.
- radiusfloat
Radius of the sphere in pixels.
- conc_maxfloat
Maximum concentration of the sphere.
- phase_idint
Index of the phase
- asym_xfloat, optional
Asymmetry of the sphere in the x direction. The default is 1.0.
- asym_yfloat, optional
Asymmetry of the sphere in the y direction. The default is 1.0.
- Returns
- None.
Examples
>>> from espm.weights.abundance import Abundance >>> from matplotlib import pyplot as plt >>> shape_2d = (100,100) >>> n_phases = 2 >>> sphere = Abundance(shape_2d, n_phases) >>> sphere.add_sphere((50,50), 30, 1.0, 1) >>> plt.imshow(sphere.weights[:,:,1])
- add_wedge(ind_origin, length, width, conc_min, conc_max, phase_id)[source]
Function to define a wedge abundance of a defined phase. The concentration is max at the top and min at the bottom.
- Parameters
- ind_origintuple of integers
Coordinates of the top left corner of the wedge. It has to be inside the current weights, i.e. ind_origin[0] + length <= self.shape_2d[0] and ind_origin[1] + width <= self.shape_2d[1].
- lengthint
Length of the wedge in pixels.
- widthint
Width of the wedge in pixels.
- conc_minfloat
Minimum concentration of the wedge. It has to be between 0.0 and 1.0.
- conc_maxfloat
Maximum concentration of the wedge. It has to be between 0.0 and 1.0.
- phase_idint
Index of the phase. It has to be between 1 and n_phases-1.
- Returns
- None.
Examples
>>> from espm.weights.abundance import Abundance >>> from matplotlib import pyplot as plt >>> shape_2d = (100,100) >>> n_phases = 2 >>> wedge = Abundance(shape_2d, n_phases) >>> wedge.add_wedge((0,0), 50, 50, 0.0, 1.0, 1) >>> plt.imshow(wedge.weights[:,:,1])
- check_add_weights(val, phase_id)[source]
Check if the sum of the weights is below 1. If it is, add the new abundance. If not, print a message.
- Parameters
- valarray
Array of the new abundance.
- phase_idint
Index of the phase. It has to be between 1 and n_phases-1.
- Returns
- None.
- scale_phase(values, conc_min, conc_max)[source]
Scale the values of a phase between conc_min and conc_max. If the values are all zero, the function returns the values without scaling.
- Parameters
- valuesarray
Array of the abundance to scale.
- conc_minfloat
Minimum concentration of the phase. It has to be between 0.0 and 1.0.
- conc_maxfloat
Maximum concentration of the phase. It has to be between 0.0 and 1.0.
- Returns
- scaled_valuesarray
Array of the scaled abundance.
- property weights
Predetermined weights
The :mod: espm.weights.generate_weights module implements functions to generate predetermined weights to quickly and easily generate spatial phase distributions. For a more advanced weights creation, see the :mod: espm.weights.abundance module.
- espm.weights.generate_weights.chemical_map_weights(file=None, line_list=[], conc_list=[], **kwargs)[source]
Generate a weight matrix based on an experimental EDS spectrum image. For each selected line a map is generated with the corresponding concentration.
- Parameters
- shape_2dtuple, optional
Shape of the weight matrix. The default is [80, 80].
- filestr, optional
Path to the EDS spectrum image. If None, the function does nothing.
- line_listlist, optional
List of lines to be used. The default is []. If line_list is empty, the function does nothing. The lines must be in the form of a string, e.g. “Fe_Ka”.
- conc_listlist, optional
List of concentrations for each line. The default is []. If conc_list is empty, the concentration of each line is set to 1/len(line_list).
- Returns
- weightsnumpy.ndarray
Weight matrix with chemical map distribution.
- espm.weights.generate_weights.gaussian_ripple_weights(shape_2d, width=1, seed=0, **kwargs)[source]
Generate a weight matrix with a gaussian ripple of the given width and randomly centered.
- Parameters
- shape_2dtuple
Shape of the weight matrix.
- widthint, optional
Width of the gaussian ripple. The default is 1.
- seedint, optional
Seed for the random number generator. The default is 0. If seed is 0, the gaussian ripple is centered at half the second dimension of the weight matrix.
- Returns
- weightsnumpy.ndarray
Weight matrix with gaussian ripple distribution.
Examples
>>> from espm.weights.generate_weights import gaussian_ripple_weights >>> import matplotlib.pyplot as plt >>> weights = gaussian_ripple_weights((100,100), 20, 0) >>> fig = plt.figure(figsize=(5,2)) >>> axs = fig.subplots(1,2) >>> for i in range(2): ... axs[i].imshow(weights[:,:,i], cmap=plt.cm.gray_r) ... axs[i].set_title(f"Map {i+1}") >>> fig.show()
- espm.weights.generate_weights.generate_weights(weight_type, shape_2d, n_phases=3, seed=0, **params)[source]
Generate a weight matrix with the given type. Additional parameters can be passed to the function as keyword arguments.
- Parameters
- weight_typestr
Type of weight matrix. Accepted types : random, laplacian, sphere, gaussian_ripple, wedge, toy_problem, chemical_map.
- shape_2dtuple
Shape of the weight matrix.
- n_phasesint, optional
Number of phases. The default is 3. The first phase is the complementary of the other phases.
- seedint, optional
Seed for the random number generator. The default is 0.
- **paramsdict
Additional parameters for the weight matrix generation. See the documentation of the corresponding function for more details.
- Returns
- weightsnumpy.ndarray
Weight matrix with the given type.
- espm.weights.generate_weights.laplacian_weights(shape_2d, n_phases=3, seed=0, size_x=10, size_y=10, **kwargs)[source]
Generates a random weight matrix with a laplacian distribution.
The random weight are then filtered with median filter 2 times to smooth the distribution. Eventually, the result is normalized to sum up to one.
- Parameters
- shape_2dtuple
Shape of the weight matrix.
- n_phasesint, optional
Number of phases. The default is 3.
- seedint, optional
Seed for the random number generator. The default is 0.
- Returns
- weightsnumpy.ndarray
Weight matrix with laplacian distribution.
Examples
>>> import numpy as np >>> import matplotlib.pyplot as plt >>> from espm.weights.generate_weights import laplacian_weights >>> weights = laplacian_weights((100,100), 3, 0) >>> fig = plt.figure(figsize=(10,3)) >>> axs = fig.subplots(1,3) >>> for i in range(3): ... axs[i].imshow(weights[:,:,i], cmap=plt.cm.gray_r) ... axs[i].set_title(f"Map {i+1}") >>> fig.show()
- espm.weights.generate_weights.random_weights(shape_2d, n_phases=3, seed=0)[source]
Generates a random weight matrix with a uniform distribution.
The random weights are then normalized to sum up to one.
- Parameters
- shape_2dtuple
Shape of the weight matrix.
- n_phasesint, optional
Number of phases. The default is 3.
- seedint, optional
Seed for the random number generator. The default is 0.
- Returns
- weightsnumpy.ndarray
Weight matrix with uniform distribution.
Examples
>>> from espm.weights.generate_weights import random_weights >>> import matplotlib.pyplot as plt >>> weights = random_weights((100,100)) >>> fig = plt.figure(figsize=(10,3)) >>> axs = fig.subplots(1,3) >>> for i in range(3): ... axs[i].imshow(weights[:,:,i], cmap=plt.cm.gray_r) ... axs[i].set_title(f"Map {i+1}") >>> fig.show()
- espm.weights.generate_weights.spheres_weights(shape_2d=[80, 80], n_phases=3, seed=0, radius=1, **kwargs)[source]
Generate a weight matrix with randomly placed spheres of the given radius.
- Parameters
- shape_2dtuple, optional
Shape of the weight matrix. The default is [80, 80].
- n_phasesint, optional
Number of phases. The default is 3. The first phase is the complementary of the other phases.
- seedint, optional
Seed for the random number generator. The default is 0.
- radiusint, optional
Radius of the spheres in pixels. The default is 1.
- Returns
- weightsnumpy.ndarray
Weight matrix with spheres distribution.
Examples
>>> from espm.weights.generate_weights import spheres_weights >>> import matplotlib.pyplot as plt >>> weights = spheres_weights((100,100), 3, 0, 20) >>> fig = plt.figure(figsize=(10,3)) >>> axs = fig.subplots(1,3) >>> for i in range(3): ... axs[i].imshow(weights[:,:,i], cmap=plt.cm.gray_r) ... axs[i].set_title(f"Map {i+1}") >>> fig.show()
- espm.weights.generate_weights.toy_weights(**kwargs)[source]
Load the toy problem weights.
- Returns
- Hdotnp.ndarray
The weights of the toy problem.
Examples
>>> from espm.weights.generate_weights import toy_weights >>> import matplotlib.pyplot as plt >>> weights = toy_weights() >>> fig = plt.figure(figsize=(10,3)) >>> axs = fig.subplots(1,3) >>> for i in range(3): ... axs[i].imshow(weights[:,:,i], cmap=plt.cm.gray_r) ... axs[i].set_title(f"Map {i+1}") >>> fig.show()
- espm.weights.generate_weights.wedge_weights(shape_2d=[80, 80])[source]
Generate a weight matrix with a wedge of phase 2 and complementary phase 1.
- Parameters
- shape_2dtuple, optional
Shape of the weight matrix. The default is [80, 80].
- Returns
- weightsnumpy.ndarray
Weight matrix with wedge distribution.
Examples
>>> from espm.weights.generate_weights import wedge_weights >>> import matplotlib.pyplot as plt >>> weights = wedge_weights((100,100)) >>> fig = plt.figure(figsize=(5,2)) >>> axs = fig.subplots(1,2) >>> for i in range(2): ... axs[i].imshow(weights[:,:,i], cmap=plt.cm.gray_r) ... axs[i].set_title(f"Map {i+1}") >>> fig.show()
Datasets
Sample creation
The espm.datasets
module implements the functions that combines the spatial distributions generated from espm.weights
and the spectra generated from espm.models
into a 3D dataset.
This part of the espm package manages the integration into the hyperspy framework :
- the datasets and their metadata are stored as hyperspy signals (.hspy).
- the espm.eds_spim
module implements the EDS_espm
class, which is a subclass of the hyperspy.signals.Signal1D
class.
Using the EDS_espm
class, the user can easily use most of the hyperspy functionalities (e.g. plotting, fitting, decomposition, etc.) as well as the espm functionalites on their experimental and simulated data.
Note
For now espm supports only the signals modeled as EDS data but we aim at implementing the signals corresponding to EELS data too.
The module espm.datasets.base
implements the functions that combines a spatial distribution and associated spectra into a 3D dataset. It also implements the functions to convert the dataset into hyperspy compatible objects.
- espm.datasets.base.generate_dataset(base_path=PosixPath('/home/docs/checkouts/readthedocs.org/user_builds/espm/envs/v0.2.1/lib/python3.7/site-packages/generated_datasets'), sample_number=10, base_seed=0, elements=[], *args, **kwargs)[source]
Generate a set of spectrum images files and save them in the generated dataset folder. Each spectrum image is saved in a separate file and was generated using a different seed.
- Parameters
- base_pathstr, optional
The path to the folder where the samples will be saved. The default is DATASETS_PATH.
- sample_numberint, optional
The number of samples to generate. The default is 10.
- base_seedint, optional
The seed used to generate the samples. The default is 0.
- Returns
- None.
- espm.datasets.base.generate_spim(phases, weights, densities, N, seed=0, continuous=False)[source]
Generate a noiseless spectrum image as tensor product of the phases and weights. Then, if asked for, a noisy spectrum image is generated by drawing from a Poisson distribution.
The noiseless spectrum image is defined as:
\[Y^{nl} = N D \otimes ( Diag(d) A )\]where \(D\) is the normalized phases, \(A\) is the weights, \(d\) is the density modifier and \(N\) is the number of counts per pixel.
To obtain the noisy spectrum image, the noiseless spectrum image is drawn from a Poisson distribution.
- Parameters
- phasesarray_like
The phases of the model. Shape (n, spectral_len).
- weightsarray_like
The weights of the model. Shape (shape_2d[0], shape_2d[1], n).
- densitiesarray_like
Density modifier of the phases. Shape (n,).
- Nint
The number of counts per pixel.
- seedint, optional
Seed for the random number generator. The default is 0.
- continuousbool, optional
If True, the function returns a noiseless spectrum image. The default is False.
- Returns
- numpy.ndarray
The spectrum image. Shape (shape_2d[0], shape_2d[1], spectral_len).
Notes
More details about the spectrum image generation can be found in the contribution: [TPH+23].
- espm.datasets.base.generate_spim_sample(phases, weights, model_params, misc_params, seed=0, g_params={})[source]
Generate a dictionary containing: the spectrum image (made with the weights and phases), the ground truth, the model parameters and the misc parameters.
- Parameters
- phasesarray_like
The phases of the model. Shape (n, spectral_len).
- weightsarray_like
The weights of the model. Shape (shape_2d[0], shape_2d[1], n). The weights should sum to one along axis 2.
- model_paramsdict
The parameters of the model. For examples see the default parameters in espm.conf.
- misc_paramsdict
The misc parameters of the model. For examples see the default parameters in espm.conf.
- seedint, optional
The seed for the random number generator. The default is 0.
- g_paramsdict, optional
The parameters for the g matrix. The default is {}. Note that for EDXS data the g matrix is not used during the creation of the data.
- Returns
- sampledict
A dictionary containing the spectrum image, the ground truth, the model parameters and the misc parameters.
- espm.datasets.base.sample_to_EDS_espm(sample, elements=[])[source]
Convert dataset to a custom hyperspy signal type called EDS_espm containing the noisy spectrum image as data, the ground truth as metadata and other useful information.
- Parameters
- sampledict
A dictionary containing the noisy spectrum image as data, the ground truth as metadata and other useful information. See
espm.datasets.base.generate_spim_sample()
for more details.- elementslist, optional
A list of the elements present in the sample. The default is [].
- Returns
- EDS_espm
The hyperspy compatible signal object of the
espm.eds_spim
module.
- espm.datasets.base.sample_to_Signal1D(sample)[source]
Same as
espm.datasets.base.sample_to_EDS_espm()
but for non-EDS data such as the toy dataset.
The module espm.datasets.built_in_EDXS_datasets
implements the functions that generate two built-in datasets:
- A dataset of 2 particles embedded in a matrix.
- A dataset with a linear local accumulation of Sr.
- espm.datasets.built_in_EDXS_datasets.generate_built_in_datasets(seeds_range=10)[source]
Generate the two built-in datasets if they are not already present in the datasets folder.
- Parameters
- seeds_rangeint
The number of seeds to use for the generation of the built-in datasets. The built-in datasets are generated with a base_seed, and then the base_seed + 1, base_seed + 2, etc. up to base_seed + seeds_range -1.
- Returns
- None
- espm.datasets.built_in_EDXS_datasets.load_grain_boundary(sample=0)[source]
Load the built-in dataset of a grain boundary.
- Parameters
- sampleint
The sample number to load.
- Returns
- spimhyperspy.signals.EDS_espm
The loaded dataset.
- espm.datasets.built_in_EDXS_datasets.load_particules(sample=0)[source]
Load the built-in dataset of particles.
- Parameters
- sampleint
The sample number to load.
- Returns
- spimhyperspy.signals.EDS_espm
The loaded dataset.
The module espm.eds_spim
implements the EDS_espm
class, which is a subclass of the hyperspy.signals.Signal1D
class.
The main purpose of this class is to provide an easy and clean interface between the hyperspy framework and the espm package:
- The metadata are organised to correspond as much as possible to the typical metadata that can be found in hyperspy EDS_TEM object.
- The machine learning algorithms of espm can be easily applied to the EDS_espm
object using the standard hyperspy decomposition method. See the notebooks for examples.
- The EDS_espm
provides a convinient way to:
get the results of
espm.estimators.NMFEstimator
access ground truth in case of simulated data
set fixed W and H for the
espm.estimators.NMFEstimator
decomposition
- class espm.datasets.eds_spim.EDS_espm(*args, **kwargs)[source]
- add_elements(*, elements=[])[source]
Add elements to the existing list of elements in the metadata.
- Parameters
- elementslist, optional
List of the elements to be added to the existing list of elements in the metadata. They have to be chemical symbols (e.g. [‘Si’,’Fe’, ‘O’]).
- build_G(problem_type='bremsstrahlung', reference_elt={})[source]
Build the G matrix of the
espm.models.EDXS
model corresponding to the metadata of theEDS_espm
object and stores it as an attribute.- Parameters
- problem_typestr, optional
- Determines the type of the G matrix to build. It can be “bremsstrahlung”, “no_brstlg” or “identity”. The parameters correspond to:
“bremsstrahlung” : the G matrix is a callable with both characteristic X-rays and a bremsstrahlung model.
“no_brstlg” : the G matrix is a matrix with only characteristic X-rays.
“identity” : the G matrix is None which is equivalent to an identity matrix for espm functions.
- reference_eltdict, optional
Dictionary containing atomic numbers and a corresponding cut-off energies. It is used to separate the characteristic X-rays of the given elements into two energies ranges and assign them each a column in the G matrix instead of having one column per element. For example reference_elt = {“26”,3.0} will separate the characteristic X-rays of the element Fe into two energies ranges and assign them each a column in the G matrix. This is useful to circumvent issues with the absorption.
- Returns
- GNone or numpy.ndarray or callable
The G matrix of the
espm.models.EDXS
model corresponding to the metadata of theEDS_espm
object.
- build_ground_truth(reshape=True)[source]
Get the ground truth stored in the metadata of the
EDS_espm
object, if available. The reshape arguments can be used to get the ground truth in a form easier to use for machine learning algorithms.- Parameters
- reshapebool, optional
If False, the ground truth is returned in the form of a 3D array of shape (shape_2d[0],shape_2d[1],n_phases) and a 2D array of shape (n_phases,n_features).
- Returns
- phasesnumpy.ndarray
The ground truth of the spectra of the phases.
- weightsnumpy.ndarray
The ground truth of the spatial distribution of the phases.
- print_concentration_report(abs=False)[source]
Print a report of the chemical concentrations from a fitted W.
- Parameters
- absbool
If True, print the absolute concentrations, if False, print the relative concentrations.
- Returns
- None
Notes
This function is only available if the learning results contain a decomposition algorithm that has been fitted.
The “absolute” concentrations correspond to some physical number. To retrieve the number of atoms per unit volume, you need to multiply by the correct pre-factors such as beam current, detector solid angle, etc…
- set_additional_parameters(thickness=2e-05, density=3.5, detector_type='SDD_efficiency.txt', width_slope=0.01, width_intercept=0.065, xray_db='default_xrays.json')[source]
Helper function to set the metadata that are specific to the
espm
package so that it does not overwrite experimental metadata. See the documentation of theset_analysis_parameters()
function for the meaning of the parameters.
- set_analysis_parameters(beam_energy=200, azimuth_angle=0.0, elevation_angle=22.0, tilt_stage=0.0, elements=[], thickness=2e-05, density=3.5, detector_type='SDD_efficiency.txt', width_slope=0.01, width_intercept=0.065, xray_db='default_xrays.json')[source]
Helper function to set the metadata of the
EDS_espm
object. Be careful, it will overwrite the metadata of the object.- Parameters
- beam_energyfloat, optional
The energy of the electron beam in keV.
- azimuth_anglefloat, optional
The azimuth angle of the EDS detector in degrees.
- elevation_anglefloat, optional
The elevation angle of the EDS detector in degrees.
- tilt_stagefloat, optional
The tilt angle of the sample stage in degrees (usually it correspond to alpha on FEI instruments).
- elementslist, optional
List of the elements to be used in the analysis.
- thicknessfloat, optional
The thickness of the sample in centimeters.
- densityfloat, optional
The density of the sample in g/cm^3.
- detector_typestr, optional
The type of the detector. It is either the name of a text file containing the efficiency of
- width_slopefloat, optional
The slope of the linear fit of the detector width as a function of the energy.
- width_interceptfloat, optional
The intercept of the linear fit of the detector width as a function of the energy.
- xray_dbstr, optional
The name of the X-ray emission cross-section database to be used. The default tables are avalaible in the espm/tables folder. Additional tables can be generated by emtables.
- set_fixed_W(phases_dict)[source]
Helper function to create a fixed_W matrix. The output matrix will have -1 entries except for the elements (and bremsstrahlung parameters) that are present in the phases_dict dictionary. In the output (fixed_W) matrix, the -1 entries will be ignored during the decomposition using
espm.estimator.NMFEstimator
are normally learned while the non-negative entries will be fixed to the values given in the phases_dict dictionary. Usually, the easiest is to fix some elements to 0.0 in some phases if you want to improve unmixing results. For example, if you have a phase with only Si and O, you can fix the Fe element to 0.0 in this phase.- Parameters
- phases_dictdict
Determines which elements of fixed_W are going to be non-negative. The dictionnary has typically the following structure : phases_dict = {“phase1_name” : {“Fe” : 0.0, “O” : 1.25e23}, “phase2_name” : {“Si” : 0.0, “b0” : 0.05}}.
- Returns
- ——-
- Wnumpy.ndarray
- set_microscope_parameters(beam_energy=200, azimuth_angle=0.0, elevation_angle=22.0, tilt_stage=0.0)[source]
Helper function to set the microscope parameters of the
EDS_espm
object. Be careful, it will overwrite the microscope parameters of the object. See the documentation of theset_analysis_parameters()
function for the meaning of the parameters.
- update_G(part_W=None, G=None)[source]
Update the absortion part of the bremsstrahlung of the G matrix.
- property X
The data in the form of a 2D array of shape (n_samples, n_features).
- property Xdot
The ground truth in the form of a 3D array of shape (shape_2d[0],shape_2d[1],n_features), if available.
- property maps
Ground truth of the spatial distribution of the phases in the form of a 3D array of shape (shape_2d[0],shape_2d[1],n_phases), if available.
- property maps_2d
Ground truth of the spatial distribution of the phases in the form of a 2D array of shape (shape_2d[0]*shape_2d[1],n_phases), if available.
- property phases
Ground truth of the spectra of the phases in the form of a 2D array of shape (n_phases,n_features), if available.
- property shape_2d
Shape of the data in the spatial dimension.
Estimators
NMF Estimators.
The espm.estimators
module implements different NMF algorithms.
The class espm.estimators
is an abstract class for all NMF algorithms. It implements the fit and transform methods.
The fit method is implemented in the abstract class and calls the _iteration method which is implemented in the child classes.
The transform method is implemented in the child classes.
NMF Estimator
- class espm.estimators.NMFEstimator(n_components=2, init=None, tol=0.0001, max_iter=200, random_state=None, verbose=1, debug=False, l2=False, G=None, shape_2d=None, normalize=False, log_shift=1e-14, eval_print=10, true_D=None, true_H=None, fixed_H=None, fixed_W=None, hspy_comp=False)[source]
Abstract class for NMF algorithms.
This abstract class espm.estimators.NMFEstimator is used to implement the different NMF algorithms. It solves problems of the form:
\[\dot{W}, \dot{H} = \arg \min_{W \geq \epsilon, H \geq \epsilon} \frac{1}{2} L(X, GWH) + R(W, H)\]where \(X\) is the data matrix, \(G\) is a matrix of known values, \(W\) and \(H\) are the matrices to be learned, \(R\) is a regularization term and \(L\) is a loss function. that represent the generalized KL divergence. As a reminder, the generalized KL divergence is defined as:
\[D_{GKL}(X || Y) = \sum_{i,j} X_{ij} \log \frac{X_{ij}}{Y_{ij}} - X_{ij} + Y_{ij}\]where \(Y = GWH\). Since \(X\) does not depend on \(W\) and \(H\), we obtain the loss function:
\[L(X, Y) = - \sum_{i,j} X_{ij} \log \frac{GWH_{ij}} + GWH_{ij}\]The Generalized KL divergence has the advantage of being zero when \(X = Y\), which is not the case for our loss. Therefore, we shift the loss function by a constant \(C\) such that it equals the Generalized KL divergence. This constant is stored in the attribute
espm.estimators.NMFEstimator.const_KL_
.The loss function can also be selected to be the Frobenius norm. In this case, the loss function is:
\[L(X, Y) = \frac{1}{2} \sum_{i,j} (X_{ij} - Y_{ij})^2\]While the code will work, it is not recommended to use the Frobenius norm as a loss function. This code is optimized for the KL divergence.
The size of:
\(X\) is \((n, p)\),
\(W\) is \((m, k)\),
\(H\) is \((k, p)\),
\(G\) is \((n, m)\).
The columns of the matrices \(H\) and \(X\) are assumed to be images. This is used typically for the smoothness regularization. The parameter shape_2d defines the shape of the images, i.e. shape_2d[0]*shape_2d[1] = p.
- Parameters
- n_componentsint, default=2
Number of components, i.e. dimensionality of the latent space.
- initstr
Method used to initialize the procedure. Default is None The method use the initialization of
sklearn.decomposition
. It can be imported using: .. code-block::python>>> from sklearn.decomposition._nmf import _initialize_nmf
- tolfloat, default=1e-4
Tolerance of the stopping condition.
- max_iterint, default=200
Maximum number of iterations before timing out.
- random_stateint, RandomState instance, default=None
- verboseint, default=1
The verbosity level.
- debugbool, default=False
If True, the algorithm will log more and perform more checks.
- l2bool, default=False
If True, the algorithm will use the l2 norm instead of the KL divergence.
- Gnp.array, function or None, default=None
If np.array, it is the known matrix of the data. If function, it is a function that takes as input the data matrix and returns the known matrix (np.array). If None, it is assumed that G is the identity matrix.
- shape_2dtuple or None, default=None
If not None, it is the image shape of the columns of the matrices \(X\) and \(H\).
- normalizebool, default=False
If True, the algorithm will normalize the data matrix \(X\).
- log_shiftfloat, default=1e-10
Lower bound for W and H, i.e. \(\epsilon\).
- eval_printint, default=10
Number of iterations between each evaluation of the loss function.
- true_Dnp.array or None, default=None
Ground truth for the matrix \(GW\). Used for evaluation purposes.
- true_Hnp.array or None, default=None
Ground truth for the matrix \(H\). Used for evaluation purposes.
- fixed_Hnp.array or None, default=None
If not None, it fixes the non-zero values of the matrix \(H\).
- fixed_Wnp.array or None, default=None
If not None, it fixes the non-zero values of the matrix \(W\).
- hspy_compbool, default=False
If True, the algorithm will use the format compatible with hyperspy. Use this option if you run the algorithm with the method decompositio in hyperspy. For example: .. code-block::python
>>> est = SmoothNMF( n_components = 3, hspy_comp = True) >>> out = spim.decomposition(algorithm = est, return_info=True)
- fit(X, y=None, **params)[source]
Learn a NMF model for the data X.
- Parameters
- X{array-like, sparse matrix} of shape (n_samples, n_features)
Data matrix to be decomposed
- yIgnored
- paramsdict
Parameters passed to the fit_transform method.
- Returns
- self
The model.
- fit_transform(X, y=None, W=None, H=None)[source]
Learn a NMF model for the data X and returns the transformed data. This is more efficient than calling fit followed by transform.
The size of:
\(X\) is \((n, p)\),
\(W\) is \((m, k)\),
\(H\) is \((k, p)\),
\(G\) is \((n, m)\).
- Parameters
- X{array-like, sparse matrix} of shape (n, p)
Data matrix to be decomposed
- yIgnored
Not used, present here for API consistency by convention.
- Warray-like of shape (m, k)
If specified, it is used as initial guess for the solution.
- Harray-like of shape (k, p)
If specified, it is used as initial guess for the solution.
- Returns
- GWndarrays
Transformed data.
- inverse_transform(W)[source]
Transform data back to its original space.
- Parameters
- W{ndarray, sparse matrix} of shape (n_samples, n_components)
Transformed data matrix.
- Returns
- X{ndarray, sparse matrix} of shape (n_samples, n_features)
Data matrix of original shape.
- loss(W, H, average=True, X=None)[source]
Loss function.
Compute the loss function for the given matrices W and H.
- Parameters
- Wnp.array
Matrix of shape (n, k)
- Hnp.array
Matrix of shape (k, p)
- averagebool, default=True
If True, the loss is averaged over the number of elements of the matrices.
- Xnp.array or None, default=None
If not None, it is the data matrix. If None, it is assumed that the data matrix in self.X_.
- Returns
- loss_float
Value of the loss function.
- const_KL_ = None
- loss_names_ = ['KL_div_loss']
SmoothNMF
- class espm.estimators.SmoothNMF(lambda_L=1.0, linesearch=False, mu=0, epsilon_reg=1, algo='log_surrogate', force_simplex=True, dicotomy_tol=1e-05, gamma=None, **kwargs)[source]
SmoothNMF - NMF with a smooth regularization term
We encourage to read the example available in the documentation: https://espm.readthedocs.io/en/latest/introduction/notebooks/toy-problem.html
The corresponding notebook is available on github: https://github.com/adriente/espm/blob/main/notebooks/toy-ML.ipynb
The class SmoothNMF implements the regularized NMF algorithm. It solves problems of the form:
\[\dot{W}, \dot{H} = \arg \min_{W \geq \epsilon, H \geq \epsilon} D_{GKL}(X || GWH) + \lambda_L tr(H \Delta H^\top) + \mu \sum_{ij} \log(H_{ij} + \epsilon_{reg})\]where:
D_{GKL} is the Generalized KL divergence loss function defined as:
\[D_{GKL}(X || Y) = \sum_{i,j} X_{ij} \log \frac{X_{ij}}{Y_{ij}} - X_{ij} + Y_{ij}\]See the documentation of the class
espm.estimators.NMFEstimator
for more details.Delta is the Laplacian operator (it can be created using the function create_laplacian_matrix from the utils module).
epsilon_{reg} is the slope of the log regularization/sparsity at 0 (you probably want to leave this to 1).
lambda_L is a regularization parameter, which encourages smoothness in the columns of H.
mu is a regularization parameter, which is similar to an L1 sparsity penalty.
The size of:
X is (n, p)
W is (m, k)
H is (k, p)
G is (n, m)
The columns of the matrices H and X are assumed to be images, typically for the smoothness regularization. The parameter shape_2d defines the shape of the images, i.e., shape_2d[0]*shape_2d[1] = p.
- Parameters
- lambda_Lfloat, default=1.0
Regularization parameter for the smooth regularization term.
- linesearchbool, default=False
If True, use a line search to find the step size.
- mufloat, default=0
Regularization parameter for the log regularization/sparsity term.
- epsilon_regfloat, default=1
Slope of the log regularization/sparsity at 0.
- algostr, default=”log_surrogate”
Algorithm to use for the smooth regularization term. Can be “log_surrogate”, “l2_surrogate”, or “projected_gradient”.
- force_simplexbool, default=True
If True, force the solution to be in the simplex.
- dicotomy_tolfloat, default=1e-3
Tolerance for the dichotomy algorithm.
- gammafloat, default=None
Initial value for the step size. If None, it is set to the Lipschitz constant of the gradient.
- **kwargsdict
Additional parameters for the NMFEstimator class.
- fit_transform(X, y=None, W=None, H=None)[source]
Fit the model to the data X and returns the transformed data.
The size of:
\(X\) is \((n, p)\),
\(W\) is \((m, k)\),
\(H\) is \((k, p)\),
\(G\) is \((n, m)\).
- Parameters
- Xarray-like, shape (n, p)
Data matrix to be decomposed
- yIgnored
Not used, present here for API consistency by convention.
- Warray-like, shape (m, k)
If init=’custom’, it is used as initial guess for the solution.
- Harray-like, shape (k, p)
If init=’custom’, it is used as initial guess for the solution.
- Returns
- GWndarrays
Transformed data.
- loss_names_ = ['KL_div_loss', 'log_reg_loss', 'Lapl_reg_loss', 'gamma']
Measures
The espm.measures
module implements different measures for the matrix factorisation problem.
In particular it contains the different losses and regularizers used in espm.estimator
module.
It also contains different metrics to evaluate the results.
- espm.measures.Frobenius_loss(X, W, H, average=False)[source]
Frobenius norm of the difference between X and WH.
Compute the Froebenius norm (elementwise L2 norm of a matrix) given \(X,W,H\):
\[\| X - WH \|_F = \sum_{ji} \left| X_{ij} - (W H)_{ij} \right|^2\]- Parameters
X (np.array 2D) – n x m matrix
W (np.array 2D) – n x k matrix
H (np.array 2D) – k x m matrix
average (boolean) – replace the sum with a mean, i.e., divide the result by n*m (default False)
- Returns
the answer
Examples
>>> import numpy as np >>> from espm.measures import Frobenius_loss >>> X = np.array([[1, 1, -1], [2, 4, 5]]) >>> W = np.array([[1], [1]]) >>> H = np.array([[1, 2, 3]]) >>> Frobenius_loss(X, W, H) 26
- espm.measures.KL(X, Y, log_shift=1e-14, average=False)[source]
Generalized KL (Kullback–Leibler) divergence for two matrices
\[D_KL(X || Y) = \sum_{ji} X_{ij} \log (X / Y)_{ij} + (Y - X)_{ij}\]- Parameters
X (np.array 2D) – n x m matrix
W (np.array 2D) – n x k matrix
H (np.array 2D) – k x m matrix
log_shift (float) – small constant to ensure the KL divergence does not explode (default value set in module
esppy.conf
)average (boolean) – replace the sum with a mean, i.e., divide the result by n*m (default False)
- Returns
the answer
- Return type
- espm.measures.KL_loss_surrogate(X, W, H, Ht, log_shift=1e-14, average=False)[source]
Surrogate loss for the KL divergence.
- espm.measures.KLdiv(X, D, H, log_shift=1e-14, average=False)[source]
Generalized KL (Kullback–Leibler) divergence
Compute the generalized KL divergence given \(X,W,H\):
\[D_KL(X || WH) = \sum_{ji} X_{ij} \log (X / D A)_{ij} + (D A - X)_{ij}\]- Parameters
X (np.array 2D) – n x m matrix
W (np.array 2D) – n x k matrix
H (np.array 2D) – k x m matrix
log_shift (float) – small constant to ensure the KL divergence does not explode (default value set in module
esppy.conf
)average (boolean) – replace the sum with a mean, i.e., divide the result by n*m (default False)
- Returns
the answer
- Return type
Examples
>>> import numpy as np >>> from espm.measures import KLdiv >>> X = np.array([[1, 1, 1], [2, 4, 5]]) >>> W = np.array([[1], [1]]) >>> H = np.array([[1, 2, 3]]) >>> KLdiv(X, W, H) 2.921251732961556
- espm.measures.KLdiv_loss(X, W, H, log_shift=1e-14, average=False)[source]
Generalized Generalized KL (Kullback–Leibler) divergence loss
Compute the loss based on the generalized KL divergence given :math: X,W,H:
\[\sum_{ji} X_{ij} \log (D W)_{ij} + (D W)_{ij}\]This does not contains all the term of the KL divergence, only the ones depending on W and H.
- Parameters
X (np.array 2D) – n x m matrix
Y (np.array 2D) – n x m matrix
log_shift (float) – small constant to ensure the KL divergence does not explode (default value set in module
esppy.conf
)average (boolean) – replace the sum with a mean, i.e., divide the result by n*m (default False)
- Returns
the answer
- Return type
Examples
>>> import numpy as np >>> from espm.measures import KLdiv, KLdiv_loss >>> X = np.array([[1, 1, 1], [2, 4, 5]]) >>> W = np.array([[1], [1]]) >>> H = np.array([[1, 2, 3]]) >>> KLdiv_loss(X, W, H) 1.9425903651915402ZSXDerzA >>> KLdiv(X, W, H) 2.921251732961556
- espm.measures.find_min_MSE(true_maps, algo_maps, get_ind=False, unique=False)[source]
Compare all the mean squared errors between ground truth spectra and NMF spectra and find the minimum configuration.
- Parameters
true_maps (np.array 2D) – true maps with shape (number of phases, number of pixels)
algo_maps (np.array 2D) – NMF maps with shape (number of phases, number of pixels)
get_ind (boolean) – If True, the function will also return the indices of the NMF maps corresponding to the ground truth
unique (boolean) – If False it will find the global minimum but several maps can be associated to the same ground truth
- Returns
list of mse, (optionally) tuple of indices
- Return type
- ..warning::
The output being either a list or a tuple of list isn’t a great idea. It has to change.
- espm.measures.find_min_angle(true_vectors, algo_vectors, get_ind=False, unique=False)[source]
Compare all the angles between ground truth spectra and NMF spectra and find the minimum configuration.
- Parameters
true_vectors (np.array 2D) – true spectra with shape (number of phases, number of energy channels)
algo_vectors (np.array 2D) – NMF spectra with shape (number of phases, number of energy channels)
get_ind (boolean) – If True, the function will also return the indices of the NMF spectra corresponding to the ground truth
unique (boolean) – If False it will find the global minimum but several spectra can be associated to the same ground truth
- Returns
list of angles, (optionally) tuple of indices
- Return type
- ..warning::
The output being either a list or a tuple of list isn’t a great idea. It has to change.
- espm.measures.find_min_config(true_maps, true_spectra, algo_maps, algo_spectra, angles=True)[source]
Determines the best match between the true and the NMF spectra and maps by finding the configuration that minimizes either the sum of angles or the sum of MSE. The function returns a warning (boolean) if the MSE and the angles disagree on the best configuration.
- Parameters
true_maps (np.array 2D) – true maps with shape (number of phases, number of pixels)
true_spectra (np.array 2D) – true spectra with shape (number of phases, number of energy channels)
algo_maps (np.array 2D) – NMF maps with shape (number of phases, number of pixels)
algo_spectra (np.array 2D) – NMF spectra with shape (number of phases, number of energy channels)
angles (boolean) – If True, the function will minimize the sum of angles between true and NMF spectra. If False, it will minimize the sum of MSE between true and NMF maps.
- Returns
list of angles, list of MSE, configuration of the best match, warning
- espm.measures.log_reg(H, mu, epsilon=1, average=False)[source]
Log regularisation
Compute the regularization loss:
\[R(\mu, H, \epsilon) = \sum_{ij} \mu_i \log \left( H_{ij} + \epsilon \right).\]
- espm.measures.log_surrogate(H, Ht, mu, epsilon, average=False)[source]
Surrogate loss for the log function.
- espm.measures.mae(map1, map2)[source]
Mean average error
Calculate the mean average error between two 2D arrays of the same dimension.
- Parameters
map1 (np.array 2D) – first array
map2 (np.array 2D) – second array
- Returns
the answer
Examples
>>> import numpy as np >>> from espm.measures import mae >>> map1 = np.array([[0, 1][0, 1]]) >>> map2 = np.array([[1, 1][1, 1]]) >>> mae(map1, map2) 0.5
- espm.measures.mse(map1, map2)[source]
Mean square error
Calculate the mean squared error between two 2D arrays of the same dimension.
- Parameters
map1 (np.array 2D) – first array
map2 (np.array 2D) – second array
- Returns
the answer
Examples
>>> import numpy as np >>> from espm.measures import mse >>> map1 = np.array([[0, 1][0, 1]]) >>> map2 = np.array([[1, 1][1, 1]]) >>> mse(map1, map2) 0.5
- espm.measures.ordered_mae(true_maps, algo_maps, input_inds)[source]
input : p x Npx matrix of floats, p x Npx matrix of floats, list of integers output : list of floats %————————-% Takes true maps of p phases and Npx pixels, reconstructed maps of the same size and indices of the correspondance between true phases and reconstructed phases returns the mean average errors of each phase in truth order.
- espm.measures.ordered_mse(true_maps, algo_maps, input_inds)[source]
input : p x Npx matrix of floats, p x Npx matrix of floats, list of integers output : list of floats %————————-% Takes true maps of p phases and Npx pixels, reconstructed maps of the same size and indices of the correspondance between true phases and reconstructed phases returns the mean squared errors of each phase in truth order.
- espm.measures.ordered_r2(true_maps, algo_maps, input_inds)[source]
input : p x Npx matrix of floats, p x Npx matrix of floats, list of integers output : list of floats %————————-% Takes true maps of p phases and Npx pixels, reconstructed maps of the same size and indices of the correspondance between true phases and reconstructed phases returns the coefficient of determination of each phase in truth order.
- espm.measures.r2(map_true, map_pred)[source]
\(R^2\) - Coefficient of determination
Calculates the coefficient of determination between two 2D arrays of the same dimension. This is also called regression score function. See wikipedia.
This function is a wrapper for the function
sklearn.metrics.r2_score
of Scikit Learn.- Parameters
map1 (np.array 2D) – first array
map2 (np.array 2D) – second array
- Returns
the answer
- Return type
- espm.measures.spectral_angle(v1, v2)[source]
Spectral angle
Calculate the angle between two spectra of the same dimension.
- Parameters
v1 (np.array 1D) – first spectrum
v2 (np.array 1D) – second spectrum
- Returns
the answer
- Return type
Examples
>>> import numpy as np >>> from espm.measures import spectral_angle >>> v1 = np.array([0, 1, 0]) >>> v2 = np.array([1, 0, 1]) >>> spectral_angle(v1, v2) 90.0
- espm.measures.squared_distance(x, y=None)[source]
Squared distance between two between all colon vectors matrices.
Calculate the squared L2 distance between all pairs of vectors of two matrices. If only one matrix is given, the function uses each pair of vector of this matrix.
- Parameters
x (np.array 2D) – n x m matrix of first colon vectors
y (np.array 2D) – n x m matrix of second colon vectors (optional)
- Returns
the answer
- Return type
np.array (m x m)
Examples
>>> import numpy as np >>> from espm.measures import square_distance >>> x = np.arange(3) >>> square_distance(x, x) array([[ 0., 1., 2.], [ 1., 0., 1.], [ 2., 1., 0.]])
- espm.measures.trace_xtLx(L, x, average=False)[source]
Trace of \(X^T L X\)
Compute the following expression \(\text{Tr} (X^T L X)\).
- Parameters
L (np.array 2D) – n x n matrix
mu (np.array 2D) – n x k martrix of k n-sized vector
average (boolean) – replace the sum with a mean, i.e., divide the result by n*m (default False)
- Returns
the answer
- Return type
- espm.measures.unique_min(matrix)[source]
From a square matrix of float values, finds the combination of elements with different lines which mimises the sum of elements.
It is a brute force algorithm, it is not recommended to input a matrix bigger than 20 x 20
- Parameters
matrix (np.array 2D) – square matrix
- Returns
list of unique min values and corresponding indices in the same order
- Return type
Examples
>>> import numpy as np >>> from espm.measures import unique_min >>> matrix = np.array([[1.2, 1.3, 3.5], [4.9, 2.2, 6.5], [9.0, 4.1, 1.8]]) >>> unique_min(v1, v2) ([1.2, 2.2, 1.8], (0, 1, 2))
Utils
Utils for the ESPM package
- espm.utils.approx_density(atomic_fraction=False, *, elements_dict={})[source]
Wrapper to the density_of_mixture function of hyperspy. Takes a dict of chemical composition expressed in atomic weight fractions. Returns an approximated density.
- espm.utils.arg_helper(params, d_params)[source]
Check if all parameter of d_params are in params. If not, they are added to params with the default value.
- Parameters
- paramsdict
Dictionary of parameters to be checked.
- d_paramsdict
Dictionary of default parameters.
- Returns
- paramsdict
Dictionary of parameters with the default parameters added if not present.
- espm.utils.atomic_to_weight_dict(*, elements_dict={})[source]
Wrapper to the atomic_to_weight function of hyperspy. Takes a dict of chemical composition expressed in atomic fractions. Returns a dict of chemical composition expressed in atomic weight fratiom.
- espm.utils.bin_spim(data, n, m)[source]
Take a 3D array of size (x,y,k) [px, py, e] Returns a 3D array of size (n,m,k) [new_px, new_py, e]
- espm.utils.check_keys(params, d_params, upperkeys='', toprint=True)[source]
Check if all parameter of d_params are in params. If not, they are added to params with the default value.
- Parameters
- paramsdict
Dictionary of parameters to be checked.
- d_paramsdict
Dictionary of default parameters.
- upperkeysstr
String of the upper keys.
- toprintbool
If True, print the warning.
- Returns
- paramsdict
Dictionary of parameters with the default parameters added if not present.
Examples
>>> params = {'a':1,'b':2} >>> d_params = {'a':1,'b':2,'c':3} >>> check_keys(params,d_params) >>> params {'a': 1, 'b': 2, 'c': 3}
- espm.utils.create_laplacian_matrix(nx, ny=None)[source]
Helper method to create the laplacian matrix for the laplacian regularization
- Parameters
- :param nx: height of the original image
- :param ny: width of the original image
- Returns
- rtype
scipy.sparse.csr_matrix ..
- :return:the n x n laplacian matrix, where n = nx*ny
- espm.utils.is_number(i)[source]
Return True if i is a number
- Parameters
i (any) – variable to check
- Returns
True if i is a number
- Return type
- espm.utils.is_symbol(i)[source]
Return True if i is a chemical symbol
- Parameters
i (any) – variable to check
- Returns
True if i is a chemical symbol
- Return type
- espm.utils.isdict(p)[source]
Return True if the variable a dictionary.
- Parameters
p (any) – variable to check
- Returns
True if p is a dictionary
- Return type
- espm.utils.number_to_symbol_dict(func)[source]
Decorator Takes a dict of elements (a.k.a chemical composition) with atomic numbers as keys (e.g. 26 for Fe) returns a dict of elements with symbols as keys (e.g. Fe for iron)
- espm.utils.number_to_symbol_list(func)[source]
Decorator Takes a dict of elements (a.k.a chemical composition) with symbols as keys (e.g. Fe for iron) returns a dict of elements with atomic numbers as keys (e.g. 26 for iron)
- espm.utils.process_losses(losses)[source]
Process the losses to be plotted
- Parameters
- losses: np.ndarray
Array of losses (output of espm.estimators.NMFEstimator.get_losses method)
- Returns
- values: np.ndarray
Array of values
- names: list
List of names
- espm.utils.rescaled_DH(D, H)[source]
Rescale the matrices D and H such that the columns of H sums approximately to one.
- Parameters
D (np.array 2D) – n x k matrix
H (np.array 2D) – k x m matrix
- Returns
D_rescale, H_rescale
- Return type
np.array 2D, np.array 2D
Table utils
The espm.table_utils
module implements some methods to manipulate the tables produces by the emtables package.
Using this module it is possible to load the tables and to import custom k-factors into the tables.
Note
The emtables package can be found here: https://github.com/adriente/emtables
- espm.tables_utils.get_k_factor(table, mdata, element, line, range=0.5, ref_elt='14', ref_line='KL3', ref_range=0.5)[source]
Obtain the k-factor of a line from an emtables, X-ray emission cross section table.
- Parameters
- tabledict
The table of the X-ray emission cross sections.
- mdatadict
The metadata of the table.
- elementint
The atomic number of the element to use.
- linestr
The regex of the line to use. It has to correspond to IUPAC notation.
- rangefloat
The energy range to use for the integration of the cross section of the line. For example, if range = 0.5, the integration will be done between the energy of the line - 0.5 and the energy of the line + 0.5. We do so that when you select the “KL3” line, it integrates around it and make it correspond to the K-alpha bunch of lines.
- ref_eltint
The atomic number of the element to use as a reference for the k-factor. The default reference line is Si “KL3” with an integration range of 0.5.
- ref_linestr
The regex of the line to use as a reference for the k-factor. It has to correspond to IUPAC notation.
- ref_rangefloat
The energy range to use for the integration of the cross section of the reference line.
- Returns
- k_factorfloat
The k-factor of the line. It does not take into account the absorption correction.
- espm.tables_utils.import_k_factors(table, mdata, k_factors_names, k_factors_values, ref_name)[source]
Modify the X-ray emission cross-sections of the input table using the k-factors input, i.e. imposing cross-sections ratios to correspond to the k-factors. The metadata are modified too to keep track of the modifications.
- Parameters
- tabledict
The table of the X-ray emission cross sections.
- mdatadict
The metadata of the table.
- k_factors_nameslist
The list of the names of the k-factors to import. It has to correspond to the nomenclature of the hyperspy X-ray lines.
- k_factors_valueslist
The list of the values of the k-factors to import. It has to have the same length and ordering as k_factors_names.
- ref_namestr
The name of the X-ray line to use as a reference for the k-factors. It has to correspond to the nomenclature of the hyperspy X-ray lines.
- Returns
- new_tabledict
The modified table of the X-ray emission cross sections.
- new_mdatadict
The modified metadata of the table.
- espm.tables_utils.load_table(db_name)[source]
Load the table and metadata of a json table generated by emtables.
- Parameters
- db_namestr
The file name of the table to load.
- Returns
- tabledict
The table of the cross sections.
- metadatadict
The metadata of the table.
Notes
Call espm.conf.DB_PATH to get the folder of the tables.
- espm.tables_utils.modify_table_lines(table, mdata, elements, line, coeff)[source]
Modify the cross section of the lines of the selected elements in the input table.
- Parameters
- tabledict
The table of the X-ray emission cross sections.
- mdatadict
The metadata of the table.
- elementslist
The list of the atomic numbers of the elements to modify.
- linestr
The regex of the line to modify. It has to correspond to IUPAC notation.
- coefffloat
The coefficient to multiply the cross section of the selected lines.
- Returns
- new_tabledict
The modified table of the X-ray emission cross sections.
- new_mdatadict
The modified metadata of the table.
Notes
X-ray line regex examples : input “L” will modify all the L lines, input “L3” will modifiy all the L3 lines, input “L3M2” will modify the “L3M2” line.
Changelog
All notable changes to this project will be documented in this file. The format is based on Keep a Changelog and this project adheres to Semantic Versioning.
0.1.4 (2023-22-02)
First inofficial release
First version of the package on PyPi
All main functionality tested
Basic documentation
0.2.0 (2023-20-04)
Second inofficial release
All important part of the documentation are present
This is a test before the first official release 1.0.0
Contributing
Contributions are welcome, and they are greatly appreciated! The development of this package takes place on GitHub. Issues, bugs, and feature requests should be reported there. Code and documentation can be improved by submitting a pull request. Please add documentation and tests for any new code.
The package can be set up (ideally in a fresh virtual environment) for local development with the following:
$ git clone https://github.com/adriente/espm.git
$ pip install -e ".[dev]"
You can improve or add functionality in the espm
folder, along with
corresponding unit tests in espm/tests/test_*.py
(with reasonable
coverage).
If you have a nice example to demonstrate the use of the introduced
functionality, please consider adding a notebook in doc/introduction/notebooks
.
In general, this can be done as a relative symbolic link to a notebook in the
notebooks
folder.
Update README.rst
and CHANGELOG.rst
if applicable.
After making any change, please run the tests, and build the documentation with the following
$ make test
$ make doc
To iterate faster, you can partially run the test suite, at various degrees of granularity, as follows:
$ pytest espm.test.test_utils.py
$ pytest espm.test.test_estimators.py
Making a release
Update the version number and release date in
setup.py
,espm/__init__.py
andCHANGELOG.rst
.Create a git tag with
git tag -a v0.2.0 -m "espm v0.2.0"
.Push the tag to GitHub with
git push --tag
. The tag should now appear in the releases and tags tab.Create a release on GitHub and select the created tag.
Build the distribution with
make dist
and check that thedist/espm-0.2.0.tar.gz
source archive contains all required files. The binary wheel should be found asdist/espm-0.2.0.py3-none-any.whl
.Test the upload and installation process:
$ twine upload --repository-url https://test.pypi.org/legacy/ dist/* $ pip install --index-url https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple espm
Build and upload the distribution to the real PyPI with
make release
.
Repository organization
LICENSE.txt Project license
*.rst Important documentation
Makefile Targets for make
setup.py Meta information about package (published on PyPI)
.gitignore Files ignored by the git revision control system
.travis.yml Defines testing on Travis continuous integration
espm/ Contains the modules (the actual toolbox implementation)
__init.py__ Load modules at package import
*.py One file per module
espm/tables/ Contains data tables for the EDXS and EDSS models
__init.py__ Load modules at package import
*.json data_files for the package
*.txt data_files for the package
espm/tests/ Contains the test suites (will be distributed to end user)
__init.py__ Load modules at package import
test_*.py One test suite per module
test_docstrings.py Test the examples in the docstrings (reference doc)
doc/ Package documentation
conf.py Sphinx configuration
index.rst Documentation entry page
*.rst Include doc files from root directory
doc/reference/ Reference documentation
index.rst Reference entry page
*.rst Only directives, the actual doc is alongside the code
doc/introduction/
index.rst Tutorials entry page
*.rst One file per tutorial
doc/introduction/notebooks/
*.ipynb One notebook per tutorial
References
- CNC84
JN Chapman, WAP Nicholson, and PA Crozier. Understanding thin film x-ray spectra. Journal of Microscopy, 136(2):179–191, 1984.
- dlPPF+22
Francisco de la Peña, Eric Prestat, Vidar Tonaas Fauske, Pierre Burdet, Jonas Lähnemann, Petras Jokubauskas, Tom Furnival, Magnus Nord, Tomas Ostasevicius, Katherine E. MacArthur, Duncan N. Johnstone, Mike Sarahan, Joshua Taillon, Thomas Aarholt, pquinn-dls, Vadim Migunov, Alberto Eljarrat, Jan Caron, Carter Francis, T. Nemoto, Timothy Poon, Stefano Mazzucco, actions-user, Nicolas Tappy, Niels Cautaerts, Suhas Somnath, Tom Slater, Michael Walls, Florian Winkler, and Håkon Wiik Ånes. Hyperspy/hyperspy: release v1.7.3. October 2022. URL: https://doi.org/10.5281/zenodo.7263263, doi:10.5281/zenodo.7263263.
- SP09
F Scholze and M Procop. Modelling the response function of energy dispersive x-ray spectrometers with silicon detectors. X-Ray Spectrometry: An International Journal, 38(4):312–321, 2009.
- TPH+23
Adrien Teurtrie, Nathanaël Perraudin, Thomas Holvoet, Hui Chen, Duncan TL Alexander, Guillaume Obozinski, and Cécile Hébert. Espm: a python library for the simulation of stem-edxs datasets. Ultramicroscopy, pages 113719, 2023.