{
 "cells": [
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Generate simulated data\n",
    "\n",
    "In this notebook we generate a simulated dataset based on experimental data. \n",
    "- The maps are directly extracted from the experimental data.\n",
    "- The phases (or spectra) are built using chemical composition extracted from the literature (citations to be included)\n",
    "\n",
    "The output of this notebook is put in the `generated_datasets` \n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Imports"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%load_ext autoreload\n",
    "%autoreload 2\n",
    "%matplotlib inline\n",
    "\n",
    "# espm imports\n",
    "from espm.datasets.base import generate_dataset\n",
    "from espm.weights.generate_weights import chemical_map_weights\n",
    "from espm.models.generate_EDXS_phases import generate_modular_phases\n",
    "from espm.models.EDXS_function import elts_list_from_dict_list\n",
    "\n",
    "# Generic imports \n",
    "from pathlib import Path\n",
    "import matplotlib.pyplot as plt\n",
    "import numpy as np\n",
    "import hyperspy.api as hs\n",
    "\n"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Generate weights\n",
    "\n",
    "We use an experimental dataset to create realistic abundance maps. Theses maps are based from chemical mapping of the dataset."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def get_repo_path():\n",
    "    \"\"\"Get the path to the git repository.\n",
    "    \n",
    "    This is a bit of a hack, but it works.\n",
    "    \"\"\"\n",
    "    this_path = Path.cwd() / Path(\"generate_data.ipynb\")\n",
    "    return this_path.resolve().parent.parent\n",
    "\n",
    "# Path of the experimental dataset\n",
    "path = get_repo_path() / Path(\"generated_datasets/71GPa_experimental_data.hspy\")\n",
    "\n",
    "# Creation of the weights\n",
    "weights = chemical_map_weights( file = path, line_list = [\"Fe_Ka\",\"Ca_Ka\"], conc_list = [0.5,0.5])"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Plot the results"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": [
     "nbsphinx-thumbnail"
    ]
   },
   "outputs": [],
   "source": [
    "fig,axs = plt.subplots(1,3,figsize = (20,20*3))\n",
    "for j in range(axs.shape[0]) :\n",
    "    im = axs[j].imshow(weights[:,:,j],vmin = 0, vmax = 1.0)\n",
    "    axs[j].tick_params(axis = \"both\",width = 0,labelbottom = False,labelleft = False) \n",
    "    axs[j].set_title(\"Phase {}\".format(j),fontsize = 22)\n",
    "\n",
    "fig.subplots_adjust(right=0.84)\n",
    "cbar_ax = fig.add_axes([0.85, 0.47, 0.01, 0.045])\n",
    "fig.colorbar(im,cax=cbar_ax)\n",
    "cbar_ax.tick_params(labelsize=22)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Generate phases\n",
    "\n",
    "We generate phases based on values extracted from the literature. The phases we try to simulate here are Ferropericlase, Bridgmanite and Ca-perovskite.\n"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Model parameters\n",
    "\n",
    "We use dictionnaries to input the modelling parameters."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Elemental concetration of each phase\n",
    "elts_dicts = [\n",
    "    # Pseudo ferropericlase\n",
    "    {\n",
    "        \"Mg\" : 0.522, \"Fe\" : 0.104, \"O\" : 0.374, \"Cu\" : 0.05\n",
    "    },\n",
    "    # Pseudo Ca-Perovskite\n",
    "    {\n",
    "        \"Mg\" : 0.020, \"Fe\" : 0.018, \"Ca\" : 0.188, \"Si\" : 0.173, \"Al\" : 0.010, \"O\" : 0.572, \"Ti\" : 0.004, \"Cu\" : 0.05, \"Sm\" : 0.007, \"Lu\" : 0.006, \"Nd\" : 0.006 \n",
    "    },\n",
    "    # Pseudo Bridgmanite\n",
    "    {\n",
    "        \"Mg\" : 0.445, \"Fe\" : 0.035, \"Ca\" : 0.031, \"Si\" : 0.419, \"Al\" : 0.074, \"O\" : 1.136, \"Cu\" : 0.05, \"Hf\" : 0.01\n",
    "    }]\n",
    "\n",
    "# Parameters of the bremsstrahlung\n",
    "brstlg_pars = [\n",
    "    {\"b0\" : 0.0001629, \"b1\" : 0.0009812},\n",
    "    {\"b0\" : 0.0007853, \"b1\" : 0.0003658},\n",
    "    {\"b0\" : 0.0003458, \"b1\" : 0.0006268}\n",
    "]\n",
    "\n",
    "# Model parameters : energy scale, detector broadening, x-ray emission database, beam energy, absorption parameters, detector efficiency\n",
    "model_params = {\n",
    "        \"e_offset\" : 0.3,\n",
    "        \"e_size\" : 1980,\n",
    "        \"e_scale\" : 0.01,\n",
    "        \"width_slope\" : 0.01,\n",
    "        \"width_intercept\" : 0.065,\n",
    "        \"db_name\" : \"200keV_xrays.json\",\n",
    "        \"E0\" : 200,\n",
    "        \"params_dict\" : {\n",
    "            \"Abs\" : {\n",
    "                \"thickness\" : 100.0e-7,\n",
    "                \"toa\" : 35,\n",
    "                \"density\" : 4.5,\n",
    "                \"atomic_fraction\" : False\n",
    "            },\n",
    "            \"Det\" : \"SDD_efficiency.txt\"\n",
    "        }\n",
    "    }\n",
    "\n",
    "# miscellaneous paramaters : average detected number of X-rays per pixel, phases densities, output folder, model name, random seed\n",
    "data_dict = {\n",
    "    \"N\" : 100,\n",
    "    \"densities\" : [1.2,1.0,0.8],\n",
    "    \"data_folder\" : \"71GPa_synthetic_N100\",\n",
    "    \"model\" : \"EDXS\",\n",
    "    \"seed\" : 42\n",
    "}"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Generate phases"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "phases = generate_modular_phases(\n",
    "    elts_dicts=elts_dicts, brstlg_pars = brstlg_pars,\n",
    "    scales = [1, 1, 1],\n",
    "    model_params= model_params,\n",
    "    seed = 42\n",
    "    )\n",
    "# scales : bremsstrahlung parameters modifiers\n",
    "\n",
    "elements = elts_list_from_dict_list(elts_dicts)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Plot the results"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "fig,axs = plt.subplots(3,1,figsize = (20,15))\n",
    "\n",
    "# Build the energy scale\n",
    "x = np.linspace(\n",
    "    model_params[\"e_offset\"],\n",
    "    model_params[\"e_offset\"]+model_params[\"e_scale\"]*model_params[\"e_size\"],\n",
    "    num=model_params[\"e_size\"])\n",
    "\n",
    "for j in range(axs.shape[0]) :\n",
    "    axs[j].plot(x,phases[j])\n",
    "    axs[j].set_title(\"Phase {}\".format(j),fontsize = 22)\n",
    "\n",
    "axs[-1].set_xlabel(\"Energy loss (eV)\",fontsize = 16)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Generate the data\n",
    "\n",
    "It will produce 1 spectrum images/sample in the target folder.\n",
    "\n",
    "You can replace `seed_range` by the number of samples you want to generate."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "generate_dataset( phases = phases,\n",
    "                  weights = weights,\n",
    "                  model_params = model_params,\n",
    "                  misc_params = data_dict,\n",
    "                  base_seed=data_dict[\"seed\"],\n",
    "                  sample_number=1,\n",
    "                  elements = elements)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The previous command will save the data in the `generated_datasets` folder defined in the `espm.conf.py` file.\n",
    "\n",
    "You can also define the path where the data will be saved using the \"base_path\" argument"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "espm",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.9"
  },
  "vscode": {
   "interpreter": {
    "hash": "8d9f0750808bbb89d6238996d547b1a0ab25cae94b245e8c4572708d26b443fd"
   }
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}