{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "UxDZW841tkSi"
   },
   "source": [
    "# **Weather type reconstruction with neural networks**"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "_Nha4NG_tkSl"
   },
   "source": [
    "created by: Lucas Pfister, 2024\n",
    "\n",
    "### ***Description:***\n",
    "\n",
    "This notebook contains the code used for weather type reconstruction in Pfister et al. (2024). For details see the description in this paper. As the original station observations are not all publicly available (yet), a dummy dataset is available for demonstration purposes. The classification method is designed for 9 weather types (similar to the CAP9 classification (Weusthoff, 2011) used in the paper).\n",
    "\n",
    "\n",
    "The notebook contains code to 1) read in 2) and preprocess the model input data, to 3) evaluate the model (independent validation) and to 4) create WT reconstructions. For this purpose, the numpy, pandas, matplotlib, tensorflow and sklearn libraries have to be installed.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "Dit404ghtkSr"
   },
   "source": [
    "## **0) Load libraries**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "id": "YbRY_dIutkSr",
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "2024-04-04 11:50:08.693263: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  FMA\n",
      "To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.\n",
      "2024-04-04 11:50:10.404215: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory\n",
      "2024-04-04 11:50:10.404275: I tensorflow/compiler/xla/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.\n",
      "2024-04-04 11:50:15.358188: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory\n",
      "2024-04-04 11:50:15.358820: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory\n",
      "2024-04-04 11:50:15.358849: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.\n"
     ]
    }
   ],
   "source": [
    "import numpy as np\n",
    "import pandas as pd\n",
    "import matplotlib.pyplot as plt\n",
    "\n",
    "import tensorflow as tf\n",
    "from tensorflow import keras\n",
    "\n",
    "import sklearn\n",
    "from sklearn.metrics import confusion_matrix\n",
    "from sklearn.linear_model import LinearRegression\n",
    "from sklearn.preprocessing import PolynomialFeatures\n",
    "\n",
    "from sklearn.model_selection import train_test_split, KFold"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "TE_HXLPjfRBt"
   },
   "source": [
    "## **1) Read data**\n",
    "\n",
    "For demonstration purposes, a dummy dataset is read with four pressure series (pp) and three temperature series (ta), as well as the weather types (WT) in the last column. Note that the dummy weather types (9 classes) losely match the patterns in the pressure and temperature series, so model training is possible."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>AAA_pp</th>\n",
       "      <th>BBB_pp</th>\n",
       "      <th>CCC_pp</th>\n",
       "      <th>DDD_pp</th>\n",
       "      <th>EEE_pp</th>\n",
       "      <th>AAA_ta</th>\n",
       "      <th>BBB_ta</th>\n",
       "      <th>CCC_ta</th>\n",
       "      <th>WT</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>1957-09-01</th>\n",
       "      <td>1020.9</td>\n",
       "      <td>1013.9</td>\n",
       "      <td>1019.1</td>\n",
       "      <td>1012.4</td>\n",
       "      <td>1014.8</td>\n",
       "      <td>10.9</td>\n",
       "      <td>15.3</td>\n",
       "      <td>12.9</td>\n",
       "      <td>3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1957-09-02</th>\n",
       "      <td>1011.1</td>\n",
       "      <td>1009.1</td>\n",
       "      <td>1019.1</td>\n",
       "      <td>1004.4</td>\n",
       "      <td>1006.3</td>\n",
       "      <td>14.8</td>\n",
       "      <td>20.6</td>\n",
       "      <td>18.9</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1957-09-03</th>\n",
       "      <td>1026.1</td>\n",
       "      <td>1015.7</td>\n",
       "      <td>1024.0</td>\n",
       "      <td>1020.1</td>\n",
       "      <td>1010.8</td>\n",
       "      <td>10.3</td>\n",
       "      <td>17.6</td>\n",
       "      <td>15.9</td>\n",
       "      <td>3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1957-09-04</th>\n",
       "      <td>1012.5</td>\n",
       "      <td>1024.6</td>\n",
       "      <td>1029.1</td>\n",
       "      <td>1021.3</td>\n",
       "      <td>1021.7</td>\n",
       "      <td>12.9</td>\n",
       "      <td>16.3</td>\n",
       "      <td>12.9</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1957-09-05</th>\n",
       "      <td>1017.1</td>\n",
       "      <td>1016.7</td>\n",
       "      <td>1022.3</td>\n",
       "      <td>1012.2</td>\n",
       "      <td>1011.8</td>\n",
       "      <td>8.8</td>\n",
       "      <td>15.9</td>\n",
       "      <td>12.0</td>\n",
       "      <td>4</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2020-12-27</th>\n",
       "      <td>1006.9</td>\n",
       "      <td>1013.5</td>\n",
       "      <td>1020.0</td>\n",
       "      <td>1006.9</td>\n",
       "      <td>1015.0</td>\n",
       "      <td>4.2</td>\n",
       "      <td>5.4</td>\n",
       "      <td>5.7</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2020-12-28</th>\n",
       "      <td>1010.2</td>\n",
       "      <td>1024.6</td>\n",
       "      <td>1029.3</td>\n",
       "      <td>1015.7</td>\n",
       "      <td>1019.4</td>\n",
       "      <td>-1.3</td>\n",
       "      <td>3.7</td>\n",
       "      <td>2.8</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2020-12-29</th>\n",
       "      <td>989.5</td>\n",
       "      <td>989.4</td>\n",
       "      <td>1015.0</td>\n",
       "      <td>983.7</td>\n",
       "      <td>983.2</td>\n",
       "      <td>0.3</td>\n",
       "      <td>5.6</td>\n",
       "      <td>5.9</td>\n",
       "      <td>7</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2020-12-30</th>\n",
       "      <td>1014.7</td>\n",
       "      <td>1005.9</td>\n",
       "      <td>1023.6</td>\n",
       "      <td>1012.0</td>\n",
       "      <td>998.3</td>\n",
       "      <td>-1.0</td>\n",
       "      <td>2.6</td>\n",
       "      <td>7.2</td>\n",
       "      <td>3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2020-12-31</th>\n",
       "      <td>1026.6</td>\n",
       "      <td>1020.1</td>\n",
       "      <td>1031.6</td>\n",
       "      <td>1029.2</td>\n",
       "      <td>1015.2</td>\n",
       "      <td>-4.6</td>\n",
       "      <td>-0.7</td>\n",
       "      <td>-7.5</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>23133 rows × 9 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "            AAA_pp  BBB_pp  CCC_pp  DDD_pp  EEE_pp  AAA_ta  BBB_ta  CCC_ta  WT\n",
       "1957-09-01  1020.9  1013.9  1019.1  1012.4  1014.8    10.9    15.3    12.9   3\n",
       "1957-09-02  1011.1  1009.1  1019.1  1004.4  1006.3    14.8    20.6    18.9   2\n",
       "1957-09-03  1026.1  1015.7  1024.0  1020.1  1010.8    10.3    17.6    15.9   3\n",
       "1957-09-04  1012.5  1024.6  1029.1  1021.3  1021.7    12.9    16.3    12.9   5\n",
       "1957-09-05  1017.1  1016.7  1022.3  1012.2  1011.8     8.8    15.9    12.0   4\n",
       "...            ...     ...     ...     ...     ...     ...     ...     ...  ..\n",
       "2020-12-27  1006.9  1013.5  1020.0  1006.9  1015.0     4.2     5.4     5.7   2\n",
       "2020-12-28  1010.2  1024.6  1029.3  1015.7  1019.4    -1.3     3.7     2.8   5\n",
       "2020-12-29   989.5   989.4  1015.0   983.7   983.2     0.3     5.6     5.9   7\n",
       "2020-12-30  1014.7  1005.9  1023.6  1012.0   998.3    -1.0     2.6     7.2   3\n",
       "2020-12-31  1026.6  1020.1  1031.6  1029.2  1015.2    -4.6    -0.7    -7.5   5\n",
       "\n",
       "[23133 rows x 9 columns]"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "## read dataset\n",
    "training_data = pd.read_csv(\"WTrec_DummyTrainingData.csv\", index_col=0)\n",
    "training_data.index = pd.to_datetime(training_data.index)\n",
    "training_data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "## separate WT series from station data\n",
    "WT_series = training_data.WT.copy()\n",
    "data = training_data.drop(\"WT\", axis=1)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## **2) Preprocessing**\n",
    "\n",
    "Seasonality correction (fitting first two harmonics) and trend correction (3rd order polynomial) for temperature series. Standardization of model input data."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "seascor = True # whether to correct temperature seasonality\n",
    "detrend = True # whether to correct temperature trend"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### **2.1) Seasonality correction**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "def calcseas(t_series):\n",
    "    '''Takes temperature series (Pandas Series object) and returns date array, fitted values (average seasonality) and residuals (anomalies from average seasonality)'''\n",
    "    \n",
    "    # get day of year and number of days\n",
    "    doy = t_series.index.dayofyear\n",
    "    ndoy = t_series.index.year.map(lambda x: pd.Timestamp(x, 12, 31).dayofyear)\n",
    "    \n",
    "    # array with 1st & 2nd harmonics (transposed)\n",
    "    x = np.array([np.cos(2*np.pi*doy/ndoy),np.sin(2*np.pi*doy/ndoy),np.cos(4*np.pi*doy/ndoy),np.sin(4*np.pi*doy/ndoy)]).T\n",
    "\n",
    "    # get temperature data\n",
    "    y=t_series.values\n",
    "    \n",
    "    # get rid of na values for fit\n",
    "    nonan_idx = np.where(~np.isnan(y))\n",
    "    x_=x[nonan_idx]\n",
    "    y_=y[nonan_idx]\n",
    "\n",
    "    if y_.size == 0:\n",
    "        print(\"observation vector is empty\")\n",
    "        ynew = res = np.zeros_like(y)*np.nan\n",
    "        \n",
    "    else:\n",
    "        ## 2nd harmonics fit\n",
    "        reg = LinearRegression().fit(x_, y_)\n",
    "        \n",
    "        # fitted values\n",
    "        ynew = reg.predict(x)\n",
    "        \n",
    "        # residuals\n",
    "        res = y-ynew\n",
    "        \n",
    "    return(t_series.index, ynew, res)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "apply seasonality correction\n",
      "AAA_ta\n",
      "BBB_ta\n",
      "CCC_ta\n"
     ]
    }
   ],
   "source": [
    "if seascor:\n",
    "    print(\"apply seasonality correction\")\n",
    "    for x in data.filter(regex=r'ta').columns:\n",
    "        print(x)\n",
    "        tseries = data[x]\n",
    "        t, n, r = calcseas(tseries)\n",
    "        data[x] = r"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### **2.2) Trend correction**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [],
   "source": [
    "def detr_poly(t_series, poly_degree = 3):\n",
    "    '''Takes temperature series (Pandas Series object) and returns detrended time series'''\n",
    "    XX = np.reshape(t_series.index, (len(t_series.index), 1))\n",
    "    YY = t_series\n",
    "    pf = PolynomialFeatures(degree=poly_degree)\n",
    "    Xp = pf.fit_transform(XX)\n",
    "    md2 = LinearRegression()\n",
    "    md2.fit(Xp, YY)\n",
    "    trendp = md2.predict(Xp)\n",
    "    \n",
    "    series_detr = YY-trendp\n",
    "    return series_detr"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "detrend temperature data\n",
      "AAA_ta\n",
      "BBB_ta\n",
      "CCC_ta\n"
     ]
    }
   ],
   "source": [
    "if detrend:\n",
    "    print(\"detrend temperature data\")\n",
    "    for x in data.filter(regex=r'ta').columns:\n",
    "        print(x)\n",
    "        tseries = data[x]\n",
    "        tseries_detr = detr_poly(tseries)\n",
    "        data[x] = tseries_detr"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### **2.3) Standardization**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "standardize data\n"
     ]
    }
   ],
   "source": [
    "print(\"standardize data\")\n",
    "## normalize station data (column-wise)\n",
    "statdata_norm = (data-data.mean())/data.std()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>AAA_pp</th>\n",
       "      <th>BBB_pp</th>\n",
       "      <th>CCC_pp</th>\n",
       "      <th>DDD_pp</th>\n",
       "      <th>EEE_pp</th>\n",
       "      <th>AAA_ta</th>\n",
       "      <th>BBB_ta</th>\n",
       "      <th>CCC_ta</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>1957-09-01</th>\n",
       "      <td>0.861675</td>\n",
       "      <td>-0.118171</td>\n",
       "      <td>0.065897</td>\n",
       "      <td>0.280997</td>\n",
       "      <td>0.221356</td>\n",
       "      <td>-0.903708</td>\n",
       "      <td>-1.544809</td>\n",
       "      <td>-0.721915</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1957-09-02</th>\n",
       "      <td>-0.164191</td>\n",
       "      <td>-0.638198</td>\n",
       "      <td>0.065897</td>\n",
       "      <td>-0.582752</td>\n",
       "      <td>-0.584370</td>\n",
       "      <td>0.121121</td>\n",
       "      <td>0.181912</td>\n",
       "      <td>0.918599</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1957-09-03</th>\n",
       "      <td>1.406012</td>\n",
       "      <td>0.076840</td>\n",
       "      <td>0.693161</td>\n",
       "      <td>1.112355</td>\n",
       "      <td>-0.157809</td>\n",
       "      <td>-0.985966</td>\n",
       "      <td>-0.735694</td>\n",
       "      <td>0.133359</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1957-09-04</th>\n",
       "      <td>-0.017639</td>\n",
       "      <td>1.041056</td>\n",
       "      <td>1.346027</td>\n",
       "      <td>1.241917</td>\n",
       "      <td>0.875417</td>\n",
       "      <td>-0.290144</td>\n",
       "      <td>-1.110568</td>\n",
       "      <td>-0.651333</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1957-09-05</th>\n",
       "      <td>0.463890</td>\n",
       "      <td>0.185178</td>\n",
       "      <td>0.475539</td>\n",
       "      <td>0.259403</td>\n",
       "      <td>-0.063018</td>\n",
       "      <td>-1.294703</td>\n",
       "      <td>-1.197688</td>\n",
       "      <td>-0.869345</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2020-12-27</th>\n",
       "      <td>-0.603848</td>\n",
       "      <td>-0.161506</td>\n",
       "      <td>0.181109</td>\n",
       "      <td>-0.312831</td>\n",
       "      <td>0.240315</td>\n",
       "      <td>1.755074</td>\n",
       "      <td>0.885801</td>\n",
       "      <td>0.834880</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2020-12-28</th>\n",
       "      <td>-0.258403</td>\n",
       "      <td>1.041056</td>\n",
       "      <td>1.371630</td>\n",
       "      <td>0.637293</td>\n",
       "      <td>0.657397</td>\n",
       "      <td>0.377222</td>\n",
       "      <td>0.355913</td>\n",
       "      <td>0.064347</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2020-12-29</th>\n",
       "      <td>-2.425284</td>\n",
       "      <td>-2.772475</td>\n",
       "      <td>-0.458957</td>\n",
       "      <td>-2.817702</td>\n",
       "      <td>-2.774050</td>\n",
       "      <td>0.801216</td>\n",
       "      <td>0.972339</td>\n",
       "      <td>0.910779</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2020-12-30</th>\n",
       "      <td>0.212658</td>\n",
       "      <td>-0.984882</td>\n",
       "      <td>0.641955</td>\n",
       "      <td>0.237809</td>\n",
       "      <td>-1.342701</td>\n",
       "      <td>0.488447</td>\n",
       "      <td>0.026072</td>\n",
       "      <td>1.271371</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2020-12-31</th>\n",
       "      <td>1.458353</td>\n",
       "      <td>0.553531</td>\n",
       "      <td>1.666060</td>\n",
       "      <td>2.094869</td>\n",
       "      <td>0.259273</td>\n",
       "      <td>-0.408775</td>\n",
       "      <td>-1.016832</td>\n",
       "      <td>-2.682053</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>23133 rows × 8 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "              AAA_pp    BBB_pp    CCC_pp    DDD_pp    EEE_pp    AAA_ta  \\\n",
       "1957-09-01  0.861675 -0.118171  0.065897  0.280997  0.221356 -0.903708   \n",
       "1957-09-02 -0.164191 -0.638198  0.065897 -0.582752 -0.584370  0.121121   \n",
       "1957-09-03  1.406012  0.076840  0.693161  1.112355 -0.157809 -0.985966   \n",
       "1957-09-04 -0.017639  1.041056  1.346027  1.241917  0.875417 -0.290144   \n",
       "1957-09-05  0.463890  0.185178  0.475539  0.259403 -0.063018 -1.294703   \n",
       "...              ...       ...       ...       ...       ...       ...   \n",
       "2020-12-27 -0.603848 -0.161506  0.181109 -0.312831  0.240315  1.755074   \n",
       "2020-12-28 -0.258403  1.041056  1.371630  0.637293  0.657397  0.377222   \n",
       "2020-12-29 -2.425284 -2.772475 -0.458957 -2.817702 -2.774050  0.801216   \n",
       "2020-12-30  0.212658 -0.984882  0.641955  0.237809 -1.342701  0.488447   \n",
       "2020-12-31  1.458353  0.553531  1.666060  2.094869  0.259273 -0.408775   \n",
       "\n",
       "              BBB_ta    CCC_ta  \n",
       "1957-09-01 -1.544809 -0.721915  \n",
       "1957-09-02  0.181912  0.918599  \n",
       "1957-09-03 -0.735694  0.133359  \n",
       "1957-09-04 -1.110568 -0.651333  \n",
       "1957-09-05 -1.197688 -0.869345  \n",
       "...              ...       ...  \n",
       "2020-12-27  0.885801  0.834880  \n",
       "2020-12-28  0.355913  0.064347  \n",
       "2020-12-29  0.972339  0.910779  \n",
       "2020-12-30  0.026072  1.271371  \n",
       "2020-12-31 -1.016832 -2.682053  \n",
       "\n",
       "[23133 rows x 8 columns]"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "statdata_norm"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## **3) Model tuning**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [],
   "source": [
    "## define predictor and predictand data\n",
    "dates = statdata_norm.index\n",
    "x_data = statdata_norm.to_numpy()\n",
    "y_data = WT_series.to_numpy()\n",
    "\n",
    "n_data = x_data.shape[0]\n",
    "\n",
    "y_data_1_hot = np.zeros((n_data, y_data.max()))\n",
    "y_data_1_hot[np.arange(n_data),y_data-1] = 1\n",
    "\n",
    "n_input = x_data.shape[1]\n",
    "n_output = y_data_1_hot.shape[1]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### **3.1) Model setup**\n",
    "\n",
    "As the time series in our dummy dataset correspond to the 1738 station set (5 pressure and 4 temperature series), either this model can be loaded or a new one can be created (un/comment corresponding lines)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [],
   "source": [
    "### create NN model (from scratch)\n",
    "\n",
    "#model = keras.Sequential()\n",
    "    \n",
    "## input layer\n",
    "#model.add(tf.keras.layers.Input(name='input', dtype=tf.float32, shape=[n_input]))\n",
    "\n",
    "## hidden layers\n",
    "#model.add(tf.keras.layers.Dense(units=256, activation='relu'))\n",
    "#model.add(tf.keras.layers.Dense(units=128, activation='relu'))\n",
    "        \n",
    "## dropout layer    \n",
    "#model.add(tf.keras.layers.Dropout(rate = 0.1))\n",
    "    \n",
    "## output layer\n",
    "#model.add(tf.keras.layers.Dense(units=n_output, name='output', activation='softmax'))\n",
    "\n",
    "## compile model\n",
    "#model.compile(optimizer=keras.optimizers.Adam(learning_rate=0.001), loss=\"categorical_crossentropy\", metrics=['accuracy'])\n",
    "\n",
    "#model.summary()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "2024-04-04 11:50:52.023935: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory\n",
      "2024-04-04 11:50:52.025388: W tensorflow/compiler/xla/stream_executor/cuda/cuda_driver.cc:265] failed call to cuInit: UNKNOWN ERROR (303)\n",
      "2024-04-04 11:50:52.025495: I tensorflow/compiler/xla/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (climcal4.giub.unibe.ch): /proc/driver/nvidia/version does not exist\n",
      "2024-04-04 11:50:52.027658: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  FMA\n",
      "To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Model: \"sequential\"\n",
      "_________________________________________________________________\n",
      " Layer (type)                Output Shape              Param #   \n",
      "=================================================================\n",
      " dense (Dense)               (None, 256)               2304      \n",
      "                                                                 \n",
      " dense_1 (Dense)             (None, 128)               32896     \n",
      "                                                                 \n",
      " dropout (Dropout)           (None, 128)               0         \n",
      "                                                                 \n",
      " output (Dense)              (None, 9)                 1161      \n",
      "                                                                 \n",
      "=================================================================\n",
      "Total params: 36,361\n",
      "Trainable params: 36,361\n",
      "Non-trainable params: 0\n",
      "_________________________________________________________________\n"
     ]
    }
   ],
   "source": [
    "### read pre-trained model\n",
    "model = tf.keras.models.load_model('NN_models/NN_hypermodel_stat_1738_tot.keras')\n",
    "model.summary()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### **3.2) Model validation**\n",
    "\n",
    "k-fold cross-validation. Note that this code is merely a regular cross-validation for a single model and not a nested cross-validation for hyperparameter tuning purposes."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {
    "scrolled": true,
    "tags": []
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "outer_fold = 1\n",
      "[ 2892  2893  2894 ... 23130 23131 23132] [   0    1    2 ... 2889 2890 2891]\n",
      "Epoch 1/40\n",
      "87/87 [==============================] - 2s 13ms/step - loss: 0.6697 - accuracy: 0.7288 - val_loss: 0.6126 - val_accuracy: 0.7448\n",
      "Epoch 2/40\n",
      "87/87 [==============================] - 1s 10ms/step - loss: 0.5768 - accuracy: 0.7603 - val_loss: 0.5650 - val_accuracy: 0.7697\n",
      "Epoch 3/40\n",
      "87/87 [==============================] - 1s 10ms/step - loss: 0.5559 - accuracy: 0.7697 - val_loss: 0.5535 - val_accuracy: 0.7690\n",
      "Epoch 4/40\n",
      "87/87 [==============================] - 1s 10ms/step - loss: 0.5498 - accuracy: 0.7704 - val_loss: 0.5641 - val_accuracy: 0.7580\n",
      "Epoch 5/40\n",
      "87/87 [==============================] - 1s 10ms/step - loss: 0.5401 - accuracy: 0.7753 - val_loss: 0.5466 - val_accuracy: 0.7714\n",
      "Epoch 6/40\n",
      "87/87 [==============================] - 1s 10ms/step - loss: 0.5352 - accuracy: 0.7770 - val_loss: 0.5556 - val_accuracy: 0.7659\n",
      "Epoch 7/40\n",
      "87/87 [==============================] - 1s 10ms/step - loss: 0.5343 - accuracy: 0.7772 - val_loss: 0.5513 - val_accuracy: 0.7690\n",
      "Epoch 8/40\n",
      "87/87 [==============================] - 1s 10ms/step - loss: 0.5290 - accuracy: 0.7783 - val_loss: 0.5391 - val_accuracy: 0.7714\n",
      "Epoch 9/40\n",
      "87/87 [==============================] - 1s 9ms/step - loss: 0.5325 - accuracy: 0.7762 - val_loss: 0.5399 - val_accuracy: 0.7701\n",
      "Epoch 10/40\n",
      "87/87 [==============================] - 1s 11ms/step - loss: 0.5250 - accuracy: 0.7815 - val_loss: 0.5474 - val_accuracy: 0.7697\n",
      "Epoch 11/40\n",
      "87/87 [==============================] - 1s 10ms/step - loss: 0.5222 - accuracy: 0.7814 - val_loss: 0.5444 - val_accuracy: 0.7697\n",
      "Epoch 12/40\n",
      "87/87 [==============================] - 1s 10ms/step - loss: 0.5223 - accuracy: 0.7821 - val_loss: 0.5448 - val_accuracy: 0.7732\n",
      "Epoch 13/40\n",
      "87/87 [==============================] - 1s 10ms/step - loss: 0.5186 - accuracy: 0.7844 - val_loss: 0.5402 - val_accuracy: 0.7749\n",
      "91/91 [==============================] - 0s 2ms/step\n",
      "outer_fold = 2\n",
      "[    0     1     2 ... 23130 23131 23132] [2892 2893 2894 ... 5781 5782 5783]\n",
      "Epoch 1/40\n",
      "87/87 [==============================] - 1s 10ms/step - loss: 0.5185 - accuracy: 0.7838 - val_loss: 0.5455 - val_accuracy: 0.7687\n",
      "Epoch 2/40\n",
      "87/87 [==============================] - 1s 10ms/step - loss: 0.5179 - accuracy: 0.7824 - val_loss: 0.5425 - val_accuracy: 0.7766\n",
      "Epoch 3/40\n",
      "87/87 [==============================] - 1s 10ms/step - loss: 0.5176 - accuracy: 0.7833 - val_loss: 0.5456 - val_accuracy: 0.7690\n",
      "Epoch 4/40\n",
      "87/87 [==============================] - 1s 10ms/step - loss: 0.5138 - accuracy: 0.7854 - val_loss: 0.5401 - val_accuracy: 0.7746\n",
      "Epoch 5/40\n",
      "87/87 [==============================] - 1s 10ms/step - loss: 0.5117 - accuracy: 0.7874 - val_loss: 0.5416 - val_accuracy: 0.7718\n",
      "Epoch 6/40\n",
      "87/87 [==============================] - 1s 10ms/step - loss: 0.5117 - accuracy: 0.7839 - val_loss: 0.5407 - val_accuracy: 0.7763\n",
      "Epoch 7/40\n",
      "87/87 [==============================] - 1s 10ms/step - loss: 0.5088 - accuracy: 0.7890 - val_loss: 0.5427 - val_accuracy: 0.7701\n",
      "Epoch 8/40\n",
      "87/87 [==============================] - 1s 10ms/step - loss: 0.5070 - accuracy: 0.7856 - val_loss: 0.5446 - val_accuracy: 0.7690\n",
      "Epoch 9/40\n",
      "87/87 [==============================] - 1s 10ms/step - loss: 0.5056 - accuracy: 0.7884 - val_loss: 0.5403 - val_accuracy: 0.7728\n",
      "91/91 [==============================] - 0s 3ms/step\n",
      "outer_fold = 3\n",
      "[    0     1     2 ... 23130 23131 23132] [5784 5785 5786 ... 8673 8674 8675]\n",
      "Epoch 1/40\n",
      "87/87 [==============================] - 1s 11ms/step - loss: 0.5113 - accuracy: 0.7869 - val_loss: 0.5467 - val_accuracy: 0.7680\n",
      "Epoch 2/40\n",
      "87/87 [==============================] - 1s 10ms/step - loss: 0.5088 - accuracy: 0.7882 - val_loss: 0.5468 - val_accuracy: 0.7694\n",
      "Epoch 3/40\n",
      "87/87 [==============================] - 1s 10ms/step - loss: 0.5081 - accuracy: 0.7871 - val_loss: 0.5443 - val_accuracy: 0.7690\n",
      "Epoch 4/40\n",
      "87/87 [==============================] - 1s 10ms/step - loss: 0.5032 - accuracy: 0.7896 - val_loss: 0.5380 - val_accuracy: 0.7797\n",
      "Epoch 5/40\n",
      "87/87 [==============================] - 1s 10ms/step - loss: 0.5039 - accuracy: 0.7901 - val_loss: 0.5460 - val_accuracy: 0.7725\n",
      "Epoch 6/40\n",
      "87/87 [==============================] - 1s 10ms/step - loss: 0.5012 - accuracy: 0.7922 - val_loss: 0.5444 - val_accuracy: 0.7742\n",
      "Epoch 7/40\n",
      "87/87 [==============================] - 1s 10ms/step - loss: 0.4989 - accuracy: 0.7904 - val_loss: 0.5473 - val_accuracy: 0.7704\n",
      "Epoch 8/40\n",
      "87/87 [==============================] - 1s 10ms/step - loss: 0.4989 - accuracy: 0.7902 - val_loss: 0.5479 - val_accuracy: 0.7725\n",
      "Epoch 9/40\n",
      "87/87 [==============================] - 1s 10ms/step - loss: 0.4969 - accuracy: 0.7925 - val_loss: 0.5427 - val_accuracy: 0.7714\n",
      "91/91 [==============================] - 0s 3ms/step\n",
      "outer_fold = 4\n",
      "[    0     1     2 ... 23130 23131 23132] [ 8676  8677  8678 ... 11565 11566 11567]\n",
      "Epoch 1/40\n",
      "87/87 [==============================] - 1s 11ms/step - loss: 0.5016 - accuracy: 0.7927 - val_loss: 0.5505 - val_accuracy: 0.7656\n",
      "Epoch 2/40\n",
      "87/87 [==============================] - 1s 10ms/step - loss: 0.5004 - accuracy: 0.7904 - val_loss: 0.5451 - val_accuracy: 0.7714\n",
      "Epoch 3/40\n",
      "87/87 [==============================] - 1s 10ms/step - loss: 0.4985 - accuracy: 0.7904 - val_loss: 0.5484 - val_accuracy: 0.7707\n",
      "Epoch 4/40\n",
      "87/87 [==============================] - 1s 10ms/step - loss: 0.4966 - accuracy: 0.7909 - val_loss: 0.5584 - val_accuracy: 0.7683\n",
      "Epoch 5/40\n",
      "87/87 [==============================] - 1s 9ms/step - loss: 0.4950 - accuracy: 0.7905 - val_loss: 0.5601 - val_accuracy: 0.7669\n",
      "Epoch 6/40\n",
      "87/87 [==============================] - 1s 10ms/step - loss: 0.4934 - accuracy: 0.7927 - val_loss: 0.5487 - val_accuracy: 0.7697\n",
      "Epoch 7/40\n",
      "87/87 [==============================] - 1s 10ms/step - loss: 0.4914 - accuracy: 0.7953 - val_loss: 0.5525 - val_accuracy: 0.7739\n",
      "91/91 [==============================] - 0s 2ms/step\n",
      "outer_fold = 5\n",
      "[    0     1     2 ... 23130 23131 23132] [11568 11569 11570 ... 14457 14458 14459]\n",
      "Epoch 1/40\n",
      "87/87 [==============================] - 1s 10ms/step - loss: 0.4966 - accuracy: 0.7952 - val_loss: 0.5494 - val_accuracy: 0.7690\n",
      "Epoch 2/40\n",
      "87/87 [==============================] - 1s 10ms/step - loss: 0.4884 - accuracy: 0.7976 - val_loss: 0.5589 - val_accuracy: 0.7697\n",
      "Epoch 3/40\n",
      "87/87 [==============================] - 1s 10ms/step - loss: 0.4888 - accuracy: 0.7964 - val_loss: 0.5468 - val_accuracy: 0.7746\n",
      "Epoch 4/40\n",
      "87/87 [==============================] - 1s 10ms/step - loss: 0.4887 - accuracy: 0.7956 - val_loss: 0.5534 - val_accuracy: 0.7694\n",
      "Epoch 5/40\n",
      "87/87 [==============================] - 1s 10ms/step - loss: 0.4863 - accuracy: 0.7968 - val_loss: 0.5526 - val_accuracy: 0.7746\n",
      "Epoch 6/40\n",
      "87/87 [==============================] - 1s 10ms/step - loss: 0.4873 - accuracy: 0.7949 - val_loss: 0.5614 - val_accuracy: 0.7687\n",
      "Epoch 7/40\n",
      "87/87 [==============================] - 1s 10ms/step - loss: 0.4857 - accuracy: 0.7982 - val_loss: 0.5559 - val_accuracy: 0.7732\n",
      "Epoch 8/40\n",
      "87/87 [==============================] - 1s 10ms/step - loss: 0.4861 - accuracy: 0.7987 - val_loss: 0.5599 - val_accuracy: 0.7701\n",
      "91/91 [==============================] - 0s 2ms/step\n",
      "outer_fold = 6\n",
      "[    0     1     2 ... 23130 23131 23132] [14460 14461 14462 ... 17348 17349 17350]\n",
      "Epoch 1/40\n",
      "87/87 [==============================] - 1s 10ms/step - loss: 0.4870 - accuracy: 0.7937 - val_loss: 0.5503 - val_accuracy: 0.7742\n",
      "Epoch 2/40\n",
      "87/87 [==============================] - 1s 10ms/step - loss: 0.4865 - accuracy: 0.7963 - val_loss: 0.5517 - val_accuracy: 0.7687\n",
      "Epoch 3/40\n",
      "87/87 [==============================] - 1s 10ms/step - loss: 0.4840 - accuracy: 0.7985 - val_loss: 0.5520 - val_accuracy: 0.7701\n",
      "Epoch 4/40\n",
      "87/87 [==============================] - 1s 10ms/step - loss: 0.4820 - accuracy: 0.7978 - val_loss: 0.5479 - val_accuracy: 0.7687\n",
      "Epoch 5/40\n",
      "87/87 [==============================] - 1s 10ms/step - loss: 0.4808 - accuracy: 0.7960 - val_loss: 0.5544 - val_accuracy: 0.7714\n",
      "Epoch 6/40\n",
      "87/87 [==============================] - 1s 9ms/step - loss: 0.4791 - accuracy: 0.7993 - val_loss: 0.5601 - val_accuracy: 0.7676\n",
      "Epoch 7/40\n",
      "87/87 [==============================] - 1s 10ms/step - loss: 0.4766 - accuracy: 0.8020 - val_loss: 0.5499 - val_accuracy: 0.7735\n",
      "Epoch 8/40\n",
      "87/87 [==============================] - 1s 10ms/step - loss: 0.4786 - accuracy: 0.8002 - val_loss: 0.5518 - val_accuracy: 0.7742\n",
      "Epoch 9/40\n",
      "87/87 [==============================] - 1s 10ms/step - loss: 0.4773 - accuracy: 0.8018 - val_loss: 0.5581 - val_accuracy: 0.7701\n",
      "91/91 [==============================] - 0s 3ms/step\n",
      "outer_fold = 7\n",
      "[    0     1     2 ... 23130 23131 23132] [17351 17352 17353 ... 20239 20240 20241]\n",
      "Epoch 1/40\n",
      "87/87 [==============================] - 1s 11ms/step - loss: 0.4865 - accuracy: 0.7976 - val_loss: 0.5527 - val_accuracy: 0.7714\n",
      "Epoch 2/40\n",
      "87/87 [==============================] - 1s 10ms/step - loss: 0.4872 - accuracy: 0.7954 - val_loss: 0.5562 - val_accuracy: 0.7676\n",
      "Epoch 3/40\n",
      "87/87 [==============================] - 1s 10ms/step - loss: 0.4853 - accuracy: 0.7956 - val_loss: 0.5538 - val_accuracy: 0.7687\n",
      "Epoch 4/40\n",
      "87/87 [==============================] - 1s 10ms/step - loss: 0.4824 - accuracy: 0.7967 - val_loss: 0.5637 - val_accuracy: 0.7697\n",
      "Epoch 5/40\n",
      "87/87 [==============================] - 1s 10ms/step - loss: 0.4819 - accuracy: 0.7976 - val_loss: 0.5535 - val_accuracy: 0.7714\n",
      "Epoch 6/40\n",
      "87/87 [==============================] - 1s 10ms/step - loss: 0.4782 - accuracy: 0.7984 - val_loss: 0.5590 - val_accuracy: 0.7704\n",
      "91/91 [==============================] - 0s 3ms/step\n",
      "outer_fold = 8\n",
      "[    0     1     2 ... 20239 20240 20241] [20242 20243 20244 ... 23130 23131 23132]\n",
      "Epoch 1/40\n",
      "87/87 [==============================] - 1s 11ms/step - loss: 0.4768 - accuracy: 0.8000 - val_loss: 0.4616 - val_accuracy: 0.8050\n",
      "Epoch 2/40\n",
      "87/87 [==============================] - 1s 10ms/step - loss: 0.4811 - accuracy: 0.7951 - val_loss: 0.4626 - val_accuracy: 0.8046\n",
      "Epoch 3/40\n",
      "87/87 [==============================] - 1s 10ms/step - loss: 0.4771 - accuracy: 0.7986 - val_loss: 0.4614 - val_accuracy: 0.8053\n",
      "Epoch 4/40\n",
      "87/87 [==============================] - 1s 10ms/step - loss: 0.4771 - accuracy: 0.7982 - val_loss: 0.4743 - val_accuracy: 0.8001\n",
      "Epoch 5/40\n",
      "87/87 [==============================] - 1s 10ms/step - loss: 0.4774 - accuracy: 0.8012 - val_loss: 0.4664 - val_accuracy: 0.8029\n",
      "Epoch 6/40\n",
      "87/87 [==============================] - 1s 10ms/step - loss: 0.4746 - accuracy: 0.8003 - val_loss: 0.4705 - val_accuracy: 0.8012\n",
      "Epoch 7/40\n",
      "87/87 [==============================] - 1s 10ms/step - loss: 0.4739 - accuracy: 0.7986 - val_loss: 0.4653 - val_accuracy: 0.7960\n",
      "Epoch 8/40\n",
      "87/87 [==============================] - 1s 10ms/step - loss: 0.4762 - accuracy: 0.8006 - val_loss: 0.4682 - val_accuracy: 0.8012\n",
      "91/91 [==============================] - 0s 3ms/step\n"
     ]
    }
   ],
   "source": [
    "## define input data\n",
    "Xdata = x_data\n",
    "ydata = y_data_1_hot\n",
    "\n",
    "## define the K-fold cross validator\n",
    "nfold_outer = 8\n",
    "cv_outer = KFold(n_splits=nfold_outer, shuffle=False)#, random_state=42)\n",
    "\n",
    "## define early stopping conditions for model tuning\n",
    "stop_early = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=5, mode=\"min\")\n",
    "\n",
    "## validation output table\n",
    "accuracy_outer_post = pd.DataFrame(index=np.arange(1,nfold_outer+1), columns=[\"ANN\",\"DJF\",\"MAM\",\"JJA\",\"SON\"])\n",
    "\n",
    "\n",
    "## loop over folds\n",
    "fold_outer = 1\n",
    "\n",
    "for train, test in cv_outer.split(Xdata, ydata):\n",
    "    print(\"outer_fold = \"+str(fold_outer))\n",
    "    print(train, test)\n",
    "    \n",
    "    ## define training and test data\n",
    "    X_train = Xdata[train, :]\n",
    "    y_train = ydata[train]\n",
    "    dates_train = dates[train]\n",
    "    \n",
    "    X_test = Xdata[test, :]\n",
    "    y_test = ydata[test]\n",
    "    dates_test = dates[test]\n",
    "    \n",
    "    ## create model and tune it (with 1/7 of training data used for validation)\n",
    "    NN_model = model\n",
    "    NN_model.fit(X_train, y_train, validation_split=1/7, epochs=40, batch_size=200, callbacks=[stop_early])\n",
    "        \n",
    "    ## evaluate model\n",
    "    y_pred = np.argmax(NN_model.predict(X_test), axis=1)+1\n",
    "    y_tst = np.argmax(y_test, axis=1)+1\n",
    "    \n",
    "    ## overall accuracy\n",
    "    acc_outer = sklearn.metrics.accuracy_score(y_tst, y_pred)\n",
    "    accuracy_outer_post.loc[fold_outer, \"ANN\"] = acc_outer\n",
    "    \n",
    "    ## seasonal accuracy\n",
    "    for i in [[12,1,2],[3,4,5],[6,7,8],[9,10,11]]:\n",
    "        acc = sklearn.metrics.accuracy_score(y_tst[dates_test.month.isin(i)], y_pred[dates_test.month.isin(i)], normalize=True)\n",
    "        if i == [12,1,2]:\n",
    "            cc = \"DJF\"\n",
    "        elif i == [3,4,5]:\n",
    "            cc = \"MAM\"\n",
    "        elif i == [6,7,8]:\n",
    "            cc = \"JJA\"\n",
    "        elif i == [9,10,11]:\n",
    "            cc = \"SON\"\n",
    "        accuracy_outer_post.loc[fold_outer, cc] = acc\n",
    "    \n",
    "    # Increase fold number\n",
    "    fold_outer += 1\n",
    "\n",
    "    "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>ANN</th>\n",
       "      <th>DJF</th>\n",
       "      <th>MAM</th>\n",
       "      <th>JJA</th>\n",
       "      <th>SON</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>0.790111</td>\n",
       "      <td>0.785319</td>\n",
       "      <td>0.778533</td>\n",
       "      <td>0.827195</td>\n",
       "      <td>0.770604</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>0.784232</td>\n",
       "      <td>0.793629</td>\n",
       "      <td>0.802989</td>\n",
       "      <td>0.750708</td>\n",
       "      <td>0.788462</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>0.78769</td>\n",
       "      <td>0.783934</td>\n",
       "      <td>0.798913</td>\n",
       "      <td>0.78187</td>\n",
       "      <td>0.785714</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>0.785961</td>\n",
       "      <td>0.795014</td>\n",
       "      <td>0.789548</td>\n",
       "      <td>0.76703</td>\n",
       "      <td>0.792582</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>0.77213</td>\n",
       "      <td>0.808864</td>\n",
       "      <td>0.771955</td>\n",
       "      <td>0.722826</td>\n",
       "      <td>0.785714</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>0.781391</td>\n",
       "      <td>0.771468</td>\n",
       "      <td>0.791489</td>\n",
       "      <td>0.779891</td>\n",
       "      <td>0.782967</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>0.806987</td>\n",
       "      <td>0.832853</td>\n",
       "      <td>0.802183</td>\n",
       "      <td>0.79212</td>\n",
       "      <td>0.802198</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>0.767209</td>\n",
       "      <td>0.794501</td>\n",
       "      <td>0.73913</td>\n",
       "      <td>0.777174</td>\n",
       "      <td>0.759615</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "        ANN       DJF       MAM       JJA       SON\n",
       "1  0.790111  0.785319  0.778533  0.827195  0.770604\n",
       "2  0.784232  0.793629  0.802989  0.750708  0.788462\n",
       "3   0.78769  0.783934  0.798913   0.78187  0.785714\n",
       "4  0.785961  0.795014  0.789548   0.76703  0.792582\n",
       "5   0.77213  0.808864  0.771955  0.722826  0.785714\n",
       "6  0.781391  0.771468  0.791489  0.779891  0.782967\n",
       "7  0.806987  0.832853  0.802183   0.79212  0.802198\n",
       "8  0.767209  0.794501   0.73913  0.777174  0.759615"
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "## accuracy for all folds (overall and seasons)\n",
    "accuracy_outer_post"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "ANN    0.784464\n",
      "DJF    0.795698\n",
      "MAM    0.784343\n",
      "JJA    0.774852\n",
      "SON    0.783482\n",
      "dtype: float64\n"
     ]
    }
   ],
   "source": [
    "## average accuracy (overall and seasons)\n",
    "print(accuracy_outer_post.mean(axis=0))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## **4) Weather type reconstructions**\n",
    "\n",
    "reconstruct WTs from our dummy station variables and evaluate the reconstructions."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "723/723 [==============================] - 2s 3ms/step\n"
     ]
    }
   ],
   "source": [
    "## create predictions with pre-trained model (section 3)\n",
    "preds = NN_model.predict(Xdata)\n",
    "preds_class = np.argmax(preds, axis=1)+1\n",
    "true_class = np.argmax(ydata, axis=1)+1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>predicted</th>\n",
       "      <th>true</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>1957-09-01</th>\n",
       "      <td>1</td>\n",
       "      <td>3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1957-09-02</th>\n",
       "      <td>2</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1957-09-03</th>\n",
       "      <td>3</td>\n",
       "      <td>3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1957-09-04</th>\n",
       "      <td>5</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1957-09-05</th>\n",
       "      <td>3</td>\n",
       "      <td>4</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2020-12-27</th>\n",
       "      <td>1</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2020-12-28</th>\n",
       "      <td>5</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2020-12-29</th>\n",
       "      <td>7</td>\n",
       "      <td>7</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2020-12-30</th>\n",
       "      <td>2</td>\n",
       "      <td>3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2020-12-31</th>\n",
       "      <td>5</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>23133 rows × 2 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "            predicted  true\n",
       "1957-09-01          1     3\n",
       "1957-09-02          2     2\n",
       "1957-09-03          3     3\n",
       "1957-09-04          5     5\n",
       "1957-09-05          3     4\n",
       "...               ...   ...\n",
       "2020-12-27          1     2\n",
       "2020-12-28          5     5\n",
       "2020-12-29          7     7\n",
       "2020-12-30          2     3\n",
       "2020-12-31          5     5\n",
       "\n",
       "[23133 rows x 2 columns]"
      ]
     },
     "execution_count": 20,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "## create data frame with predicted and true WT time series\n",
    "WT_rec = pd.DataFrame([preds_class, true_class], columns=dates, index=[\"predicted\",\"true\"]).T\n",
    "WT_rec"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0.8026196342886786"
      ]
     },
     "execution_count": 21,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "## get overall accuracy\n",
    "sklearn.metrics.accuracy_score(WT_rec.true, WT_rec.predicted)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<style type=\"text/css\">\n",
       "#T_1c721_row0_col0, #T_1c721_row1_col1, #T_1c721_row2_col2, #T_1c721_row3_col3, #T_1c721_row4_col4, #T_1c721_row5_col5, #T_1c721_row6_col6, #T_1c721_row7_col7, #T_1c721_row8_col8 {\n",
       "  background-color: #08306b;\n",
       "  color: #f1f1f1;\n",
       "  font-size: 20px;\n",
       "}\n",
       "#T_1c721_row0_col1, #T_1c721_row0_col5 {\n",
       "  background-color: #eff6fc;\n",
       "  color: #000000;\n",
       "  font-size: 20px;\n",
       "}\n",
       "#T_1c721_row0_col2, #T_1c721_row3_col0, #T_1c721_row3_col4, #T_1c721_row4_col7 {\n",
       "  background-color: #e7f1fa;\n",
       "  color: #000000;\n",
       "  font-size: 20px;\n",
       "}\n",
       "#T_1c721_row0_col3, #T_1c721_row1_col5, #T_1c721_row2_col1 {\n",
       "  background-color: #e9f2fa;\n",
       "  color: #000000;\n",
       "  font-size: 20px;\n",
       "}\n",
       "#T_1c721_row0_col4, #T_1c721_row0_col6, #T_1c721_row0_col7, #T_1c721_row0_col8, #T_1c721_row1_col3, #T_1c721_row1_col4, #T_1c721_row1_col7, #T_1c721_row1_col8, #T_1c721_row2_col5, #T_1c721_row2_col6, #T_1c721_row2_col7, #T_1c721_row2_col8, #T_1c721_row3_col1, #T_1c721_row3_col5, #T_1c721_row3_col6, #T_1c721_row3_col7, #T_1c721_row3_col8, #T_1c721_row4_col0, #T_1c721_row4_col1, #T_1c721_row4_col5, #T_1c721_row4_col6, #T_1c721_row4_col8, #T_1c721_row5_col2, #T_1c721_row5_col3, #T_1c721_row5_col4, #T_1c721_row5_col7, #T_1c721_row5_col8, #T_1c721_row6_col0, #T_1c721_row6_col2, #T_1c721_row6_col3, #T_1c721_row6_col4, #T_1c721_row6_col7, #T_1c721_row7_col0, #T_1c721_row7_col1, #T_1c721_row7_col2, #T_1c721_row7_col3, #T_1c721_row7_col5, #T_1c721_row7_col6, #T_1c721_row7_col8, #T_1c721_row8_col0, #T_1c721_row8_col1, #T_1c721_row8_col2, #T_1c721_row8_col3, #T_1c721_row8_col4, #T_1c721_row8_col7 {\n",
       "  background-color: #f7fbff;\n",
       "  color: #000000;\n",
       "  font-size: 20px;\n",
       "}\n",
       "#T_1c721_row1_col0, #T_1c721_row1_col2, #T_1c721_row5_col6 {\n",
       "  background-color: #ecf4fb;\n",
       "  color: #000000;\n",
       "  font-size: 20px;\n",
       "}\n",
       "#T_1c721_row1_col6, #T_1c721_row2_col4 {\n",
       "  background-color: #f0f6fd;\n",
       "  color: #000000;\n",
       "  font-size: 20px;\n",
       "}\n",
       "#T_1c721_row2_col0 {\n",
       "  background-color: #d6e5f4;\n",
       "  color: #000000;\n",
       "  font-size: 20px;\n",
       "}\n",
       "#T_1c721_row2_col3 {\n",
       "  background-color: #eaf3fb;\n",
       "  color: #000000;\n",
       "  font-size: 20px;\n",
       "}\n",
       "#T_1c721_row3_col2, #T_1c721_row4_col2, #T_1c721_row6_col5 {\n",
       "  background-color: #eef5fc;\n",
       "  color: #000000;\n",
       "  font-size: 20px;\n",
       "}\n",
       "#T_1c721_row4_col3 {\n",
       "  background-color: #d8e7f5;\n",
       "  color: #000000;\n",
       "  font-size: 20px;\n",
       "}\n",
       "#T_1c721_row5_col0 {\n",
       "  background-color: #e2edf8;\n",
       "  color: #000000;\n",
       "  font-size: 20px;\n",
       "}\n",
       "#T_1c721_row5_col1 {\n",
       "  background-color: #dbe9f6;\n",
       "  color: #000000;\n",
       "  font-size: 20px;\n",
       "}\n",
       "#T_1c721_row6_col1 {\n",
       "  background-color: #e3eef8;\n",
       "  color: #000000;\n",
       "  font-size: 20px;\n",
       "}\n",
       "#T_1c721_row6_col8 {\n",
       "  background-color: #f2f7fd;\n",
       "  color: #000000;\n",
       "  font-size: 20px;\n",
       "}\n",
       "#T_1c721_row7_col4 {\n",
       "  background-color: #eaf2fb;\n",
       "  color: #000000;\n",
       "  font-size: 20px;\n",
       "}\n",
       "#T_1c721_row8_col5 {\n",
       "  background-color: #f2f8fd;\n",
       "  color: #000000;\n",
       "  font-size: 20px;\n",
       "}\n",
       "#T_1c721_row8_col6 {\n",
       "  background-color: #cee0f2;\n",
       "  color: #000000;\n",
       "  font-size: 20px;\n",
       "}\n",
       "</style>\n",
       "<table id=\"T_1c721\">\n",
       "  <thead>\n",
       "    <tr>\n",
       "      <th class=\"blank level0\" >&nbsp;</th>\n",
       "      <th id=\"T_1c721_level0_col0\" class=\"col_heading level0 col0\" >1</th>\n",
       "      <th id=\"T_1c721_level0_col1\" class=\"col_heading level0 col1\" >2</th>\n",
       "      <th id=\"T_1c721_level0_col2\" class=\"col_heading level0 col2\" >3</th>\n",
       "      <th id=\"T_1c721_level0_col3\" class=\"col_heading level0 col3\" >4</th>\n",
       "      <th id=\"T_1c721_level0_col4\" class=\"col_heading level0 col4\" >5</th>\n",
       "      <th id=\"T_1c721_level0_col5\" class=\"col_heading level0 col5\" >6</th>\n",
       "      <th id=\"T_1c721_level0_col6\" class=\"col_heading level0 col6\" >7</th>\n",
       "      <th id=\"T_1c721_level0_col7\" class=\"col_heading level0 col7\" >8</th>\n",
       "      <th id=\"T_1c721_level0_col8\" class=\"col_heading level0 col8\" >9</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th id=\"T_1c721_level0_row0\" class=\"row_heading level0 row0\" >1</th>\n",
       "      <td id=\"T_1c721_row0_col0\" class=\"data row0 col0\" >81.999106</td>\n",
       "      <td id=\"T_1c721_row0_col1\" class=\"data row0 col1\" >3.331843</td>\n",
       "      <td id=\"T_1c721_row0_col2\" class=\"data row0 col2\" >5.612701</td>\n",
       "      <td id=\"T_1c721_row0_col3\" class=\"data row0 col3\" >6.127013</td>\n",
       "      <td id=\"T_1c721_row0_col4\" class=\"data row0 col4\" >0.000000</td>\n",
       "      <td id=\"T_1c721_row0_col5\" class=\"data row0 col5\" >2.929338</td>\n",
       "      <td id=\"T_1c721_row0_col6\" class=\"data row0 col6\" >0.000000</td>\n",
       "      <td id=\"T_1c721_row0_col7\" class=\"data row0 col7\" >0.000000</td>\n",
       "      <td id=\"T_1c721_row0_col8\" class=\"data row0 col8\" >0.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_1c721_level0_row1\" class=\"row_heading level0 row1\" >2</th>\n",
       "      <td id=\"T_1c721_row1_col0\" class=\"data row1 col0\" >4.678530</td>\n",
       "      <td id=\"T_1c721_row1_col1\" class=\"data row1 col1\" >82.864524</td>\n",
       "      <td id=\"T_1c721_row1_col2\" class=\"data row1 col2\" >3.932262</td>\n",
       "      <td id=\"T_1c721_row1_col3\" class=\"data row1 col3\" >0.000000</td>\n",
       "      <td id=\"T_1c721_row1_col4\" class=\"data row1 col4\" >0.000000</td>\n",
       "      <td id=\"T_1c721_row1_col5\" class=\"data row1 col5\" >5.396096</td>\n",
       "      <td id=\"T_1c721_row1_col6\" class=\"data row1 col6\" >3.128588</td>\n",
       "      <td id=\"T_1c721_row1_col7\" class=\"data row1 col7\" >0.000000</td>\n",
       "      <td id=\"T_1c721_row1_col8\" class=\"data row1 col8\" >0.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_1c721_level0_row2\" class=\"row_heading level0 row2\" >3</th>\n",
       "      <td id=\"T_1c721_row2_col0\" class=\"data row2 col0\" >14.065450</td>\n",
       "      <td id=\"T_1c721_row2_col1\" class=\"data row2 col1\" >6.010069</td>\n",
       "      <td id=\"T_1c721_row2_col2\" class=\"data row2 col2\" >71.806167</td>\n",
       "      <td id=\"T_1c721_row2_col3\" class=\"data row2 col3\" >5.286344</td>\n",
       "      <td id=\"T_1c721_row2_col4\" class=\"data row2 col4\" >2.769037</td>\n",
       "      <td id=\"T_1c721_row2_col5\" class=\"data row2 col5\" >0.000000</td>\n",
       "      <td id=\"T_1c721_row2_col6\" class=\"data row2 col6\" >0.000000</td>\n",
       "      <td id=\"T_1c721_row2_col7\" class=\"data row2 col7\" >0.062933</td>\n",
       "      <td id=\"T_1c721_row2_col8\" class=\"data row2 col8\" >0.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_1c721_level0_row3\" class=\"row_heading level0 row3\" >4</th>\n",
       "      <td id=\"T_1c721_row3_col0\" class=\"data row3 col0\" >6.715006</td>\n",
       "      <td id=\"T_1c721_row3_col1\" class=\"data row3 col1\" >0.000000</td>\n",
       "      <td id=\"T_1c721_row3_col2\" class=\"data row3 col2\" >3.247163</td>\n",
       "      <td id=\"T_1c721_row3_col3\" class=\"data row3 col3\" >83.890290</td>\n",
       "      <td id=\"T_1c721_row3_col4\" class=\"data row3 col4\" >6.116015</td>\n",
       "      <td id=\"T_1c721_row3_col5\" class=\"data row3 col5\" >0.000000</td>\n",
       "      <td id=\"T_1c721_row3_col6\" class=\"data row3 col6\" >0.000000</td>\n",
       "      <td id=\"T_1c721_row3_col7\" class=\"data row3 col7\" >0.031526</td>\n",
       "      <td id=\"T_1c721_row3_col8\" class=\"data row3 col8\" >0.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_1c721_level0_row4\" class=\"row_heading level0 row4\" >5</th>\n",
       "      <td id=\"T_1c721_row4_col0\" class=\"data row4 col0\" >0.000000</td>\n",
       "      <td id=\"T_1c721_row4_col1\" class=\"data row4 col1\" >0.038124</td>\n",
       "      <td id=\"T_1c721_row4_col2\" class=\"data row4 col2\" >3.583683</td>\n",
       "      <td id=\"T_1c721_row4_col3\" class=\"data row4 col3\" >13.381624</td>\n",
       "      <td id=\"T_1c721_row4_col4\" class=\"data row4 col4\" >75.447960</td>\n",
       "      <td id=\"T_1c721_row4_col5\" class=\"data row4 col5\" >0.000000</td>\n",
       "      <td id=\"T_1c721_row4_col6\" class=\"data row4 col6\" >0.000000</td>\n",
       "      <td id=\"T_1c721_row4_col7\" class=\"data row4 col7\" >7.548608</td>\n",
       "      <td id=\"T_1c721_row4_col8\" class=\"data row4 col8\" >0.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_1c721_level0_row5\" class=\"row_heading level0 row5\" >6</th>\n",
       "      <td id=\"T_1c721_row5_col0\" class=\"data row5 col0\" >8.706572</td>\n",
       "      <td id=\"T_1c721_row5_col1\" class=\"data row5 col1\" >11.929678</td>\n",
       "      <td id=\"T_1c721_row5_col2\" class=\"data row5 col2\" >0.083717</td>\n",
       "      <td id=\"T_1c721_row5_col3\" class=\"data row5 col3\" >0.000000</td>\n",
       "      <td id=\"T_1c721_row5_col4\" class=\"data row5 col4\" >0.000000</td>\n",
       "      <td id=\"T_1c721_row5_col5\" class=\"data row5 col5\" >74.257011</td>\n",
       "      <td id=\"T_1c721_row5_col6\" class=\"data row5 col6\" >4.730013</td>\n",
       "      <td id=\"T_1c721_row5_col7\" class=\"data row5 col7\" >0.000000</td>\n",
       "      <td id=\"T_1c721_row5_col8\" class=\"data row5 col8\" >0.293010</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_1c721_level0_row6\" class=\"row_heading level0 row6\" >7</th>\n",
       "      <td id=\"T_1c721_row6_col0\" class=\"data row6 col0\" >0.000000</td>\n",
       "      <td id=\"T_1c721_row6_col1\" class=\"data row6 col1\" >8.613218</td>\n",
       "      <td id=\"T_1c721_row6_col2\" class=\"data row6 col2\" >0.000000</td>\n",
       "      <td id=\"T_1c721_row6_col3\" class=\"data row6 col3\" >0.000000</td>\n",
       "      <td id=\"T_1c721_row6_col4\" class=\"data row6 col4\" >0.000000</td>\n",
       "      <td id=\"T_1c721_row6_col5\" class=\"data row6 col5\" >3.737811</td>\n",
       "      <td id=\"T_1c721_row6_col6\" class=\"data row6 col6\" >85.319610</td>\n",
       "      <td id=\"T_1c721_row6_col7\" class=\"data row6 col7\" >0.000000</td>\n",
       "      <td id=\"T_1c721_row6_col8\" class=\"data row6 col8\" >2.329361</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_1c721_level0_row7\" class=\"row_heading level0 row7\" >8</th>\n",
       "      <td id=\"T_1c721_row7_col0\" class=\"data row7 col0\" >0.000000</td>\n",
       "      <td id=\"T_1c721_row7_col1\" class=\"data row7 col1\" >0.089445</td>\n",
       "      <td id=\"T_1c721_row7_col2\" class=\"data row7 col2\" >0.089445</td>\n",
       "      <td id=\"T_1c721_row7_col3\" class=\"data row7 col3\" >0.000000</td>\n",
       "      <td id=\"T_1c721_row7_col4\" class=\"data row7 col4\" >5.187835</td>\n",
       "      <td id=\"T_1c721_row7_col5\" class=\"data row7 col5\" >0.000000</td>\n",
       "      <td id=\"T_1c721_row7_col6\" class=\"data row7 col6\" >0.000000</td>\n",
       "      <td id=\"T_1c721_row7_col7\" class=\"data row7 col7\" >94.633274</td>\n",
       "      <td id=\"T_1c721_row7_col8\" class=\"data row7 col8\" >0.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_1c721_level0_row8\" class=\"row_heading level0 row8\" >9</th>\n",
       "      <td id=\"T_1c721_row8_col0\" class=\"data row8 col0\" >0.000000</td>\n",
       "      <td id=\"T_1c721_row8_col1\" class=\"data row8 col1\" >0.000000</td>\n",
       "      <td id=\"T_1c721_row8_col2\" class=\"data row8 col2\" >0.000000</td>\n",
       "      <td id=\"T_1c721_row8_col3\" class=\"data row8 col3\" >0.000000</td>\n",
       "      <td id=\"T_1c721_row8_col4\" class=\"data row8 col4\" >0.000000</td>\n",
       "      <td id=\"T_1c721_row8_col5\" class=\"data row8 col5\" >1.762632</td>\n",
       "      <td id=\"T_1c721_row8_col6\" class=\"data row8 col6\" >17.861340</td>\n",
       "      <td id=\"T_1c721_row8_col7\" class=\"data row8 col7\" >0.000000</td>\n",
       "      <td id=\"T_1c721_row8_col8\" class=\"data row8 col8\" >80.376028</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n"
      ],
      "text/plain": [
       "<pandas.io.formats.style.Styler at 0x7f3a445544f0>"
      ]
     },
     "execution_count": 22,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "## confusion matrix\n",
    "confmat = sklearn.metrics.confusion_matrix(WT_rec.true, WT_rec.predicted,  labels = [1, 2, 3, 4, 5, 6, 7, 8, 9], normalize = \"true\")\n",
    "df = pd.DataFrame(confmat)*100\n",
    "df.index = df.columns = [1,2,3,4,5,6,7,8,9]\n",
    "df.style.background_gradient(cmap ='Blues').set_properties(**{'font-size': '20px'})"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "anaconda-cloud": {},
  "colab": {
   "collapsed_sections": [
    "Dit404ghtkSr",
    "K2DojogoQv4l",
    "rcuT4JYRtkS4",
    "YzpETsFotkTF",
    "agJ3gkg0tkTJ",
    "cgHlUGwDtkTW",
    "hh6I4wJLtkTa",
    "kcLUPxNwtkTd",
    "TE_HXLPjfRBt",
    "I3v3Qp2WfMoE"
   ],
   "name": "Tutorial_III_tf2_Fully_connected_NNs.ipynb",
   "provenance": []
  },
  "kernelspec": {
   "display_name": "Python [conda env:.conda-ML]",
   "language": "python",
   "name": "conda-env-.conda-ML-py"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.16"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}