Performance analysis of the urban climate model MUKLIMO_3 for three extreme heatwave events in Bern

Extreme heatwaves represent a health hazard that is expected to increase in the future, and which particularly affects urban populations worldwide due to intensification by urban heat islands. To analyze the impact of such extreme heatwaves on urban areas, urban climate models are a valuable tool. This study examines the performance of the urban climate model MUKLIMO_3 in modelling spatial air temperature patterns in the greater urban area of Bern, Switzerland, a city in complex topography, during three distinct extreme heatwaves in 2018 and 2019 over a total of 23 days. The model is validated using low-cost air temperature data from 79 (2018) and 84 (2019) measurement sites. The intercomparison of the three extreme heatwaves shows that during the first extreme heatwave 2019 at lower elevation regions in the outskirts of the city, modelled air temperature was higher than observation, which was likely due to pronounced mesoscale cold air advection. During calm and dry days, the air temperature distribution was modelled realistically over all three extreme heatwaves investigated. During daytime, modelled air temperatures were lower across all evaluation sites and all extreme heatwaves when compared to the measured values, with highest median air temperature differences of (cid:0) 3.7 K to (cid:0) 4.8 K found in the late afternoon. At night, MUKLIMO_3 generally shows a slowed cooling, so that higher air temperatures were modelled when compared to measured values, with median air temperature biases of + 1.5 K to + 2.8 K at midnight. By sunrise, the model biases continuously decreased, so that the lowest air temperatures at 7 a.m. were modelled with a bias of + 0.2 K to + 0.7 K. Peak biases exceed 7 K both during day and night. In sum, our results show that MUKLIMO_3 allows to realistically model the urban air temperature distributions during the peaks of the heatwaves investigated with the highest day and night air temperatures, which may assist in the development of heat mitigation measures to reduce the impacts of heat extremes and improve public health in cities with complex topography.


Introduction
Changes in numerous extreme weather events have been observed globally since the mid-20th century due to anthropogenic greenhouse gas emissions, such as an increase and intensification of air temperature extremes [21].A growing number of studies show that the intensity, frequency, and duration of heatwaves are significantly associated with increases in human mortality [2,26,34].To illustrate, the extreme heatwaves in summer 2003 led to an excess mortality of 1000 fatalities in Switzerland [16], whereas the hot summers of 2018 and 2019 resulted in additional 200 and 500 fatalities, respectively [6,38].In urban areas, extreme air temperatures are intensified through changes in land surface and anthropogenic activities influencing the heat storage and fluxes [36].The increased air temperatures in urban areas compared to the rural surroundings are referred to as the urban heat island (UHI) effect and represent one of the most known and well-documented examples of anthropogenic modifications of the urban atmosphere [35].
In order to assess the risks of heatwaves in the particularly affected urban areas, knowledge about intra-urban air temperature variability is necessary at high spatial resolutions [20].While in-situ measurements are one way to measure the UHI effect, urban climate models (UCM) are another possible way to assess air temperature patterns in cities.One of these urban-scale models is MUKLIMO_3 [40,41].Numerous studies have already applied MUKLIMO_3 to model the air temperature field of heatwaves in urban areas.Often, only a single day or a few days at the peak of the respective heatwave were modelled (e.g., [3,11,13,19,24,28,45].While Geletič et al. [13] find a good agreement in particular of the spatial pattern of the air temperature during the day (errors below 1 • C), Holec et al. [19] find rather big differences of several degrees.Straka & Sodoudi [45] find a too slow cooling after sunset, leading to an overestimation of 2 m temperature and to a root mean squared error of 1-1.5 • C or even higher.These discrepancies might also be due to the fact that the model evaluation could only be carried out using a few individual air temperature measurement sites available across the urban area.However, since heatwaves can vary considerably in terms of duration, intensity, spatial variability, and atmospheric conditions, it is important not to limit the analysis to individual days of a single heatwave, but to compare multiple heatwaves over their entire duration and to evaluate it against high resolved in-situ data to perform fine-scale evaluations intra-urban air temperature distributions.
The aim of this study is to fill this gap by applying MUKLIMO_3 to model air temperature patterns throughout three distinct extreme heatwaves in 2018 and 2019 for the greater area of Bern, a city with complex topography in Switzerland.We take advantage of the existence of an air temperature measurement network within the study area consisting of more than 70 measurement sites [18].A specific objective is to investigate the extent to which the performance of the UCM differs throughout the individual extreme heatwaves both temporally (diurnal cycle and course of the heatwaves) and spatially (measurement network sites).
The paper is organized as follows.After a detailed description of heatwaves, study area and model domain, the modelling results and its spatiotemporal evaluation are discussed regarding meteorological conditions, model set-up and measurement uncertainties.

Study area, climatology, and extreme heatwaves during 2018 / 2019
The city of Bern is located at the boundary of the Bernese Pre-Alps and the Swiss Plateau in the western part of Switzerland (46 • 56 ′ N, 7 • 26 ′ E) at an average elevation of 550 m a.s.l.(Fig. 1A).Characterized by a heterogenous and complex topography, various hills around the city contribute to the city's regional and local climate ([4,18,49]; Fig. 1A).The river Aare flows through Bern and exerts its microclimatic influence on the city.With 134,000 inhabitants in the city and 400,000 in its greater metropolitan area, Bern is the fifth largest city of Switzerland [5].
Bern has a warm temperate, fully humid climate with warm summers [25].The rural WMO-certified weather station Bern/Zollikofen is located about 5 km north of the city center.The monthly average of all daily maximum (minimum) temperatures for the reference period 1981-2010 in the months June, July and August were 21.

Table 1
Measured values for the three extreme heatwaves under investigation in Bern/ Zollikofen.T avg shows the average daily air temperature, T max the average daily maximum air temperature, T min the average daily minimum air temperature [33].Fig. 2. Implemented irregular grid in horizontal direction with the LUCs described in Fig. 1.The black lines show the transition from one grid cell spacing to the next.6.3, the number of tropical nights (T min ≥ 20 • C) was 0 [31,32].The wind conditions on the Swiss Plateau, and thus also around Bern, are characterized by the gradient winds modified by the relief of the main Alpine ridge.Winds from the north to northeast and west to southwest dominate [27].A discussion of the wind situation during the three extreme heatwaves studied is available in Section 4.1 and is also shown in the appendix (Fig. A2).The summers of 2018 and 2019 were the fourth and third hottest in Switzerland since measurements began in 1864, with nationally averaged summer air temperature being +2.0 K and +2.1 K above the 1981-2010 norm, respectively.They were exceeded only by the summers of 2003 and 2015, for which, however, no high-resolution air temperature measurement data are available.Despite similar average air temperatures, the two summers of 2018 and 2019 showed significant differences in precipitation, with summer 2018 well below average and summer 2019 within the norm.Both summers were characterized by exceptionally long or intense heatwaves [29,32].
The Federal Office of Meteorology and Climatology in Switzerland (MeteoSwiss) describes heatwaves and extreme heatwaves as a period of extreme heat stress that can endanger human health.Using the definition of MeteoSwiss, three extreme heatwaves based on a heat index were identified in 2018 and 2019 for Bern/Zollikofen ( [30], Table 1).

MUKLIMO_3
To simulate diurnal cycles of atmospheric variables (temperature, moisture, wind field, radiation) on a three-dimensional model grid in dependence on prevailing weather conditions, land use, and topography, the non-hydrostatic microscale UCM MUKLIMO_3 (3-dimensionales mikroskaliges urbanes Klimamodell) developed by the German Meteorological Service in 1986 and continuously improved and extended until the most current available version of June 2020 (v200629) was used [41].
During a model run, two phases can be distinguished: First, the onedimensional model (1D) runs alone for a certain time in order to calculate the initial boundary conditions at 1 h intervals based on given input variables (e.g., initial vertical profile of air temperature and relative humidity, wind speed and wind direction as well as values for soil temperature and moisture, cloudiness and parameters for solar radiation) [41].Then, the three-dimensional simulation (3D) starts, with 1D continuing to run in parallel and providing the upper boundary values for the 3D.The lateral sides of 3D have free boundary conditions with horizontal advection terms being zero at the upstream domain boundaries [51].Both 1D and 3D include atmospheric balance equations for wind, air temperature and specific humidity, balance equations for temperature and volumetric water content in the earth's soil, and an interface model of the earth's surface that also takes vegetation into account and connects the atmosphere with the earth's soil [39,42,43].In addition, there are model equations for calculating the incoming shortwave and long-wave radiant flux densities on the surfaces.Unlike 3D, 1D does not include buildings, trees or water surfaces, but considers energy balance for low vegetation, sealed and free-soil surface.
In order to reduce the computational effort when investigating entire cities, the air flow in built-up area is parametrized through a porous medium approach for unresolved buildings, characterized by the building density, height and wall area within each grid cell [17].To carry out a simulation, datasets on topography, land use and description of land use classes (LUC) within the model area are required.Moreover, the meteorological situation at the start of the simulation is described by measured soil temperature in 1 m depth, as well as air temperature and relative humidity at up to five different altitudes.The water temperature, the temperature inside the buildings, and the degree of cloud cover Besides topography, MUKLIMO_3 requires detailed information about the land use of each grid cell.The definition of LUCs is primarily conducted using the dataset "Amtliche Vermessung vereinfacht", which contains the most accurate geodata available for the Canton of Bern [1]; Table A1).The dataset swissBUILDINGS 3D 2.0 [47] is used to subdivide the buildings according to their area, Imperviousness Density 2018 [9] to subdivide roads and squares according to the degree of sealing and Ginzler & Hobi [14] to subdivide garden areas according to the vegetation height.Using these datasets, 22 LUCs are defined, which are as optimally delineated from each other and as homogenous within themselves as possible (Fig. 1D).
To achieve the highest possible spatial resolution at ground level and in the central parts of the city while considering the influence of the surrounding topography, a three-dimensional irregular grid is implemented (Fig. 2).The core area of the city of Bern is modelled with the spatial resolution of 50 × 50 m.A higher resolution is theoretically possible, but the fraction of buildings per grid cell should not exceed 0.75, which is then no longer guaranteed [7].The grid cell spacing increases continuously up to 200 m towards the city's periphery, whereby neighboring grid cells differ by a maximum factor of 1.3 to avoid numerical instabilities in the model.In the vertical direction, the grid cell spacing is increased with increasing height from 10 m to 100 m in 56 and 51 levels for 1D and 3D, respectively, over the entire model area.This allows the layers close to the ground to be optimally resolved and the influence of the overlying atmosphere to be included using as few computing resources as possible.These properties lead to a 3D model area with 202 × 173 × 51 grid cells in northeast, northwest, and vertical direction, respectively.
The model area has a horizontal extent of 12.1 × 12.1 km and is rotated by 45 • against north, which allows relevant topographic features in the vicinity of Bern to be included.The vertical extent corresponds to 1611 m and 1151 m for 1D and 3D, respectively, with the model area being based on the lowest point of the area (480 m a.s.l.).
The start time of the MUKLIMO_3 simulations should be chosen so that the influence of the LUCs is as small as possible at the beginning.This is determined by the smallest average variance of the air temperature measurement data during each diurnal cycle of the study period.Therefore, 3D was started for each individual day at 07:00 CEST (Central European Summer Time, UTC+2) and modelled over 25 h.This allowed a complete diurnal cycle to be generated while removing the first hour (remaining data from 08:00 CEST to 07:00 CEST the following day), as this is used to create steady-state profiles of the atmospheric and ground variables.For this, 1D ran from 07:00 CEST the previous day to 07:00 CEST the day when 3D started to provide initial conditions for 3D.
Since MUKLIMO_3 cannot represent phase transitions of water vapor in the atmosphere, and thus neither cloud formation nor rain or snow [41], precipitation periods as well as the following hours of the respective model run were excluded from further investigations (Table 2).As precipitation during the study period often occurred in the form of small-scale thunderstorms that did not affect the entire model area, radar images were used to determine when a precipitation event occurred inside the model area [22], Appendix Fig. A3).Thereby, a total of 146/240 h (61 %) remain in HW18, 168/192 h (88 %) in HW19.1, and 88/96 h (92 %) in HW19.2 (Table 2).

High-resolution air temperature measurement network
To evaluate the modelled air temperatures, these are compared with measured air temperature data.From May to September 2018 and 2019, a low-cost air temperature measurement network was in operation to assess fine-scale spatiotemporal air temperature variability across the city of Bern and its surroundings using 79 (2018) and 84 (2019) thermistor-based HOBO Pendant® temperature loggers, respectively, whereas 64 loggers were placed at the same locations during both summers (Fig. 3).They were installed at about 3 m above ground to prevent vandalism, which is above standard height of 2 m but acceptable for urban canopy layer temperature measurements [50].
The temperature loggers recorded near-surface air temperatures at 10-minutes intervals, but due to the plastic coating of the sensor, the temperature loggers have a relatively long e-folding time of 10 min for changes in ambient temperatures at an airflow of 2 m s − 1 (manufacturer indications; see [18].Therefore, the hourly outputs of MUKLIMO_3 are compared with the measured air temperature values 10 min later (e.g., modelled air temperature at xx:00 is compared with the measured air temperature at xx:10).

Temporal evaluation of MUKLIMO_3
The temporal evaluation is based on six specific time steps during the course of the day for each heatwave (02:00, 06:00, 10:00, 14:00, 18:00, and 22:00 CEST).In each time step, all available loggers are included so that no spatial subdivision is applied to allow the spatial and temporal evaluation to be carried out separately.The fit for the linear regression of the modelled air temperature as response variable and measured air temperature as predictor variable using ordinary least squares is determined individually for the six time steps.In addition, the standard deviation, the root mean square error (RMSE), and the Pearson correlation are presented graphically using Taylor diagrams [48].The scatter of the bias between measured and modelled air temperatures as well as its median is presented using boxplots for all time steps over the entire diurnal cycle.To conduct an analysis of the spatial distribution of air temperature, maps of the mean UHI per heatwave are analyzed in a qualitative manner for 02:00 CEST and 18:00 CEST.The UHI is defined so that each grid cell reflects the difference to the grid cell with the measurement site Bern/Zollikofen.This allows to determine whether MUKLIMO_3 realistically represents the spatial distribution of air temperature over the course of the day.

Spatial evaluation of MUKLIMO_3
The measurement network covers a wide variety of city districts, neighboring agglomeration, and rural surroundings.The spatial evaluation is carried out in three different ways: First, the spatial patterns of the modelled air temperature at 3 m above ground are analyzed qualitatively.To represent the UHI, for each grid cell the air temperature difference to the grid cell containing the rural reference site Bern/Zollikofen is determined at 02:00 CEST and 18:00 CEST for each heatwave.In this way, both the daytime and nighttime situation can be analyzed and is also consistent with time steps used in the temporal evaluation.Second, the individual sites are mapped and evaluated based on their geographic location.We show the mean bias between measured and modelled air temperatures for each individual site at 02:00 CEST and 18:00 CEST for all three extreme heatwaves (Fig. 5).To analyze the larger deviations in HW19.1 and their spatial pattern in more detail, three individual days from this extreme heatwave are shown additionally for 02:00 CEST and 18:00 CEST.Third, all sites are classified into different local climate zones (LCZ) to evaluate not only the influence of the geographical localization of the measuring sites, but also the influence of the direct environment on the modelling and measurements.The LCZ concept was introduced by Stewart and Oke [44] and describes a distinct urban area with more or less homogeneous surface cover, structure, materials, and human activity.Each LCZ has a "characteristic screen-height temperature regime that is most apparent over dry surfaces, on calm, clear nights, and in areas of simple relief" ( [44], p. 1884).Using LCZ allows comparisons across studies.However, for Bern no gridded LCZ classification exists.Therefore, the classification of LCZs was done for each measurement site using satellite imagery and GIS data in a qualitative manner by Gubler et al. [18] and adopted for this study.The measurement sites are grouped according to the LCZs separated by heatwaves and statistically evaluated for all available hours over the entire diurnal cycle.For this purpose, the fit for the linear regression per LCZ of the modelled air temperature as response variable and measured air temperature as predictor variable using ordinary least squares is determined over all time steps.Using a Taylor diagram, each individual LCZ is examined separately according to heatwave [48].The scatter of the bias between measured and modelled air temperatures as well as its median is presented using boxplots per LCZ over all available time steps.

Temporal evaluation
The hourly air temperature fields at 3 m above ground modelled by MUKLIMO_3 (ΔT MUK ) are compared with the low-cost measurement network (ΔT AWS ) of Gubler et al. [18], including all available measurement sites over the entire diurnal cycle separated into respective hours of the day.The comparison of all three extreme heatwaves shows that the modelled median air temperature is in general lower during daytime and higher during nighttime (Fig. 4A to C).The largest negative deviations of modelled median air temperature can be found during the hottest time of the day (between 16:00 and 18:00 CEST) with − 3.7 K (HW18), − 4.1 K (HW19.1),and − 4.8 K (HW19.2),respectively.The largest positive bias of modelled median air temperature can be found between 00:00 and 02:00 CEST with 1.5 K (HW18), 2.8 K (HW19.1),and 1.5 K (HW19.2).A diurnal pattern is visible, whereby in the morning, the modelled air temperatures are consistently lower than the measured values.This negative model bias increases in the afternoon with rising air temperatures (around 13:00 to 18:00 CEST), whereas the sign of bias turns positive with decreasing air temperatures in the evening (around 21:00 to 23:00 CEST) indicating higher air temperatures in MUKLIMO_3 compared to the measured values.During night, the positive bias decreases towards 0 K, so that at sunrise (06:00 to 07:00 CEST), a median bias of 0.7 K (HW18), 0.2 K (HW19.1),and 0.7 K (HW19.2) can be determined for HW18, HW19.1, and HW19.2, respectively.
The comparison of modelled air temperatures with measurements of the official, ventilated weather stations in Bern/Zollikofen and Wankdorf reveals a similar pattern regarding the median model bias over the diurnal cycle, especially in HW19.1 and HW19.2.The largest positive median air temperature bias for MUKLIMO_3 can be found for both Bern/Zollikofen and Wankdorf at night (00:00 CEST until 04:00 CEST), whereas the largest negative median air temperature bias is found in the afternoon (around 16:00 CEST to 17:00 CEST).In HW18, the largest negative median model bias for Bern/Zollikofen and Wankdorf is found at 09:00 CEST and increases continuously until midnight.The intercomparison of the three heatwaves over the diurnal cycle reveals that HW19.1 shows a larger spread of air temperature bias over the entire diurnal cycle compared to HW18 and HW19.2.Further, HW19.1 also shows a higher RMSE and an equal standard deviation over the entire diurnal cycle compared to HW18 and HW19.2 (Fig. 4D).Conversely, HW19.2 differs from HW18 and HW19.1 when considering only the cooling phase between maximum and minimum temperatures with a lower Pearson correlation coefficient of − 0.03 compared to 0.44 (HW18) and 0.30 (HW19.1).

Spatial evaluation
The spatial pattern of modelled air temperature shows a pronounced UHI compared to the surrounding rural area for all extreme heatwaves investigated, both at night (02:00 CEST) and during the day (18:00 CEST; Fig. 5).At 02:00 CEST, this is particularly evident in HW18 and HW19.1 with UHI intensities of up to 1.6 K when compared to the rural site Bern/Zollikofen.In HW19.2, the excess heat in the city center is slightly lower with 1.2 K.The lowest air temperatures are in the hilly south and northeast of the model area in all three extreme heatwaves with deviations compared to Bern/Zollikofen of − 2.5 K, − 2.2 K, and − 2.6 K, respectively.As during nighttime, negative UHI intensities in the model domain at 18:00 CEST are in the higher elevation regions, although these deviations are markedly higher with − 4.2 K (HW18), − 3.8 K (HW19.1),and − 4.0 K (HW19.2),respectively.At 18:00 CEST, a distinct air temperature gradient can be seen along the topography during all three extreme heatwaves investigated and all built-up areas show a stronger warming than Bern/Zollikofen, especially the densely built-up inner city.The UHI intensity is larger than during the night when compared to Bern/Zollikofen with up to +2.0 K (HW18), +2.9 K (HW19.1),and +2.5 K (HW19.2),respectively.
The spatial pattern of mean deviations between modelled and measured air temperature per heatwave shows certain peculiarities (Fig. 6).The modelled air temperature is generally overestimated compared to the measured air temperature at nighttime (02:00 CEST) and underestimated at daytime (18:00 CEST).The overestimation at nighttime is particularly noticeable in HW19.1, whereby the sites in the outskirts of the city are more affected by higher (positive) deviations than the sites in the city center, which often show deviations of less than +1 K.However, especially the low-lying sites along the river show deviations of up to +5.6 K.This spatial pattern also applies to HW18 and HW19.2, but to a lesser extent with maximum deviations of 3.8 K and 3.2 K, respectively.At 18:00 CEST, a less pronounced spatial pattern of deviations can be seen across all extreme heatwaves.However, HW18 (− 4.9 K), as well as HW19.1 (− 5.7 K) and HW19.2 (− 5.7 K) show substantial maximum deviations of the modelled air temperature compared to the measured air temperature.The average deviation in HW19.2 (− 3.9 K) is slightly larger than in HW18 (− 3.3 K) and HW19.1 (− 3.3 K).
To understand the exceptionally high nighttime model biases in HW19.1, a more detailed analysis of the evolution of the extreme heatwave reveals a marked intensification of model deviations (Fig. 7).While at the beginning of HW19.1 on 26 June 2019 the deviations reach a maximum of +2.8 K at 02:00 CEST, they already rise to +6.8 K on 28 June 2019 at 02:00 CEST.On 29 June 2019 at 02:00 CEST deviations of up to +10.6 K can be observed.The spatial pattern with stronger deviations in the outskirts of the city than in the city center is particularly visible at the end of the extreme heatwave but is also present at the beginning of the extreme heatwave to a lesser extent.During daytime at 18:00 CEST, the largest deviations are reached at the beginning of HW19.1 on 26 June 2019 with up to − 7.2 K. On 28 June 2019 at 18:00 CEST, there is a maximum negative deviation of − 0.7 K, but a maximum positive deviation of up to +3.5 K. Towards the end of the extreme heatwave on 29 June 2019, a maximum negative deviation of − 4.0 K is observed.
We also analysed components of the energy budget, namely sensible heat flux and the outgoing longwave radiation (see Fig. A4).During the first two days, sensible heat flux averaged over the entire model domain was higher than later, while outgoing long wave radiation was lower than during the remaining days.
The comparison of model biases across different LCZs over the entire diurnal cycle shows that the median air temperature 3 m above ground in MUKLIMO_3 is generally lower than the measured for all LCZs except for LCZ A (HW18 and HW19.1;Fig. 8 A and B) and LCZ G (HW19.1; Fig. 8B).Further, the intercomparison of the three extreme heatwaves reveals differences regarding the extent and variability of biases across the LCZs.Whereas in HW18 LCZ A shows the lowest (0.3 K) and LCZ E the highest absolute median bias (− 1.6 K), in HW19.1 LCZ 8 shows the lowest (− 0.3 K) and LCZ 10 the highest absolute median bias (− 2.1 K).In HW19.2 LCZ A shows the lowest (− 0.9 K) and LCZ 10 the highest absolute median bias (− 3.3 K).In addition, HW19.1 shows a substantially higher variability compared to HW18 and HW19.2.This also results in HW19.1 having a lower Pearson correlation of 0.79 between modelled and measured air temperature and a RMSE of 3.4 K compared to HW18 and HW19.2 at 0.9 and 2.0 K over all LCZs, respectively (Fig. 8D).Contrastingly, the standard deviation is similar in all extreme heatwaves under investigation.Regarding differences across LCZs, it can be stated that in general, built-up LCZs (LCZ 1 to 10) have similar median biases, higher Pearson correlations, higher standard deviations and lower RMSEs compared to non-built LCZs (LCZ A to G).

Discussion
The analysis of modelled air temperature fields and its comparison with observational data from the measurement network shows that MUKLIMO_3 could model the basic spatiotemporal air temperature distribution patterns in the area of Bern during precipitation-free days in three distinct extreme heatwaves.In particular, the effect of the inner, densely built-up city is realistically represented across all extreme heatwaves investigated and at almost all times of day with highest UHI intensities.However, nights with strong influence of cold air advection, which are mostly prevalent in HW19.1, are an exception as it could influence the air temperature evolution in the city center (e.g., 29 June 2019; Fig. 7).Further, the outskirts of the city and areas along water bodies showed generally higher model biases, which might be due to the fact that these areas are potentially more prone to be affected by mesoscale atmospheric influences not being well represented in MUKLIMO_3 (Figs. 6 and 7).The better performance of MUKLIMO_3 for built-up LCZs (LCZ 1-9) as well as forests (LCZ A) compared to un-built LCZs (LCZ B-G) could at least partly be explained by the fact that the former are more often located in the inner city, which was generally better modelled (Fig. 8).These differences in model performance across the three extreme heatwaves require a discussion of potential influencing factors related to prevalent meteorological conditions, the model set-up, as well as the measurement data used for the model evaluation.

Meteorological conditions
To investigate the influence of mesoscale meteorological conditions, wind measurements on Bantiger at about 1100 m a.s.l. are used.HW18 and HW19.2 were dominated by the wind direction northeast during the entire diurnal cycle of all days modelled except for early mornings of HW19.2, when southwesterly winds occurred.Contrastingly, HW19.1 shows a distinct shift in wind direction from northeast at daytime to south to southwest during night and early morning between 22:00 and 09:00 CEST (Fig. A2).This change in wind direction after 22:00 CEST in HW19.1 coincides with the largest positive model biases and the highest RMSE between modelled and measured air temperatures.Besides differences in wind direction, the poorer model performance in HW19.1 can also be explained by pronounced mesoscale cold air advection, which occurred more frequently in HW19.1 compared to HW18 and HW19.2.To illustrate, temperature measurements at the high-alpine Jungfraujoch weather station (3582 m a.s.l.) show a drop of − 7 K within 48 h initiated on 27 June 2019 (MeteoSwiss, 2021), which indicates the advection of colder air masses in the evening.Given that the first three nights of HW19.1 from 24 to 26 June 2019 show only small deviations between modelled and measured air temperature, it seems thus likely that the biases of up to +10 K in the subsequent nights of 27-29 June 2019 result from the advection of this colder air masses towards the model area (Fig. 7).Because MUKLIMO_3 does not receive information regarding changes in air temperature outside the model area, this mesoscale cold air advection cannot be detected, which likely results in observed overestimations of air temperatures.This is supported by the heat fluxes, which initially change but remain unaffected by the advection.

Model set-up
The differences between measured and modelled air temperatures may be the result of multiple factors.While the negative model biases (underestimation) during the day are likely to be also influenced by inaccuracies of the temperature measurements (see Section 4.3), the deviations at night are rather due to model characteristics and parametrizations given that nocturnal measurement biases have been reported to be negligible [18].A prominent insufficiency of MUKLIMO_3 is that the cooling of near-ground air masses after sunset is often modelled too slowly [8,12,23,45].This would explain higher RMSE between modelled and measured air temperature especially in HW19.1 between 22:00 and 01:00 CEST (Fig. 4D).
The change in the sensible heat flux and outgoing longwave radiation during the first two days of HW19.2 might point to an initial model imbalance.However, as several processes, most importantly advection, are not represented in the model, the interpretation of energy balance components is difficult.
MUKLIMO_3 models a substantially smaller UHI magnitude than the measured magnitude at night (Figs. 5 and 6).A potential reason for this can be found in the modelled spatial extent of the UHI, which extends very far at night, especially to the north and southeast (Fig. 5).Thus, the rural measurement site Bern/Zollikofen seems to be under urban air temperature influence at night in MUKLIMO_3, especially in HW19.1 and HW19.2, so that rural air temperatures are no longer reflected there.A probable explanation for this phenomenon can be found in the more frequent wind from the south to southeast, also implemented accordingly in the model, which occurred mainly in these two extreme heatwaves.This leads to an urban plume [37] in the model from the city towards the north to north-east, which also influences the air temperatures in Bern/Zollikofen and thus reduces the air temperatures difference between Bern/Zollikofen and the urban areas in the city center.A proposed solution to this problem is to set up a more distant rural measurement site outside this plume or to use several rural measurement sites at different directions from the city center, whose air temperatures could be averaged and thus reduce the influence of onedirectional urban plumes.
The small-scale air temperature variability between the respective LCZs is smaller in the model than in the measurements (Fig. 8).On the one hand, this might be due to the horizontal spatial resolution per grid cell of 50 × 50 m, which means that small-scale air temperature fluctuations are lost due to the fact that the variability of land cover smaller than grid cell size is not included in the model.On the other hand, this might also be due to the unresolved buildings, which are divided into five different categories of grid cells that contain buildings to a different proportion.MUKLIMO_3 forms an ensemble for each grid cell with different realisations of the unresolved buildings, so that the arrangement of the buildings is different, but the proportion of buildings is hold constant.An average value for all meteorological parameters is then calculated from this ensemble [41].
The initial vertical temperature profile of each model run is determined by interpolation based on air temperature measurements in Bern/ Zollikofen and Bantiger.Yet this could only inadequately represent the vertical temperature distribution in the atmosphere, especially in the case of a non-adiabatic air temperature distribution, as occurs on nights with strong radiative emission and a developing air temperature inversion forming [15].
Many meteorological input factors are fixed at the start of the respective model run.Therefore, the conditions at 07:00 CEST have a great influence on the further course of the modelling.Changes in meteorological conditions during the model run can thus not be captured by the model.Hollósi et al. [20] mention that a positive or negative cloud cover bias highly influences the diurnal ranges of air temperature and relative humidity and could show that excluding simulations of cloud cover bias of more than 50 % improve the air temperature results by about 10 %.
Another meteorologically induced uncertainty is the soil moisture, which is indicated in MUKLIMO_3 via classes from 1 (very dry) to 6 (very moist).Since the model tends to overestimate evapotranspiration, small values of 1 to 2 are recommended [7].If rain fell on the previous day, a value of 2 was used, otherwise 1.However, since the rain did not fall uniformly over the entire model area and the soil was likely able to absorb different amounts of water depending on the previous dryness, the real soil moisture is difficult to estimate and is subject to some uncertainties.
The soil temperature must be given to the model at a depth of 1 m [7].However, measured values are only available up to a depth of 0.5 m in Bern/Zollikofen, so that a greater depth must be estimated and is correspondingly uncertain.Straka & Sodoudi [45] could show that deviations in soil temperature can lead to considerable changes in the output of MUKLIMO_3, so temperature as well as moisture measurements at a depth of 1 m would be important for improving the modelling.

Measurement uncertainties
Compared to official air temperature measurements by MeteoSwiss and by the city's office for environmental protection, the measurement devices of the high-resolution air temperature measurement network take place only passively ventilated.As a consequence, errors due to radiative heating can affect the daytime measurements, particularly during heatwaves when solar radiation is high and the passive ventilation low.Gubler et al. [18] reported air temperature biases of up to +2 K and more during heatwaves at daytime in HW18 when compared to actively ventilated air temperature measurements at the official reference stations (Fig. 3).The underestimation of the modelled air temperatures at daytime can thus be partly explained by positive measurement biases of the unventilated air temperature measurements.
Another factor being potentially relevant for the model performance is the number and location of measurement sites used for the evaluation.During HW19.1 and HW19.2 some additional measurement sites were installed especially along potential cold air transects (triangles in Fig. 3).Therefore, the better performance of HW18 could be partly explained by the fact that the distribution of measurement sites was less susceptible to cold air influences in 2018 than in 2019.

Conclusion
The aim of this study is to model the spatiotemporal air temperature distribution at 3 m above ground on an hourly basis for the greater area of Bern during the three distinct extreme heatwaves HW18, HW19.1, and HW19.2.In order to evaluate the spatiotemporal model performance, the high-resolution temperature measurement network of Gubler et al. [18] is used.
It could be shown that MUKLIMO_3 is able to represent the diurnal variation of air temperatures in the greater area of Bern for precipitation-free days during extreme heatwaves.Especially on calm, radiation-intensive days without thunderstorm formation and mesoscale cold air advection, the modelled air temperature shows a high degree of agreement with the measurement.While the underestimation of modelled air temperatures relative to measured air temperatures during the day is greater than the overestimation at night, the linear regression model fit is better for the maximum air temperatures in the afternoon than for the minimum air temperatures in the early morning.In part, the larger underestimation during daytime can be explained by a positive measurement bias of air temperatures by the unventilated measurement sites.HW19.1 performs worse than HW18 and HW19.2 with a larger median bias and a larger variability between modelled and measured air temperature and a larger RMSE both during daytime and nighttime.Potential explanations for this relate to a larger variability of wind directions, to a potentially more frequent influence of cold air advection, or to a more susceptible distribution of measurement stations for cold air advection compared to HW18.
From a spatial perspective, MUKLIMO_3 is capable of modelling higher air temperatures in the inner city, thereby forming a realistic UHI.However, the area of elevated air temperatures extends far beyond the city boundaries both during the day and at night, so that especially during nighttime (02:00 CEST) the air temperatures are overestimated by the model particularly in the surrounding area.In HW19.1, nocturnal cold air advection is not captured by MUKLIMO_3, so that the overestimation of air temperatures in the affected outskirts is particularly large.This is especially evident when considering the course of HW19.1, when at the beginning of the extreme heatwave there is almost no bias between modelled and measured air temperatures and this bias increased during the extreme heatwave due to more frequent cold air advection both during daytime and nighttime.
The study demonstrats the potential of using MUKLIMO_3 to obtain the spatiotemporal air temperature distribution during extreme heatwaves.Areas with increased air temperatures and thus greater heat stress during the day as well as at night can be identified and provide important indications for future urban development measures.In particular, the hottest locations in the inner city show the smallest bias A. Hürzeler et al. between modelled and measured air temperatures during the day as well as at night and are thus most realistically reproduced.In particular, days without cold air advection or precipitation and with little wind, which generally lead to the strongest heat stress during heatwaves, can be modelled realistically with MUKLIMO_3.However, days with changes in mesoscale background conditions during the simulation period lead to larger biases between modelled and measured air temperatures, which is also detected by Hollósi et al. [20].Compared to Bokwa et al. [3], this study found lower Pearson correlation values between measured and modelled air temperatures.However, Bokwa et al. [3] studied only the hottest day of the heatwave.Especially the days with mesoscale cold air advections, which are difficult to model with MUKLIMO_3, resulted in lower Pearson correlation values in this study.Furthermore, improvements in the model set-up would potentially result in modelling the full range of heatwave manifestations.This could contribute to urban climate adaptations, such as large urban development measures, and improve future public health during extreme heatwaves.
Results indicate a considerable daytime difference between model and observations, part of which might be due to errors in the latter.However, also the model needs to be evaluated further.Depending on the application, the bias can be more or less important.For instance, our regression analysis and the spatial pattern analyses show good results and hence for such applications, MUKLIMO_3 is a suitable tool.
Each LUC is described by 26 physical parameters, which contain information about building and vegetation characteristics as well as the degree of soil sealing (Table 3).All parameters are determined for a grid cell size of 50 × 50 m and averaged over all grid cells of the model area.The building parameters "building fraction of the first building type" (vg1), "wall area index of the first building type" (wai1) and "mean building height of the first building type" (h1) are derived using swiss-BUILDINGS 3D 2.0.For simplicity, all buildings are combined into one building type and a possible second building type (vg2, wai2, h2) is omitted."Fraction of impervious surface between buildings" (vs) refers to the proportion of the surface area that is unbuilt and is determined with Imperviousness Density 2018."Surface roughness of the non-builtup areas or areas below trees" (z0 in m) could not be determined with the available datasets, therefore, after consultation with the German Weather Service and the Central Institute for Meteorology and Geodynamics (ZAMG), a surface roughness of 0.2 is estimated for the nonbuilt-up areas of grid cells with buildings or trees and 0.1 for the nonbuilt-up areas of all other grid cells."Tree height" (hbm in m) and "stem height" (hst in m) is determined using Ginzler and Hobi [14], assuming half the tree height for the stem height and reducing the stem height by 2 m for large trees over 20 m. "Leaf area density in the tree top area" (bf0 in m − 1 ) and "leaf area density in the stem area" (bf1 in m − 1 )

Table A2
Physical parameters used for the individual LUCs.The fraction of impervious surface between buildings (vs) for water (LUC = 23) is set to − 1 by convention.Only the first 16 parameters are shown, as the other parameters are used for measures such as green roofs, changed albedo of walls, roofs or surfaces or a changed heat capacity.For this study, these were filled with model internal default values.has been set to 0.05 m − 1 and 0.8 m − 1 , respectively."Leaf area index of the canopy layer" (lai, dimensionless) is set to 1.5 for forests and close to 1 for all other LUCs."Vegetation height of the canopy layer" (hca in m) is determined for grid cells without tree cover by Ginzler and Hobi [14].In the case of tree cover, the vegetation height in Ginzler and Hobi [14] corresponds to the tree height, therefore the vegetation height of the canopy layer is estimated to 0.4-0.7 m. "Tree cover" (sigbm, dimensionless) is determined as the vertical projection of tree crowns to a horizontal earth's surface using Tree Cover Density 2018 [10]."Vegetational cover of the canopy layer" (sigma, dimensionless) is determined using "Amtliche Vermessung vereinfacht".All other parameters have not been adjusted and are therefore set with default values by MUKLIMO_3 [7,40].Buildings (vg1) and trees (sigbm) cannot occur in the same grid cell, so in each LUC the dominant factor is determined and the respective is ignored (see Fig. A5).

Fig. 1 .
Fig. 1.Information for model input data: A shows the altitude above sea level of the digital elevation model based on swissALTI 3D [46], B the fraction of impervious surfaces based on Imperviousness Density 2018 [9], C the building height based on swissBUILDINGS 3D 2.0 [47], and D the self-defined LUCs.Solid black lines show the city districts of Bern.

Fig. 3 .
Fig. 3. Measurement sites of the high-resolution air temperature measurement network in Bern classified according to operating years 2018 and 2019, and reference stations (operation in both years).① Indicating measurement site Bern/Zollikofen, ② measurement site Wankdorf, and ③ measurement site Bollwerk.Three rural stations are outside the model area and are therefore not considered for this study.Dark green are forests, light green are open green spaces, blue are water areas and light grey are sealed surfaces.

Fig. 4 .
Fig. 4. Boxplots of hourly air temperature difference of MUKLIMO_3 at 3 m above ground and the measured air temperatures separated into different hours of day for A HW18, B HW19.1, and C HW19.2.The box shows the interquartile range (IQR) with the 25th (bottom) and 75th (top) percentile.The black line in the box shows the median.The vertical lines refer to the minimum (bottom) and maximum (top) value of the data within ±1.5*IQR.Black dots denote outliers.The black triangles in A-C represent the median of the hourly air temperature difference of the actively ventilated measurement site Wankdorf, the black squares in A-C show the median of the hourly air temperature difference of the actively ventilated measurement site Bern/Zollikofen.The Taylor diagram in D shows hourly air temperatures at 3 m above ground for all available loggers separated into six groups of four hours each.HW18 is shown as dots, HW19.1 as triangles, and HW19.2 as rectangles.The Pearson correlation coefficient is related to the azimuthal angle, the RMSE is proportional to the distance from the black circle on the x-axis, and the standard deviation is proportional to the radial distance from the origin and labelled on the x-axis.

Fig. 5 .
Fig. 5. Maps of UHI showing the difference of the mean air temperature of MUKLIMO_3 of each grid cell compared to the grid cell of the measurement site Bern/ Zollikofen at 02:00 CEST and 18:00 CEST for each individual extreme heatwave.In black are all the buildings [47].

Fig. 6 .
Fig. 6.Maps showing the difference of the modelled air temperature of each individual grid cell containing a measurement site and the corresponding measured air temperature at 02:00 CEST and 18:00 CEST for each individual extreme heatwave.Positive values denote sites with higher modelled than measured air temperature, negative values denote sites with lower modelled than measured air temperature.Dark green indicate forests, light green open vegetation areas, light gray sealed surfaces and black buildings [47].

Fig. 7 .
Fig. 7. Maps showing the difference of the modelled air temperature of each individual grid cell containing a measurement site and the corresponding measured air temperature at 02:00 CEST and 18:00 CEST during HW19.1 for 26.06.19, 28.06.19, and 29.06.19.Positive values denote sites with higher modelled than measured air temperature, negative values denote sites with lower modelled than measured air temperature.Dark green are forests, light green are open green spaces, blue are water areas, light grey are sealed surfaces and black are buildings [47].

Fig. 8 .
Fig. 8. Boxplots of hourly air temperature difference of MUKLIMO_3 at 3 m above ground and the measured air temperatures separated into different LCZ for A HW18, B HW19.1, and C HW19.2.The box shows the interquartile range (IQR) with the 25th (bottom) and 75th (top) percentile.The black line in the box shows the median.The vertical lines refer to the minimum (bottom) and maximum (top) value of the data within ± 1.5*IQR.The black dots denote outliers.The Taylor diagram in D shows the hourly air temperatures at 3 m above ground over the entire diurnal cycle separated into LCZs.HW18 is shown as dots, HW19.1 as triangles, and HW19.2 as rectangles.The Pearson correlation coefficient is related to the azimuthal angle, the RMSE is proportional to the distance from the black circle on the xaxis, and the standard deviation is proportional to the radial distance from the origin and labelled on the x-axis.

Fig. A3 .
Fig. A3.Radar maps used to determine precipitation periods in the model area and exclude them from further analysis [22].

Fig. A4 .
Fig. A4.Maps of sensible heat flux from the ground (A) and outgoing longwave radiation (B) in MUKLIMO_3 on the second model day of each heat wave at 0:00 and 13:00.(C) Difference in sensible heat flux from the ground and outgoing longwave radiation at 13:00 between the third and first day of HW19.1.

Fig. A5 .
Fig. A5.Time series of sensible heat flux from the ground (black) and outgoing longwave radiation (red) in MUKLIMO_3 (average over the domain shown in Fig.A4) for all three heat waves.

Table 2
Periods excluded from further investigations due to precipitation.The times are given in CEST.

Table 3
Pearson correlation r of the air temperature difference of MUKLIMO_3 at 3 m above ground and the measured air temperatures between 02:00 CEST and 18:00 CEST the previous day (r 1802 ) as well as between the highest and lowest temperature of each measurement site (r minmax ) for HW18, HW19.1, and HW19.2, respectively.df and p represent the degrees of freedom and the p-value of the Pearson correlation, respectively.