Temperature trends in Europe: comparison of different data sources

Abstract

Temperature trends differ markedly not only region-to-region and between seasons but also depending on the selected dataset. Only a few studies have attempted to compare temperature trends between data sources of different types. Here, one station-based (ECA&D), two gridded (E-OBS; CRUTEM) and two reanalysis (ERA-40; NCEP/NCAR) datasets are used for long-term temperature change detection over Europe. The period from 1957 to 2002 when all the datasets overlap is examined and the linear regression method is utilized to calculate temperature trends in each season separately. Raster maps illustrating differences in trends between datasets are accompanied by mean temperature series showing the causes of these discrepancies. We demonstrate that trends in reanalyses deviate considerably from the other datasets mainly because the type and amount of data assimilated into them change in time. Interestingly, whilst the ERA-40 shows lower trends due to an overestimation of the mean temperature prior 1967, the NCEP/NCAR reveal lower trends compared with other datasets owing to mean temperature underestimation at the end of the examined period. A noticeable anomaly in NCEP/NCAR data was detected in Eastern Europe in summer with temperature trends nearly twice as steep compared with other data sources. The study also reveals the weaknesses of gridded datasets, such as the unstable number of stations entering the interpolation over time. The lack of representativeness of some climate stations is the major drawback of the station data.

Introduction

Temperature increase is the most evident manifestation of current climate change. The rate of global mean surface temperature rise is 0.13 °C per decade in the period from 1956 to 2005 with stronger warming over the land than over the sea (Trenberth et al. 2007). Temperature trends differ markedly not only between land and sea but also between continents. van der Schrier et al. (2013) note that Europe is warming faster (0.41 °C per decade) than global land average (0.27 °C per decade) over the period 1980–2010. These values refer to conspicuously enhanced warming in Europe at the end of the twentieth century compared with other continents. In addition, temperature trends vary depending on season (e.g. Klein Tank et al. 2005; Brázdil et al. 2009; van Oldenborgh 2009; Mamara et al. 2016; Pokorná et al. 2018) and on the type of data source.

The dependence on the selected dataset is an essential, although often overlooked, factor of the temperature trend estimation. There are three major types of data sources that are available for detection of temperature trends. The first type is data measured at climate stations and collected into station databases, such as European Climate Assessment & Dataset (hereinafter ‘ECA&D’; Klein Tank et al. 2002), Global Historical Climatology Network (‘GHCN’; Peterson and Vose 1997) and historical instrumental climatological surface time series of the Greater Alpine Region (‘HISTALP’; Auer et al. 2007). These datasets are not independent of each other as many stations are included in all of these databases. The second type is data interpolated into a regular network of gridpoints, such as the European daily high-resolution gridded dataset of surface temperature and precipitation (‘E-OBS’; Haylock et al. 2008), Climatic Research Unit Temperature (‘CRUTEM’ and together with sea surface temperature from Hadley Centre ‘HadCRUT’; Osborn and Jones 2014) and Goddard Institute for Space Studies (‘GISS’; Hansen et al. 2010). Finally, reanalyses have also been used for trend detection. Widely used are the project of National Centers for Environmental Predicton and National Center for Atmospheric Research (‘NCEP/NCAR’; Kalnay et al. 1996), and the ERA-40 and ERA-Interim reanalysis products (superseded by ERA 5 at the advanced stage of our writing) of the global atmosphere and surface conditions produced by European Centre for Medium-Range Weather Forecast (ECMWF; Uppala et al. (2005) and Dee et al. (2011a)).

The suitability of individual datasets to determine temperature trends is a subject for discussions. Station data are sometimes used as a ‘true’ reference (e.g. Mooney et al. 2011; Marshall et al. 2018). However, even the station data cannot be considered perfect: their deficiencies can be seen in their potential inhomogeneity caused by changes in station location, environment, instrumentation and observing practices (Klein Tank et al. 2002; Aguilar et al. 2003; Walton and Hall 2018) and unrepresentative spatial coverage (Klein Tank et al. 2002). Disadvantages of gridpoint data can be found in inaccurate interpolation (Hofstra et al. 2009), in the time-varying number of stations used in interpolation, which may induce inhomogeneities, and in creating a relatively dense grid from a sparse network of stations (Kyselý and Plavcová 2010), which leads to insufficiently constrained gridpoint values (Hofstra et al. 2012). In the case of reanalyses, a change in the type and amount of assimilated data during the reanalysis run may cause a distortion of trends (Bengtsson et al. 2004). The main examples are the incorporation of satellite data from the late 1970s (Kalnay et al. 1996; Uppala et al. 2005) and upper-air data from the 1950s (Stickler et al. 2014). Thorne and Vose (2010) point out that these unphysical time-varying biases at best reduce the utility of reanalyses for long-term trend monitoring, which was thoroughly discussed by Dee et al. (2011b).

Only few studies attempted to compare temperature trends between data sources of different types. Simmons et al. (2004) compare CRUTEM temperature trends with ERA-40 and NCEP/NCAR reanalyses worldwide with a conclusion that the interpolated dataset exhibits larger temperature trends than the reanalyses. Vose et al. (2012) discuss differences in trend values between the US station network and recent atmospheric reanalyses and Wang et al. (2013) do it likewise for China. Furthermore, van der Schrier et al. (2013) find E-OBS temperature trends to be comparable with other interpolated datasets but substantially different from trends calculated from reanalyses. Jones et al. (2012) compare CRUTEM, ERA-40 and ERA-Interim temperature series with an excellent agreement between CRUTEM and ERA-Interim in the Northern Hemisphere but less correspondence between CRUTEM and ERA-40, with higher ERA-40 tempeartures before 1970s in all seasons, which reduces temperature trends.

The objective of this paper is to provide a detailed comparison of seasonal temperature trends over Europe between databases of all three types. Temperature trends and their statistical significance for each of these datasets in the period 1957 to 2002 are calculated and mapped. Moreover, the suitability of these databases to evaluate long-term temperature changes is assessed and their weaknesses that can lead to an overestimation or underestimation of temperature trends are discussed.

Data and methods

The data used in this study include three different types of data sources, viz., station data, data interpolated onto a regular grid, and reanalyses, each type of datasets being represented by one or two examples. Mean temperature is analyzed and temperature trends are calculated for each conventional climatological season separately in the period 1957–2002 in which all datasets overlap. The area of interest is Europe with a good availability of data sources and a relatively dense network of observations, which offers conditions suitable for comparison.

Station data are taken from the ECA&D database (Klein Tank et al. 2002), which contains publicly available data from nearly 6600 European stations (at the time of writing) and is gradually expanding (van der Schrier et al. 2012). Spatially irregular meteorological observations interpolated to a regular grid are represented by the E-OBS and CRUTEM datasets. E-OBS (Haylock et al. 2008) is a European land only daily high-resolution (0.25° × 0.25°; see Table 1) gridded dataset of maximum, mean, and minimum surface temperatures, precipitation amounts and sea level pressure. It covers period from 1950 until present and is produced from a network of ECA&D stations. CRUTEM (Jones et al. 2012) is a gridded dataset of monthly surface air temperature anomalies over global land drawing data from a wide range of station networks including ECA&D, GHCN and HISTALP. Data are available on a 5° × 5° grid for each month from January 1850 to present. ERA-40 and NCEP/NCAR reanalyses are selected as representatives of global reanalyses because of their widespread use and sufficient temporal coverage. ERA-40 (Uppala et al. 2005) covers the period from September 1957 to August 2002 on a 2° × 2° grid and surface air temperature is available on sub-daily resolution there. NCEP/NCAR (Kalnay et al. 1996) is another global reanalysis product with a sub-daily resolution and temporal coverage from January 1948 to present available on a 2.5° × 2.5° grid.

Table 1 Specification of regular datasets used in the study

From ECA&D, only homogeneous stations with complete series of daily mean temperatures in the analysis period are selected. Regardless of the method of calculation of daily temperature means and whether the series are blended or non-blended, the completeness criterion is met by 92 (Fig. 1a) out of more than 1000 publically available meteorological stations that we pre-selected on the basis of their geographical location and sufficiently long series. The original spatial resolutions were utilized for CRUTEM, ERA-40 and NCEP/NCAR; for E-OBS, we chose the spatial resolution averaged from original 0.25° × 0.25° into 2° × 2° grid cells publicly available from KNMI climate explorer, which should provide a fair comparison.

Fig. 1
figure1

Spatial distribution of stations and gridpoints for the five datasets. The stations referenced in the text are circled in red and numbered

The resolutions and analysis domains of the four regular datasets are summarized in Table 1 and the spatial distribution of stations and gridpoints is displayed in Fig. 1.

Temperature trends are calculated for each conventional season separately. For the sake of brevity, we only display results for winter and summer, however. The seasonal mean temperature values are calculated from daily and subsequently as 3-month average (ECA&D, E-OBS) and from monthly (CRUTEM, ERA-40, NCEP/NCAR) mean values. Linear temperature trends are calculated by ordinary linear regression.

Statistical significance of trends is evaluated by Student test of correlation coefficients between temperature and time. The significance may be exaggerated; that is, significant trend may be detected even in the absence of a real trend, in the presence of autocorrelations in the temperature time series. The value for which the autocorrelation gets significant at the 5% level under the two-sided hypothesis is 0.297. Autocorrelation values fall below this critical value, that is, can be considered zero, at all stations or gridpoints in all datasets in both seasons except the stations of Smolensk in winter and Heraklion and Isparta in summer. These stations were discarded from further analysis for the sake of simplicity.

In order to evaluate whether the locally detected trend could or could not occur due to mere chance (e.g. Wilks 2016), the global (field) significance is assessed by the Monte Carlo approach. To this end, mean temperature maps for 45 years are randomly shuffled 1000 times. The number of gridpoints (stations) with statistically significant trends is calculated then for each shuffled series of maps. If the real number of gridpoints (stations) with statistically significant trends exceeds the 95th percentile of values from the shuffled series, the trend map (pattern) is considered significant at the 5% level.

Thanks to the large number of stations and gridpoints it was possible to plot maps revealing the spatial distribution of temperature trends in Europe for each dataset and season. To depict differences between observation-based data sources and reanalyses, temperature trends at ERA-40 and NCEP/NCAR gridpoints are subtracted from trends at the nearest E-OBS and CRUTEM gridpoint values if their distance is less than 50 km. These differences are then mapped. Statistical significance of these differences is calculated using the Fisher test for the equality of two correlation coefficients (between temperature and time) for the two compared (subtracted) gridpoints. In the next step, temperature anomalies from the 1971–1990 period are plotted for each dataset for domains where substantial differences in trends between datasets occur to disclose causes of the differences. The reference period (1971–1990) is the same for all displayed temperature anomaly series in this paper and was chosen as a 20-year period in the middle of the analyzed period.

Results

This chapter is organized as follows: first, we expose average temperature trends in Europe as well as their spatial distribution as estimated from all datasets. Then, we present the assessment of differences in temperature trends between datasets and discuss their possible causes.

European long-term temperature changes

Table 2 shows annual and seasonal trends averaged over all stations or gridpoints for all the five datasets together with their field significance. In period 1957–2002, most of the temperature trends are positive in all data sources and in all seasons except autumn when trends are close to zero. Based on the Monte Carlo method, trends are globally statistically significant at the 5% level in all datasets and all seasons except autumn and ERA-40 in winter. The lack of warming in autumn in the second half of the twentieth century was noticed in large parts of Europe earlier, e.g. by Brázdil et al. (2009) and Pokorná et al. (2018). By averaging the temperature trend from 92 selected ECA&D stations, Europe is warming up by 0.22 °C per decade with the most pronounced warming in winter (0.34 °C per decade). Using data from the E-OBS interpolated dataset, the warming of 0.21 °C per decade has been calculated in Europe by averaging 325 gridpoints, with the largest trend also observed in winter (0.35 °C per decade). CRUTEM, ERA-40 and NCEP/NCAR exhibit somewhat weaker temperature trends, 0.18 °C, 0.16 °C and 0.17 °C, respectively. Whilst the different trends calculated from CRUTEM may be caused by a low resolution (5° × 5°), which may not reflect a potentially high spatial variability of real trends in a spatially heterogeneous area, a different type of data source is likely the reason for weaker trends in the ERA-40 and NCEP/NCAR reanalyses.

Table 2 Mean temperature trends averaged over all stations / gridpoints. The numbers in brackets demonstrate at what level the trend map (pattern) is significant for particular season and dataset

Similarly to us, Hartmann et al. (2013) detect higher temperature trend of the global land in the GHCN station network (0.197 °C per decade in the period 1950–2010) than in the CRUTEM dataset (0.175 °C per decade). Compo et al. (2013) calculate the global land temperature trend to be only 0.13 °C per decade in ERA-40 in the same period.

Figure 2 displays geographical distribution of temperature trends in winter. The trends exhibit a considerable spatial variability, with strongest warming (more than 0.6 °C per decade) in the Baltic region, and substantial cooling (about 0.3 °C per decade) in Turkey.

Fig. 2
figure2

Winter temperature trends in Europe, 1957–2002, for a E-OBS with ECA&D as dots, b CRUTEM, c ERA-40, d NCEP/NCAR). Areas with statistically significant trends are denoted by hatching. Stations symbols with statistically significant trends are filled with a black dot

All data sources more or less reveal this geographical distribution of temperature trends, but several differences are easy to see. ERA-40 shows only small positive or even near-zero trends in northern Scandinavia, whilst warming rates in the other data sources exceed 0.30° per decade there. Additionally, NCEP/NCAR reanalysis is the only dataset that does not detect positive trends in southwestern Europe and locates the boundary between near-zero trends in the south and a considerable warming in the north more northwards than other data sources. CRUTEM has smoother features than other datasets due to its lower spatial resolution. Interestingly, although CRUTEM trends in western and southwestern Europe are not as strong as in E-OBS and ERA-40, they are statistically significant there due to a smaller year-to-year variability. The most striking peculiarity of E-OBS are strong trends in southern Scandinavia, which exceed 0.75 °C per decade, more than in any other dataset in any area. Furthermore, near-zero trends prevail in Turkey in ECA&D and CRUTEM data compared with considerably negative trends in both reanalyses and E-OBS.

A noticeable northeast-southwest gradient is apparent in trend values in summer (Fig. 3). Whilst warming rates are well above 0.15 °C and quite often above 0.30 °C per decade in southern and central Europe, near-zero trends prevail in northern and to a large extent also in Eastern Europe; this pattern holds for all datasets except NCEP/NCAR. The spatial distribution of temperature trends exhibits larger differences between datasets than in winter, NCEP/NCAR reanalysis deviating most. The area with warming is much more extensive in NCEP/NCAR data, reaching up to Poland, Ukraine and Russia where trends are even the strongest. Another substantial specific can be found in E-OBS data in southeastern Turkey and Syria where the rate of warming exceeds the other data sources by up to 0.30 °C per decade.

Fig. 3
figure3

Same as Fig. 2, except for summer

Comparisons between datasets

As mentioned above, there are relatively large differences between temperature trend values estimated from the five data sources, not only in terms of the average value of warming but also in its geographical distribution. All the dataset types include specifics, leading to trend biases, both locally and in general; they are presented in more details, e.g. in Klein Tank et al. (2002), Bengtsson et al. (2004), Hofstra et al. (2009), Kyselý and Plavcová (2010), Thorne and Vose (2010), Hofstra et al. (2012) and Director and Bornn (2015). In this section, we first briefly describe the deficiencies of the station data and interpolated datasets, revealed when comparing each other. Subsequently, this group of ‘observation-based’ sources is compared with reanalyses in more detail.

One potential deficiency of station data is the lack of representativeness of climate stations for their wider surroundings (see also Liu et al. 2018). In most studies, the station network considered consists of stations with long homogeneous temperature series, and it is customary that such stations are located in or near cities. The amplifying effect of the urban heat island thus may lead to the overestimation of temperature trends calculated from station data compared with regular networks into which also the non-urban stations with frequently incomplete or short series are incorporated (Haylock et al. 2008). For instance, the Prague-Klementinum meteorological station, which is located in the very city centre, exhibits the winter temperature trend of 0.52 °C per decade whereas the closest gridpoints of E-OBS and the CRUTEM show trends below 0.35 °C per decade. Similar features occur at high-altitude stations. The Sonnblick climate station, located on a mountain summit at an altitude of 3106 m, exhibits spring and summer temperature trends of 0.38 °C and 0.46 °C per decade, respectively, whereas the nearest gridpoints (in fact representing box averages) of E-OBS, ERA-40 and NCEP/NCAR show trends below 0.20 °C per decade and CRUTEM trend is 0.30 °C per decade in both seasons. The explanation can be the altitude mismatch between the station and the gridpoints and the fact that the snow-albedo feedback modifies temperature trends at high-elevated stations (Scherrer et al. 2012), which results in a higher temperature trend at a high-elevated mountain station compared with its lower lying surroundings.

Compared with previous studies on differences between stations and gridded data (Klein Tank et al. 2002; Jones et al. 2012; Xu et al. 2014), we newly found areas (thanks to the spatial analysis of trends, high resolution of datasets and a high number of climate stations) where a higher rate of warming is detected in the interpolated datasets than at the weather stations around. In most of these cases, the difference appears to have been caused by an inhomogeneity of the interpolated dataset. Figure 4 shows such a case: winter temperature anomaly series for the Lyngor station (southeastern Norway) and nearby gridpoints. A marked inhomogeneity occurs in 1995: E-OBS temperature anomalies are higher than the station data and ERA-40 after 1995, whilst before 1995, the opposite tends to be the case. Since ERA-40 temperatures after 1995 are closer to the Lyngor temperatures, it can be concluded that the inhomogeneity occurs in the E-OBS rather than in the station series. The possible reason for the inhomogeneity is a sudden jump in the number of stations used for interpolation in E-OBS in 1995 (Haylock et al. 2008) when several stations in southwestern Sweden (e.g. Kropefjall, Rangedala) were added into the database from which the E-OBS data are calculated. It is a likely cause of markedly higher trends (by more than 0.2 °C per decade) in the E-OBS database. This confirms the statement of Kyselý and Plavcová (2010) that the time-varying number of stations entering the interpolation (adding stations with above or below average temperature) can induce an artificial trend. However, entering more stations into the interpolation procedure, even not homogeneous (Cornes et al. 2018), is generally beneficial since it provides a more complete and regular coverage of the area, and hence a more accurate assessment of temperature interpolation (Haylock et al. 2008).

Fig. 4
figure4

Mean winter temperature anomalies at the 59.25° N × 10.25° E E-OBS gridpoint, 60° N × 10° E ERA-40 gridpoint and at the Lyngor climate station (58.60° N, 9.20° E)

Whereas the differences between the station and interpolated data are typically only local (though conspicuous), reanalyses differ more noticeably on average and typically on larger areas. To depict differences between observation-based sources and reanalyses, temperature trends at ERA-40 and NCEP/NCAR gridpoints were subtracted from the nearest E-OBS and CRUTEM gridpoint values, and the differences were mapped.

In winter, E-OBS and CRUTEM tend to exhibit a higher temperature trend than ERA-40, but the differences vary spatially (Fig. 5, top). The most substantial difference (although statistically significant only at several gridpoints) between them occurs in northern Scandinavia and Finland: north of 60° N, spatially averaged E-OBS and CRUTEM trends are higher than ERA-40 trends by 0.19 °C per decade and 0.16 °C per decade, respectively. It is in a good agreement with van der Schrier et al. (2013) and Wang et al. (2013) who detected the greatest discrepancy between the E-OBS and CRUTEM datasets and the 20th Century Reanalysis also in the Scandinavia and northeastern Europe, the reanalysis underestimating trends by 0.1 °C per decade. Similarly, Marshall et al. (2018) reveal a smaller magnitude of warming in four examined reanalyses in Arctic Fennoscandia than in observations. On the other hand, ERA-40 exhibits slightly higher trends around the Black Sea and Aegean Sea (statistically significant) compared with E-OBS and in Western Europe compared with CRUTEM.

Fig. 5
figure5

Differences in winter temperature trends between datasets. Areas with statistically significant differences are denoted by hatching

To uncover causes of these discrepancies, we plot mean winter temperature anomalies for the individual datasets and their differences for the domains where substantial discrepancies in trends occur. At first, the area-averaged temperature for northern Europe (north of 60° N) in ERA-40 is compared with E-OBS and CRUTEM (Fig. 6). The largest difference occurs before 1967 when ERA-40 is by up to 0.5 °C warmer than E-OBS and CRUTEM. After 1990, ERA-40 becomes slightly cooler, especially in comparison with E-OBS.

Fig. 6
figure6

Mean temperature anomalies in northern Europe (averaged over the area north of 60° N) in winter in the E-OBS, CRUTEM and ERA-40 datasets (top) and their differences (bottom)

At the turn of 1966 and 1967, the number of observations assimilated to ERA-40 increased sharply (Simmons et al. 2004), especially in large European countries such as Germany, France, Spain and Sweden. Together with a near-surface warm bias in the background forecast (Simmons et al. 2004), it is probably the reason for the overestimation of mean temperature before 1967 in ERA-40, which leads to smaller (i.e. less positive or more negative) temperature trends. This specific feature of ERA-40 appears to be common in all seasons for all the areas where trends are underestimated by ERA-40. However, the underestimation of trends, though not so evident, by a reanalysis produced by ECMWF (whether ERA-40 or ERA-Interim) is present in the European Arctic even for a more recent period (after 1967) (Lindsay et al. 2014; Marshall et al. 2018).

NCEP/NCAR reanalysis, similarly to ERA-40, shows lower temperature trends in winter than the observation-based sources (Fig. 5, bottom). However, the spatial distribution of differences is entirely different from ERA-40: NCEP/NCAR exhibits lower trends across almost entire Europe, except for its northeast, compared with E-OBS and especially in southeastern and central Europe compared to CRUTEM (Fig. 5, bottom). In the Pyrenees, the southern part of Scandinavia, the Near East and the Pannonian basin, where the differences are statistically significant at most gridpoints (except southern Scandinavia), the temperature trends in NCEP/NCAR are even more than 0.30 °C per decade below the trends in E-OBS. A comparison with other data sources indicates a strong underestimation of trends by NCEP/NCAR in these areas, except for southern Scandinavia, where it probably reflects the inhomogeneity in E-OBS, as was demonstrated above in the comparison with station data in Fig. 4.

Figure 7 reveals a clear underestimation of mean temperatures by NCEP/NCAR after 1990 in the entire domain when compared with E-OBS and CRUTEM. For example, whilst in 2001, the mean winter European temperature anomaly is estimated to be 1.98, 2.02 and 1.86 °C in E-OBS, CRUTEM and ERA-40, respectively; the anomaly in NCEP/NCAR is only 1.24 °C. Since some reanalyses do not assimilate surface temperature over land (e.g. NCEP/NCAR; Kalnay et al. 1996) and therefore are not directly sensitive to near-surface properties, they may not capture well the substantial changes in land use and land cover in recent years leading to further temperature changes (Fall et al. 2009). A well-marked underestimation of mean temperature after 1990 is documented by van der Schrier et al. (2013) for the 20th Century Reanalysis, which only uses surface pressure data (Compo et al. 2013).

Fig. 7
figure7

As in Fig. 6, except for the entire domain and E-OBS, CRUTEM and NCEP/NCAR

In areas where NCEP/NCAR trends deviate most from the other datasets (Pannonian basin, Near East, the Pyrenees), the underestimation of temperatures after 1990 is accompanied by the overestimation of temperatures before 1967 (similar to ERA-40), resulting in considerably lower values of NCEP/NCAR trends. For example, temperatures in NCEP/NCAR are by 0.2 to 1 °C higher in France and Spain than in the other datasets before 1967 (not shown). Where NCEP/NCAR reveals higher temperature trends than observation-based datasets (northeastern Europe), mean temperatures are underestimated before 1967. The explanation for this effect is not clear and we are not aware of any study where it would be reported and discussed.

The underestimation of trends in ERA-40 as compared with E-OBS and CRUTEM is obvious also in summer (Fig. 8, top), especially in Western (see, e.g. the statistical significance of differences over British Isles) and southwestern Europe. The anomaly in eastern Turkey is probably caused by the overestimation of E-OBS trends as the same effect appears in the comparison with NCEP/NCAR (Fig. 8, bottom).

Fig. 8
figure8

Differences in summer temperature trends between datasets. Areas with statistically significant differences are denoted by hatching

NCEP/NCAR behaves differently in summer than in winter. Whilst winter temperature trends are underestimated by NCEP/NCAR even more than by ERA-40, summer trends in NCEP/NCAR are noticeably higher compared with other data sources, e.g. by more than 0.2 °C per decade in Eastern Europe where the difference is statistically significant at most gridpoints, especially in the southeast. We plot mean summer temperature anomalies for Askania station (southern Ukraine) and for the closest gridpoint of each dataset (Fig. 9) as an example. NCEP/NCAR clearly underestimates mean temperatures in 1960s almost by 2 °C there compared with the other datasets. Summer temperature trends in southern Ukraine are consequently about 0.5 °C per decade in NCEP/NCAR, whilst only 0.1 °C per decade in other data sources. This feature appears in entire Eastern Europe, most strongly in the belt from the Black to the Baltic sea, and is present also in spring, although less pronounced, but not in winter and autumn. It can be attributed to a NCEP/NCAR data input error in 1948-1967 (Reid et al. 2001; Quan et al. 2003) resulting in a negative bias of sea level pressure, observed especially in Asia (Ho et al. 2003).

Fig. 9
figure9

Mean summer temperature anomalies at the Askania station (46.5° N × 33.9° E) and at the 47.25° N × 32.25° E E-OBS gridpoint, 48° N × 32° E ERA-40 gridpoint, 47.5° N × 32.5° E CRUTEM gridpoint and 47.5° N × 32.5° E NCEP/NCAR gridpoint

A similar discrepancy between the NCEP/NCAR and ERA-40 reanalyses was found by Greatbatch and Rong (2006) in sea level pressure (SLP) trends. They detected higher summer SLP trends estimated by NCEP/NCAR than by ERA-40 especially in Asia with islands extending to the Black sea region due to the mean SLP underestimation by NCEP/NCAR in 1960s.

Discussion and conclusions

This study is, to our knowledge, the first to compare temperature trends estimated in three types of datasets (station data, data interpolated onto a regular grid and reanalyses). It helps us to better distinguish which type of data underestimates or overestimates trends, compared with studies assessing only two dataset types (Simmons et al. 2004; Vose et al. 2012; Donat et al. 2014; Marshall et al. 2018). Additionally, we use two gridded networks and two reanalyses to determine if it is the type of data source that plays a key role in the trend bias. The comparison is performed by area-averaged trends and trend maps for all the five datasets, and by temperature anomaly series for domains where substantial differences in trends between datasets occur.

In 1957–2002, Europe warmed up by 0.17 to 0.22 °C per decade depending on dataset. The datasets agree in that temperature trends reveal a high seasonal variability with the strongest warming in winter and near-zero trends in autumn. The spatial variability of trends is noticeable as well (strongest warming in the Baltic region and substantial cooling in Turkey in winter and the northeast-southwest warming gradient in summer) but varies slightly between the datasets. We disclosed a weaker warming in Scandinavia in ERA-40 compared with other sources in winter, and a stronger ERA-40 warming in the Pyrenees in winter and in summer, similarly to Scherrer et al. (2006); trends closer to zero in ECA&D and CRUTEM compared with more negative trends in other data sources in Turkey in winter; not so pronounced the northeast-southwest warming gradient in CRUTEM compared with other sources in summer, similarly to van Oldenborgh et al. (2009); and a strong warming in eastern Europe in NCEP/NCAR in summer, not observed in any other dataset and not noticed in any previous study.

The results of this study show that temperature trends are most distorted in reanalyses. Besides the overestimation of temperature prior to 1967 by ERA-40 revealed also by Simmons et al. (2004) and underestimation of temperature by reanalyses after 1990 revealed also by van der Schrier et al. (2013), both leading to the underestimation of temperature trends, we newly detected a strong underestimation of temperature by NCEP/NCAR in the 1960s in Eastern Europe in summer, leading to an exaggerated warming. The statements about the reduced utility of reanalyses for long-term trend monitoring (Thorne and Vose 2010), the speculations about the sensitivity of reanalyses to near-surface properties (Fall et al. 2009) and discussions about how a change in the type and in the number of data supplied to reanalysis assimilation during its run generates fictitious trends (Bengtsson et al. 2004) are thus confirmed. The potential inhomogeneity in 1979 (introduction of satellite data), highlighted often by reanalyses producers (Kalnay et al. 1996; Uppala et al. 2005), is not detected in our results probably because only near-surface temperature is investigated and because Europe is a data-rich region where satellite data do not bring much additional information. Nevertheless, as presented by Cornes and Jones (2013) for temperature extremes, trends in reanalyses and observations are much more comparable if the period starting at 1979 is considered. We concede that additional potential uncertainties might be introduced into our analysis due the fact that the comparison of regular networks does not take place on exactly the same domains and with the same resolution.

If area-averaged trends are assessed, the agreement between observation-based sources is closer than when compared with reanalyses. Nevertheless, local disparities between station and gridded data caused by unrepresentative positions of stations and by time-varying numbers of stations (often not homogeneous) entering the interpolation are conspicuous, as evidenced by our results, and therefore, we suggest not to overlook them. We refer again to the jump up in the number of stations interpolated into E-OBS in 1995 (Haylock et al. 2008) resulting probably in the inhomogeneity in E-OBS temperature series in southern Scandinavia. Potential disparities caused by comparing fully homogeneous ECA&D stations with gridpoints interpolated from non-homogeneous stations have not been observed in our study.

Our results indicate that temperature trends are not accurate in any type of data and differ to a considerable extent among datasets, which is detected especially when they are assessed spatially and in a higher resolution. Thus, we strongly recommend to use multiple types of data sources if estimating trends in meteorological elements and to be exceptionally cautious if using reanalyses for such purposes.

References

  1. Aguilar E, Auer I, Brunet M, Peterson TC, Wieringa J (2003) Guidelines on Climate Metadata and Homogenization. World Climate Programme Data and Monitoring WCDMP 53, WMO-TD 1186, WMO. Geneva

  2. Auer I, Böhm R, Jurkovic A, Lipa W, Orlik A, Potzmann R, Schöner W, Ungersböck M, Matulla C, Briffa K, Jones PD, Efthymiadis D, Brunetti M, Nanni T, Maugeri M, Mercalli L, Mestre O, Moisellin J, Begert M, Müller-Westermeier G, Kveton V, Bohnicek O, Stastny P, Lapin M, Szalai S, Szentimrey T, Cegnar T, Dolinar M, Gajic-Capka M, Zaninovic K, Majstorovic Z, Nieplova E (2007) HISTALP – historical instrumental climatological surface time series of the Greater Alpine Region. Int J Climatol 27:17–46. https://doi.org/10.1002/joc.1377

    Article  Google Scholar 

  3. Bengtsson L, Hageman S, Hodges KI (2004) Can climate trends be calculated from reanalysis data? J Geophys Res 109:D11111. https://doi.org/10.1029/2004JD004536

    Article  Google Scholar 

  4. Brázdil R, Chromá K, Dobrovolný P, Tolasz R (2009) Climate fluctuations in the Czech Republic during the period 1961-2005. Int J Climatol 29:223–242. https://doi.org/10.1002/joc.1718

    Article  Google Scholar 

  5. Compo GP, Sardeshmukh PD, Whitaker JS, Brohan P, Jones PD, McColl C (2013) Independent confirmation of global land warming without the use of station temperatures. Geophys Res Lett 40:3170–3174. https://doi.org/10.1002/grl.50425

    Article  Google Scholar 

  6. Cornes RC, Jones PD (2013) How well does ERA-Interim replicate trends in extremes of surface temperature across Europe? J Geophys Res Atmos 118:10262–10276. https://doi.org/10.1002/jgrd.50799

    Article  Google Scholar 

  7. Cornes RC, van der Schrier G, van den Besselaar EJM, Jones PD (2018) An ensemble version of the E-OBS temperature and precipitation data sets. J Geophys Res Atmos 123:9391–9409. https://doi.org/10.1029/2017JD028200

    Article  Google Scholar 

  8. Dee DP, Uppala SM, Simmons AJ, Berrishford P, Poli P, Kobayashi S, Andrae U, Balmaseda MA, Balsamo G, Bauer P, Bechtold P, Beljaars ACM, van de Berg L, Bidlot J, Bormann N, Delsol C, Dragani R, Fuentes M, Geer AJ, Haimberger L, Healy SB, Hersbach H, Hólm EV, Isaksen L, Kallberg P, Köhler M, Matricardi M, McNally AP, Monge-Sanz BM, Morcrette J-J, Park B-K, Peubey C, de Rosnay P, Tavolato C, Théput J-N, Vitart F (2011a) The ERA-Interim reanalysis: configuration and performance of the data assimilation systém. Q J R Meteorol Soc 137:553–597. https://doi.org/10.1002/qj.828

    Article  Google Scholar 

  9. Dee DP, Källén E, Simmons AJ, Haimberger L (2011b) Comments on “Reanalyses suitable for characterizing long-term trends”. Bull Amer Meteor Soc 92: 65-70. https://doi.org/10.1175/2010BAMS3070.1

    Article  Google Scholar 

  10. Director H, Bornn L (2015) Connecting point-level and gridded moments in the analysis of climate data. J Clim 28:3496–3510. https://doi.org/10.1175/jcli-d-14-00571.1

    Article  Google Scholar 

  11. Donat MG, Sillmann J, Wild S, Alexander IV, Lippmann T, Zwier FW (2014) Consistency of temperature and precipitation extremes across various global gridded in situ and reanalysis datasets. J Clim 27:5019–5035. https://doi.org/10.1175/JCLI-D-13-00405.1

    Article  Google Scholar 

  12. Fall S, Niyogi D, Gluhovsky A, Pielke RA, Kalnay E, Rochon G (2009) Impacts of land use land cover on temperature trends over the continental United States: assessment using the North American Regional Reanalysis. Int J Climatol 30:1980–1993. https://doi.org/10.1002/joc.1996

    Article  Google Scholar 

  13. Greatbatch RJ, Rong P (2006) Discrepancies between different Northern Hemisphere summer atmospheric data products. J Clim 19:1261–1273. https://doi.org/10.1175/JCLI3643.1

    Article  Google Scholar 

  14. Hansen J, Ruedy R, Sato M, Lo K (2010) Global surface temperature change. Rev Geophys 48:1–29. https://doi.org/10.1029/2010RG000345

    Article  Google Scholar 

  15. Hartmann DL, Klein Tank AMG, Rusticucci M, Alexander LV, Brönnimann S, Charabi Y, Dentener FJ, Dlugokencky EJ, Easterling DR, Kaplan A, Soden BJ, Thorne PW, Wild M, Zhai PM (2013) Observations: atmosphere and surface. In: Stocker TF (ed) Climate change 2013: the physical science basis. Contribution of Working Group I to the Fifth Assessment Report of the Intergovernmental Panel on Climate Change. Cambridge University Press, Cambridge, United Kingdom and New York, NY, USA, pp 160–254

    Google Scholar 

  16. Haylock MR, Hofstra N, Klein Tank AMG, Klok EJ, Jones PD, New M (2008) A European daily high-resolution gridded data set of surface temperature and precipitation for 1950-2006. J Geophys Res 113:D20119. https://doi.org/10.1029/2008jd010201

    Article  Google Scholar 

  17. Ho C-H, Lee J-Y, Ahn M-H, Lee H-S (2003) A sudden change in summer rainfall characteristics in Korea during the late 1970s. Int J Climatol 23:117–128. https://doi.org/10.1002/joc.864

    Article  Google Scholar 

  18. Hofstra N, Haylock M, New MG, Jones PD (2009) Testing E-OBS European high-resolution gridded data set of daily precipitation and surface temperature. J Geophys Res 114:D21101. https://doi.org/10.1029/2009JD011799

    Article  Google Scholar 

  19. Hofstra N, New MG, McSweeney C (2012) The influence of interpolation and station network density on the distributions and trends of climate variables in gridded daily data. Clim Dyn 35:841–858. https://doi.org/10.1007/s00382-009-0698-1

    Article  Google Scholar 

  20. Jones PD, Lister DH, Osborn TJ, Harpham C, Salmon M, Morice CP (2012) Hemispheric and large-scale land-surface air temperature variations: an extension revision and update to 2010. J Geophys Res 117:D05127. https://doi.org/10.1029/2011JD017139

    Article  Google Scholar 

  21. Kalnay E, Kanamitsu M, Kistler R, Collins W, Deaven D, Gandin L, Iredell M, Saha S, White G, Woolen J, Zhu Y, Chelliah M, Ebisuzaki W, Higgins W, Janowiak J, Mo KC, Ropelewski C, Wang J, Leetma A, Reynolds R, Jenne R, Joseph D (1996) The NCEP/NCAR 40-year reanalysis project. Bull Am Meteorol Soc 77:437–471. https://doi.org/10.1175/1520-0477(1996)077<0437:TNYRP>2.0.CO;2

    Article  Google Scholar 

  22. Klein Tank AMG, Können GP, Selten FM (2005) Signals of anthropogenic influence on European warming as seen in the trend patterns of daily temperature variance. Int J Climatol 25:1–16. https://doi.org/10.1002/joc.1087

    Article  Google Scholar 

  23. Klein Tank AMG, Wijngaard JB, Können GP, Böhm R, Demarée G, Gocheva A, Mileta M, Pashiardis S, Hejkrlik L, Kern-Hansen C, Heino R, Bessemoulin P, Müller-Westermeier G, Tzanakou M, Szalai S, Pálsdóttir T, Fitzgerald D, Rubin S, Capaldo M, Maugeri M, Leitass A, Bukantis A, Aberfeld R, van Engelen AFV, Forland E, Mietus M, Coelho F, Mares C, Razuvaev V, Nieplova E, Cegnar T, Antonio López J, Dahlström B, Moberg A, Kirchhofer W, Ceylan A, Pachaliuk O, Alexander LV, Petrovic P (2002) Daily dataset of 20th-century surface air temperature and precipitation series for the European Climate Assessment. Int J Climatol 22:1441–1453. https://doi.org/10.1002/joc.773

    Article  Google Scholar 

  24. Kyselý J, Plavcová E (2010) A critical remark of the applicability of E-OBS European gridded temperature data set for validating control climate simulations. J Geophys Res 115:D23118. https://doi.org/10.1029/2010JD014123

    Article  Google Scholar 

  25. Lindsay R, Wensnahan M, Schweiger A, Zhang J (2014) Evaluation of seven different atmospheric reanalysis products in the Arctic. J Clim 27:2588–2606. https://doi.org/10.1175/JCLI-D-13-00014.1

    Article  Google Scholar 

  26. Liu S, Su H, Tian J, Wang W (2018) An analysis of spatial representativeness of air temperature monitoring stations. Theor Appl Climatol 132:857–865. https://doi.org/10.1007/s00704-017-2133-6

    Article  Google Scholar 

  27. Mamara A, Argiriou AA, Anadranistakis M (2016) Recent trend analysis of mean air temperature in Greece based on homogenized data. Theor Appl Climatol 126:543–573. https://doi.org/10.1007/s00704-015-1592-x

    Article  Google Scholar 

  28. Marshall GJ, Kivinen S, Jylhä K, Vignols RM, Rees WG (2018) The accuracy of climate variability and trends across Arctic Fennoscandia in four reanalyses. Int J Climatol 38:3878–3895. https://doi.org/10.1002/joc.5541

    Article  Google Scholar 

  29. Mooney PA, Mulligan FJ, Fealy R (2011) Comparison of ERA-40, ERA-Interim and NCEP/NCAR reanalysis data with observed surface air temperatures over Ireland. Int J Climatol 31:545–557. https://doi.org/10.1002/joc.2098

    Article  Google Scholar 

  30. Osborn TJ, Jones PD (2014) The CRUTEM4 land surface air temperature data set: construction, previous versions and dissemination via Google Earth. Earth Sci Syst Data 6:61–68. https://doi.org/10.5194/essd-6-61-2014

    Article  Google Scholar 

  31. Peterson TC, Vose RS (1997) An overview of the Global Historical Climatology Network. Bull Am Meteorol Soc 78:2837–2849. https://doi.org/10.1175/JTECH-D-11-00103.1

    Article  Google Scholar 

  32. Pokorná L, Kučerová M, Huth R (2018) Annual cycle of temperature trends in Europe, 1961–2000. Glob Planet Chang 170:146–162. https://doi.org/10.1016/j.gloplacha.2018.08.015

    Article  Google Scholar 

  33. Quan X-W, Diaz HF, Fu C-B (2003) Interdecadal Change in the Asia-Africa Summer Monsoon and Its Associated Changes in Global Atmospheric circulation. Glob Planet Chang 37:171–188. https://doi.org/10.1016/S0921-8181(02)00200-X

    Article  Google Scholar 

  34. Reid PA, Jones PD, Brown O, Goodess CM, Davies TD (2001) Assessments of the reliability of NCEP circulation data and relationships with surface climate by direct comparisons with station based data. Clim Res 17:247–261. https://doi.org/10.3354/cr017247

    Article  Google Scholar 

  35. Scherrer SC, Appenzeller C, Liniger MA (2006) Temperature trends in Switzerland and Europe: implications for climate normals. Int J Climatol 26:565–580. https://doi.org/10.1002/joc.1270

    Article  Google Scholar 

  36. Scherrer SC, Ceppi P, Croci-Maspoli M, Appenzeller C (2012) Snow-albedo feedback and Swiss spring temperature trends. Theor Appl Climatol 114:509–516. https://doi.org/10.1007/s00704-012-0712-0

    Article  Google Scholar 

  37. Stickler A, Brönnimann S, Jourdain S, Roucaute E, Sterin A, Nikolaev D, Valente MA, Wartenburger R, Hersbach H, Ramella-Pralungo L, Dee D (2014) Description of the ERA-CLIM upper-air data. Earth Syst Sci Data 6:29–48. https://doi.org/10.5194/essd-6-29-2014

    Article  Google Scholar 

  38. Simmons AJ, Jones PD, da Costa BV, Beljaars ACM, Kälberg PW, Saarinen S, Uppala SM, Viterbo P, Wedi N (2004) Comparison of trends and low-frequency variability in CRU, ERA-40 and NCEP/NCAR analyses of surface air temperature. J Geophys Res 109:D24115. https://doi.org/10.1029/2004JD005306

    Article  Google Scholar 

  39. Thorne PW, Vose RS (2010) Reanalysis suitable for characterising long-term trends. Are they really achievable? Bull Am Meteorol Soc 91:353–361. https://doi.org/10.1175/2009bams2858.1

    Article  Google Scholar 

  40. Trenberth KE, Jones PD, Ambenje P, Bojariu R, Easterling D, Klein Tank A, Parker D, Rahimzadeh, Renwick JA, Rusticucci M, Soden B, Zhai P (2007) Observations: Surface and Atmospheric Climate Change. In: Solomon S (ed) Climate Change 2007: The Physical Science Basis. Contribution of Working Group I to the Fourth Assessment Report of the Intergovernmental Panel on Climate Change. Cambridge University Press, Cambridge, United Kingdom and New York, NY, USA, pp 235-336

  41. Uppala SM, Kallberg PW, Simmons AJ, Andrae U, da Costa BV, Fiorino M, Gibson JK, Haseler J, Hernandez A, Kelly GA, Li X, Onogi K, Saarinen S, Sokka N, Allan RP, Andersson E, Arpe K, Balmaseda MA, Beljaars ACM, van de Berg L, Bidlot J, Bormann N, Caires S, Chevallier F, Dethof A, Dragosavac M, Fisher M, Fuentes M, Hagemann S, Hólm E, Hoskins BJ, Isaksen L, Janssen PAEM, Jenne R, Mcnally AP, Mahfouf J, Morcrette J, Rayner NA, Saunders RW, Simon P, Sterl A, Trenberth KE, Untch A, Vasiljevic D, Vitterbo P, Woolen J (2005) The Era-40 re-analysis. Q J R Meteorol Soc 131:2961–3012. https://doi.org/10.1256/qj.04.176

    Article  Google Scholar 

  42. Van der Schrier G, Horstink G, van den Besselaar E, Klein Tank AMG (2012) ECA&D: a high resolution dataset for monitoring climate change and effects on viticulture in Europe. Centre de Recherchers de Climatologie/Biogeosciences-Université de Bourgogne, Proceedings of the IXth Intern Terroir Congres

    Google Scholar 

  43. Van der Schrier G, van den Besselaar EJM, Klein Tank AMG, Verver G (2013) Monitoring European average temperature based on E-OBS gridded data set. J Geophys Res Atmos 118:5120–5135. https://doi.org/10.1002/jgrd50444

    Article  Google Scholar 

  44. Van Oldenborgh GJ, Drijfhout S, van Ulden A, Haarsma R, Sterl A, Severins C, Hazeleger W, Dijkstra H (2009) Western Europe is warming much faster than expected. Clim Past 5:1–12. https://doi.org/10.5194/cp-5-1-2009

    Article  Google Scholar 

  45. Vose RS, Applequist S, Menne MJ, Williams N Jr, Thorne P (2012) An intercomparison of temperature trends in the US Historical Climatology Network and recent atmospheric reanalyses. Geophys Res Lett 39:L10703. https://doi.org/10.1029/2012GL051387

    Article  Google Scholar 

  46. Walton D, Hall A (2018) An assesment of high-resolution gridded temperature dataset over California. J Clim 31:3789–3810. https://doi.org/10.1175/JCLI-D-17-0410.1

    Article  Google Scholar 

  47. Wang J, Yan Z, Jones PD, Xia J (2013) On “observation minus reanalysis” method: A view from multidecadal variability. J Geophys Res Atmos 118: 7450-7458. https://doi.org/10.1002/jgrd.50574

    Google Scholar 

  48. Wilks (2016) “The stippling shows statistically significant gridpoints.” How research results are routinely overstated and overinterpreted, and what to do about it. Bull Amer Meteorol Soc 97: 2263-2273. https://doi.org/10.1175/BAMS-D-15-00267.1

    Article  Google Scholar 

  49. Xu W, Li Q, Yang S, Xu Y (2014) Overview of global monthly surface temperature data in the past century and preliminary integration. Adv Clim Chang Res 5:111–117. https://doi.org/10.1016/j.accre.2014.11.003

    Article  Google Scholar 

Download references

Acknowledgments

The authors would like to thank all the data providers for their efforts and making their datasets publicly available.

Funding

This study was supported by the Czech Science Foundation, project 16-04676S. T.K. was also supported by the Grant Agency of the Charles University, project 558119.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Tomáš Krauskopf.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Krauskopf, T., Huth, R. Temperature trends in Europe: comparison of different data sources. Theor Appl Climatol 139, 1305–1316 (2020). https://doi.org/10.1007/s00704-019-03038-w

Download citation