...
See section 3.4.
Known issues
Rogue values in Tmin-24h layers
Background
Several users have reported erroneous temperature values in the Tmin-24h variable where the value for selected grid cells could reach unrealistic values of around 220 K (-50 C) in locations with otherwise high temperatures. Analysis of the spatial distribution demonstrated that the cells with erroneous values can often be found in Western Australia but are not limited to that region and can be found in other parts of the World as well (Figure 1).
Figure 1: Maps of AgERA5 Tmin-24h with rogue values (black cells) for several regions in the World
A further analysis on the occurrence of the rogue Tmin-24h values demonstrates that the problem occurs quite often. Figure 2 shows the number of files per year where such rogue values occur in the Tmin-24h variable. Note that files with rogue Tmin-24h values cannot be found by looking at the temperature extremes because the low Tmin-24h values are still within the valid range. E.g. an erroneous value of 220 K in Western Australia in Summer cannot be discriminated from a valid temperature value of 220K that occurs in Eastern Siberia at the same day. Instead, figure 2 was generated by computing the first order spatial differences and selecting on a threshold value.
Surprisingly there are large differences between the different time-periods: the problem hardly occurs with the 1979-1999 time period, quite regularly in the period 2000-2020 and often since 2021. These time periods coincide with the batches in which the AgERA5 archive has been processed. The origin of the differences is not entirely clear but could be related to different encodings of the original ERA5 input data.
Figure 2: Number of Tmin-24 files per year where the problem of rogue values occurs
Problem analysis
To find the origin of the problem it is needed to dive deep into the processing chain used for AgERA5 and the structure of the ERA5 files used as input for AgERA5. First of all, a feature of the ERA5 input files is that the content of each file does not contain the data for 00:00 to 24:00 UTC. For example, the ERA5 file containing air temperature data for 2024-01-01 contains data ranging from 2024-01-01T07:00:00 up till 2024-01-02T06:00:00. Therefore, the AgERA5 processing line first harmonizes all data files so they contain the time slices for the period 00:00 to 24:00 UTC. For example, for harmonizing the data for 2024-01-02 the processing line takes the files for 2024-01-01 and 2024-01-02, opens them jointly with xarray and takes the slice out of the dataset covering 2024-01-02T00:00:00 <= time < 2024-01-03T00:00:00. Analysis of the processing line of AgERA5 looking specifically what happens at those rogue Tmin-24h values demonstrated that the problem is generated at the step when two ERA5 values are joined (see Figure 3).
Figure 3: Above: A timeseries of the original hourly ERA5 input data for variable MN2T (minimum temperature). Below: hourly ERA5 data clipped to 0-24UTC period. The grey area show the process of taking 2 ERA5 files and combining them into one new file during which the rogue Tmin-24h values are generated.
A second point that should be understood is that figure 3 shows the ERA5 data as floating point values in degrees Kelvin. However, that is not how the data is stored on disk. The raw data coming from the ERA5 processing chain is not stored as floating point values (a 32 bit single-precision float) but instead as a C short datatype (a signed 16 bit integer) with an offset and a scaling factor associated with the variable as attributes. You can find out when looking at the data in Panoply. The AgERA5 Tmin-24h is derived from the ERA5 variable "mn2t" and panoply shows the scale_factor and offset values (figure 4).
Figure 4: Encoding of the variable mn2t in ERA5 input files.
Tools like panoply and xarray handle this completely transparent on the background: they recognize the offset and scale_factor and convert back and forth. Moreover, the scale_factor and offset are highly optimized values: each ERA5 file has its own scale_factor and offset in order to maximize the precision for the given data range.
The tricky part is when the newly sliced dataset has to be saved into a new NetCDF file which combines data from 2024-01-01 and 2024-01-02 (figure 3). Under the hood, xarray still knows that this data is represented by a C short with a scale_factor and offset, the question is now which scale_factor and offset to apply? The one for 2024-01-01 or the one for 2024-01-02? Xarray applies the scale_factor and offset from the first file it opens, so 2024-01-01 in this case.
The location in time where things go wrong with the variable "mn2t" is marked with the square on the red curve in the figure 5 below. It is the first slice of the second input file (red line) and xarray is applying the scale_factor and offset of the first input file (green line) to save a new NetCDF file.
Figure 5: As top figure 3, but with the input value that turns rogue marked with a black diamond.
The temperature value that turns rogue is the first data point on the red curve (Figure 5) whose actual value is 318.871307 K and we convert it to 16 bit integer by inverting the scale and offset of the NetCDF file represented by the green curve (Figure 4):
Code Block |
---|
>>> math.trunc((318.871307 - 268.68162880225003)/0.0015210688706274414)
32996 |
And this is what happens with the last data point on the green curve whose values is 318.309113 K
Code Block |
---|
>>> math.trunc((318.309113 - 268.68162880225003)/0.0015210688706274414)
32626 |
But the maximum value that can be stored in a signed 16 bit integer is 32767. So the first data point on the red line (marked with the diamond) is too large to fit in the range represented by a 16-bit integer because the scale_factor and offset are not representative. The last data point on the green line just fits as it remains below 32767.
Thus, the rogue Tmin-24h values come from an integer overflow case. Unfortunately, integer overflow does not generate any errors as it just rolls over towards the negative side of the signed integer (starting from -32768) so it is hard to detect. Fixing it in terms of software is relatively easy: we just have to force xarray to write NetCDF files with single precision floats instead of 16-bit integers. This takes twice as much disk space but these are only temporary files so that won't matter. Fixing it in terms of data is more tricky: a complete reprocessing of AgERA5 will be required.
Since 10 March 2024, the processing line has been updated in order to avoid this problem. However, fixing the issue in the full AgERA5 dataset will require a reprocessing of the archive. We are currently investigating what the consequences are and if a full reprocessing is achievable.
The consequences of the erroneous values for the fitness for purpose of the AgERA5 dataset are small. The cells with erroneous values are mostly located in deserts and other extremely warm areas which are usually not used for agriculture. Nevertheless such errors are undesirable and should preferably be fixed.
Impact on AgERA5 variables
All the input variables that are taken from ERA5 are stored as 16-bit C short datatypes and therefore the problem of integer overflow (and underflow!) could happen for any ERA5 input variable that is used for generating AgERA5. Nevertheless, the impact on different AgERA5 variables is different. Below there is a expert assessment on the impact of the different variables.
Temperature
- Temperature variables that take the min or max of a time-slice can be affected directly depending on whether the rogue values is within the selection window (24h or day/night time)
- Assuming a maximum differences of 90 K for a rogue Temperature value, the temperature variables that are based on the mean are affected by a maximum of ~4 degrees K (90 K / 24 timesteps = 3.75 K) for 24h mean values or ~7.5 degrees K (90 K / 12 timesteps = 7.5 K)
Precipitation
- Precipitation is affected slightly because the sum of all 24h values is taken. However, an overflow will turn a precipitation value for a single 1h time slice into a near-zero precipitation. The impact of this will not be noticeable due to the variable and erratic nature of precipitation
Global radiation
- Global radiation is affected slightly. However, an overflow will turn a radiation value for a single 1h time slice into a near-zero radiation. The impact of this will not be noticeable due to the natural variability of radiation.
Windspeed:
- Windspeed is hardly affected because an overflow or underflow will generate a windspeed in the opposite direction but at similar magnitude. The windspeed in AgERA5 is computed as the square root of the sum of the squared windspeeds in u and v direction. Therefore an over- or underflow will not cause a large difference in daily mean windspeed.
Humidity and vapour pressure:
- Individual humidity values could be affected when they coincide with a particular slice that is affected by rogue temperature values. Given that humidity is constrained between 0 and 100 % the impact is limited.
- Vapour pressure is computed as the mean of 24 timesteps and is therefore the impact is limited.
Snow thickness and LWE:
- Snow variables are calculated as the mean of 24 time slices and therefore the impact will be limited.
Precipitation type:
- Precipitation type is based on a count of the different time steps and is therefore only in a limited degree affected.
Striping in the AgERA5 relative humidity layers
Background
Users have reported stripes (discontinuities) in the relative humidity product layers which are part of AgERA5. In figure 1 the humidy at 18:00 local time for 2024-03-16 is shown and the discontinuities in the product have been marked by the red arrows at the top of the image.
Figure 1: Relative humidity around 18:00 local time for 2024-03-16
In fact, these discontinuities correspond to the processing windows that the AgERA5 processing software uses to convert the hourly ERA5 input data into a daily product. For each window, the software uses a different 24 hour slice of the hourly data which best corresponds to the local time zone. This is illustrated in figure 2, here the solar elevation is shown for three locations, Beijing, London and Los Angeles with time in UTC on the x-axis. It demonstrates that the course of the solar elevation (the daylight cycle) shifts according to the location on Earth: in Being it moves to earlier UTC time (daylight starts around 21:00 UTC of 5 June). The daylight cycle for London is exactly centred at 12:00 UTC. While for Los Angeles it moves to later UTC time given its location to the West. The coloured bars on the top figure 2 indicate the 24-hour slice that is selected from the hourly ERA5 input data. Within the AgERA5 processing chain there are 8 windows defined for which a dedicated 24-hour slice is selected and when we move from one window to the next the slice shifts by 3 hours.
However, this also means that at the edges of each window we may get effects due to the difference in the slice of ERA5 inputs. For most AgERA5 layers this effect is small because we are computing an average, sum, maximum or minimum over the 24 ERA5 layers, e.g. 24h maximum temperature or daily solar radiation sum and we hardly see it in the resulting product. However, for the relative humidity products that provide an estimate of relative humidity at a particular local time, edge effects may occur because the time for which the value is computed shifts by three hours.
Figure 2: Solar elevation for three locations on Earth as a function of UTC time. The horizontal bars indicate the slice of the 24 hourly ERA5 inputs that are selected for generating the AgERA5 product for 2017-06-06.
The problem
Relative humidity is not a conservative property because it varies with temperature. For a given vapour pressure, the relative humidity decreases with increasing temperature simply because warm air can take up more moisture and therefore humidity decreases in relative terms.
For diagnosing problems in AgERA5 it is therefore better to look at the vapour pressure (figure 3). Here it is obvious that the discontinuities are gone. The vapour pressure is a smooth field indicating that the underlying data about atmospheric moisture is fine. The discontinuities are therefore part of the dataset and are not a problem with the processing chain.
Figure 3: Daily average vapour pressure at 2024-03-16.
Looking into the actual values
In figure 4 we zoom into an area in Southern Africa. The map shows the discontinuity that runs through the scene which is marked again by the red arrow. We can now look at the values for vapour pressure and humidity for a point left (blue dot) and right (red dot) of the discontinuity. The vapour pressure values for both point are nearly identical: 21.54 and 21.45 hPa (Table 1). The humidity values for the corresponding local times (6:00, 9:00, 12:00, 15:00 and 18:00 local time) do show different values (table 1).
Figure 4: Relative humidity at 18:00 local time for a region in Southern Africa.
However, after looking closely it can be observed that the values are actually shifted by three hours: the 06h estimate for the blue point matches with the 09h estimate of the red point, while the 09h estimate matches with the 12h, etc. Figure 5 confirms the shift. This shift is caused by the impact of selecting a different slice of hourly temperature values from ERA5 between the red and the blue point.
Table 1: Values for vapour pressure and humidith at local time for the selected points.
Blue dot | Red dot | |||||
latitude | 1.14 | latitude | 1.14 | |||
longitude | 22.4 | longitude | 22.5 | |||
Vapour pressure | 21.54 | hPa | Vapour pressure | 21.45 | hPa | |
06h | 88.05 | % | 06h | 89.1 | % | |
09h | 61.3 | % | 09h | 87.3 | % | |
12h | 43.8 | % | 12h | 60.3 | % | |
15h | 56.9 | % | 15h | 43.1 | % | |
18h | 69.7 | % | 18h | 54.7 | % |
Figure 5: Relative humidity values throughout the day for the two points on opposite sides of the discontinuity.
The solution
Within the current setup of the AgERA5 processing chain there is no solution to solve the discontinuities in the relatively humidity. Moreover, occasionally such discontinuities are visible in other AgERA5 products as well although not as dramatic as in the relative humidity. The only solution is to create a more fine-grained temporal subsetting. Currently, the AgERA5 processing works in 8 windows with a 3-hourly shift. A more fine grained subsetting could use 12 windows with a 2-hourly shift or even 24 windows that shift one hour. The latter would make optimal use of the hourly ERA5 inputs.
However, one should also consider whether these discontinuities in the humidity layers of AgERA5 are actually problematic for applications. The background of these humidity fields is that they were added to AgERA5 for running pest & disease models. Particularly fungal diseases are dependent on the leaf wetness duration for which relative humidity is a key parameter. But if those humidity values are shifted by three hours that will not matter much for those models. There may be other applications that are more critical but currently we not aware of anything.
We are currently exploring if a reprocessing of AgERA5 could make use of hourly or 2-hourly windows, or whether this will be a recommendation for a future AgERA6.
Info | ||
---|---|---|
| ||
This document has been produced in the context of the Copernicus Climate Change Service (C3S). The activities leading to these results have been contracted by the European Centre for Medium-Range Weather Forecasts, operator of C3S on behalf of the European Union (Delegation Agreement signed on 11/11/2014 and Contribution Agreement signed on 22/07/2021). All information in this document is provided "as is" and no guarantee or warranty is given that the information is fit for any particular purpose. The users thereof use the information at their sole risk and liability. For the avoidance of all doubt , the European Commission and the European Centre for Medium - Range Weather Forecasts have no liability in respect of this document, which is merely representing the author's view. |
...