The evapoRe package developed as a complementary toolbox
to the pRecipe package (Vargas Godoy and Markonis
2023), available at [https://CRAN.R-project.org/package=pRecipe].
evapoRe facilitates the download, exploration,
visualization, and analysis of evapotranspiration (ET) data.
Additionally, evapoRe offers the functionality to calculate various
Potential EvapoTranspiration (PET) methods.
Like many other R packages, evapoRe has some system
requirements:
evapoRe database hosts 13 different ET data sets; three
satellite-based, five reanalysis, and five hydrological model products.
Their native specifications, as well as links to their providers, and
their respective references are detailed in the following subsections.
We have already homogenized, compacted to a single file, and stored them
in a Zenodo
repository under the following naming convention:
<data set>_<variable>_<units>_<coverage>_<start date>_<end date>_<resolution>_<time step>.nc
The evapoRe data collection was homogenized to these
specifications:
<variable> = evapotranspiration (e)<units> = millimeters (mm)<resolution> = 0.25°E.g., ERA5 (Hersbach et al. 2020) would be:
era5_e_mm_global_195901_202112_025_monthly.nc
| Data Set | Spatial Resolution | Global | Land | Ocean | Temporal Resolution | Record Length | Get Data | Reference |
|---|---|---|---|---|---|---|---|---|
| GLEAM V3.7b | 0.25° | x | Monthly | 1980/01-2021/12 | Download | Martens et al. (2017) | ||
| BESS V2.0 | 0.05° | x | Monthly | 1982/01-2019/12 | Download | B. Li et al. (2023) | ||
| ETMonitor | 1\(km\) | x | Daily | 2000/06-2019/12 | Download | Zheng et al. (2022) |
| Data Set | Spatial Resolution | Global | Land | Ocean | Temporal Resolution | Record Length | Get Data | Reference |
|---|---|---|---|---|---|---|---|---|
| ERA5-Land | 0.1° | x | Monthly | 1960/01-2022/12 | Download | Muñoz-Sabater et al. (2021) | ||
| ERA5 | 0.25° | x | Monthly | 1959/01-2021/12 | Download | Hersbach et al. (2020) | ||
| JRA-55 | 1.25° | x | Monthly | 1958/01-2021/12 | Download | Kobayashi et al. (2015) | ||
| MERRA-2 | 0.5° x 0.625° | x | Monthly | 1980/01-2023/01 | Download | Gelaro et al. (2017) | ||
| CAMELE | 0.25° | x | Monthly | 1980/01-2022/12 | Download | C. Li and al. (2023) |
| Data Set | Spatial Resolution | Global | Land | Ocean | Temporal Resolution | Record Length | Get Data | Reference |
|---|---|---|---|---|---|---|---|---|
| FLDAS | 0.1° | x | Monthly | 1982/01-2022/12 | Download | McNally et al. (2017) | ||
| GLDAS CLSM V2.1 | 1° | x | Monthly | 2000/01-2022/11 | Download | Rodell et al. (2004) | ||
| GLDAS NOAH V2.1 | 0.25° | x | Monthly | 2000/01-2022/11 | Download | Rodell et al. (2004) and Beaudoing and Rodell (2020) | ||
| GLDAS VIC V2.1 | 1° | x | Monthly | 2000/01-2022/11 | Download | Rodell et al. (2004) | ||
| TerraClimate | 4\(km\) | x | Monthly | 1958/01-2021/12 | Download | Abatzoglou et al. (2018) |
In this introductory demo we will first download the GLDAS-CLSM data set. We will then subset the downloaded data over Mediterranean region for the 2001-2010 period, and crop it to the national scale for Spain. In paralel, we will estimate potential evapotranspiration over the same domain and the same record length. In the next step, we will generate time series for our data sets and conclude with the visualization of our data.
Downloading the entire data collection or only a few data sets is
quite straightforward. You just call the download_data
function, which has four arguments data_name, path,
domain, and time_res.
Let’s download the GLDAS CLSM data set and inspect its content with
infoNC:
download_data(data_name = 'gldas-clsm', path = ".")
gldas_clsm_global <- raster::brick('gldas-clsm_e_mm_land_200001_202211_025_monthly.nc')
infoNC(gldas_clsm_global)[1] "class : RasterBrick "
[2] "dimensions : 720, 1440, 1036800, 275 (nrow, ncol, ncell, nlayers)"
[3] "resolution : 0.25, 0.25 (x, y)"
[4] "extent : -180, 180, -90, 90 (xmin, xmax, ymin, ymax)"
[5] "crs : +proj=longlat +datum=WGS84 "
[6] "source : gldas-clsm_e_mm_land_200001_202211_025_monthly.nc "
[7] "names : X2000.01.01, X2000.02.01, X2000.03.01, X2000.04.01, X2000.05.01, X2000.06.01, X2000.07.01, X2000.08.01, X2000.09.01, X2000.10.01, X2000.11.01, X2000.12.01, X2001.01.01, X2001.02.01, X2001.03.01, ... "
[8] "Date : 2000-01-01, 2022-11-01 (min, max)"
[9] "varname : e "
Once we have downloaded our database, we can start processing the data with:
crop_data to crop the data using a shapefile.fldmean to generate a time series by taking the area
weighted average over each timestep.remap to go from the native resolution (0.25°) to
coarser ones (e.g., 0.5°, 1°, 1.5°, …).subset_data to subset the data in time and/or
space.yearstat to aggregate the data from monthly into
annual.To subset our data to a desired region and period of interest, we use
the subset_data function, which has three arguments
x, box, and yrs.
Let’s subset the GLDAS CLSM data set over Mediterranean region
(-10,40,30,45) for the 2001-2010 period, and inspect its content with
infoNC:
gldas_clsm_subset <- subset_data(gldas_clsm_global,box = c(-10,40,30,45) ,yrs = c(2001, 2010))
infoNC(gldas_clsm_subset)[1] "class : RasterBrick "
[2] "dimensions : 60, 200, 12000, 120 (nrow, ncol, ncell, nlayers)"
[3] "resolution : 0.25, 0.25 (x, y)"
[4] "extent : -10, 40, 30, 45 (xmin, xmax, ymin, ymax)"
[5] "crs : +proj=longlat +datum=WGS84 +no_defs "
[6] "source : memory"
[7] "names : X2001.01.01, X2001.02.01, X2001.03.01, X2001.04.01, X2001.05.01, X2001.06.01, X2001.07.01, X2001.08.01, X2001.09.01, X2001.10.01, X2001.11.01, X2001.12.01, X2002.01.01, X2002.02.01, X2002.03.01, ... "
[8] "min values : 0.85979986, 1.62062681, 1.42477119, 0.76781327, 0.75662607, 0.34450921, 0.25072542, 0.15768366, 0.13057871, 0.05979802, 0.11780920, -0.69875073, 0.36552662, 0.70131457, 0.63548779, ... "
[9] "max values : 80.61240, 89.56071, 101.81876, 143.45859, 158.37830, 202.83186, 192.55907, 190.07066, 111.40405, 116.93645, 67.32398, 48.42713, 74.23843, 59.85103, 88.96181, ... "
[10] "time : 2001-01-01, 2010-12-01 (min, max)"
To further crop our data to a desired polygon other than a rectangle,
we use the crop_data function, which has two arguments
x, and y.
Let’s crop our GLDAS CLSM subset to cover only Spain with the
respective shape
file, and inspect its content with infoNC:
[1] "class : RasterBrick "
[2] "dimensions : 56, 58, 3248, 120 (nrow, ncol, ncell, nlayers)"
[3] "resolution : 0.25, 0.25 (x, y)"
[4] "extent : -10, 4.5, 30, 44 (xmin, xmax, ymin, ymax)"
[5] "crs : +proj=longlat +datum=WGS84 +no_defs "
[6] "source : memory"
[7] "names : X2001.01.01, X2001.02.01, X2001.03.01, X2001.04.01, X2001.05.01, X2001.06.01, X2001.07.01, X2001.08.01, X2001.09.01, X2001.10.01, X2001.11.01, X2001.12.01, X2002.01.01, X2002.02.01, X2002.03.01, ... "
[8] "min values : 7.216680, 18.606867, 32.398956, 37.939827, 39.484840, 29.796391, 15.073787, 17.676109, 15.789503, 26.564753, 13.147447, 9.846310, 8.794820, 14.355796, 26.288857, ... "
[9] "max values : 80.61240, 89.56071, 101.81876, 139.86717, 151.03282, 197.47284, 146.44232, 145.36212, 111.40405, 116.93645, 67.32398, 48.42713, 74.23843, 58.79382, 88.13857, ... "
[10] "time : 2001-01-01, 2010-12-01 (min, max)"
First we need to download temperature data, available at: Zenodo repository:
NOTE: Temperature data available at the moment is limited to monthly. The data sets are TerraClimate, MSWX, and CRU and for brevity We will only estimate PET over the 2001 to 2010 period using MSWX dataset.
we use the download_t_data function, which has five
arguments data_name, variable, path,
time_res, and domain.
t2m,tmin, and tmax stand for
average temperature, minimum temperature, and maximum temperature.This will download temperature data in following naming convention e.g.,
mswx_t2m_degC_land_197901_202308_025_monthly.nc
As stated above we will work only with the 2001-2010 period. Since
evapoRe makes all of pRecipe functions
available we can load and subset the data as follows:
t2m_global <- raster::brick("mswx_t2m_degC_land_197901_202308_025_monthly.nc") %>%
subset_data(yrs = c(2001, 2010))
infoNC(t2m_global)[1] "class : RasterBrick "
[2] "dimensions : 720, 1440, 1036800, 120 (nrow, ncol, ncell, nlayers)"
[3] "resolution : 0.25, 0.25 (x, y)"
[4] "extent : -180, 180, -90, 90 (xmin, xmax, ymin, ymax)"
[5] "crs : +proj=longlat +datum=WGS84 +no_defs "
[6] "source : memory"
[7] "names : X2001.01.01, X2001.02.01, X2001.03.01, X2001.04.01, X2001.05.01, X2001.06.01, X2001.07.01, X2001.08.01, X2001.09.01, X2001.10.01, X2001.11.01, X2001.12.01, X2002.01.01, X2002.02.01, X2002.03.01, ... "
[8] "min values : -49.867680, -45.244373, -42.529930, -33.551258, -22.829224, -15.122508, -13.402536, -14.673846, -18.428692, -27.443539, -35.674980, -44.693199, -46.778694, -47.197075, -42.277370, ... "
[9] "max values : 35.00875, 33.52000, 33.29000, 35.04688, 38.26308, 39.35505, 40.33314, 39.98806, 36.99562, 33.00374, 32.52186, 33.59248, 34.46944, 33.25062, 33.71313, ... "
[10] "time : 2001-01-01, 2010-12-01 (min, max)"
The pet function estimates PET using a method of choice
from the following available options:
The pet function has two arguments x and
method.
Let’s calculate PET using the Oudin formulation. Then, same as GLDAS
CLSM we can subset it over Mediterranean region and Spain, and inspect
its content with infoNC:
NOTE: pet output is [mm/day], in order
to get values in [mm] for a 1 to 1 comparison we use muldpm
function.
[1] "class : RasterBrick "
[2] "dimensions : 720, 1440, 1036800, 120 (nrow, ncol, ncell, nlayers)"
[3] "resolution : 0.25, 0.25 (x, y)"
[4] "extent : -180, 180, -90, 90 (xmin, xmax, ymin, ymax)"
[5] "crs : +proj=longlat +datum=WGS84 +no_defs "
[6] "source : memory"
[7] "names : layer.1, layer.2, layer.3, layer.4, layer.5, layer.6, layer.7, layer.8, layer.9, layer.10, layer.11, layer.12, layer.13, layer.14, layer.15, ... "
[8] "min values : 0.000000e+00, 0.000000e+00, 8.728488e-04, 8.322404e-04, 3.890790e-04, 0.000000e+00, 0.000000e+00, 2.140520e-04, 8.102036e-04, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 0.000000e+00, 1.431775e-04, ... "
[9] "max values : 219.5310, 176.8754, 184.4747, 189.0052, 222.4694, 224.2932, 233.6676, 220.5922, 187.6766, 182.0405, 189.9230, 206.4958, 214.1489, 176.4865, 184.7495, ... "
[10] "time : 2001-01-01, 2010-12-01 (min, max)"
[1] "class : RasterBrick "
[2] "dimensions : 64, 104, 6656, 120 (nrow, ncol, ncell, nlayers)"
[3] "resolution : 0.25, 0.25 (x, y)"
[4] "extent : -10, 40, 30, 45 (xmin, xmax, ymin, ymax)"
[5] "crs : +proj=longlat +datum=WGS84 +no_defs "
[6] "source : memory"
[7] "names : layer.1, layer.2, layer.3, layer.4, layer.5, layer.6, layer.7, layer.8, layer.9, layer.10, layer.11, layer.12, layer.13, layer.14, layer.15, ... "
[8] "min values : 0.099306993, 0.740073629, 13.313356668, 11.852477789, 47.733557582, 62.935860157, 79.710862637, 77.624826193, 33.284636736, 28.931746244, 4.929369092, 0.039311446, 0.030302684, 2.928590268, 8.578593999, ... "
[9] "max values : 54.77562, 62.65787, 115.40463, 131.55265, 177.77098, 204.70669, 222.70252, 200.23511, 165.45423, 117.08133, 69.63583, 55.89678, 58.23283, 65.31674, 103.43404, ... "
[10] "time : 2001-01-01, 2010-12-01 (min, max)"
[1] "class : RasterBrick "
[2] "dimensions : 56, 58, 3248, 120 (nrow, ncol, ncell, nlayers)"
[3] "resolution : 0.25, 0.25 (x, y)"
[4] "extent : -10, 4.5, 30, 44 (xmin, xmax, ymin, ymax)"
[5] "crs : +proj=longlat +datum=WGS84 +no_defs "
[6] "source : memory"
[7] "names : layer.1, layer.2, layer.3, layer.4, layer.5, layer.6, layer.7, layer.8, layer.9, layer.10, layer.11, layer.12, layer.13, layer.14, layer.15, ... "
[8] "min values : 3.5873966, 3.6652387, 23.3621691, 26.7197371, 54.5738134, 83.2697010, 93.0016851, 91.4133866, 49.0935838, 34.9640317, 5.8351220, 0.5116812, 4.7784175, 7.1974645, 19.2179436, ... "
[9] "max values : 40.66180, 48.18053, 80.11776, 100.17784, 125.64010, 160.57205, 165.80086, 155.79110, 111.13910, 80.95558, 44.86891, 37.12437, 40.82948, 48.49249, 74.67580, ... "
[10] "time : 2001-01-01, 2010-12-01 (min, max)"
To make a time series out of our data, we use the
fldmean function, which has one argument x.
Let’s generate the time series for our three different GLDAS CLSM data sets (Global, Mediterranean region, and Spain), and inspect its first 12 rows:
date value
1: 2000-01-01 42.63418
2: 2000-02-01 40.28064
3: 2000-03-01 46.65724
4: 2000-04-01 49.73078
5: 2000-05-01 61.78450
6: 2000-06-01 71.51643
7: 2000-07-01 78.34947
8: 2000-08-01 68.59857
9: 2000-09-01 52.40877
10: 2000-10-01 45.95624
11: 2000-11-01 40.95821
12: 2000-12-01 41.50710
date value
1: 2001-01-01 14.47589
2: 2001-02-01 19.65537
3: 2001-03-01 38.58488
4: 2001-04-01 45.47299
5: 2001-05-01 57.83225
6: 2001-06-01 63.57403
7: 2001-07-01 51.30824
8: 2001-08-01 41.88030
9: 2001-09-01 29.30722
10: 2001-10-01 24.02233
11: 2001-11-01 16.56476
12: 2001-12-01 12.67189
date value
1: 2001-01-01 17.99823
2: 2001-02-01 31.41443
3: 2001-03-01 57.23334
4: 2001-04-01 84.13048
5: 2001-05-01 95.06479
6: 2001-06-01 118.33516
7: 2001-07-01 87.58777
8: 2001-08-01 74.37666
9: 2001-09-01 45.09689
10: 2001-10-01 43.91893
11: 2001-11-01 25.11206
12: 2001-12-01 16.99089
Let’s generate the time series for our three different PET calculated by Oudin method (Global, Mediterranean region, and Spain), and inspect its first 12 rows:
date value
1: 2001-01-01 90.97581
2: 2001-02-01 90.72542
3: 2001-03-01 100.12134
4: 2001-04-01 96.08822
5: 2001-05-01 105.25369
6: 2001-06-01 110.88759
7: 2001-07-01 119.98619
8: 2001-08-01 112.29808
9: 2001-09-01 94.00018
10: 2001-10-01 89.70338
11: 2001-11-01 82.71571
12: 2001-12-01 90.02744
date value
1: 2001-01-01 28.41624
2: 2001-02-01 34.31941
3: 2001-03-01 70.77386
4: 2001-04-01 85.68093
5: 2001-05-01 119.92428
6: 2001-06-01 146.10311
7: 2001-07-01 161.36373
8: 2001-08-01 147.05941
9: 2001-09-01 105.36592
10: 2001-10-01 73.91439
11: 2001-11-01 37.36657
12: 2001-12-01 24.99642
date value
1: 2001-01-01 23.33118
2: 2001-02-01 28.85419
3: 2001-03-01 57.39954
4: 2001-04-01 71.95438
5: 2001-05-01 101.37500
6: 2001-06-01 134.48663
7: 2001-07-01 139.09158
8: 2001-08-01 131.25821
9: 2001-09-01 87.49866
10: 2001-10-01 59.00385
11: 2001-11-01 25.24541
12: 2001-12-01 16.16446
Either after we have processed our data as required or right after downloaded, we have six different options to visualize our data for more information refer to visualisation section of pRecipe:
To see a map of any data set raw or processed, we use
plot_map which takes only one layer of the RasterBrick as
input.
To draw a time series generated by fldmean, we use any
of the options below, which takes only a fldmean “.csv”
generated file.
p01 <- plot_line(gldas_clsm_global_ts, var = "Evapotranspiration")
p02 <- plot_line(pet_oudin_global_ts, var = "Potential Evapotranspiration")
ggpubr::ggarrange(p01, p02, ncol = 1)p01 <- plot_box(gldas_clsm_global_ts, var = "ET")
p02 <- plot_box(pet_oudin_global_ts, var = "PET")
ggpubr::ggarrange(p01, p02, ncol = 2)p01 <- plot_density(gldas_clsm_global_ts, var = "ET")
p02 <- plot_density(pet_oudin_global_ts, var = "PET")
ggpubr::ggarrange(p01, p02, ncol = 2)NOTE: For good aesthetics we recommend saving
plot_summary with
ggsave(<filename>, <plot>, width = 16.3, height = 15.03).
plot_summary(gldas_clsm_global_ts, var = "Evapotranspiration")
#plot_summary(gldas_clsm_subset_ts, var = "Evapotranspiration")
#plot_summary(gldas_clsm_esp_ts, var = "Evapotranspiration")
#plot_summary(pet_oudin_global_ts, var = "Potential Evapotranspiration")
#plot_summary(pet_oudin_subset_ts)
#plot_summary(pet_oudin_esp_ts)We will introduce significant enhancements to ET database and PET calculation methods. This expansion builds upon our existing temperature-based approach and incorporates a radiation-based PET calculation methods, along with an expanded range of temperature-based methods. Our aim is to provide users with a more comprehensive and accurate estimation of ET and PET, catering to a broader range of applications and requirements.