The function RRphylogeography is designed to find the
area of origin (AOO) and/or identify suitable area of contact between a
pair of target species, taking into account their evolutionary history
and environmental preferences. To run RRphylogeography, the
user must provide the presence datapoints and the habitat suitability
(HS) maps for both target species. The outputs of
RRphylogeography consists of relative probability of
occurrence (RPO) maps, which are calculated based on three elements: the
habitat suitability values in each grid cell, the kernel density
estimations derived from the occurrences, and the spatial distance
(inclusive of the path cost for moving) between the most suitable cells
of the two target species. The user can customize several parameters in
RRphylogeography to account for different phylogenetic
relationships between the species pair (i.e. whether budding
cladogenesis, branching cladogenesis or anagenesis are suspected to have
occurred) and to adjust the conductance matrix which affects the
difficulty of moving across the landscape.
In the following chunks, we provide a detailed walkthrough on how to
format data and run RRphylogeography to identify the
suitable area of contact between Ursus arctos and Ursus
maritimus (Mondanaro et al. 2025) at 35 kya.
Important: RRphylogeography is able to
work with the habitat suitability (HS) maps generated by any Species
Distribution Models (SDMs) algorithm, as long as they are in the 0-1
range. This flexibility allows users to calibrate their SDMs using
statistical/regression models, similarity/envelope models, and machine
learning methods, depending on their preferred approach. By
accommodating a wide range of modeling techniques,
RRphylogeography ensures broad applicability across
different ecological and evolutionary studies.
Other than the name of the target species (i.e., spec1
and spec2 arguments), RRphylogeography
requires two mandatory arguments:
pred: list of SpatRaster objects
containing the habitat suitability maps for both species.occs: list of sf::data.frame objects
containing the presence data for both species.Both lists must be named and ordered according to spec1
and spec2.
Note: RRphylogeography can also process
a list of stacked SpatRaster objects. This is particularly
useful when users have SDMs projected across different time frames as it
allows to input the habitat suitability maps all together.
library(RRgeo)
library(terra)
library(sf)
rast(system.file("exdata/U.arctos_suitability.tif", package="RRgeo"))->map1
rast(system.file("exdata/U.maritimus_suitability.tif", package="RRgeo"))->map2
load(system.file("exdata/Ursus_occurrences.RDa", package="RRgeo"))list(Ursus_arctos=map1,Ursus_maritimus=map2)->pred
list(Ursus_arctos=occs_arctos,Ursus_maritimus=occs_marit)->occs
head(occs$Ursus_arctos)
#> Simple feature collection with 6 features and 7 fields
#> Geometry type: POINT
#> Dimension: XY
#> Bounding box: xmin: -6268427 ymin: 4829599 xmax: 4561482 ymax: 7759100
#> Projected CRS: PROJCRS["unknown",
#> BASEGEOGCRS["unknown",
#> DATUM["D_Unknown_based_on_WGS84_ellipsoid",
#> ELLIPSOID["WGS 84",6378137,298.257223563,
#> LENGTHUNIT["metre",1],
#> ID["EPSG",7030]]],
#> PRIMEM["Greenwich",0,
#> ANGLEUNIT["Degree",0.0174532925199433]]],
#> CONVERSION["unnamed",
#> METHOD["Mollweide"],
#> PARAMETER["Longitude of natural origin",0,
#> ANGLEUNIT["Degree",0.0174532925199433],
#> ID["EPSG",8802]],
#> PARAMETER["False easting",0,
#> LENGTHUNIT["metre",1],
#> ID["EPSG",8806]],
#> PARAMETER["False northing",0,
#> LENGTHUNIT["metre",1],
#> ID["EPSG",8807]]],
#> CS[Cartesian,2],
#> AXIS["(E)",east,
#> ORDER[1],
#> LENGTHUNIT["metre",1,
#> ID["EPSG",9001]]],
#> AXIS["(N)",north,
#> ORDER[2],
#> LENGTHUNIT["metre",1,
#> ID["EPSG",9001]]]]
#> OBS TIME_factor bio4 bio5 bio15 bio18 bio19
#> 1 1 0.000 2231.449 15.892 44.463 1298.594 545.262
#> 2 1 0.000 569.361 25.074 47.080 1285.298 3269.817
#> 72 1 0.001 1834.911 10.961 42.652 1014.167 528.589
#> 108 1 0.002 1615.390 19.783 27.550 1226.768 1046.242
#> 109 1 0.002 1614.916 20.613 23.908 1143.378 1053.529
#> 110 1 0.002 1705.699 19.869 36.797 1459.522 1003.189
#> geometry
#> 1 POINT (4561482 7500426)
#> 2 POINT (-444307.6 4829599)
#> 72 POINT (-6268427 7759100)
#> 108 POINT (4152689 6174048)
#> 109 POINT (4151493 6136813)
#> 110 POINT (4365559 6321109)The standard format for individual elements of occs must
include the column containing species occurrence data in binary format
(“OBS”) and the “geometry” column. Additionally, if available, a column
representing the time intervals associated with each species presence
may be added (“TIME_factor”). All other columns in the
data.frame are ignored.
RRphylogeography also accepts:
th: threshold value for defining the most suitable
cells for both target species. RRphylogeography function
considers all the habitat suitability values greater than 0 which exceed
the th quantile as the suitable cells to consider. This
step is crucial because the subsequent distance calculations are
restricted to these most suitable cells only. The latter procedure
relies on the R package * leastcostpath (Lewis, 2023).mask_for_pred: a SpatRaster object used to
define the spatial extent of RRphylogeography outputs.resistance_map: a SpatRaster object to
determine the difficulty of moving across the landscapetime_col: the name of the column containing time
intervals associated to species presence records. If
time_col is indicated, the function incorporates time
intervals as weights when calculating the kernel density.kde_inversion: TRUE/FALSE, this sets how
time intervals should be used as weights when calculating the kernel
density. This depends on the mode of speciation thought to link
spec1 to spec2 and the research objective. If
the study assumes an anagenetic process linking the target species, the
AOO of the descendant species becomes the focus. In this case, setting
kde_inversion = TRUE is appropriate as it gives higher weights to
younger occurrences of the ancestor species. On the contrary, if the
goal is to identify the AOO or the past contact of two coeval species,
temporal weights should be considered similar, making
kde_inversion unnecessary.standardize: if TRUE, the output maps are
rescaled between 0 and 1.yourdir<-"YOUR_DIRECTORY"
setwd(yourdir)
RRphylogeography(spec1="Ursus_arctos",
spec2="Ursus_maritimus",
pred=pred,
occs=occs,
aggr=2,
time_col="TIME_factor",
kde_inversion=FALSE,
resistance_map=NULL,
clust=0.5,
plot=FALSE,
mask_for_pred=NULL,
standardize=TRUE,
output.dir=yourdir)Important: The computational time of
RRphylogeography is influenced by several factors,
including the number of suitable cells for each target species, the
th value used to define the most suitable cells, and the
number of habitat suitability layers. To speed up the procedure, we
strongly recommend aggregating the habitat suitability (HS) maps
provided as pred by setting the aggr argument.
Since the primary goal of RRphylogeography is to identify
the area of origin (rather than a single cell of origin) between two
target species, using very high spatial resolution maps may not be
necessary. The spatial aggregation can drastically reduce computational
time without compromising the analysis. Alternatively, users can
increase the th values or restrict the study area by using
the mask_for_pred argument.
RRphylogeography outputsRRphylogeography creates within output.dir
an output folder named by combining the names of the two target species.
Therein, RRphylogeography stores three
SpatRaster maps.
spec1spec2Both these files include:
The names of the three output files are a combination of species names and the names of the HS maps.
Here, we plot the RPO_combined map of U. arctos and U.
maritimus at 35 kya. The RPO_combined map returned by
RRphylogeography was enhanced to replicate the one as in
Mondanaro et al. (2025).