Calculating the average nLTT plot of multiple phylogenies is not a trivial tasks.
The function get_nltt_values collects the nLTT values of a collection of phylogenies as tidy data.
This allows for a good interplay with ggplot2.
Create two easy trees:
newick1 <- "((A:1,B:1):2,C:3);"
newick2 <- "((A:2,B:2):1,C:3);"
phylogeny1 <- ape::read.tree(text = newick1)
phylogeny2 <- ape::read.tree(text = newick2)
phylogenies <- c(phylogeny1, phylogeny2)There are very similar. phylogeny1 has short tips:
ape::plot.phylo(phylogeny1)
ape::add.scale.bar() #nolintThis can be observed in the nLTT plot:
nLTT::nltt_plot(phylogeny1, ylim = c(0, 1))As a collection of timepoints:
t <- nLTT::get_phylogeny_nltt_matrix(phylogeny1)
knitr::kable(t)| time | N |
|---|---|
| 0.0000000 | 0.3333333 |
| 0.6666667 | 0.6666667 |
| 1.0000000 | 1.0000000 |
Plotting those timepoints:
df <- as.data.frame(nLTT::get_phylogeny_nltt_matrix(phylogeny1))
ggplot2::qplot(
time, N, data = df, geom = "step", ylim = c(0, 1), direction = "vh",
main = "NLTT plot of phylogeny 1"
)phylogeny2 has longer tips:
ape::plot.phylo(phylogeny2)
ape::add.scale.bar() #nolintAlso this can be observed in the nLTT plot:
nLTT::nltt_plot(phylogeny2, ylim = c(0, 1))As a collection of timepoints:
t <- nLTT::get_phylogeny_nltt_matrix(phylogeny2)
knitr::kable(t)| time | N |
|---|---|
| 0.0000000 | 0.3333333 |
| 0.3333333 | 0.6666667 |
| 1.0000000 | 1.0000000 |
Plotting those timepoints:
df <- as.data.frame(nLTT::get_phylogeny_nltt_matrix(phylogeny2))
ggplot2::qplot(
time, N, data = df, geom = "step", ylim = c(0, 1), direction = "vh",
main = "NLTT plot of phylogeny 2"
)The average nLTT plot should be somewhere in the middle.
It is constructed from stretched nLTT matrices.
Here is the nLTT matrix of the first phylogeny:
t <- nLTT::stretch_nltt_matrix(
nLTT::get_phylogeny_nltt_matrix(phylogeny1), dt = 0.20, step_type = "upper"
)
knitr::kable(t)| 0.0 | 0.6666667 |
| 0.2 | 0.6666667 |
| 0.4 | 0.6666667 |
| 0.6 | 0.6666667 |
| 0.8 | 1.0000000 |
| 1.0 | 1.0000000 |
Here is the nLTT matrix of the second phylogeny:
t <- nLTT::stretch_nltt_matrix(
nLTT::get_phylogeny_nltt_matrix(phylogeny2), dt = 0.20, step_type = "upper"
)
knitr::kable(t)| 0.0 | 0.6666667 |
| 0.2 | 0.6666667 |
| 0.4 | 1.0000000 |
| 0.6 | 1.0000000 |
| 0.8 | 1.0000000 |
| 1.0 | 1.0000000 |
Here is the average nLTT matrix of both phylogenies:
t <- nLTT::get_average_nltt_matrix(phylogenies, dt = 0.20)
knitr::kable(t)| 0.0 | 0.6666667 |
| 0.2 | 0.6666667 |
| 0.4 | 0.8333333 |
| 0.6 | 0.8333333 |
| 0.8 | 1.0000000 |
| 1.0 | 1.0000000 |
Observe how the numbers get averaged.
The same, now shown as a plot:
nLTT::nltts_plot(phylogenies, dt = 0.20, plot_nltts = TRUE)Here a demo how the new function works:
t <- nLTT::get_nltt_values(c(phylogeny1, phylogeny2), dt = 0.2)
knitr::kable(t)| id | t | nltt |
|---|---|---|
| 1 | 0.0 | 0.6666667 |
| 1 | 0.2 | 0.6666667 |
| 1 | 0.4 | 0.6666667 |
| 1 | 0.6 | 0.6666667 |
| 1 | 0.8 | 1.0000000 |
| 1 | 1.0 | 1.0000000 |
| 2 | 0.0 | 0.6666667 |
| 2 | 0.2 | 0.6666667 |
| 2 | 0.4 | 1.0000000 |
| 2 | 0.6 | 1.0000000 |
| 2 | 0.8 | 1.0000000 |
| 2 | 1.0 | 1.0000000 |
Plotting options, first create a data frame:
df <- nLTT::get_nltt_values(c(phylogeny1, phylogeny2), dt = 0.01)Here we see an averaged nLTT plot, where the original nLTT values are still visible:
ggplot2::qplot(
t, nltt, data = df, geom = "point", ylim = c(0, 1),
main = "Average nLTT plot of phylogenies", color = id, size = I(0.1)
) + ggplot2::stat_summary(
fun.data = "mean_cl_boot", color = "red", geom = "smooth"
)## Warning: Computation failed in `stat_summary()`:
## object 'mean_cl_boot' of mode 'function' was not found
Here we see an averaged nLTT plot, with the original nLTT values omitted:
ggplot2::qplot(t, nltt, data = df, geom = "blank", ylim = c(0, 1),
main = "Average nLTT plot of phylogenies"
) + ggplot2::stat_summary(
fun.data = "mean_cl_boot", color = "red", geom = "smooth"
)## Warning: Computation failed in `stat_summary()`:
## object 'mean_cl_boot' of mode 'function' was not found
Create two harder trees:
newick1 <- "((A:1,B:1):1,(C:1,D:1):1);"
newick2 <- paste0("((((XD:1,ZD:1):1,CE:2):1,(FE:2,EE:2):1):4,((AE:1,BE:1):1,",
"(WD:1,YD:1):1):5);"
)
phylogeny1 <- ape::read.tree(text = newick1)
phylogeny2 <- ape::read.tree(text = newick2)
phylogenies <- c(phylogeny1, phylogeny2)There are different. phylogeny1 is relatively simple, with two branching events happening at the same time:
ape::plot.phylo(phylogeny1)
ape::add.scale.bar() #nolintThis can be observed in the nLTT plot:
nLTT::nltt_plot(phylogeny1, ylim = c(0, 1))As a collection of timepoints:
t <- nLTT::get_phylogeny_nltt_matrix(phylogeny2)
knitr::kable(t)| time | N |
|---|---|
| 0.0000000 | 0.1111111 |
| 0.5714286 | 0.2222222 |
| 0.7142857 | 0.3333333 |
| 0.7142857 | 0.4444444 |
| 0.7142857 | 0.5555556 |
| 0.8571429 | 0.6666667 |
| 0.8571429 | 0.7777778 |
| 0.8571429 | 0.8888889 |
| 1.0000000 | 1.0000000 |
phylogeny2 is more elaborate:
ape::plot.phylo(phylogeny2)
ape::add.scale.bar() #nolintAlso this can be observed in the nLTT plot:
nLTT::nltt_plot(phylogeny2, ylim = c(0, 1))As a collection of timepoints:
t <- nLTT::get_phylogeny_nltt_matrix(phylogeny2)
knitr::kable(t)| time | N |
|---|---|
| 0.0000000 | 0.1111111 |
| 0.5714286 | 0.2222222 |
| 0.7142857 | 0.3333333 |
| 0.7142857 | 0.4444444 |
| 0.7142857 | 0.5555556 |
| 0.8571429 | 0.6666667 |
| 0.8571429 | 0.7777778 |
| 0.8571429 | 0.8888889 |
| 1.0000000 | 1.0000000 |
The average nLTT plot should be somewhere in the middle.
It is constructed from stretched nLTT matrices.
Here is the nLTT matrix of the first phylogeny:
t <- nLTT::stretch_nltt_matrix(
nLTT::get_phylogeny_nltt_matrix(phylogeny1), dt = 0.20, step_type = "upper"
)
knitr::kable(t)| 0.0 | 0.5 |
| 0.2 | 0.5 |
| 0.4 | 0.5 |
| 0.6 | 1.0 |
| 0.8 | 1.0 |
| 1.0 | 1.0 |
Here is the nLTT matrix of the second phylogeny:
t <- nLTT::stretch_nltt_matrix(
nLTT::get_phylogeny_nltt_matrix(phylogeny2), dt = 0.20, step_type = "upper"
)
knitr::kable(t)| 0.0 | 0.2222222 |
| 0.2 | 0.2222222 |
| 0.4 | 0.2222222 |
| 0.6 | 0.3333333 |
| 0.8 | 0.6666667 |
| 1.0 | 1.0000000 |
Here is the average nLTT matrix of both phylogenies:
t <- nLTT::get_average_nltt_matrix(phylogenies, dt = 0.20)
knitr::kable(t)| 0.0 | 0.3611111 |
| 0.2 | 0.3611111 |
| 0.4 | 0.3611111 |
| 0.6 | 0.6666667 |
| 0.8 | 0.8333333 |
| 1.0 | 1.0000000 |
Observe how the numbers get averaged.
Here a demo how the new function works:
t <- nLTT::get_nltt_values(c(phylogeny1, phylogeny2), dt = 0.2)
knitr::kable(t)| id | t | nltt |
|---|---|---|
| 1 | 0.0 | 0.5000000 |
| 1 | 0.2 | 0.5000000 |
| 1 | 0.4 | 0.5000000 |
| 1 | 0.6 | 1.0000000 |
| 1 | 0.8 | 1.0000000 |
| 1 | 1.0 | 1.0000000 |
| 2 | 0.0 | 0.2222222 |
| 2 | 0.2 | 0.2222222 |
| 2 | 0.4 | 0.2222222 |
| 2 | 0.6 | 0.3333333 |
| 2 | 0.8 | 0.6666667 |
| 2 | 1.0 | 1.0000000 |
Plotting options, first create a data frame:
df <- nLTT::get_nltt_values(c(phylogeny1, phylogeny2), dt = 0.01)Here we see an averaged nLTT plot, where the original nLTT values are still visible:
ggplot2::qplot(
t, nltt, data = df, geom = "point", ylim = c(0, 1),
main = "Average nLTT plot of phylogenies", color = id, size = I(0.1)
) + ggplot2::stat_summary(
fun.data = "mean_cl_boot", color = "red", geom = "smooth"
)## Warning: Computation failed in `stat_summary()`:
## object 'mean_cl_boot' of mode 'function' was not found
Here we see an averaged nLTT plot, with the original nLTT values omitted:
ggplot2::qplot(t, nltt, data = df, geom = "blank", ylim = c(0, 1),
main = "Average nLTT plot of phylogenies"
) + ggplot2::stat_summary(
fun.data = "mean_cl_boot", color = "red", geom = "smooth"
)## Warning: Computation failed in `stat_summary()`:
## object 'mean_cl_boot' of mode 'function' was not found
Create three random trees:
set.seed(42)
phylogeny1 <- ape::rcoal(10)
phylogeny2 <- ape::rcoal(20)
phylogeny3 <- ape::rcoal(30)
phylogeny4 <- ape::rcoal(40)
phylogeny5 <- ape::rcoal(50)
phylogeny6 <- ape::rcoal(60)
phylogeny7 <- ape::rcoal(70)
phylogenies <- c(
phylogeny1, phylogeny2, phylogeny3,
phylogeny4, phylogeny5, phylogeny6, phylogeny7
)Here a demo how the new function works:
t <- nLTT::get_nltt_values(phylogenies, dt = 0.2)
knitr::kable(t)| id | t | nltt |
|---|---|---|
| 1 | 0.0 | 0.2000000 |
| 1 | 0.2 | 0.2000000 |
| 1 | 0.4 | 0.2000000 |
| 1 | 0.6 | 0.2000000 |
| 1 | 0.8 | 0.3000000 |
| 1 | 1.0 | 1.0000000 |
| 2 | 0.0 | 0.1000000 |
| 2 | 0.2 | 0.1000000 |
| 2 | 0.4 | 0.1000000 |
| 2 | 0.6 | 0.1000000 |
| 2 | 0.8 | 0.2000000 |
| 2 | 1.0 | 1.0000000 |
| 3 | 0.0 | 0.0666667 |
| 3 | 0.2 | 0.0666667 |
| 3 | 0.4 | 0.1000000 |
| 3 | 0.6 | 0.1333333 |
| 3 | 0.8 | 0.2333333 |
| 3 | 1.0 | 1.0000000 |
| 4 | 0.0 | 0.0500000 |
| 4 | 0.2 | 0.0500000 |
| 4 | 0.4 | 0.0500000 |
| 4 | 0.6 | 0.1000000 |
| 4 | 0.8 | 0.2750000 |
| 4 | 1.0 | 1.0000000 |
| 5 | 0.0 | 0.0400000 |
| 5 | 0.2 | 0.0600000 |
| 5 | 0.4 | 0.0600000 |
| 5 | 0.6 | 0.0600000 |
| 5 | 0.8 | 0.1000000 |
| 5 | 1.0 | 1.0000000 |
| 6 | 0.0 | 0.0333333 |
| 6 | 0.2 | 0.0333333 |
| 6 | 0.4 | 0.0666667 |
| 6 | 0.6 | 0.0666667 |
| 6 | 0.8 | 0.0833333 |
| 6 | 1.0 | 1.0000000 |
| 7 | 0.0 | 0.0285714 |
| 7 | 0.2 | 0.0285714 |
| 7 | 0.4 | 0.0285714 |
| 7 | 0.6 | 0.0428571 |
| 7 | 0.8 | 0.1000000 |
| 7 | 1.0 | 1.0000000 |
Here we see an averaged nLTT plot, where the original nLTT values are still visible:
ggplot2::qplot(t, nltt, data = df, geom = "point", ylim = c(0, 1),
main = "Average nLTT plot of phylogenies", color = id, size = I(0.1)
) + ggplot2::stat_summary(
fun.data = "mean_cl_boot", color = "red", geom = "smooth"
)## Warning: Computation failed in `stat_summary()`:
## object 'mean_cl_boot' of mode 'function' was not found
Here we see an averaged nLTT plot, with the original nLTT values omitted:
ggplot2::qplot(t, nltt, data = df, geom = "blank", ylim = c(0, 1),
main = "Average nLTT plot of phylogenies"
) + ggplot2::stat_summary(
fun.data = "mean_cl_boot", color = "red", geom = "smooth"
)## Warning: Computation failed in `stat_summary()`:
## object 'mean_cl_boot' of mode 'function' was not found