Almer
functionThe Almer
function is a modification/hack of the lmer
function in the lme4
package to incorporate correlated effects in the random structure. The Almer
function can be used to fit phylogenetic mixed models (and other models with correlated random effects such as “animal models”).
To start, we need an ultrametric phylogeny of unit depth. We can construct this, for example, using the function rtree
of the ape
package.
# Only a very small sample size is used
# in the interest of computational speed:
set.seed(57)
n_species <- 50
tree <- ape::rtree(n = n_species)
tree <- ape::chronopl(tree, lambda = 1)
From this we can generate the phylogenetic relatedness matrix A.
The column names of A must be the species identifier.
From this we can simulate a Brownian motion process and add some residual noise.
y <- 5 + t(chol(A))%*%rnorm(n_species, 0, 2) + # BM process with mean = 5 and sd = 2
rnorm(n_species, 0, 1) # residual variation with sd = 1
For Almer
to work, the data must include the species identifier in addition to the species means.
Almer
can then be used to estimate the means and variances of the process.
mod <- Almer(y ~ 1 + (1|species), data = dt, A = list(species = A))
summary(mod)
#> Linear mixed model fit by REML ['lmerMod']
#> Formula: y ~ 1 + (1 | species)
#> Data: dt
#>
#> REML criterion at convergence: 209.4
#>
#> Scaled residuals:
#> Min 1Q Median 3Q Max
#> -1.57466 -0.59782 -0.02876 0.49887 1.75585
#>
#> Random effects:
#> Groups Name Variance Std.Dev.
#> species (Intercept) 1.862 1.364
#> Residual 2.443 1.563
#> Number of obs: 50, groups: species, 50
#>
#> Fixed effects:
#> Estimate Std. Error t value
#> (Intercept) 5.0252 0.4654 10.8
The Almer
function is flexible (it is based on lmer
), and can include additional fixed and random effects on top of the phylogenetic effects. Also, it is not restricted to phylogeny related problems, for example it can be used to estimate additive genetic variances and/or dominance variances (the argument A
can have several entries).
Almer_SE
functionThis function extends Almer
by allowing the inclusion of the uncertainty of the species means. To do this, we take advantage of how weights are included in the lmer
function: the diagonal of the residual co-variance matrix is the residual variance parameter \(\sigma^2\) times the vector of inverse weights. By using weights equal to \(1/(1+SE^2/\sigma^2)\), where \(SE\) is a vector of standard errors, the diagonal of the residual co-variance matrix is \(\sigma^2 + SE^2\). Because the weights include the residual variance parameter, the function uses an iterative approach.
To illustrate the approach, we add some arbitrary SE-values to the data
Almer_SE
can then be used to estimate the means and variances of the process taking the uncertainty into account.
mod_SE <- Almer_SE(y ~ 1 + (1|species), data = dt, SE = dt$SE, A = list(species = A))
summary(mod_SE)
#> Linear mixed model fit by REML ['lmerMod']
#> Formula: y ~ 1 + (1 | species)
#> Data: dt
#>
#> REML criterion at convergence: 209.4
#>
#> Scaled residuals:
#> Min 1Q Median 3Q Max
#> -1.57461 -0.59779 -0.02875 0.49885 1.75577
#>
#> Random effects:
#> Groups Name Variance Std.Dev.
#> species (Intercept) 1.862 1.365
#> Residual 2.443 1.563
#> Number of obs: 50, groups: species, 50
#>
#> Fixed effects:
#> Estimate Std. Error t value
#> (Intercept) 5.0252 0.4654 10.8
Note that the estimated residual variances represent the residual variance after correcting for the uncertainty in the means. Thus, this function can be useful for meta analyses.
The Almer_sim
function can be used to simulate the responses, in our case species means, of the fitted models of both Almer
and Almer_SE
. Note that the lme4::simulate.merMod
function did not seem to work properly when the number of random effects equal the number of observations, and could therefore not be used.
sim_y <- Almer_sim(mod, nsim = 3)
sim_y[1:3,]
#> 3 x 3 Matrix of class "dgeMatrix"
#> Sim_1 Sim_2 Sim_3
#> 1 5.171605 7.314536 3.453942
#> 2 1.901284 3.038564 6.394652
#> 3 6.275471 6.200653 9.409682
This can further be used to do bootstrapping, implemented in Almer_boot
.
# The number of bootstrap simulations is kept very low in the interest
# of computational speed. Often 1000 is used in real analyses.
Almer_boot_obj <- Almer_boot(mod, nsim = 10)
Almer_boot_obj$fixef
#> Mean Std. Err. 2.5% 97.5%
#> (Intercept) 4.984573 0.4847826 4.465023 5.788567
Almer_boot_obj$vcov
#> Mean Std. Err. 2.5% 97.5%
#> species 2.103980 1.820257 0.000000 4.758393
#> Residual 2.475891 1.672111 0.136031 5.004038
The phylH
function can be used to estimate the phylogenetic heritability of a object fitted by Almer
. The 95% confidence interval is estimated by parametric bootstrapping. Both the name of the numerator of the heritability and, unless the phylogenetic residual is estimated as residuals in the model fit, the name of the phylogenetic residuals need to be specified.