The metafor Package

Berkey et al. (1998)

The Methods and Data

Berkey et al. (1998) describe a meta-analytic multivariate model for the analysis of multiple correlated outcomes. The use of the model is illustrated with results from 5 trials comparing surgical and non-surgical treatments for medium-severity periodontal disease. Reported outcomes include the change in probing depth (PD) and attachment level (AL) one year after the treatment. The effect size measure used for this meta-analysis was the (raw) mean difference, calculated in such a way that positive values indicate that surgery was more effective than non-surgical treatment in decreasing the probing depth and increasing the attachment level. The data are provided in Table I in the article and are stored in the dataset dat.berkey1998 that comes with the metafor package:

library(metafor)
dat <- dat.berkey1998
dat

(I copy the dataset into dat, which is a bit shorter and therefore easier to type further below). The contents of the dataset are:

   trial           author year ni outcome    yi    v1i    v2i
1      1 Pihlstrom et al. 1983 14      PD  0.47 0.0075 0.0030
2      1 Pihlstrom et al. 1983 14      AL -0.32 0.0030 0.0077
3      2    Lindhe et al. 1982 15      PD  0.20 0.0057 0.0009
4      2    Lindhe et al. 1982 15      AL -0.60 0.0009 0.0008
5      3   Knowles et al. 1979 78      PD  0.40 0.0021 0.0007
6      3   Knowles et al. 1979 78      AL -0.12 0.0007 0.0014
7      4  Ramfjord et al. 1987 89      PD  0.26 0.0029 0.0009
8      4  Ramfjord et al. 1987 89      AL -0.31 0.0009 0.0015
9      5    Becker et al. 1988 16      PD  0.56 0.0148 0.0072
10     5    Becker et al. 1988 16      AL -0.39 0.0072 0.0304

So, the results from the various trials indicate that surgery is preferable for reducing the probing depth, while non-surgical treatment is preferable for increasing the attachment level.

Since each trial provides effect size estimates for both outcomes, the estimates are correlated. The v1i and v2i values are the variances and covariances of the observed effects. In particular, for each study, variables v1i and v2i form a 2×2 variance-covariance matrix of the observed effects, with the diagonal elements corresponding to the sampling variances of the mean differences (the first for probing depth, the second for attachment level) and the off-diagonal value corresponding to the covariance of the two mean differences.

Variance-Covariance Matrix

Before we can proceed with the model fitting, we need to construct the full (block-diagonal) variance-covariance for all studies from these two variables. We can do this using the vcalc() function in one line of code:

V <- vcalc(vi=1, cluster=author, rvars=c(v1i, v2i), data=dat)

The V matrix is then equal to:

        [,1]   [,2]   [,3]   [,4]   [,5]   [,6]   [,7]   [,8]   [,9]  [,10]
 [1,] 0.0075 0.0030 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
 [2,] 0.0030 0.0077 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
 [3,] 0.0000 0.0000 0.0057 0.0009 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
 [4,] 0.0000 0.0000 0.0009 0.0008 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
 [5,] 0.0000 0.0000 0.0000 0.0000 0.0021 0.0007 0.0000 0.0000 0.0000 0.0000
 [6,] 0.0000 0.0000 0.0000 0.0000 0.0007 0.0014 0.0000 0.0000 0.0000 0.0000
 [7,] 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0029 0.0009 0.0000 0.0000
 [8,] 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0009 0.0015 0.0000 0.0000
 [9,] 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0148 0.0072
[10,] 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0072 0.0304

Multivariate Random-Effects Model

A multivariate random-effects model can now be used to meta-analyze the two outcomes simultaneously. The model is given by $$\left[ \begin{array}{c} y_{PD,i}\\ y_{AL,i} \end{array} \right] = \left[ \begin{array}{c} \mu_{PD} \\ \mu_{AL} \end{array} \right] + \left[ \begin{array}{c} u_{PD,i} \\ u_{AL,i} \end{array} \right] + \left[ \begin{array}{c} \epsilon_{PD,i} \\ \epsilon_{AL,i} \end{array} \right],$$ where $$G = \mbox{Var} \left[ \begin{array}{c} u_{PD,i} \\ u_{AL,i} \end{array} \right] = \left[ \begin{array}{cc} \tau^2_{PD} & \rho \tau_{PD} \tau_{AL} \\ \rho \tau_{PD} \tau_{AL} & \tau^2_{AL} \end{array} \right]$$ and $$V_i = \mbox{Var} \left[ \begin{array}{c} \epsilon_{PD,i} \\ \epsilon_{AL,i} \end{array} \right] = \left[ \begin{array}{cc} \mbox{var}_{PD,i} & \mbox{cov}_{PD,i;AL,i} \\ \mbox{cov}_{PD,i;AL,i} & \mbox{var}_{AL,i} \end{array} \right],$$ where the elements in this second variance-covariance matrix are given by the $2 \times 2$ blocks along the diagonal in the V matrix. Therefore, $$V = \left[ \begin{array}{ccc} V_1 & & \\ & \ddots & \\ & & V_5 \end{array} \right].$$

We can fit this model with:

res <- rma.mv(yi, V, mods = ~ 0 + outcome,
              random = ~ outcome | trial, struct="UN",
              data=dat, method="ML")
print(res, digits=3)

Multivariate Meta-Analysis Model (k = 10; method: ML)
 
Variance Components:
 
outer factor: trial   (nlvls = 5)
inner factor: outcome (nlvls = 2)
 
           estim   sqrt  k.lvl  fixed  level
tau^2.1    0.026  0.162      5     no     AL
tau^2.2    0.007  0.084      5     no     PD
 
    rho.AL  rho.PD    AL  PD
AL       1   0.699     -  no
PD   0.699       1     5   -
 
Test for Residual Heterogeneity:
QE(df = 8) = 128.227, p-val < .001
 
Test of Moderators (coefficient(s) 1,2):
QM(df = 2) = 155.772, p-val < .001
 
Model Results:
 
           estimate     se    zval   pval   ci.lb   ci.ub
outcomeAL    -0.338  0.080  -4.237  <.001  -0.494  -0.182  ***
outcomePD     0.345  0.049   6.972  <.001   0.248   0.442  ***
 
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

This is what Berkey et al. (1998) call a multivariate maximum likelihood (MML) random-effects model.¹⁾ Note that rma.mv() uses REML estimation by default, so method="ML" must be explicitly requested. The random = ~ outcome | trial part adds random effects for each outcome within each trial to the model. With struct="UN", the random effects are allowed to have different variances for each outcome and are allowed to be correlated.

The results show that the amount of heterogeneity in the attachment level (AL) outcome (i.e., $\hat{\tau}_{AL}^2 = .026$) is larger than the amount of heterogeneity in the probing depth (PD) outcome (i.e., $\hat{\tau}_{PD}^2 = .007$). Furthermore, the true outcomes appear to correlate quite strongly (i.e., $\hat{\rho} = .70$). On average, surgery is estimated to lead to significantly greater decreases in probing depth (i.e., $\hat{\mu}_{PB} = .35$), but non-surgery is more effective in increasing the attachment level (i.e., $\hat{\mu}_{AL} = -.34$).

Multivariate Mixed-Effects Meta-Regression Model

The results given in Table II in the paper actually are based on a meta-regression model, using year of publication as a potential moderator. To replicate those analyses, we use:

res <- rma.mv(yi, V, mods = ~ 0 + outcome + outcome:I(year - 1983),
              random = ~ outcome | trial, struct="UN",
              data=dat, method="ML")
print(res, digits=3)

Multivariate Meta-Analysis Model (k = 10; method: ML)
 
Variance Components:
 
outer factor: trial   (nlvls = 5)
inner factor: outcome (nlvls = 2)
 
           estim   sqrt  k.lvl  fixed  level
tau^2.1    0.025  0.158      5     no     AL
tau^2.2    0.008  0.090      5     no     PD
 
    rho.AL  rho.PD    AL  PD
AL       1   0.659     -  no
PD   0.659       1     5   -
 
Test for Residual Heterogeneity:
QE(df = 6) = 125.756, p-val < .001
 
Test of Moderators (coefficient(s) 1,2,3,4):
QM(df = 4) = 143.439, p-val < .001
 
Model Results:
 
                          estimate     se    zval   pval   ci.lb   ci.ub
outcomeAL                   -0.335  0.079  -4.261  <.001  -0.489  -0.181  ***
outcomePD                    0.348  0.052   6.694  <.001   0.246   0.450  ***
outcomeAL:I(year - 1983)    -0.011  0.024  -0.445  0.656  -0.059   0.037
outcomePD:I(year - 1983)     0.001  0.015   0.063  0.950  -0.029   0.031
 
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Note that publication year was centered at 1983, as was done by the authors. These results correspond to those given in the rightmost column in Table II on page 2545 (column "multiple outcomes MML"). The output above directly provides the correlation among the true effects. We can obtain the covariance with:

round(res$G[1,2], digits=3)

[1] 0.009

To test whether the slope of publication year actually differs for the two outcomes, we can fit the same model with:

res <- rma.mv(yi, V, mods = ~ 0 + outcome*I(year - 1983),
              random = ~ outcome | trial, struct="UN",
              data=dat, method="ML")
print(res, digits=3)

The output is identical, except for the last part, which is now equal to:

Model Results:
 
                          estimate     se    zval   pval   ci.lb   ci.ub
outcomeAL                   -0.335  0.079  -4.261  <.001  -0.489  -0.181  ***
outcomePD                    0.348  0.052   6.694  <.001   0.246   0.450  ***
I(year - 1983)              -0.011  0.024  -0.445  0.656  -0.059   0.037
outcomePD:I(year - 1983)     0.012  0.020   0.593  0.553  -0.027   0.051
 
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Therefore, the slope is actually not significantly different for the two outcomes ($p = .553$). In fact, it does not appear as if publication year is at all related to the two outcomes.

Comparing Different Random Effects Structures

One could actually consider a simpler model for these data, which assumes a compound symmetry structure for the random effects (this would imply that the amount of heterogeneity is the same for the two outcomes). A formal comparison of the two models can be conducted using a likelihood ratio test:

res1 <- rma.mv(yi, V, mods = ~ 0 + outcome,
               random = ~ outcome | trial, struct="UN",
               data=dat, method="ML")
res0 <- rma.mv(yi, V, mods = ~ 0 + outcome,
               random = ~ outcome | trial, struct="CS",
               data=dat, method="ML")
anova(res0, res1)

        df     AIC     BIC    AICc logLik    LRT   pval       QE
Full     5 -1.6813 -0.1684 13.3187 5.8407               128.2267
Reduced  4 -2.4111 -1.2008  5.5889 5.2056 1.2702 0.2597 128.2267

Since model res1 has one more parameter than model res0, the test statistic (LRT) follows (approximately) a chi-square distribution with 1 degree of freedom. The p-value (0.2597) suggests that the more complex model (with struct="UN") does not actually yield a significantly better fit than the simpler model (with struct="CS"). However, with only 5 studies, this test is likely to have very low power.

References

Berkey, C. S., Hoaglin, D. C., Antczak-Bouckoms, A., Mosteller, F., & Colditz, G. A. (1998). Meta-analysis of multiple outcomes by regression with random effects. Statistics in Medicine, 17(22), 2537–2550.

¹⁾

Berkey et al. (1998) claim that this approach can only be used when all the trials in the meta-analysis provide results for all the outcomes considered. This does not apply to the way rma.mv() estimates this model. Therefore, if a particular study had only measured one of the two outcomes, it could still be included in the analysis. The variance-covariance matrix of that study would then simply be a $1 \times 1$ matrix, with the value of that matrix equal to the sampling variance of the observed outcome.

Table of Contents