The metafor Package

Gleser & Olkin (2009)

Introduction

The chapter on stochastically dependent effect sizes by Gleser and Olkin (2009) in The handbook of research synthesis and meta-analysis (2nd ed.) describes how multiple-treatment and multiple-endpoint studies can be meta-analyzed. In particular, whenever there is at least some overlap in the individuals used to compute multiple effect size estimates in a particular study, then the estimates from that study can no longer be treated as independent. The chapter describes how the correlation between the effect size estimates can be computed for a variety of outcome measures and how this information can be incorporated into the analysis.

Below, the analyses described in the chapter are recreated using R and the metafor package, so let's load the package first:

library(metafor)

Multiple-Treatment Studies

In a multiple-treatment study, more than one treatment (or variation of the treatment) is compared against a single control group. For each treatment group, we can compute an effect size estimate, contrasting the treatment group against the control group. Since the information from the control group is repeatedly used to compute the effect size estimates, the resulting effect size estimates are correlated.

Dichotomous Response Variable

When the response variable within the individual studies is dichotomous, commonly used outcome measures for the meta-analysis are the risk difference, odds ratio, risk ratio, and the arcsine transformed risk difference. The data in Table 19.1 are used to illustrate these measures. This dataset can be recreated with:

dat <- data.frame(study=c(1,1,2,3,3,3), trt=c(1,2,1,1,2,3),
                  ai=c( 40, 40, 10,150,150,150), n1i=c(1000,1000,200,2000,2000,2000),
                  ci=c(100,150, 15, 40, 80, 50), n2i=c(4000,4000,400,1000,1000,1000))
dat$pti <- with(dat, ci / n2i)
dat$pci <- with(dat, ai / n1i)
dat

  study trt  ai  n1i  ci  n2i    pti   pci
1     1   1  40 1000 100 4000 0.0250 0.040
2     1   2  40 1000 150 4000 0.0375 0.040
3     2   1  10  200  15  400 0.0375 0.050
4     3   1 150 2000  40 1000 0.0400 0.075
5     3   2 150 2000  80 1000 0.0800 0.075
6     3   3 150 2000  50 1000 0.0500 0.075

Note that the dataset is created in the "long format", with multiple rows for the multi-treatment studies (i.e., studies 1 and 3). Also note that variables ai and n1i refer to the control group and that in studies 1 and 3, multiple treatment groups are contrasted with a common control group.

The risk differences and corresponding sampling variances can be added to the dataset with:

dat <- escalc(measure="RD", ai=ai, ci=ci, n1i=n1i, n2i=n2i, data=dat)
dat

  study trt  ai  n1i  ci  n2i    pti   pci      yi     vi
1     1   1  40 1000 100 4000 0.0250 0.040  0.0150 0.0000
2     1   2  40 1000 150 4000 0.0375 0.040  0.0025 0.0000
3     2   1  10  200  15  400 0.0375 0.050  0.0125 0.0003
4     3   1 150 2000  40 1000 0.0400 0.075  0.0350 0.0001
5     3   2 150 2000  80 1000 0.0800 0.075 -0.0050 0.0001
6     3   3 150 2000  50 1000 0.0500 0.075  0.0250 0.0001

As in the article, the risk differences are computed with $p_C - p_T$ to avoid a large number of negative estimates.

Multiple risk differences from the same study are correlated. We therefore need to construct the entire variance-covariance matrix for the 6 effect size estimates. This can be done with:

calc.v <- function(x) {
   v <- matrix(x$pci[1]*(1-x$pci[1])/x$n1i[1], nrow=nrow(x), ncol=nrow(x))
   diag(v) <- x$vi
   v
}
V <- bldiag(lapply(split(dat, dat$study), calc.v))

Since the numbers in V are quite small, we can use:

round(V * 10^6, 3)

to get a better impression of the structure of the V matrix:

       [,1]   [,2]    [,3]   [,4]    [,5]   [,6]
[1,] 44.494 38.400   0.000  0.000   0.000  0.000
[2,] 38.400 47.423   0.000  0.000   0.000  0.000
[3,]  0.000  0.000 327.734  0.000   0.000  0.000
[4,]  0.000  0.000   0.000 73.088  34.688 34.688
[5,]  0.000  0.000   0.000 34.688 108.287 34.688
[6,]  0.000  0.000   0.000 34.688  34.688 82.188

The values along the diagonal are the (scaled) sampling variances of the risk differences (i.e., scaled by $10^6$). The off-diagonal elements are the (scaled) covariances. Note the block-diagonal structure: Multiple risk differences from the same study are correlated, while risk differences from different studies are uncorrelated.

Now the model described in equation (19.5) can be fitted with:

res <- rma.mv(yi, V, mods = ~ 0 + factor(trt), data=dat)
res

Multivariate Meta-Analysis Model (k = 6; method: FE)
 
Variance Components: none
 
Test for Residual Heterogeneity:
QE(df = 3) = 7.1907, p-val = 0.0661
 
Test of Moderators (coefficient(s) 1,2,3):
QM(df = 3) = 30.4173, p-val < .0001
 
Model Results:
 
              estimate      se    zval    pval    ci.lb   ci.ub
factor(trt)1    0.0200  0.0050  4.0347  <.0001   0.0103  0.0297  ***
factor(trt)2    0.0043  0.0053  0.8011  0.4231  -0.0062  0.0148
factor(trt)3    0.0211  0.0084  2.5304  0.0114   0.0048  0.0375    *
 
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

The variance-covariance matrix (scaled by a factor of $10^6$) of the parameter estimates is equal to:

round(vcov(res) * 10^6, 3)

             factor(trt)1 factor(trt)2 factor(trt)3
factor(trt)1       24.612       19.954       13.323
factor(trt)2       19.954       28.538       13.255
factor(trt)3       13.323       13.255       69.806

These results match what is reported on pages 361-362.¹⁾

The analysis can also be conducted with (log) odds ratios as follows:

dat <- escalc(measure="OR", ai=ai, ci=ci, n1i=n1i, n2i=n2i, data=dat)
 
calc.v <- function(x) {
   v <- matrix(1/(x$n1i[1]*x$pci[1]*(1-x$pci[1])), nrow=nrow(x), ncol=nrow(x))
   diag(v) <- x$vi
   v
}
 
V <- bldiag(lapply(split(dat, dat$study), calc.v))
 
res <- rma.mv(yi, V, mods = ~ 0 + factor(trt), data=dat)
res

Multivariate Meta-Analysis Model (k = 6; method: FE)
 
Variance Components: none
 
Test for Residual Heterogeneity:
QE(df = 3) = 2.0563, p-val = 0.5608
 
Test of Moderators (coefficient(s) 1,2,3):
QM(df = 3) = 31.1204, p-val < .0001
 
Model Results:
 
              estimate      se    zval    pval    ci.lb   ci.ub
factor(trt)1    0.5099  0.1188  4.2918  <.0001   0.2771  0.7428  ***
factor(trt)2    0.0044  0.1086  0.0403  0.9679  -0.2084  0.2171
factor(trt)3    0.4301  0.1644  2.6162  0.0089   0.1079  0.7523   **
 
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

These results are given on pages 362-363.

To analyze (log) risk ratios, we use:

dat <- escalc(measure="RR", ai=ai, ci=ci, n1i=n1i, n2i=n2i, data=dat)
 
calc.v <- function(x) {
   v <- matrix((1-x$pci[1])/(x$n1i[1]*x$pci[1]), nrow=nrow(x), ncol=nrow(x))
   diag(v) <- x$vi
   v
}
 
V <- bldiag(lapply(split(dat, dat$study), calc.v))
 
res <- rma.mv(yi, V, mods = ~ 0 + factor(trt), data=dat)
res

Multivariate Meta-Analysis Model (k = 6; method: FE)
 
Variance Components: none
 
Test for Residual Heterogeneity:
QE(df = 3) = 1.8954, p-val = 0.5944
 
Test of Moderators (coefficient(s) 1,2,3):
QM(df = 3) = 30.9802, p-val < .0001
 
Model Results:
 
              estimate      se    zval    pval    ci.lb   ci.ub
factor(trt)1    0.4875  0.1135  4.2972  <.0001   0.2652  0.7099  ***
factor(trt)2    0.0006  0.1018  0.0055  0.9956  -0.1991  0.2002
factor(trt)3    0.4047  0.1554  2.6035  0.0092   0.1000  0.7094   **
 
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Finally, the chapter illustrates how to analyze the difference of arcsine square-root transformed proportions (a variance stabilizing transformation for proportions). These results can be replicated with:

dat <- escalc(measure="AS", ai=ai, ci=ci, n1i=n1i, n2i=n2i, data=dat)
 
calc.v <- function(x) {
   v <- matrix(1/(4*x$n1i[1]), nrow=nrow(x), ncol=nrow(x))
   diag(v) <- x$vi
   v
}
 
V <- bldiag(lapply(split(dat, dat$study), calc.v))
 
res <- rma.mv(yi, V, mods = ~ 0 + factor(trt), data=dat)
res

Multivariate Meta-Analysis Model (k = 6; method: FE)
 
Variance Components: none
 
Test for Residual Heterogeneity:
QE(df = 3) = 4.2634, p-val = 0.2344
 
Test of Moderators (coefficient(s) 1,2,3):
QM(df = 3) = 31.8052, p-val < .0001
 
Model Results:
 
              estimate      se    zval    pval    ci.lb   ci.ub
factor(trt)1    0.0505  0.0120  4.1921  <.0001   0.0269  0.0741  ***
factor(trt)2    0.0051  0.0123  0.4124  0.6800  -0.0191  0.0292
factor(trt)3    0.0491  0.0185  2.6503  0.0080   0.0128  0.0854   **
 
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

These results are given on pages 364-365.²⁾

Quantitative Response Variable

When quantitative response variables are measured in the individual studies, the standardized mean difference is often used as the effect size measure. The illustrative dataset from Table 19.3 can be recreated with:

dat <- data.frame(study=c(1,1,2,3,4,4), trt=c(1,2,1,1,1,2),
                  m1i=c(7.87, 4.35, 9.32, 8.08, 7.44, 5.34),
                  m2i=c(-1.36, -1.36, 0.98, 1.17, 0.45, 0.45),
                  sdpi=c(4.2593,4.2593,2.8831,3.1764,2.9344,2.9344),
                  n1i=c(25,22,38,50,30,30), n2i=c(25,25,40,50,30,30))

In addition, the total sample sizes of the individual studies are needed for further computations. This information can be added to the dataset with:

dat$Ni <- unlist(lapply(split(dat, dat$study), function(x) rep(sum(x$n1i) + x$n2i[1], each=nrow(x))))

Finally, the standardized mean differences and corresponding sampling variances can be computed with:

dat$yi <- with(dat, (m1i-m2i)/sdpi)
dat$vi <- with(dat, 1/n1i + 1/n2i + yi^2/(2*Ni))
dat

The dataset now contains:

  study trt  m1i   m2i   sdpi n1i n2i  Ni       yi         vi
1     1   1 7.87 -1.36 4.2593  25  25  72 2.167023 0.11261102
2     1   2 4.35 -1.36 4.2593  22  25  72 1.340596 0.09793508
3     2   1 9.32  0.98 2.8831  38  40  78 2.892720 0.10495571
4     3   1 8.08  1.17 3.1764  50  50 100 2.175419 0.06366223
5     4   1 7.44  0.45 2.9344  30  30  90 2.382088 0.09819080
6     4   2 5.34  0.45 2.9344  30  30  90 1.666439 0.08209456

Again, multiple standardized mean differences from the same study are correlated due to the repeated use of the information from the control group. The variance-covariance matrix of the effect size estimates can be constructed with:

calc.v <- function(x) {
   v <- matrix(1/x$n2i[1] + outer(x$yi, x$yi, "*")/(2*x$Ni[1]), nrow=nrow(x), ncol=nrow(x))
   diag(v) <- x$vi
   v
}
 
V <- bldiag(lapply(split(dat, dat$study), calc.v))
V

           [,1]       [,2]      [,3]       [,4]      [,5]       [,6]
[1,] 0.11261102 0.06017432 0.0000000 0.00000000 0.0000000 0.00000000
[2,] 0.06017432 0.09793508 0.0000000 0.00000000 0.0000000 0.00000000
[3,] 0.00000000 0.00000000 0.1049557 0.00000000 0.0000000 0.00000000
[4,] 0.00000000 0.00000000 0.0000000 0.06366223 0.0000000 0.00000000
[5,] 0.00000000 0.00000000 0.0000000 0.00000000 0.0981908 0.05538670
[6,] 0.00000000 0.00000000 0.0000000 0.00000000 0.0553867 0.08209456

Again, we note the block-diagonal structure.

The analysis can now be carried out with:

res <- rma.mv(yi, V, mods = ~ 0 + factor(trt), data=dat)
print(res, digits=3)

Multivariate Meta-Analysis Model (k = 6; method: FE)
 
Variance Components: none
 
Test for Residual Heterogeneity:
QE(df = 4) = 3.945, p-val = 0.414
 
Test of Moderators (coefficient(s) 1,2):
QM(df = 2) = 252.165, p-val < .001
 
Model Results:
 
              estimate     se    zval   pval  ci.lb  ci.ub
factor(trt)1     2.374  0.150  15.804  <.001  2.080  2.669  ***
factor(trt)2     1.570  0.189   8.330  <.001  1.201  1.940  ***
 
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

The variance-covariance matrix of the parameter estimates is equal to:

round(vcov(res), 5)

             factor(trt)1 factor(trt)2
factor(trt)1      0.02257      0.01244
factor(trt)2      0.01244      0.03554

The same results are given on page 367.

Multiple-Endpoint Studies

In a multiple-endpoint study, more than one response variable is measured in the same group of subjects. The response variables may represent different ways of measuring the same construct (e.g., two different depression scales), different constructs (e.g., depression and anxiety), or multiple measurements over time (e.g., immediately after the treatment and at a later follow-up assessment). In any case, the corresponding effect size estimates are then (likely to be) correlated.

Such a scenario is illustrated with the data in Table 19.5. The dataset can be recreated with:

dat <- data.frame(school=c(1,1,2,2,3,3,4,4,5,5,6,6,7,7),
                  outcome=rep(c("math", "reading"), times=7),
                  m1i=c(2.3,2.5,2.4,1.3,2.5,2.4,3.3,1.7,1.1,2.0,2.8,2.1,1.7,0.6),
                  m2i=c(10.3,6.6,9.7,3.1,8.7,3.7,7.5,8.5,2.2,2.1,3.8,1.4,1.8,3.9),
                  sdpi=c(8.2,7.3,8.3,8.9,8.5,8.3,7.7,9.8,9.1,10.4,9.6,7.9,9.2,10.2),
                  ri=rep(c(0.55,0.43,0.57,0.66,0.51,0.59,0.49), each=2),
                  n1i=rep(c(22,21,26,18,38,42,39), each=2),
                  n2i=rep(c(24,21,23,18,36,42,38), each=2))

The standardized mean differences, sampling variances, and covariances can be added to the dataset with:

dat$yi <- round(with(dat, (m2i-m1i)/sdpi), 3)
dat$vi <- round(with(dat, 1/n1i + 1/n2i + yi^2/(2*(n1i+n2i))), 4)
dat$covi <- round(with(dat, (1/n1i + 1/n2i) * ri + 
                            (rep(sapply(split(dat$yi, dat$school), prod), each=2) / 
                            (2*(n1i+n2i))) * ri^2), 4)

The contents of the resulting dataset are:

   school outcome m1i  m2i sdpi   ri n1i n2i     yi     vi   covi
1       1    math 2.3 10.3  8.2 0.55  22  24  0.976 0.0975 0.0497
2       1 reading 2.5  6.6  7.3 0.55  22  24  0.562 0.0906 0.0497
3       2    math 2.4  9.7  8.3 0.43  21  21  0.880 0.1045 0.0413
4       2 reading 1.3  3.1  8.9 0.43  21  21  0.202 0.0957 0.0413
5       3    math 2.5  8.7  8.5 0.57  26  23  0.729 0.0874 0.0471
6       3 reading 2.4  3.7  8.3 0.57  26  23  0.157 0.0822 0.0471
7       4    math 3.3  7.5  7.7 0.66  18  18  0.545 0.1152 0.0756
8       4 reading 1.7  8.5  9.8 0.66  18  18  0.694 0.1178 0.0756
9       5    math 1.1  2.2  9.1 0.51  38  36  0.121 0.0542 0.0276
10      5 reading 2.0  2.1 10.4 0.51  38  36  0.010 0.0541 0.0276
11      6    math 2.8  3.8  9.6 0.59  42  42  0.104 0.0477 0.0281
12      6 reading 2.1  1.4  7.9 0.59  42  42 -0.089 0.0477 0.0281
13      7    math 1.7  1.8  9.2 0.49  39  38  0.011 0.0520 0.0255
14      7 reading 0.6  3.9 10.2 0.49  39  38  0.324 0.0526 0.0255

The variance-covariance matrix for the entire dataset can be constructed with:

V <- bldiag(lapply(split(dat, dat$school), 
            function(x) matrix(c(x$vi[1], x$covi[1], x$covi[2], x$vi[2]), nrow=2)))
V

The variance-covariance matrix again has a block-diagonal structure:

        [,1]   [,2]   [,3]   [,4]   [,5]   [,6]   [,7]   [,8]   [,9]  [,10]  [,11]  [,12]  [,13]  [,14]
 [1,] 0.0975 0.0497 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
 [2,] 0.0497 0.0906 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
 [3,] 0.0000 0.0000 0.1045 0.0413 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
 [4,] 0.0000 0.0000 0.0413 0.0957 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
 [5,] 0.0000 0.0000 0.0000 0.0000 0.0874 0.0471 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
 [6,] 0.0000 0.0000 0.0000 0.0000 0.0471 0.0822 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
 [7,] 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.1152 0.0756 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
 [8,] 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0756 0.1178 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
 [9,] 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0542 0.0276 0.0000 0.0000 0.0000 0.0000
[10,] 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0276 0.0541 0.0000 0.0000 0.0000 0.0000
[11,] 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0477 0.0281 0.0000 0.0000
[12,] 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0281 0.0477 0.0000 0.0000
[13,] 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0520 0.0255
[14,] 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0255 0.0526

Finally, we can fit a model allowing for a different treatment effect depending on the outcome with:

res <- rma.mv(yi, V, mods = ~ 0 + outcome, data=dat)
print(res, digits=3)

Multivariate Meta-Analysis Model (k = 14; method: FE)
 
Variance Components: none
 
Test for Residual Heterogeneity:
QE(df = 12) = 19.626, p-val = 0.074
 
Test of Moderators (coefficient(s) 1,2):
QM(df = 2) = 13.005, p-val = 0.001
 
Model Results:
 
                estimate     se   zval   pval  ci.lb  ci.ub
outcomemath        0.362  0.100  3.603  <.001  0.165  0.558  ***
outcomereading     0.205  0.099  2.062  0.039  0.010  0.400    *
 
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Categorical and/or quantitative moderator variables can also be included in such a model as described on pages 371-373.

Note that all of the models described above are fixed-effects models. Multivariate random/mixed-effects models can also be fitted with the rma.mv() function. For further examples, see Berkey et al. (1998) and van Houwelingen, Arends, and Stijnen (2002).

References

Gleser, L. J., & Olkin, I. (2009). Stochastically dependent effect sizes. In H. Cooper, L. V. Hedges, & J. C. Valentine (Eds.), The handbook of research synthesis and meta-analysis (2nd ed., pp. 357–376). New York: Russell Sage Foundation.

¹⁾

The value of the $Q$ statistic given on page 362 (i.e., $Q = 17.191$) is a typo.

²⁾

Note that escalc() computes the arcsine transformed risk differences with the equation $arcsin[\sqrt{p_C}] - arcsin[\sqrt{p_T}]$, so the results differ by a multiplicative factor of 2 from those given in the chapter.

Table of Contents