Table of Contents
Gleser & Olkin (2009)
Introduction
The chapter on stochastically dependent effect sizes by Gleser and Olkin (2009) in The handbook of research synthesis and meta-analysis (2nd ed.) describes how multiple-treatment and multiple-endpoint studies can be meta-analyzed. In particular, whenever there is at least some overlap in the individuals used to compute multiple effect size estimates in a particular study, then the estimates from that study can no longer be treated as independent. The chapter describes how the correlation between the effect size estimates can be computed for a variety of outcome measures and how this information can be incorporated into the analysis.
Below, the analyses described in the chapter are recreated using R and the metafor package, so let's load the package first:
library(metafor)
Multiple-Treatment Studies
In a multiple-treatment study, more than one treatment (or variation of the treatment) is compared against a single control group. For each treatment group, we can compute an effect size estimate, contrasting the treatment group against the control group. Since the information from the control group is repeatedly used to compute the effect size estimates, the resulting effect size estimates are correlated.
Dichotomous Response Variable
When the response variable within the individual studies is dichotomous, commonly used outcome measures for the meta-analysis are the risk difference, odds ratio, risk ratio, and the arcsine transformed risk difference. The data in Table 19.1 are used to illustrate these measures. This dataset can be recreated with:
dat <- data.frame(study=c(1,1,2,3,3,3), trt=c(1,2,1,1,2,3), ai=c( 40, 40, 10,150,150,150), n1i=c(1000,1000,200,2000,2000,2000), ci=c(100,150, 15, 40, 80, 50), n2i=c(4000,4000,400,1000,1000,1000)) dat$pti <- with(dat, ci / n2i) dat$pci <- with(dat, ai / n1i) dat
study trt ai n1i ci n2i pti pci 1 1 1 40 1000 100 4000 0.0250 0.040 2 1 2 40 1000 150 4000 0.0375 0.040 3 2 1 10 200 15 400 0.0375 0.050 4 3 1 150 2000 40 1000 0.0400 0.075 5 3 2 150 2000 80 1000 0.0800 0.075 6 3 3 150 2000 50 1000 0.0500 0.075
Note that the dataset is created in the "long format", with multiple rows for the multi-treatment studies (i.e., studies 1 and 3). Also note that variables ai
and n1i
refer to the control group and that in studies 1 and 3, multiple treatment groups are contrasted with a common control group.
The risk differences and corresponding sampling variances can be added to the dataset with:
dat <- escalc(measure="RD", ai=ai, ci=ci, n1i=n1i, n2i=n2i, data=dat) dat
study trt ai n1i ci n2i pti pci yi vi 1 1 1 40 1000 100 4000 0.0250 0.040 0.0150 0.0000 2 1 2 40 1000 150 4000 0.0375 0.040 0.0025 0.0000 3 2 1 10 200 15 400 0.0375 0.050 0.0125 0.0003 4 3 1 150 2000 40 1000 0.0400 0.075 0.0350 0.0001 5 3 2 150 2000 80 1000 0.0800 0.075 -0.0050 0.0001 6 3 3 150 2000 50 1000 0.0500 0.075 0.0250 0.0001
As in the article, the risk differences are computed with $p_C - p_T$ to avoid a large number of negative estimates.
Multiple risk differences from the same study are correlated. We therefore need to construct the entire variance-covariance matrix for the 6 effect size estimates. This can be done with:
calc.v <- function(x) { v <- matrix(x$pci[1]*(1-x$pci[1])/x$n1i[1], nrow=nrow(x), ncol=nrow(x)) diag(v) <- x$vi v } V <- bldiag(lapply(split(dat, dat$study), calc.v))
Since the numbers in V are quite small, we can use:
round(V * 10^6, 3)
to get a better impression of the structure of the V
matrix:
[,1] [,2] [,3] [,4] [,5] [,6] [1,] 44.494 38.400 0.000 0.000 0.000 0.000 [2,] 38.400 47.423 0.000 0.000 0.000 0.000 [3,] 0.000 0.000 327.734 0.000 0.000 0.000 [4,] 0.000 0.000 0.000 73.088 34.688 34.688 [5,] 0.000 0.000 0.000 34.688 108.287 34.688 [6,] 0.000 0.000 0.000 34.688 34.688 82.188
The values along the diagonal are the (scaled) sampling variances of the risk differences (i.e., scaled by $10^6$). The off-diagonal elements are the (scaled) covariances. Note the block-diagonal structure: Multiple risk differences from the same study are correlated, while risk differences from different studies are uncorrelated.
Now the model described in equation (19.5) can be fitted with:
res <- rma.mv(yi, V, mods = ~ 0 + factor(trt), data=dat) res
Multivariate Meta-Analysis Model (k = 6; method: FE) Variance Components: none Test for Residual Heterogeneity: QE(df = 3) = 7.1907, p-val = 0.0661 Test of Moderators (coefficient(s) 1,2,3): QM(df = 3) = 30.4173, p-val < .0001 Model Results: estimate se zval pval ci.lb ci.ub factor(trt)1 0.0200 0.0050 4.0347 <.0001 0.0103 0.0297 *** factor(trt)2 0.0043 0.0053 0.8011 0.4231 -0.0062 0.0148 factor(trt)3 0.0211 0.0084 2.5304 0.0114 0.0048 0.0375 * --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
The variance-covariance matrix (scaled by a factor of $10^6$) of the parameter estimates is equal to:
round(vcov(res) * 10^6, 3)
factor(trt)1 factor(trt)2 factor(trt)3 factor(trt)1 24.612 19.954 13.323 factor(trt)2 19.954 28.538 13.255 factor(trt)3 13.323 13.255 69.806
These results match what is reported on pages 361-362.1)
The analysis can also be conducted with (log) odds ratios as follows:
dat <- escalc(measure="OR", ai=ai, ci=ci, n1i=n1i, n2i=n2i, data=dat) calc.v <- function(x) { v <- matrix(1/(x$n1i[1]*x$pci[1]*(1-x$pci[1])), nrow=nrow(x), ncol=nrow(x)) diag(v) <- x$vi v } V <- bldiag(lapply(split(dat, dat$study), calc.v)) res <- rma.mv(yi, V, mods = ~ 0 + factor(trt), data=dat) res
Multivariate Meta-Analysis Model (k = 6; method: FE) Variance Components: none Test for Residual Heterogeneity: QE(df = 3) = 2.0563, p-val = 0.5608 Test of Moderators (coefficient(s) 1,2,3): QM(df = 3) = 31.1204, p-val < .0001 Model Results: estimate se zval pval ci.lb ci.ub factor(trt)1 0.5099 0.1188 4.2918 <.0001 0.2771 0.7428 *** factor(trt)2 0.0044 0.1086 0.0403 0.9679 -0.2084 0.2171 factor(trt)3 0.4301 0.1644 2.6162 0.0089 0.1079 0.7523 ** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
These results are given on pages 362-363.
To analyze (log) risk ratios, we use:
dat <- escalc(measure="RR", ai=ai, ci=ci, n1i=n1i, n2i=n2i, data=dat) calc.v <- function(x) { v <- matrix((1-x$pci[1])/(x$n1i[1]*x$pci[1]), nrow=nrow(x), ncol=nrow(x)) diag(v) <- x$vi v } V <- bldiag(lapply(split(dat, dat$study), calc.v)) res <- rma.mv(yi, V, mods = ~ 0 + factor(trt), data=dat) res
Multivariate Meta-Analysis Model (k = 6; method: FE) Variance Components: none Test for Residual Heterogeneity: QE(df = 3) = 1.8954, p-val = 0.5944 Test of Moderators (coefficient(s) 1,2,3): QM(df = 3) = 30.9802, p-val < .0001 Model Results: estimate se zval pval ci.lb ci.ub factor(trt)1 0.4875 0.1135 4.2972 <.0001 0.2652 0.7099 *** factor(trt)2 0.0006 0.1018 0.0055 0.9956 -0.1991 0.2002 factor(trt)3 0.4047 0.1554 2.6035 0.0092 0.1000 0.7094 ** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Finally, the chapter illustrates how to analyze the difference of arcsine square-root transformed proportions (a variance stabilizing transformation for proportions). These results can be replicated with:
dat <- escalc(measure="AS", ai=ai, ci=ci, n1i=n1i, n2i=n2i, data=dat) calc.v <- function(x) { v <- matrix(1/(4*x$n1i[1]), nrow=nrow(x), ncol=nrow(x)) diag(v) <- x$vi v } V <- bldiag(lapply(split(dat, dat$study), calc.v)) res <- rma.mv(yi, V, mods = ~ 0 + factor(trt), data=dat) res
Multivariate Meta-Analysis Model (k = 6; method: FE) Variance Components: none Test for Residual Heterogeneity: QE(df = 3) = 4.2634, p-val = 0.2344 Test of Moderators (coefficient(s) 1,2,3): QM(df = 3) = 31.8052, p-val < .0001 Model Results: estimate se zval pval ci.lb ci.ub factor(trt)1 0.0505 0.0120 4.1921 <.0001 0.0269 0.0741 *** factor(trt)2 0.0051 0.0123 0.4124 0.6800 -0.0191 0.0292 factor(trt)3 0.0491 0.0185 2.6503 0.0080 0.0128 0.0854 ** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
These results are given on pages 364-365.2)
Quantitative Response Variable
When quantitative response variables are measured in the individual studies, the standardized mean difference is often used as the effect size measure. The illustrative dataset from Table 19.3 can be recreated with:
dat <- data.frame(study=c(1,1,2,3,4,4), trt=c(1,2,1,1,1,2), m1i=c(7.87, 4.35, 9.32, 8.08, 7.44, 5.34), m2i=c(-1.36, -1.36, 0.98, 1.17, 0.45, 0.45), sdpi=c(4.2593,4.2593,2.8831,3.1764,2.9344,2.9344), n1i=c(25,22,38,50,30,30), n2i=c(25,25,40,50,30,30))
In addition, the total sample sizes of the individual studies are needed for further computations. This information can be added to the dataset with:
dat$Ni <- unlist(lapply(split(dat, dat$study), function(x) rep(sum(x$n1i) + x$n2i[1], each=nrow(x))))
Finally, the standardized mean differences and corresponding sampling variances can be computed with:
dat$yi <- with(dat, (m1i-m2i)/sdpi) dat$vi <- with(dat, 1/n1i + 1/n2i + yi^2/(2*Ni)) dat
The dataset now contains:
study trt m1i m2i sdpi n1i n2i Ni yi vi 1 1 1 7.87 -1.36 4.2593 25 25 72 2.167023 0.11261102 2 1 2 4.35 -1.36 4.2593 22 25 72 1.340596 0.09793508 3 2 1 9.32 0.98 2.8831 38 40 78 2.892720 0.10495571 4 3 1 8.08 1.17 3.1764 50 50 100 2.175419 0.06366223 5 4 1 7.44 0.45 2.9344 30 30 90 2.382088 0.09819080 6 4 2 5.34 0.45 2.9344 30 30 90 1.666439 0.08209456
Again, multiple standardized mean differences from the same study are correlated due to the repeated use of the information from the control group. The variance-covariance matrix of the effect size estimates can be constructed with:
calc.v <- function(x) { v <- matrix(1/x$n2i[1] + outer(x$yi, x$yi, "*")/(2*x$Ni[1]), nrow=nrow(x), ncol=nrow(x)) diag(v) <- x$vi v } V <- bldiag(lapply(split(dat, dat$study), calc.v)) V
[,1] [,2] [,3] [,4] [,5] [,6] [1,] 0.11261102 0.06017432 0.0000000 0.00000000 0.0000000 0.00000000 [2,] 0.06017432 0.09793508 0.0000000 0.00000000 0.0000000 0.00000000 [3,] 0.00000000 0.00000000 0.1049557 0.00000000 0.0000000 0.00000000 [4,] 0.00000000 0.00000000 0.0000000 0.06366223 0.0000000 0.00000000 [5,] 0.00000000 0.00000000 0.0000000 0.00000000 0.0981908 0.05538670 [6,] 0.00000000 0.00000000 0.0000000 0.00000000 0.0553867 0.08209456
Again, we note the block-diagonal structure.
The analysis can now be carried out with:
res <- rma.mv(yi, V, mods = ~ 0 + factor(trt), data=dat) print(res, digits=3)
Multivariate Meta-Analysis Model (k = 6; method: FE) Variance Components: none Test for Residual Heterogeneity: QE(df = 4) = 3.945, p-val = 0.414 Test of Moderators (coefficient(s) 1,2): QM(df = 2) = 252.165, p-val < .001 Model Results: estimate se zval pval ci.lb ci.ub factor(trt)1 2.374 0.150 15.804 <.001 2.080 2.669 *** factor(trt)2 1.570 0.189 8.330 <.001 1.201 1.940 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
The variance-covariance matrix of the parameter estimates is equal to:
round(vcov(res), 5)
factor(trt)1 factor(trt)2 factor(trt)1 0.02257 0.01244 factor(trt)2 0.01244 0.03554
The same results are given on page 367.
Multiple-Endpoint Studies
In a multiple-endpoint study, more than one response variable is measured in the same group of subjects. The response variables may represent different ways of measuring the same construct (e.g., two different depression scales), different constructs (e.g., depression and anxiety), or multiple measurements over time (e.g., immediately after the treatment and at a later follow-up assessment). In any case, the corresponding effect size estimates are then (likely to be) correlated.
Such a scenario is illustrated with the data in Table 19.5. The dataset can be recreated with:
dat <- data.frame(school=c(1,1,2,2,3,3,4,4,5,5,6,6,7,7), outcome=rep(c("math", "reading"), times=7), m1i=c(2.3,2.5,2.4,1.3,2.5,2.4,3.3,1.7,1.1,2.0,2.8,2.1,1.7,0.6), m2i=c(10.3,6.6,9.7,3.1,8.7,3.7,7.5,8.5,2.2,2.1,3.8,1.4,1.8,3.9), sdpi=c(8.2,7.3,8.3,8.9,8.5,8.3,7.7,9.8,9.1,10.4,9.6,7.9,9.2,10.2), ri=rep(c(0.55,0.43,0.57,0.66,0.51,0.59,0.49), each=2), n1i=rep(c(22,21,26,18,38,42,39), each=2), n2i=rep(c(24,21,23,18,36,42,38), each=2))
The standardized mean differences, sampling variances, and covariances can be added to the dataset with:
dat$yi <- round(with(dat, (m2i-m1i)/sdpi), 3) dat$vi <- round(with(dat, 1/n1i + 1/n2i + yi^2/(2*(n1i+n2i))), 4) dat$covi <- round(with(dat, (1/n1i + 1/n2i) * ri + (rep(sapply(split(dat$yi, dat$school), prod), each=2) / (2*(n1i+n2i))) * ri^2), 4)
The contents of the resulting dataset are:
school outcome m1i m2i sdpi ri n1i n2i yi vi covi 1 1 math 2.3 10.3 8.2 0.55 22 24 0.976 0.0975 0.0497 2 1 reading 2.5 6.6 7.3 0.55 22 24 0.562 0.0906 0.0497 3 2 math 2.4 9.7 8.3 0.43 21 21 0.880 0.1045 0.0413 4 2 reading 1.3 3.1 8.9 0.43 21 21 0.202 0.0957 0.0413 5 3 math 2.5 8.7 8.5 0.57 26 23 0.729 0.0874 0.0471 6 3 reading 2.4 3.7 8.3 0.57 26 23 0.157 0.0822 0.0471 7 4 math 3.3 7.5 7.7 0.66 18 18 0.545 0.1152 0.0756 8 4 reading 1.7 8.5 9.8 0.66 18 18 0.694 0.1178 0.0756 9 5 math 1.1 2.2 9.1 0.51 38 36 0.121 0.0542 0.0276 10 5 reading 2.0 2.1 10.4 0.51 38 36 0.010 0.0541 0.0276 11 6 math 2.8 3.8 9.6 0.59 42 42 0.104 0.0477 0.0281 12 6 reading 2.1 1.4 7.9 0.59 42 42 -0.089 0.0477 0.0281 13 7 math 1.7 1.8 9.2 0.49 39 38 0.011 0.0520 0.0255 14 7 reading 0.6 3.9 10.2 0.49 39 38 0.324 0.0526 0.0255
The variance-covariance matrix for the entire dataset can be constructed with:
V <- bldiag(lapply(split(dat, dat$school), function(x) matrix(c(x$vi[1], x$covi[1], x$covi[2], x$vi[2]), nrow=2))) V
The variance-covariance matrix again has a block-diagonal structure:
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] [1,] 0.0975 0.0497 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 [2,] 0.0497 0.0906 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 [3,] 0.0000 0.0000 0.1045 0.0413 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 [4,] 0.0000 0.0000 0.0413 0.0957 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 [5,] 0.0000 0.0000 0.0000 0.0000 0.0874 0.0471 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 [6,] 0.0000 0.0000 0.0000 0.0000 0.0471 0.0822 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 [7,] 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.1152 0.0756 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 [8,] 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0756 0.1178 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 [9,] 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0542 0.0276 0.0000 0.0000 0.0000 0.0000 [10,] 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0276 0.0541 0.0000 0.0000 0.0000 0.0000 [11,] 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0477 0.0281 0.0000 0.0000 [12,] 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0281 0.0477 0.0000 0.0000 [13,] 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0520 0.0255 [14,] 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0255 0.0526
Finally, we can fit a model allowing for a different treatment effect depending on the outcome with:
res <- rma.mv(yi, V, mods = ~ 0 + outcome, data=dat) print(res, digits=3)
Multivariate Meta-Analysis Model (k = 14; method: FE) Variance Components: none Test for Residual Heterogeneity: QE(df = 12) = 19.626, p-val = 0.074 Test of Moderators (coefficient(s) 1,2): QM(df = 2) = 13.005, p-val = 0.001 Model Results: estimate se zval pval ci.lb ci.ub outcomemath 0.362 0.100 3.603 <.001 0.165 0.558 *** outcomereading 0.205 0.099 2.062 0.039 0.010 0.400 * --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Categorical and/or quantitative moderator variables can also be included in such a model as described on pages 371-373.
Note that all of the models described above are fixed-effects models. Multivariate random/mixed-effects models can also be fitted with the rma.mv()
function. For further examples, see Berkey et al. (1998) and van Houwelingen, Arends, and Stijnen (2002).
References
Gleser, L. J., & Olkin, I. (2009). Stochastically dependent effect sizes. In H. Cooper, L. V. Hedges, & J. C. Valentine (Eds.), The handbook of research synthesis and meta-analysis (2nd ed., pp. 357–376). New York: Russell Sage Foundation.
escalc()
computes the arcsine transformed risk differences with the equation $arcsin[\sqrt{p_C}] - arcsin[\sqrt{p_T}]$, so the results differ by a multiplicative factor of 2 from those given in the chapter.