Table of Contents
Normand (1999)
The Data (Part 1)
The article by Normand (1999) is part of the "Tutorial in Biostatistics" series in the journal Statistics in Medicine and provides a nice introduction to meta-analysis, covering equal- and random-effects models. Two datasets are analyzed in this article. The first is based on data by Hine et al. (1989) regarding the mortality risk due to prophylactic use of lidocaine after a heart attack. The second dataset provides the results from studies on the length of the hospital stay of stroke patients under specialized care and under conventional/routine (non-specialist) care.
We will start with the lidocaine dataset, which is provided in Table I in the article (p. 322). After loading the metafor package with library(metafor)
, the data can be examined with:
dat.hine1989
study source n1i n2i ai ci 1 1 Chopra et al. 39 43 2 1 2 2 Mogensen 44 44 4 4 3 3 Pitt et al. 107 110 6 4 4 4 Darby et al. 103 100 7 5 5 5 Bennett et al. 110 106 7 3 6 6 O'Brien et al. 154 146 11 4
As described under help(dat.hine1989)
, variables n1i
and n2i
are the number of patients in the lidocaine and control group, respectively, and ai
and ci
are the corresponding number of deaths in the two groups. Since these are 2×2 table data, a variety of different outcome measures could be used for the meta-analysis, including the risk difference, the risk ratio (relative risk), and the odds ratio (see Table III). Normand (1999) uses risk differences for the meta-analysis, so we will proceed accordingly.
We can calculate the risk differences and corresponding sampling variances with:
dat <- escalc(measure="RD", n1i=n1i, n2i=n2i, ai=ai, ci=ci, data=dat.hine1989) dat
study source n1i n2i ai ci yi vi 1 1 Chopra et al. 39 43 2 1 0.0280 0.0018 2 2 Mogensen 44 44 4 4 0.0000 0.0038 3 3 Pitt et al. 107 110 6 4 0.0197 0.0008 4 4 Darby et al. 103 100 7 5 0.0180 0.0011 5 5 Bennett et al. 110 106 7 3 0.0353 0.0008 6 6 O'Brien et al. 154 146 11 4 0.0440 0.0006
Note that the yi
values are the risk differences in terms of proportions. Since Normand (1999) provides the results in terms of percentages, we can make the results directly comparable by multiplying the risk differences by $100$ (and the sampling variances by $100^2$):
dat$yi <- dat$yi * 100 dat$vi <- dat$vi * 100^2
Equal-Effects Model
Next, an equal-effects model1) can be fitted to these data with:
rma(yi, vi, data=dat, method="EE")
Equal-Effects Model (k = 6) I^2 (total heterogeneity / total variability): 0.00% H^2 (total variability / sampling variability): 0.17 Test for Heterogeneity: Q(df = 5) = 0.86, p-val = 0.97 Model Results: estimate se zval pval ci.lb ci.ub 2.94 1.31 2.25 0.02 0.38 5.51 * --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
These are the same results as shown in Table VII (p. 349). In particular, patients in the group receiving lidocaine are estimated to have a 2.94 percentage points higher risk of mortality than patients in the control group (with 95% CI: 0.4% to 5.5%).
Random-Effects Model
Next, we can fit a random-effects model to these data with:
rma(yi, vi, data=dat)
When method
is not been specified, the rma()
function uses REML estimation by default. The results are then:
Random-Effects Model (k = 6; tau^2 estimator: REML) tau^2 (estimated amount of total heterogeneity): 0 (SE = 6.17) tau (square root of estimated tau^2 value): 0 I^2 (total heterogeneity / total variability): 0.00% H^2 (total variability / sampling variability): 1.00 Test for Heterogeneity: Q(df = 5) = 0.86, p-val = 0.97 Model Results: estimate se zval pval ci.lb ci.ub 2.94 1.31 2.25 0.02 0.38 5.51 * --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Since the estimate of heterogeneity is 0 (i.e., $\hat{\tau}^2 = 0$), the results are the same as for the equal-effects model. Finally, the random-effects model using the DerSimonian-Laird estimator for $\tau^2$ can be fitted with:
rma(yi, vi, data=dat, method="DL", digits=2)
Since $\hat{\tau}^2 = 0$ also for the DL estimator, the results are again the same (not shown).
Sidenote: The equation for the REML estimator given in the article (on page 336) is actually just an approximation. The exact equation is given, for example, in Viechtbauer (2005) and this is what the rma()
will use when fitting a random-effects model. The differences between the approximate and exact equation is usually small, but there really is no need to use the approximate formula. Also, Normand uses SAS to fit the random-effects model with REML estimation and SAS actually uses the exact formula.
The Data (Part 2)
The second dataset analyzed in the article provides the results from studies where specialist inpatient stroke care was compared to conventional non-specialist care. Table II in the article (p. 324) lists the findings from 9 studies regarding the average length of stay during the acute hospital admission and the corresponding standard deviation. The central hypothesis of interest is whether specialist stroke unit care will result in a shorter length of hospitalization compared to routine management.
The data can be examined with:
dat.normand1999
study source n1i m1i sd1i n2i m2i sd2i 1 1 Edinburgh 155 55 47 156 75 64 2 2 Orpington-Mild 31 27 7 32 29 4 3 3 Orpington-Moderate 75 64 17 71 119 29 4 4 Orpington-Severe 18 66 20 18 137 48 5 5 Montreal-Home 8 14 8 13 18 11 6 6 Montreal-Transfer 57 19 7 52 18 4 7 7 Newcastle 34 52 45 33 41 34 8 8 Umea 110 21 16 183 31 27 9 9 Uppsala 60 30 27 52 23 20
As described under help(dat.normand1999)
, variables n1i
, m1i
, and sd1i
are the sample sizes, means, and standard deviations for the groups receiving specialized care, while n2i
, m2i
, and sd2i
are the corresponding variables for the groups receiving routine care.
For the meta-analysis, Normand uses the (raw or unstandardized) mean difference as the outcome measure. We can compute the observed mean differences and corresponding sampling variances with:
dat <- escalc(m1i=m1i, sd1i=sdpi, n1i=n1i, m2i=m2i, sd2i=sdpi, n2i=n2i, measure="MD", vtype="HO", data=dat.normand1999, digits=2) dat
study source n1i m1i sd1i n2i m2i sd2i sdpi yi vi 1 1 Edinburgh 155 55 47 156 75 64 56.174313 -20.00 40.59 2 2 Orpington-Mild 31 27 7 32 29 4 5.677104 -2.00 2.05 3 3 Orpington-Moderate 75 64 17 71 119 29 23.607908 -55.00 15.28 4 4 Orpington-Severe 18 66 20 18 137 48 36.769553 -71.00 150.22 5 5 Montreal-Home 8 14 8 13 18 11 10.000000 -4.00 20.19 6 6 Montreal-Transfer 57 19 7 52 18 4 5.768104 1.00 1.22 7 7 Newcastle 34 52 45 33 41 34 39.964792 11.00 95.38 8 8 Umea 110 21 16 183 31 27 23.491023 -10.00 8.03 9 9 Uppsala 60 30 27 52 23 20 24.009657 7.00 20.69
These are the same values as shown in Table VIII (p. 351).
Note that the equation for the sampling variance of a mean difference provided on page 331 in the article assumes that the true variances are the same in the treatment and control groups (i.e., the equation assumes homoscedasticity). By default, escalc()
with measure="MD"
does not make that assumption, but we can do so by setting vtype="HO"
as was done above.
Equal-Effects Model
Next, an equal-effects model can be fitted to these data with:
rma(yi, vi, data=dat, method="EE", digits=2)
Equal-Effects Model (k = 9) I^2 (total heterogeneity / total variability): 96.68% H^2 (total variability / sampling variability): 30.13 Test for Heterogeneity: Q(df = 8) = 241.06, p-val < .01 Model Results: estimate se zval pval ci.lb ci.ub -3.49 0.78 -4.47 <.01 -5.03 -1.96 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The results indicate that patients receiving specialized care spent, on average, three and a half days less in the hospital than those in the control groups ($\hat{\theta} = -3.5$). However, the results of the studies are clearly heterogeneous ($Q(df=8) = 241.06$, $p < .01$).
Random-Effects Model
A random-effects model using the DerSimonian-Laird estimator can be fitted to these data with:
rma(yi, vi, data=dat, method="DL", digits=2)
Random-Effects Model (k = 9; tau^2 estimator: DL) tau^2 (estimated amount of total heterogeneity): 218.72 (SE = 195.47) tau (square root of estimated tau^2 value): 14.79 I^2 (total heterogeneity / total variability): 96.68% H^2 (total variability / sampling variability): 30.13 Test for Heterogeneity: Q(df = 8) = 241.06, p-val < .01 Model Results: estimate se zval pval ci.lb ci.ub -14.10 5.28 -2.67 <.01 -24.45 -3.75 ** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
As shown in Table X (p. 352), we obtain an estimate for the amount of heterogeneity equal to $\hat{\tau}^2 = 219$ and an estimate for the average mean difference equal to $\hat{\mu} = -14$ (with 95% CI: $-24$ to $-4$).
The random-effects model using REML estimation can be fitted with:
rma(yi, vi, data=dat, digits=2)
Random-Effects Model (k = 9; tau^2 estimator: REML) tau^2 (estimated amount of total heterogeneity): 685.20 (SE = 360.37) tau (square root of estimated tau^2 value): 26.18 I^2 (total heterogeneity / total variability): 98.92% H^2 (total variability / sampling variability): 92.26 Test for Heterogeneity: Q(df = 8) = 241.06, p-val < .01 Model Results: estimate se zval pval ci.lb ci.ub -15.12 8.95 -1.69 0.09 -32.67 2.43 . --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
The estimate of the amount of heterogeneity is now quite a bit larger (i.e., $\hat{\tau}^2 = 685$), but we still obtain a very similar value for the average mean difference (i.e., $\hat{\mu} = -15$). The 95% confidence interval for $\mu$ is now quite a bit wider (i.e., $-33$ to $2$) and in fact includes the value $0$ now (hence, we cannot reject $\mbox{H}_0{:}\; \mu = 0$ at $\alpha = .05$). Again, these results match what is reported in Table X (p. 352).2)
References
Hine, L. K., Laird, N., Hewitt, P., & Chalmers, T. C. (1989). Meta-analytic evidence against prophylactic use of lidocaine in acute myocardial infarction. Archives of Internal Medicine, 149(12), 2694–2698.
Normand, S. T. (1999). Meta-analysis: Formulating, evaluating, combining, and reporting. Statistics in Medicine, 18(3), 321–359.
Viechtbauer, W. (2005). Bias and efficiency of meta-analytic variance estimators in the random-effects model. Journal of Educational and Behavioral Statistics, 30(3), 261–293.
scoring=1000
in SAS (so that Fisher scoring is used), then SAS will also use the expected Hessian and the results agree.