# The metafor Package

A Meta-Analysis Package for R

### Sidebar

tips:meta_regression_with_log_rr

## Interpreting Coefficients in Meta-Regression Models with (Log) Risk Ratios

In this tutorial, we will examine ways of interpreting the coefficients in meta-regression models when the log risk ratio is used as the effect size measure.

### Data Preparation

For this illustration, I will use (once again) the BCG vaccine dataset. To start, let's compute the log risk ratios (and corresponding sampling variances) for each study. For this illustration, we will compute the log risk ratios in such a way that they reflect the difference in the (log-transformed) risk of a tuberculosis infection in those from the control (i.e., not vaccinated) group compared to those from the treatment (i.e., vaccinated) group:1)

library(metafor)
dat <- escalc(measure="RR", ai=cpos, bi=cneg, ci=tpos, di=tneg, data=dat.bcg)
dat
   trial               author year tpos  tneg cpos  cneg ablat      alloc      yi     vi
1      1              Aronson 1948    4   119   11   128    44     random  0.8893 0.3256
2      2     Ferguson & Simes 1949    6   300   29   274    55     random  1.5854 0.1946
3      3      Rosenthal et al 1960    3   228   11   209    42     random  1.3481 0.4154
4      4    Hart & Sutherland 1977   62 13536  248 12619    52     random  1.4416 0.0200
5      5 Frimodt-Moller et al 1973   33  5036   47  5761    13  alternate  0.2175 0.0512
6      6      Stein & Aronson 1953  180  1361  372  1079    44  alternate  0.7861 0.0069
7      7     Vandiviere et al 1973    8  2537   10   619    19     random  1.6209 0.2230
8      8           TPT Madras 1980  505 87886  499 87892    13     random -0.0120 0.0040
9      9     Coetzee & Berjak 1968   29  7470   45  7232    27     random  0.4694 0.0564
10    10      Rosenthal et al 1961   17  1699   65  1600    42 systematic  1.3713 0.0730
11    11       Comstock et al 1974  186 50448  141 27197    18 systematic  0.3394 0.0124
12    12   Comstock & Webster 1969    5  2493    3  2338    33 systematic -0.4459 0.5325
13    13       Comstock et al 1976   27 16886   29 17825    33 systematic  0.0173 0.0714

The yi values are the observed log risk ratios for the 13 trials, reflecting the (usually) higher risk of an infection in those who did NOT receive the BCG vaccine compared to those who did (while the vi values are the corresponding sampling variances).

### Meta-Regression with a Categorical Moderator

Now let's fit a meta-regression model to these outcomes with the three-level alloc factor as a moderator:

res <- rma(yi, vi, mods = ~ factor(alloc), data=dat)
res
Mixed-Effects Model (k = 13; tau^2 estimator: REML)

tau^2 (estimated amount of residual heterogeneity):     0.3615 (SE = 0.2111)
tau (square root of estimated tau^2 value):             0.6013
I^2 (residual heterogeneity / unaccounted variability): 88.77%
H^2 (unaccounted variability / sampling variability):   8.91
R^2 (amount of heterogeneity accounted for):            0.00%

Test for Residual Heterogeneity:
QE(df = 10) = 132.3676, p-val < .0001

Test of Moderators (coefficients 2:3):
QM(df = 2) = 1.7675, p-val = 0.4132

Model Results:

estimate      se     zval    pval    ci.lb   ci.ub
intrcpt                    0.5180  0.4412   1.1740  0.2404  -0.3468  1.3827
factor(alloc)random        0.4478  0.5158   0.8682  0.3853  -0.5632  1.4588
factor(alloc)systematic   -0.0890  0.5600  -0.1590  0.8737  -1.1867  1.0086

---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

While the factor as a whole is not significant (as we can see from the $Q_M$-test), we will ignore this for this illustration. So, how do we interpret the coefficients from this model? To start with, the intercept is the estimated average log risk ratio for alternate allocation (since we see coefficients for random and systematic in the model, we know that alternate was automatically chosen as the reference level). However, a log risk ratio is a bit difficult to interpret, so we can exponentiate this coefficient to obtain an estimate of the average risk ratio:2)

exp(coef(res)[[1]])
[1] 1.678593

To also obtain the corresponding confidence interval (CI), we can use the predict() function (and we might as well round the results to two decimals, because anything beyond that is probably just noise):

predict(res, newmods=c(0,0), transf=exp, digits=2)
 pred ci.lb ci.ub pi.lb pi.ub
1.68  0.71  3.99  0.39  7.24

Hence, the estimated average risk ratio (of a tuberculosis infection in those NOT receiving the BCG vaccine compared to those who did) in trials using random allocation is 1.68 (with 95% CI: 0.71 to 3.99). Another way of expressing the same result is to say that the risk was on average 1.68 times (or 68%) greater in those who did not receive the vaccine compared to those who did.

The other two coefficients of the model reflect the difference in the (average) log risk ratio when comparing random and systematic allocation with alternate allocation. Hence, if we want to know what the average risk ratio for random allocation was, we just add the second coefficient to the intercept and exponentiate the result:

exp(coef(res)[[1]] + coef(res)[[2]])
[1] 2.62682

Again, using the predict() function also provides us with the CI:

predict(res, newmods=c(1,0), transf=exp, digits=2)
 pred ci.lb ci.ub pi.lb pi.ub
2.63  1.56  4.44  0.72  9.54

Note that predict() (by default) automatically adds the intercept term, so using newmods=c(1,0) gives us the linear combination intercept + coefficient for random * 1 + coefficient for systematic * 0 which is then exponentiated to obtain the desired result. Hence, under this form of allocation, we estimate an average risk ratio of 2.63 (95% CI: 1.56 to 4.44) or a 2.63 times greater risk in non-vaccinated individuals.

The average risk ratio was therefore greater under random compared to alternate allocation. How much greater? $2.63 / 1.68 \approx 1.56$ times greater! This is a ratio of risk ratios (RRR), which we can also obtain directly by exponentiating the coefficient for random allocation:

exp(coef(res)[[2]])
[1] 1.564894

To get the CI for this value, we can again use the predict() function, but now we just want to pick out the coefficient for random allocation and not add the intercept, so we use:

predict(res, newmods=c(1,0), intercept=FALSE, transf=exp, digits=2)
 pred ci.lb ci.ub pi.lb pi.ub
1.56  0.57  4.30  0.33  7.39

Hence, the average risk ratio for random allocation is 1.56 times greater compared alternate allocation (95% CI: 0.57 to 4.30).

To illustrate that this is indeed a ratio of the two risk ratios for these two subgroups of trials, let's fit a random-effects model in the two subgroups:

res.a <- rma(yi, vi, data=dat, subset=alloc=="alternate", tau2=res$tau2) res.r <- rma(yi, vi, data=dat, subset=alloc=="random", tau2=res$tau2)

We need to set $\tau^2$ to the value from the meta-regression model (otherwise, due to different $\tau^2$ values in the two subgroups, the following will not give us the same result). Now we can obtain the estimated average risk ratio via exponentiation in each subgroup and take the ratio:

exp(coef(res.a)) / exp(coef(res.r))
[1] 1.564894

This is the exact same value we obtained above. Hence, we see that we are indeed getting an estimate of the ratio of two risk ratios when we exponentiate a coefficient from the meta-regression model.

### Meta-Regression with a Continuous Moderator

The same principle also applies when the model involves a continuous moderator. Let's use the absolute latitude (i.e., the distance from the equator) as such a moderator:

res <- rma(yi, vi, mods = ~ ablat, data=dat)
res
Mixed-Effects Model (k = 13; tau^2 estimator: REML)

tau^2 (estimated amount of residual heterogeneity):     0.0764 (SE = 0.0591)
tau (square root of estimated tau^2 value):             0.2763
I^2 (residual heterogeneity / unaccounted variability): 68.39%
H^2 (unaccounted variability / sampling variability):   3.16
R^2 (amount of heterogeneity accounted for):            75.62%

Test for Residual Heterogeneity:
QE(df = 11) = 30.7331, p-val = 0.0012

Test of Moderators (coefficient 2):
QM(df = 1) = 16.3571, p-val < .0001

Model Results:

estimate      se     zval    pval    ci.lb   ci.ub
intrcpt   -0.2515  0.2491  -1.0095  0.3127  -0.7397  0.2368
ablat      0.0291  0.0072   4.0444  <.0001   0.0150  0.0432  ***

---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Based on this model, we therefore estimate that the average risk ratio at the equator is:3)

predict(res, newmods=0, transf=exp, digits=2)
 pred ci.lb ci.ub pi.lb pi.ub
0.78  0.48  1.27  0.38  1.61

The coefficient for ablat reflects the change in the average log risk ratio for a one-unit (i.e., degree) increase in absolute latitude. Hence, for example at 30 degrees, the estimated average risk ratio is:

predict(res, newmods=30, transf=exp, digits=2)
 pred ci.lb ci.ub pi.lb pi.ub
1.86  1.51  2.30  1.04  3.33

This risk ratio is $1.86 / 0.78 \approx 2.39$ times greater than the one at the equator. This is the same as taking the coefficient, multiplying it by 30, and then exponentiating the result:

exp(coef(res)[[2]] * 30)
[1] 2.394202

With predict(), we can again obtain the corresponding CI:

predict(res, newmods=30, intercept=FALSE, transf=exp, digits=2)
 pred ci.lb ci.ub pi.lb pi.ub
2.39  1.57  3.66  1.20  4.76

So, once again, this value actually reflects the ratio of two average risk ratios (at 30 degrees absolute latitude versus the equator).

The absolute values of the moderator variable do not actually matter here. We could just as well compare 40 with 10 degrees (or any other 30 degree difference) to obtain the same result. For example:

predict(res, newmods=10, transf=exp, digits=2)
predict(res, newmods=40, transf=exp, digits=2)

yields

 pred ci.lb ci.ub pi.lb pi.ub
1.04  0.72  1.50  0.54  2.00

and

 pred ci.lb ci.ub pi.lb pi.ub
2.49  1.96  3.17  1.38  4.51

which again gives us a ratio of $2.49 / 1.04 \approx 2.39$.

### Other Effect Size Measures

The same principle applies to other log-transformed effect size measures, such as the log odds ratio, the log incidence ratio, and the log-transformed ratio of means (also called the log response ratio), except that we would then get a ratio of odds ratios, a ratio of incidence ratios, or a ratio of the ratio of two means (yes, that sounds quite convoluted).

In fact, in regular logistic regression modeling, this phenomenon is well known. Let's say we fit a logistic regression model with two predictors and also include their interaction in the model. When we then exponentiate the coefficient for the interaction term, we get a ratio of two odds ratios. See here for a discussion of this.

1)
Note that in many illustrations using the same dataset, the log risk ratios are typically computed the other way around (i.e., the log-transformed risk in the vaccinated versus the non-vaccinated group), but I will deviate from this in this illustration, because it makes the description of the results a bit simpler.
2)
Strictly speaking, since exponentiation is a non-linear transformation, we are not getting an estimate of the average risk ratio, but the median risk ratio. However, this is subtle and often ignored aspect that we will also ignore for the rest of this discussion.
3)
Note that the range of the ablat moderator is 13 to 55 degrees, so this is an extrapolation, which should be treated with great caution.