Below is a to-do list and a description of planned features that I would like to implement in the
metafor package at some point in the future. This list is dynamic, incomplete, somewhat disorganized, and I cannot provide any kind of timeline when I will get around to implementing some of those things. If you are interested in working on any of these topics – or you think there are other things that really should make it into the package – let me know and we can discuss this further.
At some point, I plan on adding aspects of what is commonly called the Hunter and Schmidt (H&S) method for meta-analysis. To find out more about this approach, you could take a look at the book:
While the principles underlying this method can be applied to various outcome measures used in meta-analyses, it is most commonly used when meta-analyzing correlation coefficients and is widely used in particular areas of research (especially in industrial and organization psychology). The crucial idea is that we want to correct the observed correlation coefficients for various artifacts (e.g., attenuation due to measurement error), so that the aggregated results reflect the direction and strength of the relationship between the two variables of interest if they could be measured under ideal circumstances (e.g., free of any artifacts such as measurement error).
There are various aspects to the H&S method that deviate a bit from the statistical/theoretical framework underlying the
rma() function (it would take some time to go into the specifics, so I might write up a more detailed description at some point in the future separately). So, this implies two approaches one can take here:
Option 1 would take a lot more work and would require separately maintaining that part of the package together with the rest. The advantage of course would be that users who really want to carry out the H&S method "by the letter" can switch to R.
Option 2 is what I am currently aiming for, because it is easier to implement and fits into the existing package framework. For example, to correct a correlation coefficient for measurement error, the user would need to specify, besides the coefficient and the sample size, also the reliability estimate(s) (for one or both variables). The rest is just the correction for attenuation. This option does not preclude that, at some point I (or anybody else) would write code to implement a line-by-line reimplementation of the H&S method. But that is currently not at the top of my to-do list.
Note that there is now a discussion of the H&S method under the tips section (see: Hunter and Schmidt Method).
I get this question quite a bit and the answer is: Yes, I definitely would like to implement methods for fitting meta-analytic models using Bayesian methods. In fact, I had developed code for this a long time ago (at the time, interfacing with WinBUGS), but it never quite made it into the package and then I got side-tracked with getting other functionality implemented first. I picked up that idea at some point, then started to develop code that would work with JAGS (I would in fact prefer a solution that will work across all platforms), and got side-tracked again. So much to do, so little time ...
There are a couple issues:
rma()function (i.e., 'normal-normal' models).
1/(ni*pi*(1-pi)). At least, that's the large-sample approximation to
piitself. So, one can either compute the sampling variances and leave them as they are or, during the Gibbs sampling, update the sampling variance estimate for each study based on the (back-transformed) predicted
logit(pi)value. This can have quite some impact on the results.
rma.glmm()function. That's going to take more time. Also, do I put this into the same function or write separate functions for that? Hmmm, choices, choices.
rma.mv()function. Multilevel, multivariate, and especially network meta-analyses are the 'hot new things' these days (and in some areas not even so hot anymore) and those types of models can be fitted with this function. But
rma.mv()is already a monster to begin with (it's quite flexible, but that flexibility frequently induces headaches when I stare at the code – and I was the one who wrote it!) and writing a Bayesian analogue to it makes me wake up at night drenched in sweat and still shaking from the terrors that lie beneath.
And I probably forgot some important issues ...
At the moment, the
rma.glmm() function only handles relatively standard random/mixed-effects (meta-regression) models – well, to be honest, most meta-analyses that could benefit from the underlying functionality implemented in that function still stick to simpler approximations, despite the fact that there are good reasons to switch. Anyway, it would be nice if
rma.glmm() could be extended to handle additional random effects (the way
rma.mv() extends the functionality of the
rma() function). While this certainly would be useful, most of the code of the
rma.glmm() function is just a bunch of data manipulation and pre/post-processing, so that other functions/packages (most notably,
glm() and the
glmer() function from the excellent lme4 package) can be used for the model fitting (the exception to this is the conditional logistic model, which results when you condition on the total number of events in each 2x2 table, so that the data in the tables can be modeled with non-central hypergeometric distributions – this I had to implement, in a rather brute-force manner, more or less from scratch). So, if you want to fit more complicated models based on the same underlying logic, it would probably just make sense to switch to
lme4 (or similar/related packages) directly, instead of waiting for me to extend the
rma.glmm() function (which may not happen any time soon, as I frankly do not see such a pressing need here – but this may change in the future).
Right now, all of the figures and graphs that can be created with the package are drawn using 'base graphics'. Of course, there are alternative graphing systems, most notably lattice (authored by Deepayan Sarkar) and ggplot2 (authored by Hadley Wickham) – and especially the latter has become extremely popular in recent years.
Fun fact: The grammar of graphics, which
ggplot2 is based on, was written/developed by Leland Wilkinson, who developed the SYSTAT statistical package (which I used for a bit during grad school) and worked for SPSS, which some may consider the complete antithesis of what R represents. And while we are at it, the Statistical Computing and Graphics sections of the American Statistical Association have an interesting video library showcasing some of the history of systems for dynamic data visualizations.
At any rate, it would be nice to provide the option to create all figures and graphs in
metafor using such alternative systems, especially given some of the limitations of the current approach (for a nice discussion of the underlying issue, see the article "What's in a Name: The Importance of Naming grid Grobs When Drawing Plots in R" by Paul Murrell in the R Journal, Volume 4/2, December 2012.
I am considering adding at least some version of a selection model to the package. But then again, there are the metasens and selectMeta packages, so this isn't all that pressing. But I have some ideas based on a recent model proposed by Martyna Citkowicz and Jack Vevea that I think would make for a nice addition, but I am waiting for the paper to appear in press to consider this is in more detail.
The synthesis of diagnostic studies is of course a major application of meta-analytic methodology. Measures such as sensitivity and specificity can of course be easily meta-analyzed with the
metafor package, but for things like the hierarchical summary receiver operating characteristic (HSROC) model and the binomial-normal version of the bivariate model (the normal-normal bivariate model can be handled with
rma.mv()), one currently has to turn to other packages (see the CRAN Task View: Meta-Analysis for some options). Then again, no need to duplicate existing functionality, so this is also not too high on my priority list.
Lots of other/smaller things:
anova()function should really automatically test appropriate subsets of coefficients for model objects returned by the
rma.mv()functions (like it does for other model fitting functions). I kind of messed up when I made some early design choices that now make implementing this functionality more tricky (if I don't want to break backwards compatibility or rewrite large chunks of code from scratch).
rma.glmmobjects in the first place.
rma()function may be extended to fit so-called 'location-scale' models at some point in the future. Just implementing this isn't really sufficient though – most people wouldn't know what that even means or how it can be useful, so I would need to write an entire article about this method and illustrate its use.
rma.mv()function should allow for one (or multiple?) continuous
random = ~ inner | outerfor modeling random slopes of moderator variables. This is especially relevant for meta-analyses examining 'dose-response' relationships (assuming there are studies examining the responses to more than two levels of a dose/exposure variable).
rma.mv(). Not that those SEs are really all that useful.
rma.mvobjects, to provide profile likelihood CIs.
ID(for a scaled identity matrix) and
DIAG(for a diagonal matrix) for the
rma.mv()? But setting
struct="HCS"does the same thing, so is this really necessary?
ranef()function (with alias
random.effects()as in the nlme package) to just extract the predicted random effects (i.e., free of the fixed effects).
rma.glmm()function currently uses ±1/2 coding for the random effects when
model="UM.FS"(i.e., an unconditional model with fixed study effects). Maybe provide an option that allows for switching to 0/1 coding?
escalc(). See also the recent article by Nakagawa et al. (2015).
rma.mv()(and while we are at it, also
rma.uni()) to also include the fixed effects (instead of profiling out the fixed effects). Then we can profile over the fixed effects to obtain profile likelihood CIs and LRTs for the model coefficients as well.
Largument in the
anova()function? (as in the car package).
rma.mv()and large datasets. See, for example, this comparison of R with Atlas, R with OpenBLAS, and Revolution R Open for an illustration of what I am talking about here. Most of the heavy lifting in
metaforin fact involves matrix algebra, so maybe some real gains could be obtained this way. But this is only relevant if you are dealing with relatively large datasets (in the meta-analytic context, anything beyond a couple hundred effect size estimates is huge!).
method="FE"as if we are fitting an equal-effects model (e.g., residuals, log likelihood, and so on), when in fact the fixed-effects model is more general. But making the appropriate changes would lead to backwards incompatibility, would just confuse most users, and it's not clear if all of the necessary adjustments can be easily implemented.