Differences

This shows you the differences between two versions of the page.

--- tips:model_selection_with_glmulti_and_mumin [2022/08/09 05:28] – Wolfgang Viechtbauer
+++ tips:model_selection_with_glmulti_and_mumin [2022/09/25 10:45] – Wolfgang Viechtbauer
@@ Line 1: / Line 1: @@
 ===== Model Selection using the glmulti and MuMIn Packages =====
-Information-theoretic approaches provide methods for model selection and (multi)model inference that differ quite a bit from more traditional methods based on null hypothesis testing (e.g., Anderson, 2007; Burnham & Anderson, 2002). These methods can also be used in the meta-analytic context when model fitting is based on likelihood methods. Below, I illustrate how to use the metafor package in combination with the [[https://cran.r-project.org/package=glmulti|glmulti]] and [[https://cran.r-project.org/package=MuMIn|MuMIn]] packages that provide the necessary functionality for model selection and multimodel inference using an information-theoretic approach.
+Information-theoretic approaches provide methods for model selection and (multi)model inference that differ quite a bit from more traditional methods based on null hypothesis testing (e.g., Anderson, 2008; Burnham & Anderson, 2002). These methods can also be used in the meta-analytic context when model fitting is based on likelihood methods. Below, I illustrate how to use the metafor package in combination with the [[https://cran.r-project.org/package=glmulti|glmulti]] and [[https://cran.r-project.org/package=MuMIn|MuMIn]] packages that provide the necessary functionality for model selection and multimodel inference using an information-theoretic approach.
 ==== Data Preparation ====
@@ Line 101: / Line 101: @@
 {{ tips:model_selection_ic_values.png?nolink }}
-The horizontal red line differentiates between models whose AICc value is less versus more than 2 units away from that of the "best" model (i.e., the model with the lowest AICc). The output above shows that there are 10 such models. Sometimes this is taken as a cutoff, so that models with values more than 2 units away are considered substantially less plausible than those with AICc values closer to that of the best model. However, we should not get too hung up about such (somewhat arbitrary) divisions (and there are critiques of this rule; e.g., Anderson, 2007).
+The horizontal red line differentiates between models whose AICc value is less versus more than 2 units away from that of the "best" model (i.e., the model with the lowest AICc). The output above shows that there are 10 such models. Sometimes this is taken as a cutoff, so that models with values more than 2 units away are considered substantially less plausible than those with AICc values closer to that of the best model. However, we should not get too hung up about such (somewhat arbitrary) divisions (and there are critiques of this rule; e.g., Anderson, 2008).
 But let's take a look at the top 10 models:
@@ Line 178: / Line 178: @@
 Now we can carry out the computations for multimodel inference with:
 <code rsplus>
-coef(res)
+coef(res, varweighting="Johnson")
 </code>
 The output is not shown, because I don't find it very intuitive. But with a bit of extra code, we can make it more interpretable:
 <code rsplus>
-mmi <- as.data.frame(coef(res))
+mmi <- as.data.frame(coef(res, varweighting="Johnson"))
 mmi <- data.frame(Estimate=mmi$Est, SE=sqrt(mmi$Uncond), Importance=mmi$Importance, row.names=row.names(mmi))
 mmi$z <- mmi$Estimate / mmi$SE
@@ Line 195: / Line 195: @@
 <code output>
          Estimate Std. Error z value Pr(>|z|)   ci.lb  ci.ub Importance
-intrcpt    0.1084     0.0956  1.1338   0.2569 -0.0790 0.2958     1.0000
+intrcpt    0.1084     0.1031  1.0514   0.2931 -0.0937 0.3105     1.0000
-imag       0.3512     0.1895  1.8528   0.0639 -0.0203 0.7226     0.8478
+imag       0.3512     0.2016  1.7414   0.0816 -0.0441 0.7464     0.8478
-meta       0.0512     0.0767  0.6676   0.5044 -0.0991 0.2015     0.4244
+meta       0.0512     0.0853  0.6003   0.5483 -0.1160 0.2184     0.4244
-feedback   0.0366     0.0607  0.6029   0.5466 -0.0824 0.1556     0.3671
+feedback   0.0366     0.0689  0.5311   0.5954 -0.0985 0.1717     0.3671
-length     0.0023     0.0041  0.5490   0.5830 -0.0058 0.0104     0.3255
+length     0.0023     0.0050  0.4527   0.6508 -0.0076 0.0121     0.3255
-pers       0.0132     0.0447  0.2964   0.7669 -0.0743 0.1008     0.2913
+pers       0.0132     0.0688  0.1925   0.8473 -0.1216 0.1481     0.2913
-wic       -0.0170     0.0393 -0.4323   0.6655 -0.0941 0.0601     0.2643
+wic       -0.0170     0.0545 -0.3122   0.7549 -0.1238 0.0897     0.2643
-info      -0.0183     0.0516 -0.3539   0.7234 -0.1195 0.0829     0.2416
+info      -0.0183     0.0799 -0.2286   0.8191 -0.1749 0.1384     0.2416
 </code>
-I rounded the results to 4 digits to make the results easier to interpret. Note that the table again includes the importance values. In addition, we get unconditional estimates of the model coefficients (first column). These are model-averaged parameter estimates, which are weighted averages of the model coefficients across the various models (with weights equal to the model probabilities). These values are called "unconditional" as they are not conditional on any one model (but they are still conditional on the 128 models that we have fitted to these data; but not as conditional as fitting a single model and then making all inferences conditional on that one single model). Moreover, we get estimates of the unconditional standard errors of these model-averaged values. These standard errors take two sources of uncertainty into account: (1) uncertainty within a given model (i.e., the standard error of a particular model coefficient shown in the output when fitting a model; as an example, see the output from the "best" model shown earlier) and (2) uncertainty with respect to which model is actually the best approximation to reality (so this source of variability examines how much the size of a model coefficient varies across the set of candidate models). The model-averaged parameter estimates and the unconditional standard errors can be used for multimodel inference, that is, we can compute z-values, p-values, and confidence interval bounds for each coefficient in the usual manner.
+I rounded the results to 4 digits to make the results easier to interpret. Note that the table again includes the importance values. In addition, we get unconditional estimates of the model coefficients (first column). These are model-averaged parameter estimates, which are weighted averages of the model coefficients across the various models (with weights equal to the model probabilities). These values are called "unconditional" as they are not conditional on any one model (but they are still conditional on the 128 models that we have fitted to these data; but not as conditional as fitting a single model and then making all inferences conditional on that one single model). Moreover, we get estimates of the unconditional standard errors of these model-averaged values.((Above, we used ''varweighting="Johnson"'' so equation 6.12 from Burnham and Anderson (2002) is used for computing the standard errors (instead of 4.9), which Anderson (2008) recommends.)) These standard errors take two sources of uncertainty into account: (1) uncertainty within a given model (i.e., the standard error of a particular model coefficient shown in the output when fitting a model; as an example, see the output from the "best" model shown earlier) and (2) uncertainty with respect to which model is actually the best approximation to reality (so this source of variability examines how much the size of a model coefficient varies across the set of candidate models). The model-averaged parameter estimates and the unconditional standard errors can be used for multimodel inference, that is, we can compute z-values, p-values, and confidence interval bounds for each coefficient in the usual manner.
 ==== Multimodel Predictions ====
@@ Line 280: / Line 280: @@
 So, the candidate set would include over $2 \times 10^8$ possible models. Fitting all of these models would not only test our patience (and would be a waste of valuable CPU cycles), it would also be a pointless exercise (even fitting the 128 models above could be critiqued by some as a mindless hunting expedition -- although if one does not get too fixated on //the// best model, but considers all the models in the set as part of a multimodel inference approach, this critique loses some of its force). So, I won't consider this any further in this example.
-==== Other Model Types ====
-The same principle can of course be applied when fitting other types of models, such as those that can be fitted with the ''rma.mv()'' or ''rma.glmm()'' functions. One just has to write an appropriate ''rma.glmulti'' function.
-For multivariate/multilevel models fitted with the ''rma.mv()'' function, one can also consider model selection with respect to the random effects structure. Making this work would require a bit more work. Time permitting, I might write up an example illustrating this at some point in the future.
 ==== Using the MuMIn Package ====
@@ Line 331: / Line 325: @@
 Multimodel inference can be done with:
 <code rsplus>
-summary(model.avg(res, revised.var=FALSE))
+summary(model.avg(res))
 </code>
 <code output>
@@ Line 337: / Line 331: @@
 (full average)
           Estimate Std. Error z value Pr(>|z|)
-intrcpt   0.108404   0.095613   1.134   0.2569
+intrcpt   0.108404   0.103105   1.051   0.2931
-imag      0.351153   0.189530   1.853   0.0639 .
+imag      0.351153   0.201648   1.741   0.0816 .
-meta      0.051201   0.076700   0.668   0.5044
+meta      0.051201   0.085290   0.600   0.5483
-feedback  0.036604   0.060711   0.603   0.5466
+feedback  0.036604   0.068926   0.531   0.5954
-length    0.002272   0.004138   0.549   0.5830
+length    0.002272   0.005019   0.453   0.6508
-wic      -0.017004   0.039337   0.432   0.6655
+wic      -0.017004   0.054466   0.312   0.7549
-pers      0.013244   0.044679   0.296   0.7669
+pers      0.013244   0.068788   0.193   0.8473
-info     -0.018272   0.051631   0.354   0.7234
+info     -0.018272   0.079911   0.229   0.8191
 </code>
-I have removed some of the output, since this is the part we are most interested in. These are the same results as in object ''mmi'' shown earlier. Note that, by default, ''model.avg()'' uses a slightly different equation for computing the unconditional standard errors. To get the same results as we obtained with glmulti, I set ''revised.var=FALSE''.
+I have removed some of the output, since this is the part we are most interested in. These are the same results as in object ''mmi'' shown earlier.
 Finally, the relative importance values for the predictors can be obtained with:
 <code rsplus>
-importance(res)
+sw(res)
 </code>
 <code output>
@@ Line 358: / Line 352: @@
 </code>
 These are again the same values we obtained earlier.
+==== Other Model Types ====
+The same principle can of course be applied when fitting other types of models, such as those that can be fitted with the ''rma.mv()'' or ''rma.glmm()'' functions. One just has to write an appropriate ''rma.glmulti'' function when using the glmulti package (when using the MuMIn package, one simply applies the ''dredge()'' function to the full model). An example illustrating the use of model selection with a fairly complex ''rma.mv()'' model can be found [[https://gist.github.com/wviechtb/891483eea79da21d057e60fd1e28856b|here]] (the example involves only a small number of predictors, but also considers their interaction and also illustrates using parallel processing when using the ''dredge()'' function).
+For multivariate/multilevel models fitted with the ''rma.mv()'' function, one can also consider model selection with respect to the random effects structure. Making this work would require a bit more work. Time permitting, I might write up an example illustrating this at some point in the future.
 ==== References ====
-Anderson, D. R. (2007). //Model based inference in the life sciences: A primer on evidence//. New York: Springer.
+Anderson, D. R. (2008). //Model based inference in the life sciences: A primer on evidence//. New York: Springer.
 Bangert-Drowns, R. L., Hurley, M. M., & Wilkinson, B. (2004). The effects of school-based writing-to-learn interventions on academic achievement: A meta-analysis. //Review of Educational Research, 74//(1), 29–58.