A Guide To Conducting A Meta-Analysis With Non-Independent Effect Sizes

Neuropsychology Review (2019) 29:387–396
https://doi.org/10.1007/s11065-019-09415-6
REVIEW
A Guide to Conducting a Meta-Analysis with Non-Independent

Effect Sizes
Mike W.-L. Cheung 1
Received: 10 August 2018 / Accepted: 14 August 2019 / Published online: 24 August 2019
# The Author(s) 2019
Abstract
Conventional meta-analytic procedures assume that effect sizes are independent. When effect sizes are not independent, conclu-
sions based on these conventional procedures can be misleading or even wrong. Traditional approaches, such as averaging the
effect sizes and selecting one effect size per study, are usually used to avoid the dependence of the effect sizes. These ad-hoc
approaches, however, may lead to missed opportunities to utilize all available data to address the relevant research questions.
Both multivariate meta-analysis and three-level meta-analysis have been proposed to handle non-independent effect sizes. This
paper gives a brief introduction to these new techniques for applied researchers. The first objective is to highlight the benefits of
using these methods to address non-independent effect sizes. The second objective is to illustrate how to apply these techniques
with real data in R and Mplus. Researchers may modify the sample R and Mplus code to fit their data.
Keywords Meta-analysis . Multivariate meta-analysis . Three-level meta-analysis . Non-independent effect size
A single study rarely provides enough evidence to address dedicated to publishing high-quality systematic reviews and
research questions in a particular domain. Replications are meta-analyses.
generally the preferred approach for addressing critical scien- Many books introducing how to conduct a systematic re-
tific questions (e.g., Open Science Collaboration, 2012, view and meta-analysis have already been published (e.g.,
2015). Replications of studies are particularly important, giv- Borenstein, Hedges, Higgins, & Rothstein, 2009; Card,
en that many of the published findings are said to be non- 2012; Cheung, 2015a; Cooper, Hedges, & Valentine, 2009;
replicable. When there is a large pool of empirical studies on Hedges & Olkin, 1985). Cheung and Vijayakumar (2016)
a similar topic, a meta-analysis can be used to synthesize these recently gave a brief introduction to how neuropsychologists
research findings (Anderson & Maxwell, 2016). Meta- can conduct a meta-analysis. Their introduction assumes that
analysis is generally recognized as the method for synthesiz- the effect sizes are independent, which is a crucial assumption
ing research findings in disciplines across the social, behav- in a meta-analysis. It is rare for primary studies to report only
ioral, and medical sciences (e.g., Gurevitch, Koricheva, one relevant effect size. Reported effect sizes are likely to be
Nakagawa, & Stewart, 2018; Hedges & Schauer, 2018; non-independent for various reasons. The sampling errors of
Hunt, 1997). A few psychological journals, such as the effect sizes may be correlated because the same partici-
Psychological Bulletin (Albarracín et al., 2018) and pants are involved in calculating the effect sizes. For example,
Neuropsychology Review (Loring & Bowden, 2016), are the same control group is used in calculating the treatment
effects or there is more than one outcome effect size.
Electronic supplementary material The online version of this article Another reason for non-independent effect sizes is that the
(https://doi.org/10.1007/s11065-019-09415-6) contains supplementary effect sizes of the independent samples are nested within a
material, which is available to authorized users. primary study. This nested structure will create dependence
when a meta-analysis is conducted. Results based on conven-
* Mike W.-L. Cheung tional meta-analytic methods are inappropriate or even mis-
mikewlcheung@nus.edu.sg
leading. Many advances in how to handle non-independent
1 effect sizes have been made in the past decade. Applied re-
Department of Psychology, Faculty of Arts and Social Sciences,
National University of Singapore, Block AS4, Level 2, 9 Arts Link, searchers, however, may not be familiar with these advanced
Singapore 117570, Singapore meta-analytic techniques.
388 Neuropsychol Rev (2019) 29:387–396
Therefore, the primary objective of this paper is to give an respectively. Multiple moderators may be included in the
introduction on how to handle non-independent effect sizes in model. When there is a categorical moderator with more than
a meta-analysis. We will introduce the multivariate meta- two categories, dummy coded moderators may be used. In
analysis and three-level meta-analysis to handle two types of addition to testing the significance of the moderators, we
non-independence in a meta-analysis. The second objective is may also calculate an R2 index to quantify the percentage of
to illustrate how to apply these techniques with real data in the the heterogeneity variance that can be explained by adding the
R statistical platform (R Development Core Team, 2019) and moderators.
Mplus (Muthén & Muthén, 2017). Researchers may modify There are two critical assumptions in random- and mixed-
the sample R and Mplus code to fit their models. In the fol- effects meta-analyses. First, the sample effect size yi is condi-
lowing sections, we first provide some background on the tionally distributed as a normal distribution with a known
problems arising from non-independent effect sizes and how sampling variance vi. Several factors may affect the appropri-
to address these problems with conventional versus preferred ateness of this assumption. The first factor is the type of the
meta-analytic methods. Two real examples in published meta- effect size. A raw mean difference, for example, approaches a
analyses are used to illustrate how to analyze non-independent normal distribution much faster than would a correlation co-
effect sizes. efficient or an odds ratio. For a correlation coefficient and an
odds ratio, we may apply transformations to “normalize” their
sampling distributions. For example, a log transformation on
the odds ratio and a Fisher’s z transformation on the correla-
What Are the Key Assumptions tion coefficient are usually applied before a meta-analysis is
in a Meta-Analysis? conducted. Another factor is the size of the sample. If the
sample size is large enough, the sampling variance of the
To facilitate the introduction, we first review a standard
effect size can be assumed to be approximately normal and
random-effects meta-analytic model (e.g., Borenstein et al.,
known. Depending on the types of effect sizes, reasonably
2009; Hedges & Olkin, 1985). We use yi to represent a generic
large sample sizes in primary studies are expected in a meta-
effect in the ith study. The effect size can be either a standard-
analysis. Some (transformed) effect sizes, for example, the
ized (or raw) mean difference, a correlation coefficient (or its
raw mean difference and the Fisher’s z transformed score,
Fisher’s z transformation), a log-odds ratio, or some other
work well even for small sample sizes when the underlying
effect size (e.g., Cheung & Vijayakumar, 2016). The
populations are normally distributed.
random-effects meta-analytic model is:
The second critical assumption is that the effect sizes are
yi ¼ β R þ u i þ e i ; ð1Þ independent. When the effect sizes in a meta-analysis are not
independent, the estimated standard errors (SEs) on the aver-
where βR is the average population effect, Var(ui) = τ2 is the age effect are generally under-estimated (López-López, Van
population heterogeneity variance that has to be estimated, den Noortgate, Tanner-Smith, Wilson, & Lipsey, 2017).
and Var(ei) = vi is the known sampling variance in the ith Researchers may incorrectly conclude that the average effect
study. The heterogeneity variance τ2 is an absolute index of is very precise. This problem is well known in the context of
heterogeneity that depends on the type of effect size. That is, multilevel models (Goldstein, 2011; Hox, 2010; Raudenbush
we cannot compare the computed heterogeneity variances & Bryk, 2002). If we incorrectly treat the non-independent
across different types of effect size. We may calculate a rela- data as independent, the statistical inferences are likely to be
tive heterogeneity index I2 to indicate what percentage of the wrong. Therefore, researchers should not treat non-
total heterogeneity is comprised by between-study heteroge- independent effect sizes as if they were independent. How to
neity (Higgins & Thompson, 2002). It should be noted that the handle non-independent effect sizes is the focus of this paper.
value of I2 is affected by the “typical” within-study sampling
variance, which is affected by the sample sizes in the primary
studies (Borenstein, Higgins, Hedges, & Rothstein, 2017). How Many Types of Non-Independent Effect
Given the same value of τ2, I2 may become larger when the Sizes Are There?
“typical” within-study sampling variance becomes smaller.
When there is excess variation in the population effect We may roughly classify non-independent effect sizes into
sizes, researchers may want to explain the heterogeneity in multivariate and nested effect sizes. Other more sophisticated
terms of the characteristics of the study. The model can be types of non-independence will be addressed in the
extended to include moderators, say xi, to explain the hetero- Conclusion and Future Directions section. Table 1 shows a
geneity of the effect sizes: sample data structure of two multivariate effect sizes. y1 and
y i ¼ β 0 þ β 1 x i þ ui þ e i ; ð2Þ y2 represent two different outcome measures, for example,
where β0 and β1 are the intercept and regression coefficient, physical and psychological improvements after a treatment.
Neuropsychol Rev (2019) 29:387–396 389
Table 1 Sample data structure for a multivariate meta-analysis with two Table 2 Sample data
multivariate effect sizes structure for a three-level Cluster y v
meta-analysis
Study y1 y2 V11 V21 V22 1 .32 .02
1 .54 .02
1 .35 .52 .02 .01 .02
1 .41 .01
2 .43 NA .03 NA NA
2 .06 .03
3 NA .27 NA NA .01
2 .02 .03
y1 and y2 are the multivariate effect sizes. V11, V21, and V22 are the known 3 .37 .05
sampling variances and covariance of y1 and y2. NA represents not
available y is the effect size and v is the known sam-
pling variance of y. Cluster indicates how
the effect sizes are grouped
Both y1 and y2 are reported in Study 1, whereas only one of the
two is reported in Studies 2 and 3.
sampling error v of the effect size y is conditionally indepen-
With regard to multivariate effect sizes, the sampling errors
dent. Thus, there is no sampling covariance V21 in the data
of the effect sizes are usually conditionally correlated because
structure. Since the effect sizes within a cluster are likely to be
the same participants are used when calculating the multiple
more similar to each other than the effect sizes across clusters,
effect sizes (e.g., Raudenbush, Becker, & Kalaian, 1988;
the population effect sizes may not be independent. This situ-
Timm, 1999). In Table 1, V21 represents the sampling covari-
ation is similar to the case of participants nested within a level-
ance of y1 and y2, which is usually non-zero. For example, the
2 unit in a multilevel model. A study may report multiple
common practice is to have several treatment groups with one
effect sizes from multiple independent samples. These effect
control group in experimental or intervention studies. The
sizes are measuring the same construct relevant to our research
effect sizes of the treatment groups are calculated against the
questions, namely, that it is fine to combine these effect sizes
same control group. The effect sizes in this setting are non-
into a single effect size. For example, Mauger et al. (2018)
independent because the same control group is used to calcu-
studied how the Turner syndrome affected executive functions
late the effect sizes. Studies that employ this approach are
in children and adolescents. Since more than one effect sizes
known as multiple-treatment studies (Gleser & Olkin, 2009).
on the executive functions were reported in the primary stud-
A second example of multivariate effect sizes is the multiple-
ies, these authors treated the effect sizes as nested within the
endpoint study (Gleser & Olkin, 2009). Abramovitch, Anholt,
primary studies.
Raveh-Gottfried, Hamo, and Abramowitz (2018) investigated
Another example is that we may conceptualize some
the effects of Obsessive Compulsive Disorder (OCD) on
higher-level units, for example, country or research groups,
Intelligence Quotient (IQ). Since IQ scores may be assessed in
as our unit of analysis. The reported effect sizes (or studies)
terms of Full-Scale IQ, Verbal IQ, and Performance IQ, the effect
are nested within these higher-level units. There are several
can be conceptualized as three inter-related outcomes. The de-
such examples in cross-cultural meta-analyses, where the
gree of dependence of the multivariate effect sizes, V21 in
studies are nested within the countries (Fischer & Boer,
Table 1, may be calculated from the summary statistics
2011; Fischer, Hanke, & Sibley, 2012). Another interesting
(Cheung, 2018; Gleser & Olkin, 2009).
example is the single-subject design. In single-subject designs,
A third example is drawn from the study of Weissberger et al.
effect sizes are calculated for each subject. The effect sizes of
(2017), who were interested in examining the accuracy of neu-
the subjects are nested within studies. When researchers meta-
ropsychological assessments in detecting mild cognitive impair-
analyze these effect sizes, they have to take the dependence of
ment (MCI) and Alzheimer’s dementia (AD). The accuracy of
the effect sizes into account (Moeyaert, Ugille, Ferron,
such assessments is usually quantified by the sensitivity and
Beretvas, & Van den Noortgate, 2013).
specificity of the tests that are used to make the assessment.
The sampling errors of the sensitivity and specificity are condi-
tionally independent because there is no overlapping of partici-
pants in the groups with and without condition (or disease; e.g., What Are the Common Approaches
Li & Fine, 2011). The random effects, however, may still be to Handling Multivariate Effect Sizes?
correlated. Therefore, we should still treat the sensitivity and
specificity as multivariate effect sizes in the analysis. We use the example of Abramovitch et al. (2018), which has
The second type of dependence is attributable to nested been introduced, to start our discussion. These authors extract-
effect sizes, that is, the effect sizes that are nested within a ed 98 studies containing the IQ scores of OCD patients and
unit, for example, a study. Table 2 displays a sample data non-psychiatric comparison groups. Since the primary studies
structure of nested effect sizes. The label “Cluster” indicates reported some of the Full Scale IQ, Verbal IQ, and
how the independent effect sizes are grouped together. The Performance IQ scores, three separate effect sizes might be
calculated in each study. One popular option for dealing with multivariate meta-analysis usually have smaller SEs. In other
multivariate effect sizes is to analyze them independently. words, we may get more precise estimates (smaller confidence
Abramovitch et al. (2018) conducted three separate meta- intervals (CIs)) by using a multivariate meta-analysis.
analyses on Full Scale IQ, Verbal IQ, and Performance IQ. Two factors may affect the usefulness of multivariate meta-
There are several such examples in the literature (e.g., analyses. The first factor is the correlation between the popu-
Belleville et al., 2017; Weissberger et al., 2017). This ap- lation effect sizes. The presence of a positive (or negative)
proach is appealing because no new technique needs to be association between the effect sizes reduces the uncertainty
used. However, the primary limitation of this approach is that of the estimates of other effect sizes. This is similar to the case
it does not take into account in the analyses the advantage of the MANOVA (see Cheung, 2015a, Section 5.1.2 for a
arising from the dependence of the effect sizes. discussion). The second factor is the number of studies with
A multivariate meta-analysis is generally recommended for complete effect sizes. If there are only a few studies with
handling multivariate effect sizes (e.g., Cheung, 2013; Hedges complete effect sizes, there would be no information to esti-
& Olkin, 1985; Nam, Mengersen, & Garthwaite, 2003; mate the correlation among the population effect sizes.
Raudenbush et al., 1988; Jackson, Riley, & White, 2011). In Suppose that all of the primary studies only report either the
this approach, the idea is similar to extending an ANOVA to a Full Scale IQ, Verbal IQ, or Performance IQ, then the estimat-
MANOVA to handle more than one dependent variable. Now, ed correlation between the population effect sizes would be
let yi be a vector of p × 1 effect sizes (p is the number of zero. Should there be no correlation among the population
outcome effect sizes). The meta-analytic model in Eq. (1) effect sizes, the results of the univariate and multivariate
can be extended to handle multivariate effect sizes as follows: meta-analyses would be the same. Researchers are encour-
aged to apply a multivariate meta-analysis whenever possible
yi ¼ βR þ ui þ ei ; ð3Þ (Jackson et al., 2011) because of the benefits that can be ob-
tained from the correlated effect sizes. In the worst-case sce-
where βR is the vector of the average population effects, nario where the effect sizes are uncorrelated, the results of the
Cov(ui) = T2 is the p × p population heterogeneity variance- multivariate meta-analysis would be similar to that which
covariance matrix that has to be estimated, and Cov(ei) = Vi would be obtained from running several univariate meta-
is the p × p known sampling variance-covariance matrix in the analyses.
ith study that is computed from the summary statistics
(Cheung, 2015a, Chapter 3). When the studies report different
numbers of effect sizes, the incomplete effect sizes are filtered
out before the analysis. This equation can easily be extended What Are the Common Approaches
to a mixed-effects model, as we did in Eq. (2). to Handling Nested Effect Sizes?
Apart from estimating the average population effects βR
and their heterogeneity variance-covariance matrix T2, several When the effect sizes are nested within some hierarchies, for
interesting research questions can be tested in a multivariate example, studies, there is a clear consensus that we should not
meta-analysis. Using the study of Abramovitch et al. (2018) as ignore the dependence and analyze the data as if they were
an example, we may treat the IQ domains (Full Scale IQ, independent. If we ignore the dependence, the SEs and the
Verbal IQ, and Performance IQ) as multiple outcomes and statistical inferences of the analyses would likely be incorrect.
compare whether the average means of the OCD patients of A three-level meta-analysis was proposed to address the
these IQ domains are the same. We may also verify whether problems mentioned above (Cheung, 2014;
the heterogeneity variances are the same across different IQ Konstantopoulos, 2011). The standard meta-analytic model
domains. By inspecting the means and heterogeneity vari- in Eq. (1) can be extended to handle nested effect sizes. We
ances, researchers may get a better idea of what the effect of use yij to represent the ith effect size in the jth study. The three-
the OCD is. Moreover, we may study the correlation between level meta-analysis is:
the population random effects (IQ domains in our example). If
the correlation is high, this indicates that studies with a higher yij ¼ βR þ uð2Þij þ uð3Þ j þ eij ; ð4Þ
population effect on one IQ domain are associated with stud-
ies with a higher population effect on another IQ domain. where βR and eij are similarly defined in Equation (1), and Var

When comparing the univariate and multivariate meta- uð2Þij ¼ τ 2ð2Þ and Var uð3Þ j ¼ τ 2ð3Þ are the level-2 and level-
analyses, Ishak, Platt, Joseph, and Hanley (2008) have argued 3 heterogeneity variances, respectively. This analysis can eas-
that researchers may conduct univariate meta-analyses with- ily be extended to a mixed-effects model, as we did in
out introducing any bias or loss of precision in the fixed- Equation (2).
effects estimates. Several authors (e.g., Demidenko, 2013; There are several advantages to applying this three-level
Riley, 2009) have shown that the estimated fixed effects in a meta-analysis on nested effect sizes. First and most important,
the level-2 heterogeneity variance τ 2ð2Þ takes the dependence 1988 on how to transform correlated effect sizes into
into account in the analyses. Results based on a conventional independent effect sizes). Mathematically, these two models
meta-analysis and a three-level meta-analysis are identical on- are closely related. A three-level meta-analysis can be formu-
ly when the level-2 heterogeneity variance is zero. Second, lated as a special case of a multivariate meta-analysis, whereas
researchers may study the level-2 and level-3 heterogeneity a multivariate meta-analysis can be approximated by a three-
variances and their I2 counterparts at level-2 and level-3. level meta-analysis with some additional assumptions (see
Third, researchers may also investigate how the level-2 and Cheung, 2015a, Section 6.4 for the details). Researchers
level-3 moderators explain the heterogeneity using R2 at level- may think carefully which technique, whether a multivariate
2 and level-3. These additional statistical analyses allow re- meta-analysis or a three-level meta-analysis, is the most ap-
searchers to study the heterogeneity at different levels (see propriate to use to analyze the data.
Cheung, 2014). Multivariate effect sizes are probably more common than
Before leaving this section, we have to mention another nested effect sizes in applications of meta-analysis. One main
procedure that is used to handle dependent effect sizes. It is difficulty of applying a multivariate meta-analysis is the re-
called the robust variance estimation (Hedges, Tipton, & quirement to calculate the sampling covariances among the
Johnson, 2010; Tipton, 2015). Instead of estimating the de- effect sizes. When these correlations are not available, several
pendence with the level-2 heterogeneity variance with options are available to deal with this situation (e.g., Riley,
Equation (4), this approach ignores the dependence by calcu- Thompson, & Abrams, 2008). One popular approach is to
lating an adjusted SE. One advantage of the robust variance average the effect sizes within a study and use this figure in
estimation is that it can be applied to both the multivariate and subsequent meta-analyses (Borenstein et al., 2009). Averaging
nested effect sizes (see the discussion in the next section). On the effect sizes within a study is easy. There are several such
the other hand, the three-level meta-analysis allows re- examples of this approach in the literature (e.g., Burmester,
searchers to study the heterogeneity variances τ2 (and R2) at Leathem, & Merrick, 2016; Mewborn, Lindbergh, & Stephen
different levels, whereas the robust variance estimation com- Miller, 2017; Sherman, Mauser, Nuno, & Sherzai, 2017).
bines these effects into one single value. However, it is less straightforward to calculate the sampling
variances of the average effect sizes. In calculating the sam-
pling variances of the average effect sizes, we need to know
the correlations among the effect sizes. Published studies rare-
What Are the Relationships ly provide information that can be used to estimate these cor-
between a Multivariate Meta-Analysis relations. Researchers usually use either 0 or 1 or some arbi-
and a Three-Level Meta-Analysis? trary values in the calculations. The assumed value of the
correlation may affect the subsequent meta-analyses.
It is of importance to clarify some key similarities and differ- Researchers may check the sensitivity of the results by using
ences between a multivariate meta-analysis and a three-level a range of possible correlations. The essential idea of a sensi-
meta-analysis. A multivariate meta-analysis is conducted tivity analysis is to investigate whether the conclusions would
when the sampling covariances are known. That is, the sam- be different if a different value of correlation is used in the
pling errors are not independent because the same participants calculations. If the conclusions are the same, the findings are
are used in calculating the effect sizes. For example, multiple- robust to the values of the correlation and researchers do not
treatment and multiple-endpoint studies are typical applica- need to worry about the stability of the findings. On the other
tions of multivariate meta-analysis. hand, researchers have to interpret the results with caution
On the other hand, a three-level meta-analysis has another when the conclusions vary a great deal and depend on the
set of assumptions. The typical application of a three-level values of the correlation.
meta-analysis is the scenario where reported effect sizes are Another popular alternative is to select one effect size from
nested within studies. The participants only contribute to one the available effect sizes within a study. Several variations
effect size, that is, there are no repeated measures. Thus, the have been used in choosing the effect sizes for meta-analyses.
sampling errors in a three-level meta-analysis are conditional- Some researchers randomly choose one effect size per study,
ly independent. The non-independence is primarily intro- while others may provide reasons for selecting a particular
duced due to the nested structure of the effect sizes. scheme. For example, they may choose “popular” measures
Technically speaking, multivariate effect sizes are also or measures with better psychometric properties. If the effect
nested within studies. We may arrange the multivariate effect sizes are randomly chosen, the average effect is unbiased.
sizes in Table 1 to the nested structure in Table 2. The only However, the estimates are less efficient because some effect
uncertain part is how to handle V21 because the sampling sizes have been dropped. If there is a selection scheme, for
variances are assumed to be independent in the nested effect example, choosing the more popular measures, the results
sizes in Table 2 (see Cheung, 2013, and Raudenbush et al., may be biased towards these measures. This is because studies
using the most popular measures may not represent the studies complete R code and the output of the analyses are shown in
published in the literature. the Supplementary Materials. Readers may easily reproduce
There are several other limitations to these approaches. and replicate the results. It should be noted that the analyses
First, they do not utilize all the available data. It is generally here are meant to illustrate the procedures of the multivariate
difficult and expensive to extract effect sizes form the source and three-level meta-analyses. The data and results may be
literature for a meta-analysis. Averaging or selecting one ef- slightly different from the ones used in the original meta-
fect per study means that much valuable information has to be analyses because the data were obtained from their published
removed from the analyses. Second, averaging the effect sizes tables rather than directly from the authors of the meta-analy-
or selecting one effect size within a study may remove valu- ses. Readers interested in the substantive research questions
able within-study variations stemming from potential moder- may refer to the original meta-analyses.
ators. For example, the effect sizes within a study may repre-
sent different types of measures and conditions. If we average Multivariate Meta-Analysis The sample data were adopted
these effect sizes into a single value or only select one effect from Table 1 of Nam et al. (2003), who studied the effects
size, it would not be possible to study whether the measures of environmental tobacco smoke, or passive smoking, on the
and conditions are moderating the effect. health of children. The effect sizes used in the analyses were
Another option is to treat the multivariate effect sizes as the the log-odds ratios of the group with environmental tobacco
nested effect sizes in a three-level meta-analysis. Dummy smoke against a normal control group in the development of
codes are used to represent different effect sizes. For example, asthma and lower respiratory disease. Since the correlation
we may treat effect sizes on the Full Scale IQ, Verbal IQ, and between asthma and lower respiratory disease was not avail-
Performance IQ in a multivariate meta-analysis as nested ef- able in the paper, we used a correlation of 0.5 to calculate the
fect sizes in a three-level meta-analysis. We may then include sampling covariance between the effect sizes. A sensitivity
dummy codes to represent the effect sizes. By making a few analysis was also conducted by using a correlation of 0 and .8.
additional assumptions, we may analyze a multivariate meta- There are a total of 59 studies in the data set “Nam03” in
analysis as a three-level meta-analysis without knowing the the metaSEM package. Eight of these studies include both
sampling covariances of the multivariate effect sizes (see asthma and lower respiratory disease, while the remaining
Cheung, 2015a, Section 6.4.2). Computer simulations (e.g., studies only include one of these two effect sizes. If we con-
Moeyaert et al., 2017; Van den Noortgate, López-López, duct two separate meta-analyses, the average effects (and their
Marín-Martínez, & Sánchez-Meca, 2013) usually suggest that SEs) on asthma and lower respiratory disease are 0.23 (0.05)
this approach works reasonably well under simulated condi- and 0.30 (0.06), respectively. The estimated heterogeneity var-
tions. Since it is quite likely that the correlations among the iances on asthma and lower respiratory disease are 0.04 and
effect sizes are missing in the meta-analyses, many researchers 0.05, respectively. The estimated I2 on asthma and lower re-
prefer the three-level meta-analysis to the multivariate meta- spiratory disease are 0.73 and 0.92, respectively.
analysis. The results of the multivariate meta-analysis on asthma and
Alternatively, the robust variance estimation (Hedges et al., lower respiratory disease are 0.27 (0.05) and 0.31 (0.05), re-
2010; Tipton, 2015) can also be applied to effect sizes with spectively. The estimated heterogeneity variances on asthma
correlated sampling errors where the sampling covariances are and lower respiratory disease are 0.07 and 0.05, respectively.
not available. Simulation studies have shown that both the The estimated I2 on asthma and lower respiratory disease are
three-level meta-analysis and the robust variance estimation .82 and .92, respectively. The results of the univariate and
work very well in simulated conditions (Moeyaert et al., multivariate meta-analyses are comparable in this case.
2017). However, there is no guarantee that the estimated SEs will
be similar. It all depends on the data.
We may take advantage of the multivariate meta-analysis
How to Conduct a Multivariate Meta-Analysis by testing several additional research questions. First, the es-
and a Three-Level Meta-Analysis? timated correlation between the random effects is .96, which
suggests that studies with a large effect on asthma tend to be
The metaSEM (Cheung, 2015b) and metafor (Viechtbauer, associated with studies with a large effect on lower respiratory
2010) packages implemented in the R statistical platform disease. Figure 1 shows the forest plots on asthma and lower
can be used to conduct multivariate and three-level meta-anal- respiratory disease and the 95% confidence ellipses on the
yses. Mplus may also be used to perform these analyses average effects (red solid ellipse) and the studies (green
(Cheung, 2015a, Chapter 9). In this paper, we will illustrate dashed ellipse). Ninety-five percent of the studies likely fall
the analyses of the multivariate meta-analysis and three-level into the green dashed ellipse. Because of the high correlation
meta-analysis with the R statistical platform and Mplus. The between the random effects (.96), we are more certain about
data are available in the metaSEM package, whereas the the position of the studies. If we had only conducted two
Forest plot of Asthma
3 0.39 [−0.14, 0.91]

6 0.35 [ 0.06, 0.64]
10 0.25 [−0.20, 0.70]
11 −0.02 [−0.45, 0.41]
14 −0.09 [−0.62, 0.43]
17 −0.27 [−0.57, 0.02]
19 0.30 [−0.88, 1.48]
24 −0.04 [−0.28, 0.19]
25 0.34 [−0.37, 1.04]
28 −0.02 [−0.24, 0.20]
38 0.52 [ 0.19, 0.86]
44 0.16 [ 0.04, 0.27]
52 0.30 [ 0.12, 0.48]
54 0.47 [−0.08, 1.02]
56 −0.56 [−1.21, 0.08]
59 0.10 [−0.12, 0.31]
71 0.19 [ 0.11, 0.27]
78 0.40 [ 0.09, 0.71]
79 0.74 [ 0.29, 1.19]
80 0.79 [ 0.38, 1.20]
81 0.34 [−0.04, 0.71]
82 1.03 [ 0.27, 1.79]
83 0.94 [ 0.21, 1.67]
84 0.69 [ 0.12, 1.26]
85 0.36 [ 0.00, 0.71]
93 0.14 [−0.12, 0.39]
113 −0.07 [−0.15, 0.01]
114 0.36 [−0.29, 1.00]
122 0.35 [−0.02, 0.72]
601 0.44 [ 0.17, 0.72]
603.1 0.10 [−0.36, 0.55]
603.2 −0.36 [−1.20, 0.49]
RE Model 0.23 [ 0.13, 0.33]
−2 −1 0 1 2
Log−odds ratio
Log−odds ratio Forest plot of LRD

1.0
4 0.04 [−0.35, 0.43]

8 0.61 [ 0.26, 0.96]
16 0.42 [ 0.18, 0.65]
20 0.99 [ 0.60, 1.39]
22 0.69 [ 0.22, 1.16]
24 0.05 [−0.13, 0.23]
26 0.85 [ 0.30, 1.40]
28 0.12 [ 0.00, 0.24]
29 −0.02 [−0.14, 0.10]
32 1.93 [ 1.03, 2.83]
36 0.38 [ 0.18, 0.57]
0.5
37 0.99 [ 0.27, 1.72]

40 −0.33 [−1.33, 0.67]
43 0.12 [−0.15, 0.40]
44 0.14 [ 0.10, 0.18]
49 0.04 [−0.33, 0.41]
50 0.02 [−0.23, 0.27]
LRD
51 0.01 [−0.21, 0.23]

52 0.22 [ 0.05, 0.40]
54 0.53 [−0.10, 1.16]
57 0.04 [−0.10, 0.18]
59 0.18 [−0.11, 0.48]
61 0.04 [ 0.02, 0.06]
63 0.48 [ 0.03, 0.93]
0.0
65 0.20 [ 0.12, 0.28]

69 0.50 [ 0.09, 0.91]
75 0.69 [ 0.00, 1.37]
76 0.92 [ 0.16, 1.69]
88 0.30 [−0.29, 0.89]
89 0.29 [−0.09, 0.66]
93 0.22 [ 0.06, 0.37]
105 1.19 [ 0.21, 2.17]
109 1.47 [ 0.55, 2.39]
111 1.77 [−0.25, 3.79]
113 0.22 [ 0.01, 0.44]
−0.5
RE Model 0.30 [ 0.20, 0.40]
−0.5 0.0 0.5 1.0 −2 −1 0 1 2 3 4
Log−odds ratio
Asthma
Fig. 1 Plot of multivariate effect sizes and forest plots
separate univariate meta-analyses, we would not have known we may also test whether the regression coefficients on asthma
that the effects of asthma and lower respiratory disease are and lower respiratory disease are the same. By comparing the
highly correlated. models with and without the constraint on the regression co-
In a multivariate meta-analysis, we may test whether the efficients, the χ2(df = 1) = 0.64, p = .42. Therefore, there is no
average effects on asthma and lower respiratory disease are evidence to reject the null hypothesis that the moderating ef-
the same and whether their heterogeneity variances are also fect of the mean age of the participants is the same in asthma
the same. Comparing the models with and without these two and lower respiratory disease.
constraints on the means and variances, the χ2(df = 2) = 2.78, In the above analyses, we used a correlation of .5 to calcu-
p = .25. Therefore, there is no evidence to reject the null hy- late the sampling covariances between the effect sizes of asth-
pothesis that the effects are the same in asthma and lower ma and lower respiratory disease. We conducted a sensitivity
respiratory disease. analysis using a correlation of 0 and .8. The results were very
We may further conduct a mixed-effects multivariate meta- similar. Therefore, our results are robust to the choices of the
analysis by using the mean age of the participants as a mod- correlation in calculating the sampling covariances between
erator. The estimated regression coefficients on asthma and the effect sizes.
lower respiratory disease and their SEs are −0.04 (0.02),
p = .01 and − 0.02 (0.01), p = .01, respectively. Their R2 are Three-Level Meta-Analysis The second example was based on
.59 and .39, respectively. The effect of environmental tobacco the data set from Stadler, Becker, Gödker, Leutner, and Greiff
smoke is weaker in studies with older participants. Similarly, (2015), Table 1). These authors investigated the correlation
between complex problem solving and intelligence. The the heterogeneity variances and explained variances at differ-
authors reported the effect sizes of 60 independent samples ent levels.
from 47 studies. Therefore, the effect sizes were nested within A multivariate meta-analysis is usually more challenging to
the studies. In their Table 1, however, they did not provide implement because we need to know the correlation between
explicit information on how these independent samples were the effect sizes. Many primary studies, however, may not in-
nested. Stadler et al. (2015) conducted their meta-analysis clude information on how to estimate this correlation. In con-
without taking the non-independence of the effect sizes into trast, it is easier to implement a three-level meta-analysis be-
account. Based on the information on “Authors” and “Year,” cause the degree of dependence is estimated from the data. As
we could only identify 44 clusters. As an illustration, we con- we have illustrated in the above example, a three-level meta-
ducted the three-level meta-analysis with 60 effect sizes analysis may also be used to handle different types of effect
nested within 44 studies. The number of effect sizes per study sizes, namely, the outcome measure in our illustration.
varied from 1 to 4. In this paper, we simplify the non-independence into either
If we ignore the dependence and conduct the univariate multivariate or nested effect sizes. Then a multivariate meta-
meta-analysis, the average correlation (and its SE) is .42 analysis and a three-level meta-analysis are used to address the
(.03). The estimated heterogeneity variance and the I2 are non-independence of the effect sizes. In applied research, the
.04 and .96, respectively. Based on the three-level meta-anal- type of dependence is usually more complicated than in cases
ysis, the average correlation (and its SE) is .43 (.03). The with either multivariate or nested effect sizes (see, e.g., Prado,
estimated level-2 and level-3 heterogeneity variances are .02 Watt, & Crowe, 2018 for an example). It may involve both
and .02, respectively while the estimated level-2 and level-3 I2 multivariate outcomes and nested structures (e.g., Cheung,
are .45 and .51, respectively. The three-level meta-analysis 2018; Scammacca, Roberts, & Stuebing, 2014). The effect
provides more information on how the heterogeneity can be sizes can be cross-classified rather than nested (Fernández-
decomposed into the level-2 and level-3 components. The Castilla et al., 2018). Researchers may need to decide on the
results suggest that the study level can account for more het- best models to use in analyzing the data.
erogeneity (51%) than the effect size level does (45%). The effect sizes may still be non-independent even though
In the dataset, the effect sizes are based on two different each study only contributes to one effect size. For example,
intelligence measures (general intelligence, with 21 indepen- Shin (2009) found that the effect sizes reported by the same
dent samples; and reasoning, with 39 independent samples). It research groups or authors tended to be more similar to each
is of interest to test whether the effects on these intelligence other than those reported by other research groups or authors.
measures are the same. We include the intelligence measure as Moreover, the effect sizes of studies based on the same data
a moderator in the three-level meta-analysis. By comparing sets are also more similar to each other. If this dependence is
the models with and without the moderator, we find that the ignored, the estimated uncertainty (SE) may be biased. Ideally,
change in the chi-square statistics was χ2(df = 1) = 4.52, we may want to model all types of dependence. However, it is
p = .03. The average correlation between complex problem sometimes challenging to do this. Further studies may clarify
solving and intelligence is stronger for studies with a reason- when it is acceptable to drop or combine the effect sizes to
ing measure, at .48 (SE = .04), than for those with a general simplify the analyses.
intelligence measure, at .35 (SE = .05). Before closing this paper, it is important to discuss a few
issues. First, the selection of effect sizes should be guided by
the research questions. Researchers should not blindly include
all effect sizes simply because the effect sizes are available.
Conclusion and Future Directions Researchers should carefully define the inclusion and exclu-
sion criteria and use these criteria to determine whether or not
This paper introduced the problems and preferred solutions for the effect sizes should be included.
handling non-independent effect sizes in a meta-analysis. Another issue is the number of effect sizes needed to con-
Multivariate meta-analyses and three-level meta-analyses duct a three-level meta-analysis. Similar to a standard meta-
can handle different types of non-independent effect sizes. analysis and multilevel model, the fixed-effects estimates are
Besides providing valid statistical models to handle non- usually quite stable whereas the stability of the estimated
independent effect sizes, multivariate and three-level meta- level-2 and level-3 variance components depends on the num-
analyses allow researchers to address new research questions ber of effect sizes for the level-2 and level-3 data. For exam-
that cannot be answered in a conventional meta-analysis. In a ple, López-López et al. (2017) showed that the estimated fixed
multivariate meta-analysis, we may compare the average ef- effects worked very well with four effect sizes per study.
fects or heterogeneity variances across different types of effect Similar findings were also made in Moeyaert et al. (2017).
sizes. We may also study how the population effect sizes are Therefore, researchers should apply a three-level meta-analy-
correlated. In a three-level meta-analysis, we may investigate sis even when the number of level-2 effect sizes is smaller.
When the number of level-2 or level-3 effect sizes is small, Cheung, M. W.-L. (2014). Modeling dependent effect sizes with three-
level meta-analyses: A structural equation modeling approach.
however, researchers should be cautious in interpreting the
Psychological Methods, 19(2), 211–229. https://doi.org/10.1037/
estimated level-2 and level-3 variance components. a0032968
In conclusion, researchers have to properly incorporate the Cheung, M. W.-L. (2015a). Meta-analysis: A structural equation model-
dependence in a meta-analysis. The recent development of ing approach. Chichester, West Sussex: John Wiley & Sons, Inc..
multivariate and three-level meta-analyses provides a good Cheung, M. W.-L. (2015b). metaSEM: An R package for meta-analysis
using structural equation modeling. Frontiers in Psychology,
starting point from which to analyze non-independent effect 5(1521). https://doi.org/10.3389/fpsyg.2014.01521
sizes. Cheung, M. W.-L. (2018). Computing multivariate effect sizes and their
sampling covariance matrices with structural equation modeling:
Acknowledgments This research was supported by the Academic Theory, examples, and computer simulations. Frontiers in
Research Fund Tier 1 (FY2017-FRC1-008) from the Ministry of Psychology, 9(1387). https://doi.org/10.3389/fpsyg.2018.01387
Education, Singapore. Cheung, M. W.-L., & Vijayakumar, R. (2016). A guide to conducting a
meta-analysis. Neuropsychology Review, 26(2), 121–128. https://
doi.org/10.1007/s11065-016-9319-z
Open Access This article is distributed under the terms of the Creative Cooper, H. M., Hedges, L. V., & Valentine, J. C. (2009). The handbook of
Commons Attribution 4.0 International License (http:// research synthesis and meta-analysis (2nd ed.). New York: Russell
creativecommons.org/licenses/by/4.0/), which permits unrestricted use, Sage Foundation.
distribution, and reproduction in any medium, provided you give
Demidenko, E. (2013). Mixed models: Theory and applications with R
appropriate credit to the original author(s) and the source, provide a link
(2nd ed.). Hoboken, N.J: Wiley-Interscience.
to the Creative Commons license, and indicate if changes were made.
Fernández-Castilla, B., Maes, M., Declercq, L., Jamshidi, L., Beretvas, S.
N., Onghena, P., & den Noortgate, W. V. (2018). A demonstration
and evaluation of the use of cross-classified random-effects models
for meta-analysis. Behavior Research Methods, 1–19. https://doi.
org/10.3758/s13428-018-1063-2
References Fischer, R., & Boer, D. (2011). What is more important for national well-
being: Money or autonomy? A meta-analysis of well-being, burn-
out, and anxiety across 63 societies. Journal of Personality and
Abramovitch, A., Anholt, G., Raveh-Gottfried, S., Hamo, N., &
Social Psychology, 101(1), 164–184. https://doi.org/10.1037/
Abramowitz, J. S. (2018). Meta-analysis of intelligence quotient
a0023663
(IQ) in obsessive-compulsive disorder. Neuropsychology Review,
28(1), 111–120. https://doi.org/10.1007/s11065-017-9358-0 Fischer, R., Hanke, K., & Sibley, C. G. (2012). Cultural and institutional
determinants of social dominance orientation: A cross-cultural meta-
Albarracín, D., Cuijpers, P., Eastwick, P. W., Johnson, B. T., Roisman, G.
analysis of 27 societies. Political Psychology, 33(4), 437–467.
I., Sinatra, G. M., & Verhaeghen, P. (2018). Editorial. Psychological
https://doi.org/10.1111/j.1467-9221.2012.00884.x
Bulletin, 144(3), 223–226. https://doi.org/10.1037/bul0000147
Gleser, L. J., & Olkin, I. (2009). Stochastically dependent effect sizes. In
Anderson, S. F., & Maxwell, S. E. (2016). There’s more than one way to
H. Cooper, L. V. Hedges, & J. C. Valentine (Eds.), The handbook of
conduct a replication study: Beyond statistical significance.
research synthesis and meta-analysis (2nd ed., pp. 357–376). New
Psychological Methods, 21(1), 1–12. https://doi.org/10.1037/
York: Russell Sage Foundation.
met0000051
Goldstein, H. (2011). Multilevel statistical models (4th ed.). Hoboken,
Belleville, S., Fouquet, C., Hudon, C., Zomahoun, H. T. V., Croteau, J., &
N.J: Wiley.
Consortium for the Early Identification of Alzheimer’s disease-
Quebec. (2017). Neuropsychological measures that predict progres- Gurevitch, J., Koricheva, J., Nakagawa, S., & Stewart, G. (2018). Meta-
sion from mild cognitive impairment to Alzheimer’s type dementia analysis and the science of research synthesis. Nature, 555(7695),
in older adults: A systematic review and meta-analysis. 175–182. https://doi.org/10.1038/nature25753
Neuropsychology Review, 27(4), 328–353. https://doi.org/10.1007/ Hedges, L. V., & Olkin, I. (1985). Statistical methods for meta-analysis.
s11065-017-9361-5 Orlando, FL: Academic Press.
Borenstein, M., Hedges, L. V., Higgins, J. P. T., & Rothstein, H. R. Hedges, L. V., & Schauer, J. M. (2018). Statistical analyses for studying
(2009). Introduction to meta-analysis. Chichester, West Sussex, replication: Meta-analytic perspectives. Psychological Methods.
U.K.; Hoboken: John Wiley & Sons. https://doi.org/10.1037/met0000189
Borenstein, M., Higgins, J. P. T., Hedges, L. V., & Rothstein, H. R. Hedges, L. V., Tipton, E., & Johnson, M. C. (2010). Robust variance
(2017). Basics of meta-analysis: I2 is not an absolute measure of estimation in meta-regression with dependent effect size estimates.
heterogeneity. Research Synthesis Methods, 8(1), 5–18. https://doi. Research Synthesis Methods, 1(1), 39–65. https://doi.org/10.1002/
org/10.1002/jrsm.1230 jrsm.5
Burmester, B., Leathem, J., & Merrick, P. (2016). Subjective cognitive Higgins, J. P. T., & Thompson, S. G. (2002). Quantifying heterogeneity in
complaints and objective cognitive function in aging: A systematic a meta-analysis. Statistics in Medicine, 21(11), 1539–1558. https://
review and meta-analysis of recent cross-sectional findings. doi.org/10.1002/sim.1186
Neuropsychology Review, 26(4), 376–393. https://doi.org/10.1007/ Hox, J. J. (2010). Multilevel analysis: Techniques and applications (2nd
s11065-016-9332-2 ed.). New York: Routledge.
Card, N. A. (2012). Applied meta-analysis for social science research. Hunt, M. (1997). How science takes stock: The story of meta-analysis.
New York: The Guilford Press. New York: Russell Sage Foundation.
Cheung, M. W.-L. (2013). Multivariate meta-analysis as structural equa- Ishak, K. J., Platt, R. W., Joseph, L., & Hanley, J. A. (2008). Impact of
tion models. Structural Equation Modeling: A Multidisciplinary approximating or ignoring within-study covariances in multivariate
Journal, 20(3), 429–454. https://doi.org/10.1080/10705511.2013. meta-analyses. Statistics in Medicine, 27(5), 670–686. https://doi.
797827 org/10.1002/sim.2913
Jackson, D., Riley, R., & White, I. R. (2011). Multivariate meta-analysis: Raudenbush, S. W., Becker, B. J., & Kalaian, H. (1988). Modeling mul-
Potential and promise. Statistics in Medicine, 30(20), 2481–2498. tivariate effect sizes. Psychological Bulletin, 103(1), 111–120.
https://doi.org/10.1002/sim.4172 https://doi.org/10.1037/0033-2909.103.1.111
Konstantopoulos, S. (2011). Fixed effects and variance components esti- Raudenbush, S. W., & Bryk, A. S. (2002). Hierarchical linear models:
mation in three-level meta-analysis. Research Synthesis Methods, Applications and data analysis methods. Thousand Oaks: Sage
2(1), 61–76. https://doi.org/10.1002/jrsm.35 Publications.
Li, J., & Fine, J. P. (2011). Assessing the dependence of sensitivity and Riley, R. D. (2009). Multivariate meta-analysis: The effect of ignoring
specificity on prevalence in meta-analysis. Biostatistics, 12(4), 710– within-study correlation. Journal of the Royal Statistical Society:
722. https://doi.org/10.1093/biostatistics/kxr008 Series A (Statistics in Society), 172(4), 789–811. https://doi.org/10.
López-López, J. A., Van den Noortgate, W., Tanner-Smith, E. E., Wilson, 1111/j.1467-985X.2008.00593.x
S. J., & Lipsey, M. W. (2017). Assessing meta-regression methods Riley, R. D., Thompson, J. R., & Abrams, K. R. (2008). An alternative
for examining moderator relationships with dependent effect sizes: model for bivariate random-effects meta-analysis when the within-
A Monte Carlo simulation. Research Synthesis Methods, 8(4), 435– study correlations are unknown. Biostatistics, 9(1), 172–186. https://
450. https://doi.org/10.1002/jrsm.1245 doi.org/10.1093/biostatistics/kxm023
Loring, D. W., & Bowden, S. C. (2016). Editorial. Neuropsychology Scammacca, N., Roberts, G., & Stuebing, K. K. (2014). Meta-analysis
Review, 26(1), 1–2. https://doi.org/10.1007/s11065-015-9314-9 with complex research designs dealing with dependence from mul-
Mauger, C., Lancelot, C., Roy, A., Coutant, R., Cantisano, N., & Gall, D. tiple measures and multiple group comparisons. Review of
L. (2018). Executive functions in children and adolescents with Educational Research, 84(3), 328–364. https://doi.org/10.3102/
turner syndrome: A systematic review and meta-analysis. 0034654313500826
Neuropsychology Review, 28(2), 188–215. https://doi.org/10.1007/
Sherman, D. S., Mauser, J., Nuno, M., & Sherzai, D. (2017). The efficacy
s11065-018-9372-x
of cognitive intervention in mild cognitive impairment (MCI): A
Mewborn, C. M., Lindbergh, C. A., & Stephen Miller, L. (2017).
meta-analysis of outcomes on neuropsychological measures.
Cognitive interventions for cognitively healthy, mildly impaired,
Neuropsychology Review, 27(4), 440–484. https://doi.org/10.1007/
and mixed samples of older adults: A systematic review and meta-
s11065-017-9363-3
analysis of randomized-controlled trials. Neuropsychology Review,
27(4), 403–439. https://doi.org/10.1007/s11065-017-9350-8 Shin, I.-S. (2009). Same author and same data dependence in meta-
Moeyaert, M., Ugille, M., Beretvas, S. N., Ferron, J., Bunuan, R., & den analysis (Ph.D.). the Florida State University, United States –
Noortgate, W. V. (2017). Methods for dealing with multiple out- Florida.
comes in meta-analysis: A comparison between averaging effect Stadler, M., Becker, N., Gödker, M., Leutner, D., & Greiff, S. (2015).
sizes, robust variance estimation and multilevel meta-analysis. Complex problem solving and intelligence: A meta-analysis.
International Journal of Social Research Methodology, 20(6), Intelligence, 53, 92–101. https://doi.org/10.1016/j.intell.2015.09.
559–572. https://doi.org/10.1080/13645579.2016.1252189 005
Moeyaert, M., Ugille, M., Ferron, J. M., Beretvas, S. N., & Van den Timm, N. H. (1999). A note on testing for multivariate effect sizes.
Noortgate, W. (2013). The three-level synthesis of standardized Journal of Educational and Behavioral Statistics, 24(2), 132–145.
single-subject experimental data: A Monte Carlo simulation study. https://doi.org/10.3102/10769986024002132
Multivariate Behavioral Research, 48(5), 719–748. https://doi.org/ Tipton, E. (2015). Small sample adjustments for robust variance estima-
10.1080/00273171.2013.816621 tion with meta-regression. Psychological Methods, 20(3), 375–393.
Muthén, B. O., & Muthén, L. K. (2017). Mplus user’s guide (8th ed.). Los https://doi.org/10.1037/met0000011
Angeles, CA: Muthén & Muthén. Van den Noortgate, W., López-López, J. A., Marín-Martínez, F., &
Nam, I.-S., Mengersen, K., & Garthwaite, P. (2003). Multivariate meta- Sánchez-Meca, J. (2013). Three-level meta-analysis of dependent
analysis. Statistics in Medicine, 22(14), 2309–2333. https://doi.org/ effect sizes. Behavior Research Methods, 45(2), 576–594. https://
10.1002/sim.1410 doi.org/10.3758/s13428-012-0261-6
Open Science Collaboration. (2012). An open, large-scale, collaborative Viechtbauer, W. (2010). Conducting meta-analyses in R with the metafor
effort to estimate the reproducibility of psychological science. package. Journal of Statistical Software, 36(3), 1–48. https://doi.
Perspectives on Psychological Science, 7(6), 657–660. https://doi. org/10.18637/jss.v036.i03
org/10.1177/1745691612462588 Weissberger, G. H., Strong, J. V., Stefanidis, K. B., Summers, M. J.,
Open Science Collaboration. (2015). Estimating the reproducibility of Bondi, M. W., & Stricker, N. H. (2017). Diagnostic accuracy of
psychological. Science, 349(6251), aac4716. https://doi.org/10. memory measures in Alzheimer’s dementia and mild cognitive im-
1126/science.aac4716 pairment: A systematic review and meta-analysis. Neuropsychology
Prado, C. E., Watt, S., & Crowe, S. F. (2018). A meta-analysis of the Review, 27(4), 354–388. https://doi.org/10.1007/s11065-017-9360-
effects of antidepressants on cognitive functioning in depressed and 6
non-depressed samples. Neuropsychology Review, 28(1), 32–72.
https://doi.org/10.1007/s11065-018-9369-5
R Development Core Team. (2019). R: A language and environment for
statistical computing. Vienna: Austria Retrieved from http://www. Publisher’s Note Springer Nature remains neutral with regard to juris-
R-project.org/ dictional claims in published maps and institutional affiliations.

A Guide To Conducting A Meta-Analysis With Non-Independent Effect Sizes

Uploaded by

A Guide To Conducting A Meta-Analysis With Non-Independent Effect Sizes

Uploaded by

Neuropsychology Review (2019) 29:387–396

A Guide to Conducting a Meta-Analysis with Non-Independent

Keywords Meta-analysis . Multivariate meta-analysis . Three-level meta-analysis . Non-independent effect size

Forest plot of Asthma

3 0.39 [−0.14, 0.91]

Log−odds ratio Forest plot of LRD

4 0.04 [−0.35, 0.43]

37 0.99 [ 0.27, 1.72]

51 0.01 [−0.21, 0.23]

65 0.20 [ 0.12, 0.28]

RE Model 0.30 [ 0.20, 0.40]

−0.5 0.0 0.5 1.0 −2 −1 0 1 2 3 4

Fig. 1 Plot of multivariate effect sizes and forest plots

You might also like