<p>Evidence synthesis frequently involves quantitative analyses of continuous outcomes. A cross-sectional study examining Cochrane systematic reviews identified 6672 out of 22,453 meta-analyses (29.7%) involved continuous outcomes [<span>1</span>]. The primary effect measures employed in meta-analyses of continuous outcomes are the mean difference (MD) and standardized mean difference (SMD) [<span>2</span>]. The MD is appropriately applied when all included studies measure outcomes using identical scales (e.g., body weight in kilograms). In contrast, the SMD serves as a solution when studies utilize different measurement scales (e.g., varied questionnaire scoring methods). Although alternative measures (e.g., the ratio of means) exist [<span>3</span>], their application remains relatively infrequent.</p><p>Despite this conceptual clarity, the term “weighted mean difference” (WMD) appears frequently in the systematic review literature [<span>4</span>], which can lead to confusion about its relationship to the MD. In this article, we first clarify the distinction between MD and WMD, then describe the historical factors underlying the term's adoption and persistence, discuss why contemporary methods render it unnecessary, illustrate examples of misuse, and conclude with practical recommendations for clearer reporting.</p><p>The MD represents the straightforward difference between group means (e.g., intervention vs. control) for a continuous outcome. Although the true MD value relates to unknown population-level differences, practical research relies on sample estimates from individual studies. Meta-analysis systematically synthesizes these study-level MD estimates to derive an overall summary effect across studies.</p><p>The term WMD emerged historically to emphasize the weighted averaging process of meta-analyses, wherein each study contributes a sample MD weighted by its statistical precision (i.e., inverse variance) [<span>5</span>]. Typically, larger studies with smaller variances or narrower confidence intervals are assigned greater weights. Traditional meta-analytical methods, performed through either fixed-effect (also known as common-effect) or random-effects models, follow this inverse-variance weighting principle. Under fixed-effect models, study weights directly reflect the inverse of their variances, whereas random-effects models incorporate both within-study and between-study variances.</p><p>To contextualize the widespread adoption of WMD, we conducted a brief literature search using Google Scholar on June 12, 2025. Using exact-phrase queries in quotation marks, for each calendar year from 1990 to 2024, we recorded the counts for “weighted mean difference” AND “systematic review” and separately for “systematic review,” then calculated the yearly proportion (Figure 1). Google Scholar indexes titles, abstracts, and, when available, full texts, so counts reflect occurrences anywhere in the indexed record, and these counts are approximate.
{"title":"Retiring the Term “Weighted Mean Difference” in Contemporary Evidence Synthesis","authors":"Lifeng Lin, Xing Xing, Wenshan Han, Jiayi Tong","doi":"10.1002/cesm.70051","DOIUrl":"https://doi.org/10.1002/cesm.70051","url":null,"abstract":"<p>Evidence synthesis frequently involves quantitative analyses of continuous outcomes. A cross-sectional study examining Cochrane systematic reviews identified 6672 out of 22,453 meta-analyses (29.7%) involved continuous outcomes [<span>1</span>]. The primary effect measures employed in meta-analyses of continuous outcomes are the mean difference (MD) and standardized mean difference (SMD) [<span>2</span>]. The MD is appropriately applied when all included studies measure outcomes using identical scales (e.g., body weight in kilograms). In contrast, the SMD serves as a solution when studies utilize different measurement scales (e.g., varied questionnaire scoring methods). Although alternative measures (e.g., the ratio of means) exist [<span>3</span>], their application remains relatively infrequent.</p><p>Despite this conceptual clarity, the term “weighted mean difference” (WMD) appears frequently in the systematic review literature [<span>4</span>], which can lead to confusion about its relationship to the MD. In this article, we first clarify the distinction between MD and WMD, then describe the historical factors underlying the term's adoption and persistence, discuss why contemporary methods render it unnecessary, illustrate examples of misuse, and conclude with practical recommendations for clearer reporting.</p><p>The MD represents the straightforward difference between group means (e.g., intervention vs. control) for a continuous outcome. Although the true MD value relates to unknown population-level differences, practical research relies on sample estimates from individual studies. Meta-analysis systematically synthesizes these study-level MD estimates to derive an overall summary effect across studies.</p><p>The term WMD emerged historically to emphasize the weighted averaging process of meta-analyses, wherein each study contributes a sample MD weighted by its statistical precision (i.e., inverse variance) [<span>5</span>]. Typically, larger studies with smaller variances or narrower confidence intervals are assigned greater weights. Traditional meta-analytical methods, performed through either fixed-effect (also known as common-effect) or random-effects models, follow this inverse-variance weighting principle. Under fixed-effect models, study weights directly reflect the inverse of their variances, whereas random-effects models incorporate both within-study and between-study variances.</p><p>To contextualize the widespread adoption of WMD, we conducted a brief literature search using Google Scholar on June 12, 2025. Using exact-phrase queries in quotation marks, for each calendar year from 1990 to 2024, we recorded the counts for “weighted mean difference” AND “systematic review” and separately for “systematic review,” then calculated the yearly proportion (Figure 1). Google Scholar indexes titles, abstracts, and, when available, full texts, so counts reflect occurrences anywhere in the indexed record, and these counts are approximate.","PeriodicalId":100286,"journal":{"name":"Cochrane Evidence Synthesis and Methods","volume":"3 5","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cesm.70051","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145037868","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Christopher James Rose, Julia Bidonde, Martin Ringsten, Julie Glanville, Thomas Potrebny, Chris Cooper, Ashley Elizabeth Muller, Hans Bugge Bergsund, Jose F. Meneses-Echavez, Rigmor C. Berg