Editorial for the special collection “Towards neutral comparison studies in methodological research”

IF 1.3 3区 生物学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY Biometrical Journal Pub Date : 2024-02-17 DOI:10.1002/bimj.202400031
Anne-Laure Boulesteix, Mark Baillie, Dominic Edelmann, Leonhard Held, Tim P. Morris, Willi Sauerbrei
{"title":"Editorial for the special collection “Towards neutral comparison studies in methodological research”","authors":"Anne-Laure Boulesteix,&nbsp;Mark Baillie,&nbsp;Dominic Edelmann,&nbsp;Leonhard Held,&nbsp;Tim P. Morris,&nbsp;Willi Sauerbrei","doi":"10.1002/bimj.202400031","DOIUrl":null,"url":null,"abstract":"<p>Biomedical researchers are frequently faced with an array of methods they might potentially use for the analysis and/or design of studies. It can be difficult to understand the absolute and relative merits of candidate methods beyond one's own particular interests and expertise. Choosing a method can be difficult even in simple settings but an increase in the volume of data collected, computational power, and methods proposed in the literature makes the choice all the more difficult. In this context, it is crucial to provide researchers with evidence-supported guidance derived from appropriately designed studies comparing statistical methods in a neutral way, in particular through well-designed simulation studies.</p><p>While neutral comparison studies are an essential cornerstone toward the improvement of this situation, a number of challenges remain with regard to their methodology and acceptance. Numerous difficulties arise when designing, conducting, and reporting neutral comparison studies. Practical experience is still scarce and literature on these issues almost inexistent. Furthermore, authors of neutral comparison studies are often faced with incomprehension from a large part of the scientific community, which is more interested in the development of “new” approaches and evaluates the importance of research primarily based on the novelty of the presented methods. Consequently, meaningful comparisons of competing approaches (especially reproducible studies including publicly available code and data) are rarely available and evidence-supported state of the art guidance is largely missing, often resulting in the use of suboptimal methods in practice.</p><p>The final special collection includes 11 contributions of the first type and 12 of the second, covering a wide range of methods and issues. Our expectations were fully met and even exceeded! We thank the authors for these outstanding contributions and the many reviewers for their very helpful comments.</p><p>The papers from the first category explore a wide range of highly relevant biostatistical methods. They present interesting implementations of various neutrality concepts and methodologies aiming at more reliability and transparency, for example, study protocols.</p><p>The topics include methodology to analyze data from randomized trials, such as the use of baseline covariates to analyze small cluster-randomized trials with a rare binary outcome (Zhu et al.) and the characterization of treatment effect heterogeneity (Sun et al.). The special collection also presents comparison studies that explore a variety of modeling approaches in other contexts. These include the analysis of survival data with nonproportional hazards with propensity score–weighted methods (Handorf et al.), the impact of the matching algorithm on the treatment effect estimate in causal analyses based on the propensity score (Heinz et al.), statistical methods for analyzing longitudinally measured ordinal outcomes in rare diseases (Geroldinger et al.), and in vitro dose–response estimation under extreme observations (Fang and Zhou).</p><p>Three papers address variable selection and penalization in the context of regression models, each with a different focus. While Frommlet investigates the minimization of L<sub>0</sub> penalties in a high-dimensional context, Hanke et al. compare various model selection strategies to the best subset approach, and Luijken et al. compare full model specification and backward elimination when estimating causal effects on binary outcomes. Finally, the collection also includes papers addressing prediction modeling: Lohmann et al. compare the prediction performance of various model selection methods in the context of logistic regression, while Graf et al. compare linear discriminant analysis to several machine learning algorithms.</p><p>Four papers from the special collection address the challenge of simulating complex data and conducting large simulation studies toward the meaningful and efficient evaluation of statistical methods. Ruberg et al. present an extensive platform for evaluating subgroup identification methodologies, including the implementation of appropriate data generating models. Wahab et al. propose a dedicated simulator for the evaluation of methods that aim at providing pertinent causal inference in the presence of intercurrent events in clinical trials. Kelter outlines a comprehensive framework for Bayesian simulation studies including a structured skeleton for the planning, coding, conduct, analysis, and reporting of Bayesian simulation studies. The open science framework developed by Kodalci and Thas, which focuses on two-sample tests, allows the comparison of new methods to all previously submitted methods using all previously submitted simulation designs.</p><p>In contrast, Huang and Trinquart consider new ways to compare the performance of methods with a different type I error—a factor that complicates power interpretation. They propose a new approach by drawing an analogy to diagnostic accuracy comparisons, based on relative positive and negative likelihood ratios.</p><p>The special issue also includes various thought-provoking perspective articles discussing fundamental aspects of benchmarking methodology. Friedrich and Friede discuss the complementary roles of simulation-based and real data–based benchmarking. Heinze et al. propose a phases framework for methodological research, which considers how to make methods fit-for-use. Strobl and Leisch stress the need to give up the notion that one method can be broadly the “best” in comparison studies. Other articles address special aspects of the design of comparison studies. Pawel et al. discuss and demonstrate the impact of so-called “questionable research practices” in the context of simulation studies, Nießl et al. explain reasons for the optimistic performance evaluation of newly proposed methods through a cross-design validation experiment. Oberman and Vink focus on aspects to consider in the design of simulation experiments that evaluate imputation methodology. In a letter to the editor related to this article, Morris et al. note some issues with fixing a single complete data set rather than repeatedly sampling the data in such simulations.</p><p>Editing this Special Collection was extremely rewarding for us. Quite aside from the high quality of the submissions, we were heartened to see the biometrical community's interest in improving the quality of research comparing methods; it was of course a concern that we may receive no submissions! It is our hope that this Special Collection represents the start rather than the end of a conversation, and that readers find the articles as thought-provoking and practically useful as we have.</p>","PeriodicalId":55360,"journal":{"name":"Biometrical Journal","volume":"66 2","pages":""},"PeriodicalIF":1.3000,"publicationDate":"2024-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/bimj.202400031","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biometrical Journal","FirstCategoryId":"99","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/bimj.202400031","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Biomedical researchers are frequently faced with an array of methods they might potentially use for the analysis and/or design of studies. It can be difficult to understand the absolute and relative merits of candidate methods beyond one's own particular interests and expertise. Choosing a method can be difficult even in simple settings but an increase in the volume of data collected, computational power, and methods proposed in the literature makes the choice all the more difficult. In this context, it is crucial to provide researchers with evidence-supported guidance derived from appropriately designed studies comparing statistical methods in a neutral way, in particular through well-designed simulation studies.

While neutral comparison studies are an essential cornerstone toward the improvement of this situation, a number of challenges remain with regard to their methodology and acceptance. Numerous difficulties arise when designing, conducting, and reporting neutral comparison studies. Practical experience is still scarce and literature on these issues almost inexistent. Furthermore, authors of neutral comparison studies are often faced with incomprehension from a large part of the scientific community, which is more interested in the development of “new” approaches and evaluates the importance of research primarily based on the novelty of the presented methods. Consequently, meaningful comparisons of competing approaches (especially reproducible studies including publicly available code and data) are rarely available and evidence-supported state of the art guidance is largely missing, often resulting in the use of suboptimal methods in practice.

The final special collection includes 11 contributions of the first type and 12 of the second, covering a wide range of methods and issues. Our expectations were fully met and even exceeded! We thank the authors for these outstanding contributions and the many reviewers for their very helpful comments.

The papers from the first category explore a wide range of highly relevant biostatistical methods. They present interesting implementations of various neutrality concepts and methodologies aiming at more reliability and transparency, for example, study protocols.

The topics include methodology to analyze data from randomized trials, such as the use of baseline covariates to analyze small cluster-randomized trials with a rare binary outcome (Zhu et al.) and the characterization of treatment effect heterogeneity (Sun et al.). The special collection also presents comparison studies that explore a variety of modeling approaches in other contexts. These include the analysis of survival data with nonproportional hazards with propensity score–weighted methods (Handorf et al.), the impact of the matching algorithm on the treatment effect estimate in causal analyses based on the propensity score (Heinz et al.), statistical methods for analyzing longitudinally measured ordinal outcomes in rare diseases (Geroldinger et al.), and in vitro dose–response estimation under extreme observations (Fang and Zhou).

Three papers address variable selection and penalization in the context of regression models, each with a different focus. While Frommlet investigates the minimization of L0 penalties in a high-dimensional context, Hanke et al. compare various model selection strategies to the best subset approach, and Luijken et al. compare full model specification and backward elimination when estimating causal effects on binary outcomes. Finally, the collection also includes papers addressing prediction modeling: Lohmann et al. compare the prediction performance of various model selection methods in the context of logistic regression, while Graf et al. compare linear discriminant analysis to several machine learning algorithms.

Four papers from the special collection address the challenge of simulating complex data and conducting large simulation studies toward the meaningful and efficient evaluation of statistical methods. Ruberg et al. present an extensive platform for evaluating subgroup identification methodologies, including the implementation of appropriate data generating models. Wahab et al. propose a dedicated simulator for the evaluation of methods that aim at providing pertinent causal inference in the presence of intercurrent events in clinical trials. Kelter outlines a comprehensive framework for Bayesian simulation studies including a structured skeleton for the planning, coding, conduct, analysis, and reporting of Bayesian simulation studies. The open science framework developed by Kodalci and Thas, which focuses on two-sample tests, allows the comparison of new methods to all previously submitted methods using all previously submitted simulation designs.

In contrast, Huang and Trinquart consider new ways to compare the performance of methods with a different type I error—a factor that complicates power interpretation. They propose a new approach by drawing an analogy to diagnostic accuracy comparisons, based on relative positive and negative likelihood ratios.

The special issue also includes various thought-provoking perspective articles discussing fundamental aspects of benchmarking methodology. Friedrich and Friede discuss the complementary roles of simulation-based and real data–based benchmarking. Heinze et al. propose a phases framework for methodological research, which considers how to make methods fit-for-use. Strobl and Leisch stress the need to give up the notion that one method can be broadly the “best” in comparison studies. Other articles address special aspects of the design of comparison studies. Pawel et al. discuss and demonstrate the impact of so-called “questionable research practices” in the context of simulation studies, Nießl et al. explain reasons for the optimistic performance evaluation of newly proposed methods through a cross-design validation experiment. Oberman and Vink focus on aspects to consider in the design of simulation experiments that evaluate imputation methodology. In a letter to the editor related to this article, Morris et al. note some issues with fixing a single complete data set rather than repeatedly sampling the data in such simulations.

Editing this Special Collection was extremely rewarding for us. Quite aside from the high quality of the submissions, we were heartened to see the biometrical community's interest in improving the quality of research comparing methods; it was of course a concern that we may receive no submissions! It is our hope that this Special Collection represents the start rather than the end of a conversation, and that readers find the articles as thought-provoking and practically useful as we have.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
为 "在方法论研究中开展中性比较研究 "特辑撰写社论。
他们通过类比诊断准确性比较,提出了一种基于相对正负似然比的新方法。特刊还包括多篇发人深省的观点文章,讨论了基准制定方法的基本方面。Friedrich 和 Friede 讨论了基于模拟和基于真实数据的基准测试的互补作用。Heinze 等人提出了方法论研究的阶段性框架,其中考虑了如何使方法适合使用。Strobl 和 Leisch 强调,在比较研究中,需要放弃一种方法可以成为 "最佳 "方法的观念。其他文章探讨了比较研究设计的特殊方面。Pawel 等人讨论并证明了所谓的 "有问题的研究实践 "在模拟研究中的影响,Nießl 等人解释了通过交叉设计验证实验对新提出的方法进行乐观的性能评估的原因。Oberman 和 Vink 重点讨论了在设计评估估算方法的模拟实验时需要考虑的方面。在一封与本文相关的致编辑的信中,莫里斯等人指出了在此类模拟中固定单一完整数据集而不是重复采样数据的一些问题。除了高质量的投稿外,我们还欣喜地看到生物计量学界对提高比较方法研究质量的兴趣;当然,我们也担心可能收不到投稿!我们希望本特辑是对话的开始而不是结束,也希望读者和我们一样认为这些文章发人深省、切实有用。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Biometrical Journal
Biometrical Journal 生物-数学与计算生物学
CiteScore
3.20
自引率
5.90%
发文量
119
审稿时长
6-12 weeks
期刊介绍: Biometrical Journal publishes papers on statistical methods and their applications in life sciences including medicine, environmental sciences and agriculture. Methodological developments should be motivated by an interesting and relevant problem from these areas. Ideally the manuscript should include a description of the problem and a section detailing the application of the new methodology to the problem. Case studies, review articles and letters to the editors are also welcome. Papers containing only extensive mathematical theory are not suitable for publication in Biometrical Journal.
期刊最新文献
A Preplanned Multi-Stage Platform Trial for Discovering Multiple Superior Treatments With Control of FWER and Power. Developing and Comparing Four Families of Bayesian Network Autocorrelation Models for Binary Outcomes: Estimating Peer Effects Involving Adoption of Medical Technologies. Sensitivity Analysis for Effects of Multiple Exposures in the Presence of Unmeasured Confounding. Quantification of Difference in Nonselectivity Between In Vitro Diagnostic Medical Devices. Investigating a Domain Adaptation Approach for Integrating Different Measurement Instruments in a Longitudinal Clinical Registry
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1