Decisions about equivalence: A comparison of TOST, HDI-ROPE, and the Bayes factor.

IF 7.6 1区 心理学 Q1 PSYCHOLOGY, MULTIDISCIPLINARY Psychological methods Pub Date : 2023-06-01 DOI:10.1037/met0000402
Maximilian Linde, Jorge N Tendeiro, Ravi Selker, Eric-Jan Wagenmakers, Don van Ravenzwaaij
{"title":"Decisions about equivalence: A comparison of TOST, HDI-ROPE, and the Bayes factor.","authors":"Maximilian Linde,&nbsp;Jorge N Tendeiro,&nbsp;Ravi Selker,&nbsp;Eric-Jan Wagenmakers,&nbsp;Don van Ravenzwaaij","doi":"10.1037/met0000402","DOIUrl":null,"url":null,"abstract":"<p><p>Some important research questions require the ability to find evidence for two conditions being practically equivalent. This is impossible to accomplish within the traditional frequentist null hypothesis significance testing framework; hence, other methodologies must be utilized. We explain and illustrate three approaches for finding evidence for equivalence: The frequentist two one-sided tests procedure, the Bayesian highest density interval region of practical equivalence procedure, and the Bayes factor interval null procedure. We compare the classification performances of these three approaches for various plausible scenarios. The results indicate that the Bayes factor interval null approach compares favorably to the other two approaches in terms of statistical power. Critically, compared with the Bayes factor interval null procedure, the two one-sided tests and the highest density interval region of practical equivalence procedures have limited discrimination capabilities when the sample size is relatively small: Specifically, in order to be practically useful, these two methods generally require over 250 cases within each condition when rather large equivalence margins of approximately .2 or .3 are used; for smaller equivalence margins even more cases are required. Because of these results, we recommend that researchers rely more on the Bayes factor interval null approach for quantifying evidence for equivalence, especially for studies that are constrained on sample size. (PsycInfo Database Record (c) 2023 APA, all rights reserved).</p>","PeriodicalId":20782,"journal":{"name":"Psychological methods","volume":"28 3","pages":"740-755"},"PeriodicalIF":7.6000,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Psychological methods","FirstCategoryId":"102","ListUrlMain":"https://doi.org/10.1037/met0000402","RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PSYCHOLOGY, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

Abstract

Some important research questions require the ability to find evidence for two conditions being practically equivalent. This is impossible to accomplish within the traditional frequentist null hypothesis significance testing framework; hence, other methodologies must be utilized. We explain and illustrate three approaches for finding evidence for equivalence: The frequentist two one-sided tests procedure, the Bayesian highest density interval region of practical equivalence procedure, and the Bayes factor interval null procedure. We compare the classification performances of these three approaches for various plausible scenarios. The results indicate that the Bayes factor interval null approach compares favorably to the other two approaches in terms of statistical power. Critically, compared with the Bayes factor interval null procedure, the two one-sided tests and the highest density interval region of practical equivalence procedures have limited discrimination capabilities when the sample size is relatively small: Specifically, in order to be practically useful, these two methods generally require over 250 cases within each condition when rather large equivalence margins of approximately .2 or .3 are used; for smaller equivalence margins even more cases are required. Because of these results, we recommend that researchers rely more on the Bayes factor interval null approach for quantifying evidence for equivalence, especially for studies that are constrained on sample size. (PsycInfo Database Record (c) 2023 APA, all rights reserved).

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
关于等效性的决定:TOST、HDI-ROPE和贝叶斯因子的比较。
一些重要的研究问题需要有能力找到两种情况实际上相等的证据。这在传统的频率主义零假设显著性检验框架中是不可能完成的;因此,必须使用其他方法。我们解释并举例说明了三种寻找等价证据的方法:频率双单侧检验法、贝叶斯实际等价的最高密度区间区域法和贝叶斯因子区间零法。我们比较了这三种方法在各种可能场景下的分类性能。结果表明,贝叶斯因子区间零方法在统计能力方面优于其他两种方法。关键是,与贝叶斯因子区间零过程相比,当样本量相对较小时,实际等效过程的两个单侧检验和最高密度区间区域的判别能力有限:具体而言,为了实际有用,当使用相当大的等效裕度(约为0.2或0.3)时,这两种方法通常需要在每个条件下超过250个病例;对于较小的等效边距,甚至需要更多的情况。由于这些结果,我们建议研究人员更多地依赖贝叶斯因子区间零方法来量化等效性的证据,特别是对于受样本量限制的研究。(PsycInfo数据库记录(c) 2023 APA,版权所有)。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Psychological methods
Psychological methods PSYCHOLOGY, MULTIDISCIPLINARY-
CiteScore
13.10
自引率
7.10%
发文量
159
期刊介绍: Psychological Methods is devoted to the development and dissemination of methods for collecting, analyzing, understanding, and interpreting psychological data. Its purpose is the dissemination of innovations in research design, measurement, methodology, and quantitative and qualitative analysis to the psychological community; its further purpose is to promote effective communication about related substantive and methodological issues. The audience is expected to be diverse and to include those who develop new procedures, those who are responsible for undergraduate and graduate training in design, measurement, and statistics, as well as those who employ those procedures in research.
期刊最新文献
Why multiple hypothesis test corrections provide poor control of false positives in the real world. Simulation studies for methodological research in psychology: A standardized template for planning, preregistration, and reporting. Item response theory-based continuous test norming. Comments on the measurement of effect sizes for indirect effects in Bayesian analysis of variance. Lagged multidimensional recurrence quantification analysis for determining leader-follower relationships within multidimensional time series.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1