Power for tests of interaction: effect of raising the Type I error rate.

Epidemiologic perspectives & innovations : EP+I Pub Date : 2007-06-19 DOI:10.1186/1742-5573-4-4

Stephen W Marshall

{"title":"Power for tests of interaction: effect of raising the Type I error rate.","authors":"Stephen W Marshall","doi":"10.1186/1742-5573-4-4","DOIUrl":null,"url":null,"abstract":"Background: Power for assessing interactions during data analysis is often poor in epidemiologic studies. This is because epidemiologic studies are frequently powered primarily to assess main effects only. In light of this, some investigators raise the Type I error rate, thereby increasing power, when testing interactions. However, this is a poor analysis strategy if the study is chronically under-powered (e.g. in a small study) or already adequately powered (e.g. in a very large study). To demonstrate this point, this study quantified the gain in power for testing interactions when the Type I error rate is raised, for a variety of study sizes and types of interaction.Methods: Power was computed for the Wald test for interaction, the likelihood ratio test for interaction, and the Breslow-Day test for heterogeneity of the odds ratio. Ten types of interaction, ranging from sub-additive through to super-multiplicative, were investigated in the simple scenario of two binary risk factors. Case-control studies of various sizes were investigated (75 cases & 150 controls, 300 cases & 600 controls, and 1200 cases & 2400 controls).Results: The strategy of raising the Type I error rate from 5% to 20% resulted in a useful power gain (a gain of at least 10%, resulting in power of at least 70%) in only 7 of the 27 interaction type/study size scenarios studied (26%). In the other 20 scenarios, power was either already adequate (n = 8; 30%), or else so low that it was still weak (below 70%) even after raising the Type I error rate to 20% (n = 12; 44%).Conclusion: Relaxing the Type I error rate did not usefully improve the power for tests of interaction in many of the scenarios studied. In many studies, the small power gains obtained by raising the Type I error will be more than offset by the disadvantage of increased \"false positives\". I recommend investigators should not routinely raise the Type I error rate when assessing tests of interaction.","PeriodicalId":87082,"journal":{"name":"Epidemiologic perspectives & innovations : EP+I","volume":"4 ","pages":"4"},"PeriodicalIF":0.0000,"publicationDate":"2007-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/1742-5573-4-4","citationCount":"407","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Epidemiologic perspectives & innovations : EP+I","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1186/1742-5573-4-4","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 407

Abstract

Background: Power for assessing interactions during data analysis is often poor in epidemiologic studies. This is because epidemiologic studies are frequently powered primarily to assess main effects only. In light of this, some investigators raise the Type I error rate, thereby increasing power, when testing interactions. However, this is a poor analysis strategy if the study is chronically under-powered (e.g. in a small study) or already adequately powered (e.g. in a very large study). To demonstrate this point, this study quantified the gain in power for testing interactions when the Type I error rate is raised, for a variety of study sizes and types of interaction.

Methods: Power was computed for the Wald test for interaction, the likelihood ratio test for interaction, and the Breslow-Day test for heterogeneity of the odds ratio. Ten types of interaction, ranging from sub-additive through to super-multiplicative, were investigated in the simple scenario of two binary risk factors. Case-control studies of various sizes were investigated (75 cases & 150 controls, 300 cases & 600 controls, and 1200 cases & 2400 controls).

Results: The strategy of raising the Type I error rate from 5% to 20% resulted in a useful power gain (a gain of at least 10%, resulting in power of at least 70%) in only 7 of the 27 interaction type/study size scenarios studied (26%). In the other 20 scenarios, power was either already adequate (n = 8; 30%), or else so low that it was still weak (below 70%) even after raising the Type I error rate to 20% (n = 12; 44%).

Conclusion: Relaxing the Type I error rate did not usefully improve the power for tests of interaction in many of the scenarios studied. In many studies, the small power gains obtained by raising the Type I error will be more than offset by the disadvantage of increased "false positives". I recommend investigators should not routinely raise the Type I error rate when assessing tests of interaction.

Abstract Image

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

交互作用测试功率:提高第一类错误率的效果。

背景:在流行病学研究中，评估数据分析过程中相互作用的能力往往很差。这是因为流行病学研究往往主要是为了评估主要影响。鉴于此，一些研究人员在测试相互作用时提高了I型错误率，从而增加了功率。然而，如果研究长期缺乏动力(例如在小型研究中)或已经足够动力(例如在非常大的研究中)，这是一个糟糕的分析策略。为了证明这一点，本研究量化了当I型错误率提高时测试相互作用的功率增益，适用于各种研究规模和相互作用类型。方法:计算相互作用的Wald检验、相互作用的似然比检验和优势比异质性的brreslow - day检验的幂次。在两个二元危险因素的简单情况下，研究了从亚加性到超乘性的十种相互作用。调查了不同规模的病例-对照研究(75例和150例对照，300例和600例对照，1200例和2400例对照)。结果:将第一类错误率从5%提高到20%的策略在27种交互类型/研究规模的研究场景中只有7种(26%)产生了有用的功率增益(增益至少10%，导致功率至少70%)。在其他20种情况下，电力要么已经足够(n = 8;30%)，或者低到即使将I型错误率提高到20% (n = 12;44%)。结论:放宽I型错误率并没有有效地提高所研究的许多情景中相互作用测试的能力。在许多研究中，通过提高I型误差获得的小功率增益将被增加的“假阳性”的缺点所抵消。我建议研究者在评估相互作用试验时不应常规地提高I型错误率。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Epidemiologic perspectives & innovations : EP+I

自引率

0.00%

发文量