{"title":"重新审视量表评估中 Alpha 的使用:量表长度和样本量的影响","authors":"Leifeng Xiao, Kit-Tai Hau, Melissa Dan Wang","doi":"10.1111/emip.12604","DOIUrl":null,"url":null,"abstract":"<p>Short scales are time-efficient for participants and cost-effective in research. However, researchers often mistakenly expect short scales to have the same reliability as long ones without considering the effect of scale length. We argue that applying a universal benchmark for alpha is problematic as the impact of low-quality items is greater on shorter scales. In this study, we proposed simple guidelines for item reduction using the “alpha-if-item-deleted” procedure in scale construction. An item can be removed if alpha increases or decreases by less than .02, especially for short scales. Conversely, an item should be retained if alpha decreases by more than .04 upon its removal. For reliability benchmarks, .80 is relatively safe in most conditions, but higher benchmarks are recommended for longer scales and smaller sample sizes. Supplementary analyses, including item content, face validity, and content coverage, are critical to ensure scale quality.</p>","PeriodicalId":47345,"journal":{"name":"Educational Measurement-Issues and Practice","volume":"43 2","pages":"74-81"},"PeriodicalIF":2.7000,"publicationDate":"2024-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/emip.12604","citationCount":"0","resultStr":"{\"title\":\"Revisiting the Usage of Alpha in Scale Evaluation: Effects of Scale Length and Sample Size\",\"authors\":\"Leifeng Xiao, Kit-Tai Hau, Melissa Dan Wang\",\"doi\":\"10.1111/emip.12604\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Short scales are time-efficient for participants and cost-effective in research. However, researchers often mistakenly expect short scales to have the same reliability as long ones without considering the effect of scale length. We argue that applying a universal benchmark for alpha is problematic as the impact of low-quality items is greater on shorter scales. In this study, we proposed simple guidelines for item reduction using the “alpha-if-item-deleted” procedure in scale construction. An item can be removed if alpha increases or decreases by less than .02, especially for short scales. Conversely, an item should be retained if alpha decreases by more than .04 upon its removal. For reliability benchmarks, .80 is relatively safe in most conditions, but higher benchmarks are recommended for longer scales and smaller sample sizes. Supplementary analyses, including item content, face validity, and content coverage, are critical to ensure scale quality.</p>\",\"PeriodicalId\":47345,\"journal\":{\"name\":\"Educational Measurement-Issues and Practice\",\"volume\":\"43 2\",\"pages\":\"74-81\"},\"PeriodicalIF\":2.7000,\"publicationDate\":\"2024-03-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1111/emip.12604\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Educational Measurement-Issues and Practice\",\"FirstCategoryId\":\"95\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1111/emip.12604\",\"RegionNum\":4,\"RegionCategory\":\"教育学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"EDUCATION & EDUCATIONAL RESEARCH\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Educational Measurement-Issues and Practice","FirstCategoryId":"95","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/emip.12604","RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"EDUCATION & EDUCATIONAL RESEARCH","Score":null,"Total":0}
Revisiting the Usage of Alpha in Scale Evaluation: Effects of Scale Length and Sample Size
Short scales are time-efficient for participants and cost-effective in research. However, researchers often mistakenly expect short scales to have the same reliability as long ones without considering the effect of scale length. We argue that applying a universal benchmark for alpha is problematic as the impact of low-quality items is greater on shorter scales. In this study, we proposed simple guidelines for item reduction using the “alpha-if-item-deleted” procedure in scale construction. An item can be removed if alpha increases or decreases by less than .02, especially for short scales. Conversely, an item should be retained if alpha decreases by more than .04 upon its removal. For reliability benchmarks, .80 is relatively safe in most conditions, but higher benchmarks are recommended for longer scales and smaller sample sizes. Supplementary analyses, including item content, face validity, and content coverage, are critical to ensure scale quality.