Factor Retention in Exploratory Multidimensional Item Response Theory.

IF 2.1 3区 心理学 Q2 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Educational and Psychological Measurement Pub Date : 2025-01-04 DOI:10.1177/00131644241306680
Changsheng Chen, Robbe D'hondt, Celine Vens, Wim Van den Noortgate
{"title":"Factor Retention in Exploratory Multidimensional Item Response Theory.","authors":"Changsheng Chen, Robbe D'hondt, Celine Vens, Wim Van den Noortgate","doi":"10.1177/00131644241306680","DOIUrl":null,"url":null,"abstract":"<p><p>Multidimensional Item Response Theory (MIRT) is applied routinely in developing educational and psychological assessment tools, for instance, for exploring multidimensional structures of items using exploratory MIRT. A critical decision in exploratory MIRT analyses is the number of factors to retain. Unfortunately, the comparative properties of statistical methods and innovative Machine Learning (ML) methods for factor retention in exploratory MIRT analyses are still not clear. This study aims to fill this gap by comparing a selection of statistical and ML methods, including Kaiser Criterion (KC), Empirical Kaiser Criterion (EKC), Parallel Analysis (PA), scree plot (OC and AF), Very Simple Structure (VSS; C1 and C2), Minimum Average Partial (MAP), Exploratory Graph Analysis (EGA), Random Forest (RF), Histogram-based Gradient Boosted Decision Trees (HistGBDT), eXtreme Gradient Boosting (XGBoost), and Artificial Neural Network (ANN). The comparison was performed using 720,000 dichotomous response data sets simulated by the MIRT, for various between-item and within-item structures and considering characteristics of large-scale assessments. The results show that MAP, RF, HistGBDT, XGBoost, and ANN tremendously outperform other methods. Among them, HistGBDT generally performs better than other methods. Furthermore, including statistical methods' results as training features improves ML methods' performance. The methods' correct-factoring proportions decrease with an increase in missingness or a decrease in sample size. KC, PA, EKC, and scree plot (OC) are over-factoring, while EGA, scree plot (AF), and VSS (C1) are under-factoring. We recommend that practitioners use both MAP and HistGBDT to determine the number of factors when applying exploratory MIRT.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":" ","pages":"00131644241306680"},"PeriodicalIF":2.1000,"publicationDate":"2025-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11699551/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Educational and Psychological Measurement","FirstCategoryId":"102","ListUrlMain":"https://doi.org/10.1177/00131644241306680","RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MATHEMATICS, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0

Abstract

Multidimensional Item Response Theory (MIRT) is applied routinely in developing educational and psychological assessment tools, for instance, for exploring multidimensional structures of items using exploratory MIRT. A critical decision in exploratory MIRT analyses is the number of factors to retain. Unfortunately, the comparative properties of statistical methods and innovative Machine Learning (ML) methods for factor retention in exploratory MIRT analyses are still not clear. This study aims to fill this gap by comparing a selection of statistical and ML methods, including Kaiser Criterion (KC), Empirical Kaiser Criterion (EKC), Parallel Analysis (PA), scree plot (OC and AF), Very Simple Structure (VSS; C1 and C2), Minimum Average Partial (MAP), Exploratory Graph Analysis (EGA), Random Forest (RF), Histogram-based Gradient Boosted Decision Trees (HistGBDT), eXtreme Gradient Boosting (XGBoost), and Artificial Neural Network (ANN). The comparison was performed using 720,000 dichotomous response data sets simulated by the MIRT, for various between-item and within-item structures and considering characteristics of large-scale assessments. The results show that MAP, RF, HistGBDT, XGBoost, and ANN tremendously outperform other methods. Among them, HistGBDT generally performs better than other methods. Furthermore, including statistical methods' results as training features improves ML methods' performance. The methods' correct-factoring proportions decrease with an increase in missingness or a decrease in sample size. KC, PA, EKC, and scree plot (OC) are over-factoring, while EGA, scree plot (AF), and VSS (C1) are under-factoring. We recommend that practitioners use both MAP and HistGBDT to determine the number of factors when applying exploratory MIRT.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
探索性多维项目反应理论中的因素保留。
多维项目反应理论(MIRT)在教育和心理评估工具的开发中得到了常规应用,例如,利用探索性MIRT来探索项目的多维结构。探索性MIRT分析的一个关键决定是保留的因素数量。不幸的是,在探索性MIRT分析中,统计方法和创新的机器学习(ML)方法在因子保留方面的比较特性仍然不清楚。本研究旨在通过比较统计和ML方法的选择来填补这一空白,包括Kaiser标准(KC),经验Kaiser标准(EKC),平行分析(PA),屏幕图(OC和AF),非常简单结构(VSS);C1和C2),最小平均偏(MAP),探索性图分析(EGA),随机森林(RF),基于直方图的梯度增强决策树(HistGBDT),极端梯度增强(XGBoost)和人工神经网络(ANN)。比较是使用由MIRT模拟的72万个二分反应数据集进行的,用于各种项目间和项目内结构,并考虑大规模评估的特点。结果表明,MAP、RF、HistGBDT、XGBoost和ANN大大优于其他方法。其中,HistGBDT的性能普遍优于其他方法。此外,将统计方法的结果作为训练特征可以提高机器学习方法的性能。这些方法的正确因子比例随着缺失量的增加或样本量的减少而降低。KC、PA、EKC和屏幕图(OC)是过度保理,而EGA、屏幕图(AF)和VSS (C1)是欠保理。我们建议从业者在应用探索性MIRT时同时使用MAP和HistGBDT来确定因素的数量。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Educational and Psychological Measurement
Educational and Psychological Measurement 医学-数学跨学科应用
CiteScore
5.50
自引率
7.40%
发文量
49
审稿时长
6-12 weeks
期刊介绍: Educational and Psychological Measurement (EPM) publishes referred scholarly work from all academic disciplines interested in the study of measurement theory, problems, and issues. Theoretical articles address new developments and techniques, and applied articles deal with innovation applications.
期刊最新文献
"What If Applicants Fake Their Responses?": Modeling Faking and Response Styles in High-Stakes Assessments Using the Multidimensional Nominal Response Model. A Comparison of the Next Eigenvalue Sufficiency Test to Other Stopping Rules for the Number of Factors in Factor Analysis. An Omega-Hierarchical Extension Index for Second-Order Constructs With Hierarchical Measuring Instruments. The Impact of Missing Data on Parameter Estimation: Three Examples in Computerized Adaptive Testing. Item Classification by Difficulty Using Functional Principal Component Clustering and Neural Networks.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1