Factor Retention in Exploratory Multidimensional Item Response Theory.

IF 2.3 3区心理学 Q2 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Educational and Psychological Measurement Pub Date : 2025-01-04 eCollection Date: 2025-08-01 DOI:10.1177/00131644241306680

Changsheng Chen, Robbe D'hondt, Celine Vens, Wim Van den Noortgate

{"title":"Factor Retention in Exploratory Multidimensional Item Response Theory.","authors":"Changsheng Chen, Robbe D'hondt, Celine Vens, Wim Van den Noortgate","doi":"10.1177/00131644241306680","DOIUrl":null,"url":null,"abstract":"<p><p>Multidimensional Item Response Theory (MIRT) is applied routinely in developing educational and psychological assessment tools, for instance, for exploring multidimensional structures of items using exploratory MIRT. A critical decision in exploratory MIRT analyses is the number of factors to retain. Unfortunately, the comparative properties of statistical methods and innovative Machine Learning (ML) methods for factor retention in exploratory MIRT analyses are still not clear. This study aims to fill this gap by comparing a selection of statistical and ML methods, including Kaiser Criterion (KC), Empirical Kaiser Criterion (EKC), Parallel Analysis (PA), scree plot (OC and AF), Very Simple Structure (VSS; C1 and C2), Minimum Average Partial (MAP), Exploratory Graph Analysis (EGA), Random Forest (RF), Histogram-based Gradient Boosted Decision Trees (HistGBDT), eXtreme Gradient Boosting (XGBoost), and Artificial Neural Network (ANN). The comparison was performed using 720,000 dichotomous response data sets simulated by the MIRT, for various between-item and within-item structures and considering characteristics of large-scale assessments. The results show that MAP, RF, HistGBDT, XGBoost, and ANN tremendously outperform other methods. Among them, HistGBDT generally performs better than other methods. Furthermore, including statistical methods' results as training features improves ML methods' performance. The methods' correct-factoring proportions decrease with an increase in missingness or a decrease in sample size. KC, PA, EKC, and scree plot (OC) are over-factoring, while EGA, scree plot (AF), and VSS (C1) are under-factoring. We recommend that practitioners use both MAP and HistGBDT to determine the number of factors when applying exploratory MIRT.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":" ","pages":"672-695"},"PeriodicalIF":2.3000,"publicationDate":"2025-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11699551/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Educational and Psychological Measurement","FirstCategoryId":"102","ListUrlMain":"https://doi.org/10.1177/00131644241306680","RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/8/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"MATHEMATICS, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}

引用次数: 0

Abstract

Multidimensional Item Response Theory (MIRT) is applied routinely in developing educational and psychological assessment tools, for instance, for exploring multidimensional structures of items using exploratory MIRT. A critical decision in exploratory MIRT analyses is the number of factors to retain. Unfortunately, the comparative properties of statistical methods and innovative Machine Learning (ML) methods for factor retention in exploratory MIRT analyses are still not clear. This study aims to fill this gap by comparing a selection of statistical and ML methods, including Kaiser Criterion (KC), Empirical Kaiser Criterion (EKC), Parallel Analysis (PA), scree plot (OC and AF), Very Simple Structure (VSS; C1 and C2), Minimum Average Partial (MAP), Exploratory Graph Analysis (EGA), Random Forest (RF), Histogram-based Gradient Boosted Decision Trees (HistGBDT), eXtreme Gradient Boosting (XGBoost), and Artificial Neural Network (ANN). The comparison was performed using 720,000 dichotomous response data sets simulated by the MIRT, for various between-item and within-item structures and considering characteristics of large-scale assessments. The results show that MAP, RF, HistGBDT, XGBoost, and ANN tremendously outperform other methods. Among them, HistGBDT generally performs better than other methods. Furthermore, including statistical methods' results as training features improves ML methods' performance. The methods' correct-factoring proportions decrease with an increase in missingness or a decrease in sample size. KC, PA, EKC, and scree plot (OC) are over-factoring, while EGA, scree plot (AF), and VSS (C1) are under-factoring. We recommend that practitioners use both MAP and HistGBDT to determine the number of factors when applying exploratory MIRT.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

探索性多维项目反应理论中的因素保留。

多维项目反应理论（MIRT）在教育和心理评估工具的开发中得到了常规应用，例如，利用探索性MIRT来探索项目的多维结构。探索性MIRT分析的一个关键决定是保留的因素数量。不幸的是，在探索性MIRT分析中，统计方法和创新的机器学习（ML）方法在因子保留方面的比较特性仍然不清楚。本研究旨在通过比较统计和ML方法的选择来填补这一空白，包括Kaiser标准（KC），经验Kaiser标准（EKC），平行分析（PA），屏幕图（OC和AF），非常简单结构（VSS）；C1和C2)，最小平均偏（MAP），探索性图分析（EGA），随机森林（RF），基于直方图的梯度增强决策树（HistGBDT），极端梯度增强（XGBoost）和人工神经网络（ANN）。比较是使用由MIRT模拟的72万个二分反应数据集进行的，用于各种项目间和项目内结构，并考虑大规模评估的特点。结果表明，MAP、RF、HistGBDT、XGBoost和ANN大大优于其他方法。其中，HistGBDT的性能普遍优于其他方法。此外，将统计方法的结果作为训练特征可以提高机器学习方法的性能。这些方法的正确因子比例随着缺失量的增加或样本量的减少而降低。KC、PA、EKC和屏幕图（OC）是过度保理，而EGA、屏幕图（AF）和VSS （C1）是欠保理。我们建议从业者在应用探索性MIRT时同时使用MAP和HistGBDT来确定因素的数量。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Educational and Psychological Measurement 医学-数学跨学科应用

CiteScore

5.50

自引率

7.40%

发文量

审稿时长

6-12 weeks

期刊介绍： Educational and Psychological Measurement (EPM) publishes referred scholarly work from all academic disciplines interested in the study of measurement theory, problems, and issues. Theoretical articles address new developments and techniques, and applied articles deal with innovation applications.