首页 > 最新文献

Educational and Psychological Measurement最新文献

英文 中文
Overestimation of Internal Consistency by Coefficient Omega in Data Giving Rise to a Centroid-Like Factor Solution. 用数据中的ω系数高估内部一致性,从而产生类质心因子解。
IF 2.1 3区 心理学 Q2 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-02-13 DOI: 10.1177/00131644241313447
Karl Schweizer, Tengfei Wang, Xuezhu Ren

Coefficient Omega measuring internal consistency is investigated for its deviations from expected outcomes when applied to correlational patterns that produce variable-focused factor solutions in confirmatory factor analysis. In these solutions, the factor loadings on the factor of the one-factor measurement model closely correspond to the correlations of one manifest variable with the other manifest variables, as is in centroid solutions. It is demonstrated that in such a situation, a heterogeneous correlational pattern leads to an Omega estimate larger than those for similarly heterogeneous and uniform patterns. A simulation study reveals that these deviations are restricted to datasets including small numbers of manifest variables and that the degree of heterogeneity determines the degree of deviation. We propose a method for identifying variable-focused factor solutions and how to deal with deviations.

测量内部一致性的系数Omega在应用于验证性因子分析中产生变量焦点因子解决方案的相关模式时,对其与预期结果的偏差进行了调查。在这些解中,单因素测量模型的因素上的因素负荷与一个表现变量与其他表现变量的相关性密切对应,就像在质心解中一样。结果表明,在这种情况下,异质相关模式导致的Omega估计大于类似异质和均匀模式的Omega估计。模拟研究表明,这些偏差仅限于包含少量明显变量的数据集,并且异质性的程度决定了偏差的程度。我们提出了一种方法来识别变量聚焦因子解决方案和如何处理偏差。
{"title":"Overestimation of Internal Consistency by Coefficient Omega in Data Giving Rise to a Centroid-Like Factor Solution.","authors":"Karl Schweizer, Tengfei Wang, Xuezhu Ren","doi":"10.1177/00131644241313447","DOIUrl":"10.1177/00131644241313447","url":null,"abstract":"<p><p>Coefficient Omega measuring internal consistency is investigated for its deviations from expected outcomes when applied to correlational patterns that produce variable-focused factor solutions in confirmatory factor analysis. In these solutions, the factor loadings on the factor of the one-factor measurement model closely correspond to the correlations of one manifest variable with the other manifest variables, as is in centroid solutions. It is demonstrated that in such a situation, a heterogeneous correlational pattern leads to an Omega estimate larger than those for similarly heterogeneous and uniform patterns. A simulation study reveals that these deviations are restricted to datasets including small numbers of manifest variables and that the degree of heterogeneity determines the degree of deviation. We propose a method for identifying variable-focused factor solutions and how to deal with deviations.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":" ","pages":"00131644241313447"},"PeriodicalIF":2.1,"publicationDate":"2025-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11826816/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143432505","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Improving the Use of Parallel Analysis by Accounting for Sampling Variability of the Observed Correlation Matrix. 通过考虑观测相关矩阵的抽样变异性改进平行分析的使用。
IF 2.3 3区 心理学 Q2 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-02-01 Epub Date: 2024-08-20 DOI: 10.1177/00131644241268073
Yan Xia, Xinchang Zhou

Parallel analysis has been considered one of the most accurate methods for determining the number of factors in factor analysis. One major advantage of parallel analysis over traditional factor retention methods (e.g., Kaiser's rule) is that it addresses the sampling variability of eigenvalues obtained from the identity matrix, representing the correlation matrix for a zero-factor model. This study argues that we should also address the sampling variability of eigenvalues obtained from the observed data, such that the results would inform practitioners of the variability of the number of factors across random samples. Thus, this study proposes to revise the parallel analysis to provide the proportion of random samples that suggest k factors (k = 0, 1, 2, . . .) rather than a single suggested number. Simulation results support the use of the proposed strategy, especially for research scenarios with limited sample sizes where sampling fluctuation is concerning.

平行分析法被认为是确定因子分析中因子个数的最准确方法之一。与传统的因子保留方法(如凯撒法则)相比,平行分析法的一大优势在于它能解决从特征矩阵(代表零因子模型的相关矩阵)中获得的特征值的抽样变异性问题。本研究认为,我们还应该解决从观测数据中获得的特征值的抽样变异性问题,从而使研究结果能够告知从业人员不同随机样本中因子数量的变异性。因此,本研究建议修改并行分析,以提供建议 k 个因子(k = 0、1、2、...)的随机样本比例,而不是单一的建议因子数。模拟结果支持使用所建议的策略,尤其是在样本量有限、抽样波动令人担忧的研究场景中。
{"title":"Improving the Use of Parallel Analysis by Accounting for Sampling Variability of the Observed Correlation Matrix.","authors":"Yan Xia, Xinchang Zhou","doi":"10.1177/00131644241268073","DOIUrl":"10.1177/00131644241268073","url":null,"abstract":"<p><p>Parallel analysis has been considered one of the most accurate methods for determining the number of factors in factor analysis. One major advantage of parallel analysis over traditional factor retention methods (e.g., Kaiser's rule) is that it addresses the sampling variability of eigenvalues obtained from the identity matrix, representing the correlation matrix for a zero-factor model. This study argues that we should also address the sampling variability of eigenvalues obtained from the observed data, such that the results would inform practitioners of the variability of the number of factors across random samples. Thus, this study proposes to revise the parallel analysis to provide the proportion of random samples that suggest <i>k</i> factors (<i>k</i> = 0, 1, 2, . . .) rather than a single suggested number. Simulation results support the use of the proposed strategy, especially for research scenarios with limited sample sizes where sampling fluctuation is concerning.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":" ","pages":"114-133"},"PeriodicalIF":2.3,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11572087/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142675458","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Obtaining a Bayesian Estimate of Coefficient Alpha Using a Posterior Normal Distribution. 利用后验正态分布获得系数Alpha的贝叶斯估计。
IF 2.3 3区 心理学 Q2 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-01-31 eCollection Date: 2025-08-01 DOI: 10.1177/00131644241311877
John Mart V DelosReyes, Miguel A Padilla

A new alternative to obtain a Bayesian estimate of coefficient alpha through a posterior normal distribution is proposed and assessed through percentile, normal-theory-based, and highest probability density credible intervals in a simulation study. The results indicate that the proposed Bayesian method to estimate coefficient alpha has acceptable coverage probability performance across the majority of investigated simulation conditions.

提出了一种通过后验正态分布获得alpha系数贝叶斯估计的新方法,并在模拟研究中通过百分位数、基于正态理论和最高概率密度可信区间进行了评估。结果表明,所提出的估计系数alpha的贝叶斯方法在大多数研究的模拟条件下具有可接受的覆盖概率性能。
{"title":"Obtaining a Bayesian Estimate of Coefficient Alpha Using a Posterior Normal Distribution.","authors":"John Mart V DelosReyes, Miguel A Padilla","doi":"10.1177/00131644241311877","DOIUrl":"10.1177/00131644241311877","url":null,"abstract":"<p><p>A new alternative to obtain a Bayesian estimate of coefficient alpha through a posterior normal distribution is proposed and assessed through percentile, normal-theory-based, and highest probability density credible intervals in a simulation study. The results indicate that the proposed Bayesian method to estimate coefficient alpha has acceptable coverage probability performance across the majority of investigated simulation conditions.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":" ","pages":"829-852"},"PeriodicalIF":2.3,"publicationDate":"2025-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11786261/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143079164","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Examining the Instructional Sensitivity of Constructed-Response Achievement Test Item Scores. 建构反应成就测验项目分数的教学敏感性检验。
IF 2.3 3区 心理学 Q2 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-01-30 eCollection Date: 2025-10-01 DOI: 10.1177/00131644241313212
Anne Traynor, Cheng-Hsien Li, Shuqi Zhou

Inferences about student learning from large-scale achievement test scores are fundamental in education. For achievement test scores to provide useful information about student learning progress, differences in the content of instruction (i.e., the implemented curriculum) should affect test-takers' item responses. Existing research has begun to identify patterns in the content of instructionally sensitive multiple-choice achievement test items. To inform future test design decisions, this study identified instructionally (in)sensitive constructed-response achievement items, then characterized features of those items and their corresponding scoring rubrics. First, we used simulation to evaluate an item step difficulty difference index for constructed-response test items, derived from the generalized partial credit model. The statistical performance of the index was adequate, so we then applied it to data from 32 constructed-response eighth-grade science test items. We found that the instructional sensitivity (IS) index values varied appreciably across the category boundaries within an item as well as across items. Content analysis by master science teachers allowed us to identify general features of item score categories that show high, or negligible, IS.

从大规模成绩测试成绩中推断学生的学习情况是教育的基础。为了让成绩测试分数提供有关学生学习进展的有用信息,教学内容(即实施的课程)的差异应该影响考生对项目的反应。现有的研究已经开始确定教学敏感性选择题内容的模式。为了给未来的测试设计决策提供信息,本研究确定了具有指导意义的(in)敏感的构式反应成就项目,然后描述了这些项目的特征及其相应的评分标准。首先,我们用模拟的方法评估了一个由广义部分信用模型衍生出来的建构反应测试项目的项目步骤难度差指数。该指标的统计性能是足够的,因此我们将其应用于32个建构反应的八年级科学测试项目的数据。我们发现教学敏感性(IS)指数值在一个项目内以及跨项目的类别边界上有明显的变化。科学硕士教师的内容分析使我们能够确定项目得分类别的一般特征,这些特征显示出高IS或可忽略的IS。
{"title":"Examining the Instructional Sensitivity of Constructed-Response Achievement Test Item Scores.","authors":"Anne Traynor, Cheng-Hsien Li, Shuqi Zhou","doi":"10.1177/00131644241313212","DOIUrl":"10.1177/00131644241313212","url":null,"abstract":"<p><p>Inferences about student learning from large-scale achievement test scores are fundamental in education. For achievement test scores to provide useful information about student learning progress, differences in the content of instruction (i.e., the implemented curriculum) should affect test-takers' item responses. Existing research has begun to identify patterns in the content of instructionally sensitive multiple-choice achievement test items. To inform future test design decisions, this study identified instructionally (in)sensitive constructed-response achievement items, then characterized features of those items and their corresponding scoring rubrics. First, we used simulation to evaluate an item step difficulty difference index for constructed-response test items, derived from the generalized partial credit model. The statistical performance of the index was adequate, so we then applied it to data from 32 constructed-response eighth-grade science test items. We found that the instructional sensitivity (IS) index values varied appreciably across the category boundaries within an item as well as across items. Content analysis by master science teachers allowed us to identify general features of item score categories that show high, or negligible, IS.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":" ","pages":"1000-1031"},"PeriodicalIF":2.3,"publicationDate":"2025-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11783420/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143079163","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The Impact of Attentiveness Interventions on Survey Data. 注意力干预对调查数据的影响。
IF 2.1 3区 心理学 Q2 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-01-29 DOI: 10.1177/00131644241311851
Christie M Fuller, Marcia J Simmering, Brian Waterwall, Elizabeth Ragland, Douglas P Twitchell, Alison Wall

Social and behavioral science researchers who use survey data are vigilant about data quality, with an increasing emphasis on avoiding common method variance (CMV) and insufficient effort responding (IER). Each of these errors can inflate and deflate substantive relationships, and there are both a priori and post hoc means to address them. Yet, little research has investigated how both IER and CMV are affected with the use of these different procedural or statistical techniques used to address them. More specifically, if interventions to reduce IER are used, does this affect CMV in data? In an experiment conducted both in and out of the laboratory, we investigate the impact of attentiveness interventions, such as a Factual Manipulation Check (FMC) on both IER and CMV in same-source survey data. In addition to typical IER measures, we also track whether respondents play the instructional video and their mouse movement. The results show that while interventions have some impact on the level of participant attentiveness, these interventions do not appear to lead to differing levels of CMV.

使用调查数据的社会和行为科学研究人员对数据质量保持警惕,越来越重视避免共同方法方差(CMV)和不充分努力响应(IER)。这些错误中的每一个都可能使实质性关系膨胀或缩小,并且有先验和事后的方法来解决它们。然而,很少有研究调查使用这些不同的程序或统计技术来解决IER和CMV是如何受到影响的。更具体地说,如果使用干预措施来减少IER,这是否会影响数据中的CMV ?在实验室内外进行的一项实验中,我们调查了注意力干预的影响,如事实操纵检查(FMC)对同一来源调查数据中的IER和CMV。除了典型的IER测量,我们还跟踪受访者是否播放教学视频和他们的鼠标移动。结果表明,虽然干预措施对参与者的注意力水平有一定影响,但这些干预措施似乎不会导致CMV的不同水平。
{"title":"The Impact of Attentiveness Interventions on Survey Data.","authors":"Christie M Fuller, Marcia J Simmering, Brian Waterwall, Elizabeth Ragland, Douglas P Twitchell, Alison Wall","doi":"10.1177/00131644241311851","DOIUrl":"10.1177/00131644241311851","url":null,"abstract":"<p><p>Social and behavioral science researchers who use survey data are vigilant about data quality, with an increasing emphasis on avoiding common method variance (CMV) and insufficient effort responding (IER). Each of these errors can inflate and deflate substantive relationships, and there are both a priori and post hoc means to address them. Yet, little research has investigated how both IER and CMV are affected with the use of these different procedural or statistical techniques used to address them. More specifically, if interventions to reduce IER are used, does this affect CMV in data? In an experiment conducted both in and out of the laboratory, we investigate the impact of attentiveness interventions, such as a Factual Manipulation Check (FMC) on both IER and CMV in same-source survey data. In addition to typical IER measures, we also track whether respondents play the instructional video and their mouse movement. The results show that while interventions have some impact on the level of participant attentiveness, these interventions do not appear to lead to differing levels of CMV.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":" ","pages":"00131644241311851"},"PeriodicalIF":2.1,"publicationDate":"2025-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11775934/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143064490","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
"What If Applicants Fake Their Responses?": Modeling Faking and Response Styles in High-Stakes Assessments Using the Multidimensional Nominal Response Model. “如果求职者的回答是假的怎么办?”运用多维标称反应模型对高风险评估中的虚假和反应风格进行建模。
IF 2.3 3区 心理学 Q2 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-01-23 eCollection Date: 2025-08-01 DOI: 10.1177/00131644241307560
Timo Seitz, Maik Spengler, Thorsten Meiser

Self-report personality tests used in high-stakes assessments hold the risk that test-takers engage in faking. In this article, we demonstrate an extension of the multidimensional nominal response model (MNRM) to account for the response bias of faking. The MNRM is a flexible item response theory (IRT) model that allows modeling response biases whose effect patterns vary between items. In a simulation, we found good parameter recovery of the model accounting for faking under different conditions as well as good performance of model selection criteria. Also, we modeled responses from N = 3,046 job applicants taking a personality test under real high-stakes conditions. We thereby specified item-specific effect patterns of faking by setting scoring weights to appropriate values that we collected in a pilot study. Results indicated that modeling faking significantly increased model fit over and above response styles and improved divergent validity, while the faking dimension exhibited relations to several covariates. Additionally, applying the model to a sample of job incumbents taking the test under low-stakes conditions, we found evidence that the model can effectively capture faking and adjust estimates of substantive trait scores for the assumed influence of faking. We end the article with a discussion of implications for psychological measurement in high-stakes assessment contexts.

在高风险评估中使用的自我报告型人格测试有考生作假的风险。在本文中,我们展示了多维名义反应模型(MNRM)的扩展,以解释虚假的反应偏差。MNRM是一个灵活的项目反应理论(IRT)模型,它允许对不同项目的反应偏差进行建模。在仿真中,我们发现在不同条件下模型的参数恢复良好,模型选择标准的性能也很好。此外,我们还模拟了N = 3046名求职者在真实高风险条件下进行性格测试的反应。因此,我们通过将得分权重设置为我们在试点研究中收集的适当值来指定特定项目的伪造效果模式。结果表明,伪造维度与多个协变量之间存在一定的关系,伪造维度显著提高了模型拟合度和发散效度。此外,将该模型应用于在低风险条件下参加测试的在职人员样本,我们发现有证据表明,该模型可以有效地捕捉虚假行为,并根据虚假行为的假设影响调整对实质性特质得分的估计。我们以讨论高风险评估情境下心理测量的含义来结束文章。
{"title":"\"What If Applicants Fake Their Responses?\": Modeling Faking and Response Styles in High-Stakes Assessments Using the Multidimensional Nominal Response Model.","authors":"Timo Seitz, Maik Spengler, Thorsten Meiser","doi":"10.1177/00131644241307560","DOIUrl":"10.1177/00131644241307560","url":null,"abstract":"<p><p>Self-report personality tests used in high-stakes assessments hold the risk that test-takers engage in faking. In this article, we demonstrate an extension of the multidimensional nominal response model (MNRM) to account for the response bias of faking. The MNRM is a flexible item response theory (IRT) model that allows modeling response biases whose effect patterns vary between items. In a simulation, we found good parameter recovery of the model accounting for faking under different conditions as well as good performance of model selection criteria. Also, we modeled responses from <i>N</i> = 3,046 job applicants taking a personality test under real high-stakes conditions. We thereby specified item-specific effect patterns of faking by setting scoring weights to appropriate values that we collected in a pilot study. Results indicated that modeling faking significantly increased model fit over and above response styles and improved divergent validity, while the faking dimension exhibited relations to several covariates. Additionally, applying the model to a sample of job incumbents taking the test under low-stakes conditions, we found evidence that the model can effectively capture faking and adjust estimates of substantive trait scores for the assumed influence of faking. We end the article with a discussion of implications for psychological measurement in high-stakes assessment contexts.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":" ","pages":"747-782"},"PeriodicalIF":2.3,"publicationDate":"2025-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11755426/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143045425","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Comparison of the Next Eigenvalue Sufficiency Test to Other Stopping Rules for the Number of Factors in Factor Analysis. 因子分析中下一特征值充分性检验与其他停止规则的比较
IF 2.3 3区 心理学 Q2 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-01-22 eCollection Date: 2025-08-01 DOI: 10.1177/00131644241308528
Pier-Olivier Caron

A plethora of techniques exist to determine the number of factors to retain in exploratory factor analysis. A recent and promising technique is the Next Eigenvalue Sufficiency Test (NEST), but has not been systematically compared with well-established stopping rules. The present study proposes a simulation with synthetic factor structures to compare NEST, parallel analysis, sequential χ 2 test, Hull method, and the empirical Kaiser criterion. The structures were based on 24 variables containing one to eight factors, loadings ranged from .40 to .80, inter-factor correlations ranged from .00 to .30, and three sample sizes were used. In total, 360 scenarios were replicated 1,000 times. Performance was evaluated in terms of accuracy (correct identification of dimensionality) and bias (tendency to over- or underestimate dimensionality). Overall, NEST showed the best overall performances, especially in hard conditions where it had to detect small but meaningful factors. It had a tendency to underextract, but to a lesser extent than other methods. The second best method was parallel analysis by being more liberal in harder cases. The three other stopping rules had pitfalls: sequential χ 2 test and Hull method even in some easy conditions; the empirical Kaiser criterion in hard conditions.

在探索性因子分析中,存在大量的技术来确定要保留的因素数量。下一个特征值充分性检验(NEST)是一种最新的、有前途的技术,但尚未与已建立的停止规则进行系统比较。本研究提出了一种综合因子结构的模拟方法来比较NEST、并行分析、序列χ 2检验、Hull方法和经验Kaiser准则。结构基于24个变量,包含1至8个因子,载荷范围为0.40至0.80,因子间相关性范围为0.00至0.30,使用了三种样本量。总共360个场景被复制了1000次。根据准确性(正确识别维度)和偏差(倾向于高估或低估维度)来评估性能。总的来说,NEST表现出了最好的综合性能,特别是在必须检测小但有意义的因素的困难条件下。它有提取不足的趋势,但程度低于其他方法。第二种最好的方法是并行分析,在更困难的情况下更自由。其他三种停止规则存在缺陷:即使在一些简单的条件下,顺序χ 2检验和赫尔法也存在缺陷;在艰苦条件下的经验凯撒标准。
{"title":"A Comparison of the Next Eigenvalue Sufficiency Test to Other Stopping Rules for the Number of Factors in Factor Analysis.","authors":"Pier-Olivier Caron","doi":"10.1177/00131644241308528","DOIUrl":"10.1177/00131644241308528","url":null,"abstract":"<p><p>A plethora of techniques exist to determine the number of factors to retain in exploratory factor analysis. A recent and promising technique is the Next Eigenvalue Sufficiency Test (NEST), but has not been systematically compared with well-established stopping rules. The present study proposes a simulation with synthetic factor structures to compare NEST, parallel analysis, sequential <math> <mrow> <msup><mrow><mi>χ</mi></mrow> <mrow><mn>2</mn></mrow> </msup> </mrow> </math> test, Hull method, and the empirical Kaiser criterion. The structures were based on 24 variables containing one to eight factors, loadings ranged from .40 to .80, inter-factor correlations ranged from .00 to .30, and three sample sizes were used. In total, 360 scenarios were replicated 1,000 times. Performance was evaluated in terms of accuracy (correct identification of dimensionality) and bias (tendency to over- or underestimate dimensionality). Overall, NEST showed the best overall performances, especially in hard conditions where it had to detect small but meaningful factors. It had a tendency to underextract, but to a lesser extent than other methods. The second best method was parallel analysis by being more liberal in harder cases. The three other stopping rules had pitfalls: sequential <math> <mrow> <msup><mrow><mi>χ</mi></mrow> <mrow><mn>2</mn></mrow> </msup> </mrow> </math> test and Hull method even in some easy conditions; the empirical Kaiser criterion in hard conditions.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":" ","pages":"814-828"},"PeriodicalIF":2.3,"publicationDate":"2025-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11755425/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143045428","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Omega-Hierarchical Extension Index for Second-Order Constructs With Hierarchical Measuring Instruments. 具有层次测量仪器的二阶结构的ω -层次可拓指标。
IF 2.1 3区 心理学 Q2 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-01-14 DOI: 10.1177/00131644241302284
Tenko Raykov, Christine DiStefano, Yusuf Ransome

An index extending the widely used omega-hierarchical coefficient is discussed, which can be used for evaluating the influence of a second-order factor on the interrelationships among the components of a hierarchical measuring instrument. The index represents a useful and informative complement to the traditional omega-hierarchical measure of explained overall scale score variance by that underlying construct. A point and interval estimation procedure is outlined for the described index, which is based on model reparameterization and is developed within the latent variable modeling framework. The method is readily applicable with popular software and is illustrated with examples.

讨论了一种扩展广泛使用的- - -等级系数的指标,该指标可用于评价二阶因子对等级测量仪器各组成部分之间相互关系的影响。该指数代表了一个有用的和翔实的补充,传统的omega-分层测量解释的总体规模得分方差的基础结构。在潜在变量建模框架内,提出了基于模型重参数化的点和区间估计方法。该方法易于应用于流行的软件,并通过实例加以说明。
{"title":"An Omega-Hierarchical Extension Index for Second-Order Constructs With Hierarchical Measuring Instruments.","authors":"Tenko Raykov, Christine DiStefano, Yusuf Ransome","doi":"10.1177/00131644241302284","DOIUrl":"10.1177/00131644241302284","url":null,"abstract":"<p><p>An index extending the widely used omega-hierarchical coefficient is discussed, which can be used for evaluating the influence of a second-order factor on the interrelationships among the components of a hierarchical measuring instrument. The index represents a useful and informative complement to the traditional omega-hierarchical measure of explained overall scale score variance by that underlying construct. A point and interval estimation procedure is outlined for the described index, which is based on model reparameterization and is developed within the latent variable modeling framework. The method is readily applicable with popular software and is illustrated with examples.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":" ","pages":"00131644241302284"},"PeriodicalIF":2.1,"publicationDate":"2025-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11733867/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143002309","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Factor Retention in Exploratory Multidimensional Item Response Theory. 探索性多维项目反应理论中的因素保留。
IF 2.3 3区 心理学 Q2 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-01-04 eCollection Date: 2025-08-01 DOI: 10.1177/00131644241306680
Changsheng Chen, Robbe D'hondt, Celine Vens, Wim Van den Noortgate

Multidimensional Item Response Theory (MIRT) is applied routinely in developing educational and psychological assessment tools, for instance, for exploring multidimensional structures of items using exploratory MIRT. A critical decision in exploratory MIRT analyses is the number of factors to retain. Unfortunately, the comparative properties of statistical methods and innovative Machine Learning (ML) methods for factor retention in exploratory MIRT analyses are still not clear. This study aims to fill this gap by comparing a selection of statistical and ML methods, including Kaiser Criterion (KC), Empirical Kaiser Criterion (EKC), Parallel Analysis (PA), scree plot (OC and AF), Very Simple Structure (VSS; C1 and C2), Minimum Average Partial (MAP), Exploratory Graph Analysis (EGA), Random Forest (RF), Histogram-based Gradient Boosted Decision Trees (HistGBDT), eXtreme Gradient Boosting (XGBoost), and Artificial Neural Network (ANN). The comparison was performed using 720,000 dichotomous response data sets simulated by the MIRT, for various between-item and within-item structures and considering characteristics of large-scale assessments. The results show that MAP, RF, HistGBDT, XGBoost, and ANN tremendously outperform other methods. Among them, HistGBDT generally performs better than other methods. Furthermore, including statistical methods' results as training features improves ML methods' performance. The methods' correct-factoring proportions decrease with an increase in missingness or a decrease in sample size. KC, PA, EKC, and scree plot (OC) are over-factoring, while EGA, scree plot (AF), and VSS (C1) are under-factoring. We recommend that practitioners use both MAP and HistGBDT to determine the number of factors when applying exploratory MIRT.

多维项目反应理论(MIRT)在教育和心理评估工具的开发中得到了常规应用,例如,利用探索性MIRT来探索项目的多维结构。探索性MIRT分析的一个关键决定是保留的因素数量。不幸的是,在探索性MIRT分析中,统计方法和创新的机器学习(ML)方法在因子保留方面的比较特性仍然不清楚。本研究旨在通过比较统计和ML方法的选择来填补这一空白,包括Kaiser标准(KC),经验Kaiser标准(EKC),平行分析(PA),屏幕图(OC和AF),非常简单结构(VSS);C1和C2),最小平均偏(MAP),探索性图分析(EGA),随机森林(RF),基于直方图的梯度增强决策树(HistGBDT),极端梯度增强(XGBoost)和人工神经网络(ANN)。比较是使用由MIRT模拟的72万个二分反应数据集进行的,用于各种项目间和项目内结构,并考虑大规模评估的特点。结果表明,MAP、RF、HistGBDT、XGBoost和ANN大大优于其他方法。其中,HistGBDT的性能普遍优于其他方法。此外,将统计方法的结果作为训练特征可以提高机器学习方法的性能。这些方法的正确因子比例随着缺失量的增加或样本量的减少而降低。KC、PA、EKC和屏幕图(OC)是过度保理,而EGA、屏幕图(AF)和VSS (C1)是欠保理。我们建议从业者在应用探索性MIRT时同时使用MAP和HistGBDT来确定因素的数量。
{"title":"Factor Retention in Exploratory Multidimensional Item Response Theory.","authors":"Changsheng Chen, Robbe D'hondt, Celine Vens, Wim Van den Noortgate","doi":"10.1177/00131644241306680","DOIUrl":"10.1177/00131644241306680","url":null,"abstract":"<p><p>Multidimensional Item Response Theory (MIRT) is applied routinely in developing educational and psychological assessment tools, for instance, for exploring multidimensional structures of items using exploratory MIRT. A critical decision in exploratory MIRT analyses is the number of factors to retain. Unfortunately, the comparative properties of statistical methods and innovative Machine Learning (ML) methods for factor retention in exploratory MIRT analyses are still not clear. This study aims to fill this gap by comparing a selection of statistical and ML methods, including Kaiser Criterion (KC), Empirical Kaiser Criterion (EKC), Parallel Analysis (PA), scree plot (OC and AF), Very Simple Structure (VSS; C1 and C2), Minimum Average Partial (MAP), Exploratory Graph Analysis (EGA), Random Forest (RF), Histogram-based Gradient Boosted Decision Trees (HistGBDT), eXtreme Gradient Boosting (XGBoost), and Artificial Neural Network (ANN). The comparison was performed using 720,000 dichotomous response data sets simulated by the MIRT, for various between-item and within-item structures and considering characteristics of large-scale assessments. The results show that MAP, RF, HistGBDT, XGBoost, and ANN tremendously outperform other methods. Among them, HistGBDT generally performs better than other methods. Furthermore, including statistical methods' results as training features improves ML methods' performance. The methods' correct-factoring proportions decrease with an increase in missingness or a decrease in sample size. KC, PA, EKC, and scree plot (OC) are over-factoring, while EGA, scree plot (AF), and VSS (C1) are under-factoring. We recommend that practitioners use both MAP and HistGBDT to determine the number of factors when applying exploratory MIRT.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":" ","pages":"672-695"},"PeriodicalIF":2.3,"publicationDate":"2025-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11699551/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142931009","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Examination of ChatGPT's Performance as a Data Analysis Tool. ChatGPT作为数据分析工具的性能检验。
IF 2.3 3区 心理学 Q2 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-01-03 eCollection Date: 2025-08-01 DOI: 10.1177/00131644241302721
Duygu Koçak

This study examines the performance of ChatGPT, developed by OpenAI and widely used as an AI-based conversational tool, as a data analysis tool through exploratory factor analysis (EFA). To this end, simulated data were generated under various data conditions, including normal distribution, response category, sample size, test length, factor loading, and measurement models. The generated data were analyzed using ChatGPT-4o twice with a 1-week interval under the same prompt, and the results were compared with those obtained using R code. In data analysis, the Kaiser-Meyer-Olkin (KMO) value, total variance explained, and the number of factors estimated using the empirical Kaiser criterion, Hull method, and Kaiser-Guttman criterion, as well as factor loadings, were calculated. The findings obtained from ChatGPT at two different times were found to be consistent with those obtained using R. Overall, ChatGPT demonstrated good performance for steps that require only computational decisions without involving researcher judgment or theoretical evaluation (such as KMO, total variance explained, and factor loadings). However, for multidimensional structures, although the estimated number of factors was consistent across analyses, biases were observed, suggesting that researchers should exercise caution in such decisions.

本研究通过探索性因素分析(EFA)对OpenAI开发的ChatGPT作为数据分析工具的性能进行了研究,ChatGPT是一种广泛使用的基于人工智能的会话工具。为此,在各种数据条件下生成模拟数据,包括正态分布、响应类别、样本量、试验长度、因子载荷和测量模型。在相同的提示下,使用chatgpt - 40对生成的数据进行两次分析,每隔一周进行一次,并与使用R代码获得的结果进行比较。在数据分析中,计算了Kaiser- meyer - olkin (KMO)值、解释的总方差、使用经验Kaiser准则、Hull方法和Kaiser- guttman准则估计的因子数量以及因子负荷。在两个不同的时间从ChatGPT获得的结果被发现与使用r获得的结果一致。总的来说,ChatGPT在只需要计算决策而不涉及研究人员判断或理论评估(如KMO,总方差解释和因子负载)的步骤中表现出良好的性能。然而,对于多维结构,尽管在分析中估计的因素数量是一致的,但仍观察到偏差,这表明研究人员在做出此类决定时应谨慎行事。
{"title":"Examination of ChatGPT's Performance as a Data Analysis Tool.","authors":"Duygu Koçak","doi":"10.1177/00131644241302721","DOIUrl":"10.1177/00131644241302721","url":null,"abstract":"<p><p>This study examines the performance of ChatGPT, developed by OpenAI and widely used as an AI-based conversational tool, as a data analysis tool through exploratory factor analysis (EFA). To this end, simulated data were generated under various data conditions, including normal distribution, response category, sample size, test length, factor loading, and measurement models. The generated data were analyzed using ChatGPT-4o twice with a 1-week interval under the same prompt, and the results were compared with those obtained using R code. In data analysis, the Kaiser-Meyer-Olkin (KMO) value, total variance explained, and the number of factors estimated using the empirical Kaiser criterion, Hull method, and Kaiser-Guttman criterion, as well as factor loadings, were calculated. The findings obtained from ChatGPT at two different times were found to be consistent with those obtained using R. Overall, ChatGPT demonstrated good performance for steps that require only computational decisions without involving researcher judgment or theoretical evaluation (such as KMO, total variance explained, and factor loadings). However, for multidimensional structures, although the estimated number of factors was consistent across analyses, biases were observed, suggesting that researchers should exercise caution in such decisions.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":" ","pages":"641-671"},"PeriodicalIF":2.3,"publicationDate":"2025-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11696938/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142931005","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Educational and Psychological Measurement
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1