首页 > 最新文献

Applied Psychological Measurement最新文献

英文 中文
Semi-Parametric Item Response Theory With O'Sullivan Splines for Item Responses and Response Time. 利用奥沙利文样条对项目响应和响应时间进行半参数项目响应理论研究
IF 1 4区 心理学 Q4 PSYCHOLOGY, MATHEMATICAL Pub Date : 2025-02-02 DOI: 10.1177/01466216251316277
Chen-Wei Liu

Response time (RT) has been an essential resource for supplementing the estimation accuracy of latent traits and item parameters in educational testing. Most item response theory (IRT) approaches are based on parametric RT models. However, since test takers may alter their behaviors during a test due to motivation or strategy shifts, fatigue, or other causes, parametric IRT models are unlikely to capture such subtle and nonlinear information. In this work, we propose a novel semi-parametric IRT model with O'Sullivan splines to accommodate the flexible mean RT shapes and explore the underlying nonlinear relationships between latent traits and RT. A simulation study was conducted to demonstrate the substantial improvement in parameter estimation achieved by the new model, as well as the detriment of using parametric models in terms of biases and measurement errors. Using this model, a dataset of mathematics test scores and RT from the Programme for International Student Assessment was analyzed to demonstrate the evident nonlinearity and to compare the proposed model with existing models in terms of model fitting. The findings presented in this study indicate the promising nature of the new approach, suggesting its potential as an additional psychometric tool to enhance test reliability and reduce measurement errors.

{"title":"Semi-Parametric Item Response Theory With O'Sullivan Splines for Item Responses and Response Time.","authors":"Chen-Wei Liu","doi":"10.1177/01466216251316277","DOIUrl":"10.1177/01466216251316277","url":null,"abstract":"<p><p>Response time (RT) has been an essential resource for supplementing the estimation accuracy of latent traits and item parameters in educational testing. Most item response theory (IRT) approaches are based on parametric RT models. However, since test takers may alter their behaviors during a test due to motivation or strategy shifts, fatigue, or other causes, parametric IRT models are unlikely to capture such subtle and nonlinear information. In this work, we propose a novel semi-parametric IRT model with O'Sullivan splines to accommodate the flexible mean RT shapes and explore the underlying nonlinear relationships between latent traits and RT. A simulation study was conducted to demonstrate the substantial improvement in parameter estimation achieved by the new model, as well as the detriment of using parametric models in terms of biases and measurement errors. Using this model, a dataset of mathematics test scores and RT from the Programme for International Student Assessment was analyzed to demonstrate the evident nonlinearity and to compare the proposed model with existing models in terms of model fitting. The findings presented in this study indicate the promising nature of the new approach, suggesting its potential as an additional psychometric tool to enhance test reliability and reduce measurement errors.</p>","PeriodicalId":48300,"journal":{"name":"Applied Psychological Measurement","volume":" ","pages":"01466216251316277"},"PeriodicalIF":1.0,"publicationDate":"2025-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11789044/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143190883","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Compound Optimal Design for Online Item Calibration Under the Two-Parameter Logistic Model.
IF 1 4区 心理学 Q4 PSYCHOLOGY, MATHEMATICAL Pub Date : 2025-01-28 DOI: 10.1177/01466216251316276
Lihong Song, Wenyi Wang

Under the theory of sequential design, compound optimal design with two optimality criteria can be used to solve the problem of efficient calibration of item parameters of item response theory model. In order to efficiently calibrate item parameters in computerized testing, a compound optimal design is proposed for the simultaneous estimation of item difficulty and discrimination parameters under the two-parameter logistic model, which adaptively focuses on optimizing the parameter which is difficult to estimate. The compound optimal design using the acceptance probability can provide ability design points to optimize the item difficulty and discrimination parameters, respectively. Simulation and real data analysis studies showed that the compound optimal design outperformed than the D-optimal and random design in terms of the recovery of both discrimination and difficulty parameters.

{"title":"Compound Optimal Design for Online Item Calibration Under the Two-Parameter Logistic Model.","authors":"Lihong Song, Wenyi Wang","doi":"10.1177/01466216251316276","DOIUrl":"10.1177/01466216251316276","url":null,"abstract":"<p><p>Under the theory of sequential design, compound optimal design with two optimality criteria can be used to solve the problem of efficient calibration of item parameters of item response theory model. In order to efficiently calibrate item parameters in computerized testing, a compound optimal design is proposed for the simultaneous estimation of item difficulty and discrimination parameters under the two-parameter logistic model, which adaptively focuses on optimizing the parameter which is difficult to estimate. The compound optimal design using the acceptance probability can provide ability design points to optimize the item difficulty and discrimination parameters, respectively. Simulation and real data analysis studies showed that the compound optimal design outperformed than the D-optimal and random design in terms of the recovery of both discrimination and difficulty parameters.</p>","PeriodicalId":48300,"journal":{"name":"Applied Psychological Measurement","volume":" ","pages":"01466216251316276"},"PeriodicalIF":1.0,"publicationDate":"2025-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11775943/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143068983","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Comparing Approaches to Estimating Person Parameters for the MUPP Model.
IF 1 4区 心理学 Q4 PSYCHOLOGY, MATHEMATICAL Pub Date : 2025-01-27 DOI: 10.1177/01466216251316278
David M LaHuis, Caitlin E Blackmore, Gage M Ammons

This study compared maximum a posteriori (MAP), expected a posteriori (EAP), and Markov Chain Monte Carlo (MCMC) approaches to computing person scores from the Multi-Unidimensional Pairwise Preference Model. The MCMC approach used the No-U-Turn sampling (NUTS). Results suggested the EAP with fully crossed quadrature and the NUTS outperformed the others when there were fewer dimensions. In addition, the NUTS produced the most accurate estimates in larger dimension conditions. The number of items per dimension had the largest effect on person parameter recovery.

{"title":"Comparing Approaches to Estimating Person Parameters for the MUPP Model.","authors":"David M LaHuis, Caitlin E Blackmore, Gage M Ammons","doi":"10.1177/01466216251316278","DOIUrl":"10.1177/01466216251316278","url":null,"abstract":"<p><p>This study compared maximum a posteriori (MAP), expected a posteriori (EAP), and Markov Chain Monte Carlo (MCMC) approaches to computing person scores from the Multi-Unidimensional Pairwise Preference Model. The MCMC approach used the No-U-Turn sampling (NUTS). Results suggested the EAP with fully crossed quadrature and the NUTS outperformed the others when there were fewer dimensions. In addition, the NUTS produced the most accurate estimates in larger dimension conditions. The number of items per dimension had the largest effect on person parameter recovery.</p>","PeriodicalId":48300,"journal":{"name":"Applied Psychological Measurement","volume":" ","pages":"01466216251316278"},"PeriodicalIF":1.0,"publicationDate":"2025-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11775930/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143068980","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Application of Bayesian Decision Theory in Detecting Test Fraud.
IF 1 4区 心理学 Q4 PSYCHOLOGY, MATHEMATICAL Pub Date : 2025-01-27 DOI: 10.1177/01466216251316559
Sandip Sinharay, Matthew S Johnson

This article suggests a new approach based on Bayesian decision theory (e.g., Cronbach & Gleser, 1965; Ferguson, 1967) for detection of test fraud. The approach leads to a simple decision rule that involves the computation of the posterior probability that an examinee committed test fraud given the data. The suggested approach was applied to a real data set that involved actual test fraud.

{"title":"Application of Bayesian Decision Theory in Detecting Test Fraud.","authors":"Sandip Sinharay, Matthew S Johnson","doi":"10.1177/01466216251316559","DOIUrl":"https://doi.org/10.1177/01466216251316559","url":null,"abstract":"<p><p>This article suggests a new approach based on Bayesian decision theory (e.g., Cronbach & Gleser, 1965; Ferguson, 1967) for detection of test fraud. The approach leads to a simple decision rule that involves the computation of the posterior probability that an examinee committed test fraud given the data. The suggested approach was applied to a real data set that involved actual test fraud.</p>","PeriodicalId":48300,"journal":{"name":"Applied Psychological Measurement","volume":" ","pages":"01466216251316559"},"PeriodicalIF":1.0,"publicationDate":"2025-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11773507/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143068974","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
R Package for Calculating Estimators of the Proportion of Explained Variance and Standardized Regression Coefficients in Multiply Imputed Datasets.
IF 1 4区 心理学 Q4 PSYCHOLOGY, MATHEMATICAL Pub Date : 2025-01-26 DOI: 10.1177/01466216251316275
Joost R van Ginkel, Julian D Karch
{"title":"R Package for Calculating Estimators of the Proportion of Explained Variance and Standardized Regression Coefficients in Multiply Imputed Datasets.","authors":"Joost R van Ginkel, Julian D Karch","doi":"10.1177/01466216251316275","DOIUrl":"10.1177/01466216251316275","url":null,"abstract":"","PeriodicalId":48300,"journal":{"name":"Applied Psychological Measurement","volume":" ","pages":"01466216251316275"},"PeriodicalIF":1.0,"publicationDate":"2025-01-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11770685/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143060765","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Experimental Design to Investigate Item Parameter Drift.
IF 1 4区 心理学 Q4 PSYCHOLOGY, MATHEMATICAL Pub Date : 2025-01-24 DOI: 10.1177/01466216251316282
Peter Baldwin, Irina Grabovsky, Kimberly A Swygert, Thomas Fogle, Pilar Reid, Brian E Clauser

Methods for detecting item parameter drift may be inadequate when every exposed item is at risk for drift. To address this scenario, a strategy for detecting item parameter drift is proposed that uses only unexposed items deployed in a stratified random method within an experimental design. The proposed method is illustrated by investigating unexpected score increases on a high-stakes licensure exam. Results for this example were suggestive of item parameter drift but not significant at the .05 level.

{"title":"An Experimental Design to Investigate Item Parameter Drift.","authors":"Peter Baldwin, Irina Grabovsky, Kimberly A Swygert, Thomas Fogle, Pilar Reid, Brian E Clauser","doi":"10.1177/01466216251316282","DOIUrl":"10.1177/01466216251316282","url":null,"abstract":"<p><p>Methods for detecting item parameter drift may be inadequate when every exposed item is at risk for drift. To address this scenario, a strategy for detecting item parameter drift is proposed that uses only unexposed items deployed in a stratified random method within an experimental design. The proposed method is illustrated by investigating unexpected score increases on a high-stakes licensure exam. Results for this example were suggestive of item parameter drift but not significant at the .05 level.</p>","PeriodicalId":48300,"journal":{"name":"Applied Psychological Measurement","volume":" ","pages":"01466216251316282"},"PeriodicalIF":1.0,"publicationDate":"2025-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11760077/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143048347","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Adaptive Measurement of Change in the Context of Item Parameter Drift. 项目参数漂移情况下变化的自适应测量。
IF 1 4区 心理学 Q4 PSYCHOLOGY, MATHEMATICAL Pub Date : 2024-12-30 DOI: 10.1177/01466216241310599
Allison W Cooperman, Ming Him Tai, Joseph N DeWeese, David J Weiss

Adaptive measurement of change (AMC) uses computerized adaptive testing (CAT) to measure and test the significance of intraindividual change on one or more latent traits. The extant AMC research has so far assumed that item parameter values are constant across testing occasions. Yet item parameters might change over time, a phenomenon termed item parameter drift (IPD). The current study examined AMC's performance in the context of IPD with unidimensional, dichotomous CATs across two testing occasions. A Monte Carlo simulation revealed that AMC false and true positive rates were primarily affected by changes in the difficulty parameter. False positive rates were related to the location of the drift items relative to the latent trait continuum, as the administration of more drift items spuriously increased the magnitude of estimated trait change. Moreover, true positive rates depended upon an interaction between the direction of difficulty parameter drift and the latent trait change trajectory. A follow-up simulation further showed that the number of items in the CAT with parameter drift impacted AMC false and true positive rates, with these relationships moderated by IPD characteristics and the latent trait change trajectory. It is recommended that test administrators confirm the absence of IPD prior to using AMC for measuring intraindividual change with educational and psychological tests.

自适应变化测量(AMC)采用计算机化的自适应测试(CAT)来测量和测试个体内部变化对一个或多个潜在特征的显著性。现有的AMC研究迄今为止都假设项目参数值在不同的测试场合是恒定的。然而,项目参数可能会随着时间的推移而改变,这种现象被称为项目参数漂移(IPD)。目前的研究通过两个测试场合,用一维、二分类cat检查了AMC在IPD背景下的表现。蒙特卡罗模拟表明,AMC的假阳性率和真阳性率主要受难度参数变化的影响。假阳性率与漂移项目相对于潜在特质连续体的位置有关,因为更多的漂移项目的管理虚假地增加了估计的特质变化的幅度。此外,真阳性率依赖于难度参数漂移方向与潜在特质变化轨迹的交互作用。后续模拟进一步表明,具有参数漂移的CAT条目数量影响AMC假阳性率和真阳性率,这种关系受到IPD特征和潜在特质变化轨迹的调节。建议考试管理员在使用AMC进行教育和心理测试来测量个人内部变化之前,先确认IPD的缺失。
{"title":"Adaptive Measurement of Change in the Context of Item Parameter Drift.","authors":"Allison W Cooperman, Ming Him Tai, Joseph N DeWeese, David J Weiss","doi":"10.1177/01466216241310599","DOIUrl":"10.1177/01466216241310599","url":null,"abstract":"<p><p>Adaptive measurement of change (AMC) uses computerized adaptive testing (CAT) to measure and test the significance of intraindividual change on one or more latent traits. The extant AMC research has so far assumed that item parameter values are constant across testing occasions. Yet item parameters might change over time, a phenomenon termed item parameter drift (IPD). The current study examined AMC's performance in the context of IPD with unidimensional, dichotomous CATs across two testing occasions. A Monte Carlo simulation revealed that AMC false and true positive rates were primarily affected by changes in the difficulty parameter. False positive rates were related to the location of the drift items relative to the latent trait continuum, as the administration of more drift items spuriously increased the magnitude of estimated trait change. Moreover, true positive rates depended upon an interaction between the direction of difficulty parameter drift and the latent trait change trajectory. A follow-up simulation further showed that the number of items in the CAT with parameter drift impacted AMC false and true positive rates, with these relationships moderated by IPD characteristics and the latent trait change trajectory. It is recommended that test administrators confirm the absence of IPD prior to using AMC for measuring intraindividual change with educational and psychological tests.</p>","PeriodicalId":48300,"journal":{"name":"Applied Psychological Measurement","volume":" ","pages":"01466216241310599"},"PeriodicalIF":1.0,"publicationDate":"2024-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11683792/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142915981","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Inference of Correlations Among Testlet Effects: A Latent Variable Selection Method. 小测验效应之间相关性的推断:潜变量选择法
IF 1 4区 心理学 Q4 PSYCHOLOGY, MATHEMATICAL Pub Date : 2024-12-26 DOI: 10.1177/01466216241310598
Xin Xu, Jinxin Guo, Tao Xin

In psychological and educational measurement, a testlet-based test is a common and popular format, especially in some large-scale assessments. In modeling testlet effects, a standard bifactor model, as a common strategy, assumes different testlet effects and the main effect to be fully independently distributed. However, it is difficult to establish perfectly independent clusters as this assumption. To address this issue, correlations among testlets could be taken into account in fitting data. Moreover, one may desire to maintain a good practical interpretation of the sparse loading matrix. In this paper, we propose data-driven learning of significant correlations in the covariance matrix through a latent variable selection method. Under the proposed method, a regularization is performed on the weak correlations for the extended bifactor model. Further, a stochastic expectation maximization algorithm is employed for efficient computation. Results from simulation studies show the consistency of the proposed method in selecting significant correlations. Empirical data from the 2015 Program for International Student Assessment is analyzed using the proposed method as an example.

在心理和教育测量中,基于测试的测试是一种常见和流行的形式,特别是在一些大规模的评估中。在模拟试验集效应时,标准双因子模型作为一种常用策略,假定不同的试验集效应和主效应是完全独立分布的。然而,这种假设很难建立完全独立的聚类。为了解决这个问题,可以在拟合数据时考虑到测试集之间的相关性。此外,人们可能希望对稀疏加载矩阵保持良好的实际解释。在本文中,我们提出了通过潜在变量选择方法对协方差矩阵中的显著相关性进行数据驱动学习。在该方法下,对扩展双因子模型的弱相关性进行正则化处理。此外,为了提高计算效率,采用了随机期望最大化算法。仿真研究结果表明,该方法在选择显著相关性方面具有一致性。以2015年国际学生评估项目的实证数据为例进行分析。
{"title":"Inference of Correlations Among Testlet Effects: A Latent Variable Selection Method.","authors":"Xin Xu, Jinxin Guo, Tao Xin","doi":"10.1177/01466216241310598","DOIUrl":"10.1177/01466216241310598","url":null,"abstract":"<p><p>In psychological and educational measurement, a testlet-based test is a common and popular format, especially in some large-scale assessments. In modeling testlet effects, a standard bifactor model, as a common strategy, assumes different testlet effects and the main effect to be fully independently distributed. However, it is difficult to establish perfectly independent clusters as this assumption. To address this issue, correlations among testlets could be taken into account in fitting data. Moreover, one may desire to maintain a good practical interpretation of the sparse loading matrix. In this paper, we propose data-driven learning of significant correlations in the covariance matrix through a latent variable selection method. Under the proposed method, a regularization is performed on the weak correlations for the extended bifactor model. Further, a stochastic expectation maximization algorithm is employed for efficient computation. Results from simulation studies show the consistency of the proposed method in selecting significant correlations. Empirical data from the 2015 Program for International Student Assessment is analyzed using the proposed method as an example.</p>","PeriodicalId":48300,"journal":{"name":"Applied Psychological Measurement","volume":" ","pages":"01466216241310598"},"PeriodicalIF":1.0,"publicationDate":"2024-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11670239/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142903933","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Information Manifold Perspective for Analyzing Test Data. 从信息流形的角度分析测试数据。
IF 1 4区 心理学 Q4 PSYCHOLOGY, MATHEMATICAL Pub Date : 2024-12-20 DOI: 10.1177/01466216241310600
James O Ramsay, Juan Li, Joakim Wallmark, Marie Wiberg

Modifications of current psychometric models for analyzing test data are proposed that produce an additive scale measure of information. This information measure is a one-dimensional space curve or curved surface manifold that is invariant across varying manifold indexing systems. The arc length along a curve manifold is used as it is an additive metric having a defined zero and a version of the bit as a unit. This property, referred to here as the scope of the test or an item, facilitates the evaluation of graphs and numerical summaries. The measurement power of the test is defined by the length of the manifold, and the performance or experiential level of a person by a position along the curve. In this study, we also use all information from the items including the information from the distractors. Test data from a large-scale college admissions test are used to illustrate the test information manifold perspective and to compare it with the well-known item response theory nominal model. It is illustrated that the use of information theory opens a vista of new ways of assessing item performance and inter-item dependency, as well as test takers' knowledge.

提出了对当前用于分析测试数据的心理测量模型的修改,以产生信息的附加尺度测量。这种信息度量是一维空间曲线或曲面流形,它在不同的流形索引系统中是不变的。沿着曲线流形的弧长被使用,因为它是一个附加度量,具有定义的零和一个版本的钻头作为单位。这个属性,在这里称为测试或项目的范围,便于对图表和数值摘要进行评估。测试的测量能力由流形的长度来定义,而一个人的表现或经验水平由曲线上的位置来定义。在本研究中,我们也使用了所有来自项目的信息,包括来自干扰物的信息。以一项大规模大学入学考试的测验数据为例,说明测验信息的多元视角,并将其与著名的项目反应理论名义模型进行比较。研究表明,信息论的应用为评估项目绩效、项目间依赖以及考生的知识提供了新的途径。
{"title":"An Information Manifold Perspective for Analyzing Test Data.","authors":"James O Ramsay, Juan Li, Joakim Wallmark, Marie Wiberg","doi":"10.1177/01466216241310600","DOIUrl":"10.1177/01466216241310600","url":null,"abstract":"<p><p>Modifications of current psychometric models for analyzing test data are proposed that produce an additive scale measure of information. This information measure is a one-dimensional space curve or curved surface manifold that is invariant across varying manifold indexing systems. The arc length along a curve manifold is used as it is an additive metric having a defined zero and a version of the bit as a unit. This property, referred to here as the scope of the test or an item, facilitates the evaluation of graphs and numerical summaries. The measurement power of the test is defined by the length of the manifold, and the performance or experiential level of a person by a position along the curve. In this study, we also use all information from the items including the information from the distractors. Test data from a large-scale college admissions test are used to illustrate the test information manifold perspective and to compare it with the well-known item response theory nominal model. It is illustrated that the use of information theory opens a vista of new ways of assessing item performance and inter-item dependency, as well as test takers' knowledge.</p>","PeriodicalId":48300,"journal":{"name":"Applied Psychological Measurement","volume":" ","pages":"01466216241310600"},"PeriodicalIF":1.0,"publicationDate":"2024-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11662344/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142878097","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Generalized Multi-Detector Combination Approach for Differential Item Functioning Detection. 差分项目功能检测的一种广义多检测器组合方法。
IF 1 4区 心理学 Q4 PSYCHOLOGY, MATHEMATICAL Pub Date : 2024-12-19 DOI: 10.1177/01466216241310602
Shan Huang, Hidetoki Ishii

Many studies on differential item functioning (DIF) detection rely on single detection methods (SDMs), each of which necessitates specific assumptions that may not always be validated. Using an inappropriate SDM can lead to diminished accuracy in DIF detection. To address this limitation, a novel multi-detector combination (MDC) approach is proposed. Unlike SDMs, MDC effectively evaluates the relevance of different SDMs under various test conditions and integrates them using supervised learning, thereby mitigating the risk associated with selecting a suboptimal SDM for DIF detection. This study aimed to validate the accuracy of the MDC approach by applying five types of SDMs and four distinct supervised learning methods in MDC modeling. Model performance was assessed using the area under the curve (AUC), which provided a comprehensive measure of the ability of the model to distinguish between classes across all threshold levels, with higher AUC values indicating higher accuracy. The MDC methods consistently achieved higher average AUC values compared to SDMs in both matched test sets (where test conditions align with the training set) and unmatched test sets. Furthermore, MDC outperformed all SDMs under each test condition. These findings indicated that MDC is highly accurate and robust across diverse test conditions, establishing it as a viable method for practical DIF detection.

许多关于差异项目功能(DIF)检测的研究依赖于单一检测方法(SDMs),每种方法都需要特定的假设,而这些假设可能并不总是有效的。使用不合适的SDM可能导致DIF检测精度降低。为了解决这一限制,提出了一种新的多检测器组合(MDC)方法。与SDM不同,MDC有效地评估了不同SDM在各种测试条件下的相关性,并使用监督学习将它们集成在一起,从而降低了为DIF检测选择次优SDM的风险。本研究旨在通过在多数据集建模中应用五种类型的sdm和四种不同的监督学习方法来验证多数据集方法的准确性。使用曲线下面积(AUC)评估模型性能,它提供了模型在所有阈值水平上区分类别的能力的综合度量,AUC值越高表明准确率越高。与sdm相比,MDC方法在匹配的测试集(测试条件与训练集一致)和不匹配的测试集中始终获得更高的平均AUC值。此外,MDC在每个测试条件下都优于所有sdm。这些发现表明,MDC在不同的测试条件下具有很高的准确性和鲁棒性,使其成为实际DIF检测的可行方法。
{"title":"A Generalized Multi-Detector Combination Approach for Differential Item Functioning Detection.","authors":"Shan Huang, Hidetoki Ishii","doi":"10.1177/01466216241310602","DOIUrl":"10.1177/01466216241310602","url":null,"abstract":"<p><p>Many studies on differential item functioning (DIF) detection rely on single detection methods (SDMs), each of which necessitates specific assumptions that may not always be validated. Using an inappropriate SDM can lead to diminished accuracy in DIF detection. To address this limitation, a novel multi-detector combination (MDC) approach is proposed. Unlike SDMs, MDC effectively evaluates the relevance of different SDMs under various test conditions and integrates them using supervised learning, thereby mitigating the risk associated with selecting a suboptimal SDM for DIF detection. This study aimed to validate the accuracy of the MDC approach by applying five types of SDMs and four distinct supervised learning methods in MDC modeling. Model performance was assessed using the area under the curve (AUC), which provided a comprehensive measure of the ability of the model to distinguish between classes across all threshold levels, with higher AUC values indicating higher accuracy. The MDC methods consistently achieved higher average AUC values compared to SDMs in both matched test sets (where test conditions align with the training set) and unmatched test sets. Furthermore, MDC outperformed all SDMs under each test condition. These findings indicated that MDC is highly accurate and robust across diverse test conditions, establishing it as a viable method for practical DIF detection.</p>","PeriodicalId":48300,"journal":{"name":"Applied Psychological Measurement","volume":" ","pages":"01466216241310602"},"PeriodicalIF":1.0,"publicationDate":"2024-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11660104/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142878074","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Applied Psychological Measurement
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1