首页 > 最新文献

Psychometrika最新文献

英文 中文
A Latent Markov Model for Noninvariant Measurements: An Application to Interaction Log Data From Computer-Interactive Assessments. 非不变测量的潜马尔可夫模型:在计算机交互评估的交互日志数据中的应用。
IF 3.1 2区 心理学 Q1 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-09-01 Epub Date: 2025-08-26 DOI: 10.1017/psy.2025.10029
Hyeon-Ah Kang

The latent Markov model (LMM) has been increasingly used to analyze log data from computer-interactive assessments. An important consideration in applying the LMM to assessment data is measurement effects of items. In educational and psychological assessment, items exhibit distinct psychometric qualities and induce systematic variance to assessment outcome data. The current development in LMM, however, assumes that items have uniform effects and do not contribute to the variance of measurement outcomes. In this study, we propose a refinement of LMM that relaxes the measurement invariance constraint and examine empirical performance of the new framework through numerical experimentation. We modify the LMM for noninvariant measurements and refine the inferential scheme to accommodate the event-specific measurement effects. Numerical experiments are conducted to validate the proposed inference methods and evaluate the performance of the new framework. Results suggest that the proposed inferential scheme performs adequately well in retrieving the model parameters and state profiles. The new LMM framework demonstrated reliable and stable performance in modeling latent processes while appropriately accounting for items' measurement effects. Compared with the traditional scheme, the refined framework demonstrated greater relevance to real assessment data and yielded more robust inference results when the model was ill-specified. The findings from the empirical evaluations suggest that the new framework has potential for serving large-scale assessment data that exhibit distinct measurement effects.

潜马尔可夫模型(LMM)越来越多地用于分析计算机交互评估的测井数据。将LMM应用于评估数据的一个重要考虑因素是项目的测量效果。在教育和心理评估中,项目表现出不同的心理测量质量,并导致评估结果数据的系统方差。然而,当前LMM的发展假设项目具有统一的效果,并且不会导致测量结果的方差。在这项研究中,我们提出了一种改进的LMM,放宽了测量不变性约束,并通过数值实验检验了新框架的经验性能。我们修改了非不变测量的LMM,并改进了推理方案以适应特定于事件的测量效果。数值实验验证了所提出的推理方法,并对新框架的性能进行了评价。结果表明,所提出的推理方案在检索模型参数和状态概况方面表现良好。新的LMM框架在对潜在过程的建模中表现出可靠和稳定的性能,同时适当地考虑了项目的测量效应。与传统方案相比,在模型不明确的情况下,改进框架与真实评估数据的相关性更强,推理结果更稳健。实证评估结果表明,新框架具有服务于具有明显测量效果的大规模评估数据的潜力。
{"title":"A Latent Markov Model for Noninvariant Measurements: An Application to Interaction Log Data From Computer-Interactive Assessments.","authors":"Hyeon-Ah Kang","doi":"10.1017/psy.2025.10029","DOIUrl":"10.1017/psy.2025.10029","url":null,"abstract":"<p><p>The latent Markov model (LMM) has been increasingly used to analyze log data from computer-interactive assessments. An important consideration in applying the LMM to assessment data is measurement effects of items. In educational and psychological assessment, items exhibit distinct psychometric qualities and induce systematic variance to assessment outcome data. The current development in LMM, however, assumes that items have uniform effects and do not contribute to the variance of measurement outcomes. In this study, we propose a refinement of LMM that relaxes the measurement invariance constraint and examine empirical performance of the new framework through numerical experimentation. We modify the LMM for noninvariant measurements and refine the inferential scheme to accommodate the event-specific measurement effects. Numerical experiments are conducted to validate the proposed inference methods and evaluate the performance of the new framework. Results suggest that the proposed inferential scheme performs adequately well in retrieving the model parameters and state profiles. The new LMM framework demonstrated reliable and stable performance in modeling latent processes while appropriately accounting for items' measurement effects. Compared with the traditional scheme, the refined framework demonstrated greater relevance to real assessment data and yielded more robust inference results when the model was ill-specified. The findings from the empirical evaluations suggest that the new framework has potential for serving large-scale assessment data that exhibit distinct measurement effects.</p>","PeriodicalId":54534,"journal":{"name":"Psychometrika","volume":" ","pages":"1481-1505"},"PeriodicalIF":3.1,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12660023/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144978378","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bayesian Transition Diagnostic Classification Models with Polya-Gamma Augmentation. 具有poly -gamma增强的贝叶斯过渡诊断分类模型。
IF 3.1 2区 心理学 Q1 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-09-01 Epub Date: 2025-08-08 DOI: 10.1017/psy.2025.10031
Joseph Resch, Samuel Baugh, Hao Duan, James Tang, Matthew J Madison, Michael Cotterell, Minjeong Jeon

Diagnostic classification models assume the existence of latent attribute profiles, the possession of which increases the probability of responding correctly to questions requiring the corresponding attributes. Through the use of longitudinally administered exams, the degree to which students are acquiring core attributes over time can be assessed. While past approaches to longitudinal diagnostic classification modeling perform inference on the overall probability of acquiring particular attributes, there is particular interest in the relationship between student progression and student covariates such as intervention effects. To address this need, we propose an integrated Bayesian model for student progression in a longitudinal diagnostic classification modeling framework. Using Pòlya-gamma augmentation with two logistic link functions, we achieve computationally efficient posterior estimation with a conditionally Gibbs sampling procedure. We show that this approach achieves accurate parameter recovery when evaluated using simulated data. We also demonstrate the method on a real-world educational testing data set.

诊断分类模型假定存在潜在的属性概况,拥有潜在的属性概况可以增加正确回答需要相应属性的问题的概率。通过使用纵向管理的考试,可以评估学生在一段时间内获得核心属性的程度。虽然过去的纵向诊断分类建模方法对获得特定属性的总体概率进行了推断,但对学生进步和学生协变量(如干预效果)之间的关系特别感兴趣。为了满足这一需求,我们提出了一个纵向诊断分类建模框架中学生进步的综合贝叶斯模型。利用Pòlya-gamma与两个逻辑链接函数的增强,我们用条件Gibbs抽样过程实现了计算效率高的后验估计。我们表明,当使用模拟数据评估时,该方法实现了准确的参数恢复。我们还在一个真实的教育测试数据集上演示了该方法。
{"title":"Bayesian Transition Diagnostic Classification Models with Polya-Gamma Augmentation.","authors":"Joseph Resch, Samuel Baugh, Hao Duan, James Tang, Matthew J Madison, Michael Cotterell, Minjeong Jeon","doi":"10.1017/psy.2025.10031","DOIUrl":"10.1017/psy.2025.10031","url":null,"abstract":"<p><p>Diagnostic classification models assume the existence of latent attribute profiles, the possession of which increases the probability of responding correctly to questions requiring the corresponding attributes. Through the use of longitudinally administered exams, the degree to which students are acquiring core attributes over time can be assessed. While past approaches to longitudinal diagnostic classification modeling perform inference on the overall probability of acquiring particular attributes, there is particular interest in the relationship between student progression and student covariates such as intervention effects. To address this need, we propose an integrated Bayesian model for student progression in a longitudinal diagnostic classification modeling framework. Using Pòlya-gamma augmentation with two logistic link functions, we achieve computationally efficient posterior estimation with a conditionally Gibbs sampling procedure. We show that this approach achieves accurate parameter recovery when evaluated using simulated data. We also demonstrate the method on a real-world educational testing data set.</p>","PeriodicalId":54534,"journal":{"name":"Psychometrika","volume":" ","pages":"1368-1399"},"PeriodicalIF":3.1,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12660026/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144800958","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Joint Item Response Models for Manual and Automatic Scores on Open-Ended Test Items. 开放式测试项目手动和自动得分的联合项目反应模型。
IF 3.1 2区 心理学 Q1 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-09-01 Epub Date: 2025-06-16 DOI: 10.1017/psy.2025.10018
Daniel Bengs, Ulf Brefeld, Ulf Kroehne, Fabian Zehner

Test items using open-ended response formats can increase an instrument's construct validity. However, traditionally, their application in educational testing requires human coders to score the responses. Manual scoring not only increases operational costs but also prohibits the use of evidence from open-ended items to inform routing decisions in adaptive designs. Using machine learning and natural language processing, automatic scoring provides classifiers that can instantly assign scores to text responses. Although optimized for agreement with manual scores, automatic scoring is not perfectly accurate and introduces an additional source of error into the response process, leading to a misspecification of the measurement model used with the manual score. We propose two joint models for manual and automatic scores of automatically scored open-ended items. Our models extend a given model from Item Response Theory for the manual scores by a component for the automatic scores, accounting for classification errors. The models were evaluated using data from the Programme for International Student Assessment (2012) and simulated data, demonstrating their capacity to mitigate the impact of classification errors on ability estimation compared to a baseline that disregards classification errors.

使用开放式回答格式的测试项目可以提高工具的结构效度。然而,传统上,它们在教育测试中的应用需要人类编码员对回答进行评分。人工评分不仅增加了操作成本,而且还禁止使用开放式项目的证据来为自适应设计中的路由决策提供信息。使用机器学习和自然语言处理,自动评分提供了分类器,可以立即为文本回复分配分数。尽管优化了与手动评分的一致性,但自动评分并不完全准确,并且在响应过程中引入了额外的错误来源,导致与手动评分一起使用的度量模型的错误说明。我们提出了人工和自动评分开放式项目的两种联合模型。我们的模型扩展了项目反应理论中用于手动得分的给定模型,增加了用于自动得分的组件,并考虑了分类错误。使用国际学生评估项目(2012)的数据和模拟数据对这些模型进行了评估,证明了与忽略分类错误的基线相比,它们能够减轻分类错误对能力估计的影响。
{"title":"Joint Item Response Models for Manual and Automatic Scores on Open-Ended Test Items.","authors":"Daniel Bengs, Ulf Brefeld, Ulf Kroehne, Fabian Zehner","doi":"10.1017/psy.2025.10018","DOIUrl":"10.1017/psy.2025.10018","url":null,"abstract":"<p><p>Test items using open-ended response formats can increase an instrument's construct validity. However, traditionally, their application in educational testing requires human coders to score the responses. Manual scoring not only increases operational costs but also prohibits the use of evidence from open-ended items to inform routing decisions in adaptive designs. Using machine learning and natural language processing, automatic scoring provides classifiers that can instantly assign scores to text responses. Although optimized for agreement with manual scores, automatic scoring is not perfectly accurate and introduces an additional source of error into the response process, leading to a misspecification of the measurement model used with the manual score. We propose two joint models for manual and automatic scores of automatically scored open-ended items. Our models extend a given model from Item Response Theory for the manual scores by a component for the automatic scores, accounting for classification errors. The models were evaluated using data from the Programme for International Student Assessment (2012) and simulated data, demonstrating their capacity to mitigate the impact of classification errors on ability estimation compared to a baseline that disregards classification errors.</p>","PeriodicalId":54534,"journal":{"name":"Psychometrika","volume":" ","pages":"1346-1367"},"PeriodicalIF":3.1,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12660020/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144303615","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Evidence Factors in Fuzzy Regression Discontinuity Designs with Sequential Treatment Assignments. 序列处理分配模糊回归不连续设计的证据因素。
IF 3.1 2区 心理学 Q1 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-09-01 Epub Date: 2025-08-08 DOI: 10.1017/psy.2025.10033
Youjin Lee, Youmi Suk

Many observational studies often involve multiple levels of treatment assignment. In particular, fuzzy regression discontinuity (RD) designs have sequential treatment assignment processes: first based on eligibility criteria, and second, on (non-)compliance rules. In such fuzzy RD designs, researchers typically use either an intent-to-treat approach or an instrumental variable-type approach, and each is subject to both overlapping and unique biases. This article proposes a new evidence factors (EFs) framework for fuzzy RD designs with sequential treatment assignments, which may be influenced by different levels of decision-makers. Each of the proposed EFs aims to test the same causal null hypothesis while potentially being subject to different types of biases. Our proposed framework utilizes the local RD randomization and randomization-based inference. We evaluate the effectiveness of our proposed framework through simulation studies and two real datasets on pre-kindergarten programs and testing accommodations.

许多观察性研究通常涉及多个级别的治疗分配。特别是,模糊回归不连续(RD)设计具有顺序的处理分配过程:首先基于资格标准,其次基于(不)遵守规则。在这种模糊RD设计中,研究人员通常使用意向治疗方法或工具变量类型方法,每种方法都受到重叠和独特偏差的影响。本文提出了一个新的证据因子框架,用于具有顺序处理任务的模糊研发设计,这可能受到不同层次决策者的影响。每个提出的EFs都旨在测试相同的因果零假设,同时可能受到不同类型的偏差的影响。我们提出的框架利用了局部RD随机化和基于随机化的推理。我们通过模拟研究和两个关于学前教育项目和测试住宿的真实数据集来评估我们提出的框架的有效性。
{"title":"Evidence Factors in Fuzzy Regression Discontinuity Designs with Sequential Treatment Assignments.","authors":"Youjin Lee, Youmi Suk","doi":"10.1017/psy.2025.10033","DOIUrl":"10.1017/psy.2025.10033","url":null,"abstract":"<p><p>Many observational studies often involve multiple levels of treatment assignment. In particular, fuzzy regression discontinuity (RD) designs have sequential treatment assignment processes: first based on eligibility criteria, and second, on (non-)compliance rules. In such fuzzy RD designs, researchers typically use either an intent-to-treat approach or an instrumental variable-type approach, and each is subject to both overlapping and unique biases. This article proposes a new evidence factors (EFs) framework for fuzzy RD designs with sequential treatment assignments, which may be influenced by different levels of decision-makers. Each of the proposed EFs aims to test the same causal null hypothesis while potentially being subject to different types of biases. Our proposed framework utilizes the local RD randomization and randomization-based inference. We evaluate the effectiveness of our proposed framework through simulation studies and two real datasets on pre-kindergarten programs and testing accommodations.</p>","PeriodicalId":54534,"journal":{"name":"Psychometrika","volume":" ","pages":"1400-1418"},"PeriodicalIF":3.1,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12660022/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144800959","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bayesian Structural Equation Envelope Model. 贝叶斯结构方程包络模型。
IF 3.1 2区 心理学 Q1 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-09-01 Epub Date: 2025-08-08 DOI: 10.1017/psy.2025.10027
Chuchu Wang, Rongqian Sun, Xiangnan Feng, Xinyuan Song

The envelope model has gained significant attention since its proposal, offering a fresh perspective on dimension reduction in multivariate regression models and improving estimation efficiency. One of its appealing features is its adaptability to diverse regression contexts. This article introduces the integration of envelope methods into the factor analysis model. In contrast to previous research primarily focused on the frequentist approach, the study proposes a Bayesian approach for estimation and envelope dimension selection. A Metropolis-within-Gibbs sampling algorithm is developed to draw posterior samples for Bayesian inference. A simulation study is conducted to illustrate the effectiveness of the proposed method. Additionally, the proposed methodology is applied to the ADNI dataset to explore the relationship between cognitive decline and the changes occurring in various brain regions. This empirical application further highlights the practical utility of the proposed model in real-world scenarios.

包络模型自提出以来,为多元回归模型的降维提供了新的视角,提高了估计效率,受到了广泛的关注。它的一个吸引人的特点是它对各种回归上下文的适应性。本文介绍了包络法在因子分析模型中的集成。与以往的研究主要集中在频率论方法不同,本研究提出了一种贝叶斯方法进行估计和包络维选择。提出了一种大都市-吉布斯内抽样算法,用于提取贝叶斯推理的后验样本。仿真实验验证了该方法的有效性。此外,所提出的方法应用于ADNI数据集,以探索认知能力下降与大脑各区域发生的变化之间的关系。这一实证应用进一步突出了所提出的模型在现实世界场景中的实际效用。
{"title":"Bayesian Structural Equation Envelope Model.","authors":"Chuchu Wang, Rongqian Sun, Xiangnan Feng, Xinyuan Song","doi":"10.1017/psy.2025.10027","DOIUrl":"10.1017/psy.2025.10027","url":null,"abstract":"<p><p>The envelope model has gained significant attention since its proposal, offering a fresh perspective on dimension reduction in multivariate regression models and improving estimation efficiency. One of its appealing features is its adaptability to diverse regression contexts. This article introduces the integration of envelope methods into the factor analysis model. In contrast to previous research primarily focused on the frequentist approach, the study proposes a Bayesian approach for estimation and envelope dimension selection. A Metropolis-within-Gibbs sampling algorithm is developed to draw posterior samples for Bayesian inference. A simulation study is conducted to illustrate the effectiveness of the proposed method. Additionally, the proposed methodology is applied to the ADNI dataset to explore the relationship between cognitive decline and the changes occurring in various brain regions. This empirical application further highlights the practical utility of the proposed model in real-world scenarios.</p>","PeriodicalId":54534,"journal":{"name":"Psychometrika","volume":" ","pages":"1236-1257"},"PeriodicalIF":3.1,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12660024/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144800957","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Beta Mixture Model for Careless Respondent Detection in Visual Analogue Scale Data. 视觉模拟尺度数据中粗心应答者检测的Beta混合模型。
IF 3.1 2区 心理学 Q1 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-09-01 Epub Date: 2025-09-23 DOI: 10.1017/psy.2025.10041
Lijin Zhang, Benjamin W Domingue, Leonie V D E Vogelsmeier, Esther Ulitzsch

Visual Analogue scales (VASs) are increasingly popular in psychological, social, and medical research. However, VASs can also be more demanding for respondents, potentially leading to quicker disengagement and a higher risk of careless responding. Existing mixture modeling approaches for careless response detection have so far only been available for Likert-type and unbounded continuous data but have not been tailored to VAS data. This study introduces and evaluates a model-based approach specifically designed to detect and account for careless respondents in VAS data. We integrate existing measurement models for VASs with mixture item response theory models for identifying and modeling careless responding. Simulation results show that the proposed model effectively detects careless responding and recovers key parameters. We illustrate the model's potential for identifying and accounting for careless responding using real data from both VASs and Likert scales. First, we show how the model can be used to compare careless responding across different scale types, revealing a higher proportion of careless respondents in VAS compared to Likert scale data. Second, we demonstrate that item parameters from the proposed model exhibit improved psychometric properties compared to those from a model that ignores careless responding. These findings underscore the model's potential to enhance data quality by identifying and addressing careless responding.

视觉模拟量表(VASs)在心理、社会和医学研究中越来越流行。然而,VASs对受访者的要求也可能更高,这可能会导致他们更快地脱离工作,并增加粗心回复的风险。目前,用于粗心响应检测的混合建模方法仅适用于likert型和无界连续数据,而没有针对VAS数据进行定制。本研究介绍并评估了一种基于模型的方法,专门用于检测和解释VAS数据中粗心的受访者。我们将现有的VASs测量模型与混合项目反应理论模型相结合,用于识别和建模粗心反应。仿真结果表明,该模型能有效地检测出粗心响应并恢复关键参数。我们使用来自VASs和Likert量表的真实数据来说明该模型在识别和解释粗心响应方面的潜力。首先,我们展示了如何使用该模型来比较不同量表类型的粗心反应,揭示了与李克特量表数据相比,VAS中粗心应答者的比例更高。其次,我们证明,与忽略粗心反应的模型相比,所提出模型的项目参数表现出更好的心理测量特性。这些发现强调了该模型通过识别和处理粗心的响应来提高数据质量的潜力。
{"title":"A Beta Mixture Model for Careless Respondent Detection in Visual Analogue Scale Data.","authors":"Lijin Zhang, Benjamin W Domingue, Leonie V D E Vogelsmeier, Esther Ulitzsch","doi":"10.1017/psy.2025.10041","DOIUrl":"10.1017/psy.2025.10041","url":null,"abstract":"<p><p>Visual Analogue scales (VASs) are increasingly popular in psychological, social, and medical research. However, VASs can also be more demanding for respondents, potentially leading to quicker disengagement and a higher risk of careless responding. Existing mixture modeling approaches for careless response detection have so far only been available for Likert-type and unbounded continuous data but have not been tailored to VAS data. This study introduces and evaluates a model-based approach specifically designed to detect and account for careless respondents in VAS data. We integrate existing measurement models for VASs with mixture item response theory models for identifying and modeling careless responding. Simulation results show that the proposed model effectively detects careless responding and recovers key parameters. We illustrate the model's potential for identifying and accounting for careless responding using real data from both VASs and Likert scales. First, we show how the model can be used to compare careless responding across different scale types, revealing a higher proportion of careless respondents in VAS compared to Likert scale data. Second, we demonstrate that item parameters from the proposed model exhibit improved psychometric properties compared to those from a model that ignores careless responding. These findings underscore the model's potential to enhance data quality by identifying and addressing careless responding.</p>","PeriodicalId":54534,"journal":{"name":"Psychometrika","volume":" ","pages":"1558-1581"},"PeriodicalIF":3.1,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12672952/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145126224","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Teamwork Cognitive Diagnostic Modeling. 团队认知诊断模型。
IF 3.1 2区 心理学 Q1 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-09-01 Epub Date: 2025-08-08 DOI: 10.1017/psy.2025.10036
Peida Zhan, Zhimou Wang, Gaohong Chu, Haixin Qiao

Teamwork relies on collaboration to achieve goals that exceed individual capabilities, with team cognition playing a key role by integrating individual expertise and shared understanding. Identifying the causes of inefficiencies or poor team performance is critical for implementing targeted interventions and fostering the development of team cognition. This study proposes a teamwork cognitive diagnostic modeling framework comprising 12 specific models-collectively referred to as Team-CDMs-which are designed to capture the interdependence among team members through emergent team cognitions by jointly modeling individual cognitive attributes and a team-level construct, termed teamwork quality, which reflects the social dimension of collaboration. The models can be used to identify strengths and weaknesses in team cognition and determine whether poor performance arises from cognitive deficiencies or social issues. Two simulation studies were conducted to assess the psychometric properties of the models under diverse conditions, followed by a teamwork reasoning task to demonstrate their application. The results showed that Team-CDMs achieve robust parameter estimation, effectively diagnose individual attributes, and assess teamwork quality while pinpointing the causes of poor performance. These findings underscore the utility of Team-CDMs in understanding, diagnosing, and improving team cognition, offering a foundation for future research and practical applications in teamwork-based assessments.

团队合作依靠协作来实现超越个人能力的目标,团队认知通过整合个人专业知识和共享理解发挥关键作用。确定效率低下或团队绩效差的原因对于实施有针对性的干预和促进团队认知的发展至关重要。本研究提出了一个团队认知诊断模型框架,该模型包括12个具体模型(统称为team- cdms),旨在通过共同建模个体认知属性和团队层面的团队素质(反映协作的社会维度),通过突发团队认知捕捉团队成员之间的相互依存关系。这些模型可以用来识别团队认知的优势和劣势,并确定绩效不佳是由认知缺陷还是社会问题引起的。通过两个模拟研究来评估模型在不同条件下的心理测量特性,然后通过团队推理任务来演示模型的应用。结果表明,团队- cdms实现了稳健的参数估计,有效地诊断个体属性,并在确定绩效不佳原因的同时评估团队质量。这些发现强调了team - cdm在理解、诊断和提高团队认知方面的效用,为未来的研究和基于团队的评估的实际应用奠定了基础。
{"title":"Teamwork Cognitive Diagnostic Modeling.","authors":"Peida Zhan, Zhimou Wang, Gaohong Chu, Haixin Qiao","doi":"10.1017/psy.2025.10036","DOIUrl":"10.1017/psy.2025.10036","url":null,"abstract":"<p><p>Teamwork relies on collaboration to achieve goals that exceed individual capabilities, with team cognition playing a key role by integrating individual expertise and shared understanding. Identifying the causes of inefficiencies or poor team performance is critical for implementing targeted interventions and fostering the development of team cognition. This study proposes a teamwork cognitive diagnostic modeling framework comprising 12 specific models-collectively referred to as Team-CDMs-which are designed to capture the interdependence among team members through emergent team cognitions by jointly modeling individual cognitive attributes and a team-level construct, termed <i>teamwork quality</i>, which reflects the social dimension of collaboration. The models can be used to identify strengths and weaknesses in team cognition and determine whether poor performance arises from cognitive deficiencies or social issues. Two simulation studies were conducted to assess the psychometric properties of the models under diverse conditions, followed by a teamwork reasoning task to demonstrate their application. The results showed that Team-CDMs achieve robust parameter estimation, effectively diagnose individual attributes, and assess teamwork quality while pinpointing the causes of poor performance. These findings underscore the utility of Team-CDMs in understanding, diagnosing, and improving team cognition, offering a foundation for future research and practical applications in teamwork-based assessments.</p>","PeriodicalId":54534,"journal":{"name":"Psychometrika","volume":" ","pages":"1319-1345"},"PeriodicalIF":3.1,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12659997/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144800960","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Analysis of Log Data From an International Online Educational Assessment System: A Multi-State Survival Modeling Approach to Reaction Time Between and Across Action Sequence. 国际在线教育评估系统日志数据分析:动作序列间和跨动作序列反应时间的多状态生存建模方法。
IF 3.1 2区 心理学 Q1 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-09-01 DOI: 10.1017/psy.2025.10043
Jina Park, Ick Hoon Jin, Minjeong Jeon

With increasingly available computer-based or online assessments, researchers have shown keen interest in analyzing log data to improve our understanding of test takers' problem-solving processes. In this article, we propose a multi-state survival model (MSM) to action sequence data from log files, focusing on modeling test takers' reaction times between actions, in order to investigate which factors and how they influence test takers' transition speed between actions. We specifically identify the key actions that differentiate correct and incorrect answers, compare transition probabilities between these groups, and analyze their distinct problem-solving patterns. Through simulation studies and sensitivity analyses, we evaluate the robustness of our proposed model. We demonstrate the proposed approach using problem-solving items from the Programme for the International Assessment of Adult Competencies (PIAAC).

随着越来越多的基于计算机或在线的评估,研究人员对分析日志数据以提高我们对考生解决问题过程的理解表现出了浓厚的兴趣。本文针对日志文件中的动作序列数据,提出了一个多状态生存模型(MSM),重点对考生动作间的反应时间进行建模,以探讨哪些因素以及这些因素如何影响考生动作间的过渡速度。我们明确了区分正确和错误答案的关键动作,比较了这些组之间的转移概率,并分析了他们独特的解决问题的模式。通过仿真研究和敏感性分析,我们评估了所提出模型的鲁棒性。我们使用国际成人能力评估项目(PIAAC)中的问题解决项目来演示拟议的方法。
{"title":"Analysis of Log Data From an International Online Educational Assessment System: A Multi-State Survival Modeling Approach to Reaction Time Between and Across Action Sequence.","authors":"Jina Park, Ick Hoon Jin, Minjeong Jeon","doi":"10.1017/psy.2025.10043","DOIUrl":"10.1017/psy.2025.10043","url":null,"abstract":"<p><p>With increasingly available computer-based or online assessments, researchers have shown keen interest in analyzing log data to improve our understanding of test takers' problem-solving processes. In this article, we propose a multi-state survival model (MSM) to action sequence data from log files, focusing on modeling test takers' reaction times between actions, in order to investigate which factors and how they influence test takers' transition speed between actions. We specifically identify the key actions that differentiate correct and incorrect answers, compare transition probabilities between these groups, and analyze their distinct problem-solving patterns. Through simulation studies and sensitivity analyses, we evaluate the robustness of our proposed model. We demonstrate the proposed approach using problem-solving items from the Programme for the International Assessment of Adult Competencies (PIAAC).</p>","PeriodicalId":54534,"journal":{"name":"Psychometrika","volume":" ","pages":"1506-1535"},"PeriodicalIF":3.1,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12660000/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144978850","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bayesian Nonparametric Models for Multiple Raters: A General Statistical Framework. 多评分者的贝叶斯非参数模型:一个通用的统计框架。
IF 3.1 2区 心理学 Q1 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-09-01 Epub Date: 2025-08-11 DOI: 10.1017/psy.2025.10035
Giuseppe Mignemi, Ioanna Manolopoulou

Rating procedure is crucial in many applied fields (e.g., educational, clinical, emergency). In these contexts, a rater (e.g., teacher, doctor) scores a subject (e.g., student, doctor) on a rating scale. Given raters' variability, several statistical methods have been proposed for assessing and improving the quality of ratings. The analysis and the estimate of the Intraclass Correlation Coefficient (ICC) are major concerns in such cases. As evidenced by the literature, ICC might differ across different subgroups of raters and might be affected by contextual factors and subject heterogeneity. Model estimation in the presence of heterogeneity has been one of the recent challenges in this research line. Consequently, several methods have been proposed to address this issue under a parametric multilevel modelling framework, in which strong distributional assumptions are made. We propose a more flexible model under the Bayesian nonparametric (BNP) framework, in which most of those assumptions are relaxed. By eliciting hierarchical discrete nonparametric priors, the model accommodates clusters among raters and subjects, naturally accounts for heterogeneity, and improves estimates' accuracy. We propose a general BNP heteroscedastic framework to analyze continuous and coarse rating data and possible latent differences among subjects and raters. The estimated densities are used to make inferences about the rating process and the quality of the ratings. By exploiting a stick-breaking representation of the discrete nonparametric priors, a general class of ICC indices might be derived for these models. Our method allows us to independently identify latent similarities between subjects and raters and can be applied in precise education to improve personalized teaching programs or interventions. Theoretical results about the ICC are provided together with computational strategies. Simulations and a real-world application are presented, and possible future directions are discussed.

评级程序在许多应用领域(如教育、临床、急救)至关重要。在这些情况下,评价者(例如,老师,医生)在评价表上给一个主体(例如,学生,医生)打分。鉴于评级者的可变性,已经提出了几种统计方法来评估和提高评级的质量。在这种情况下,类内相关系数(ICC)的分析和估计是主要关注的问题。正如文献所证明的那样,ICC可能在不同的评分者亚组中有所不同,并可能受到背景因素和受试者异质性的影响。存在异质性的模型估计是这一研究领域最近面临的挑战之一。因此,提出了几种方法在参数化多层建模框架下解决这一问题,其中做出了强分布假设。我们在贝叶斯非参数框架下提出了一个更灵活的模型,其中大多数假设都是宽松的。通过引出分层离散非参数先验,该模型适应了评分者和受试者之间的聚类,自然地解释了异质性,提高了估计的准确性。我们提出了一个通用的BNP异方差框架来分析连续和粗糙的评分数据以及受试者和评分者之间可能存在的潜在差异。估计的密度用于对评级过程和评级质量进行推断。通过利用离散非参数先验的断裂表示,可以为这些模型导出一般类型的ICC指标。我们的方法使我们能够独立地识别受试者和评分者之间潜在的相似性,并可应用于精确教育,以改进个性化的教学计划或干预措施。给出了有关ICC的理论结果和计算策略。给出了仿真和实际应用,并讨论了可能的未来发展方向。
{"title":"Bayesian Nonparametric Models for Multiple Raters: A General Statistical Framework.","authors":"Giuseppe Mignemi, Ioanna Manolopoulou","doi":"10.1017/psy.2025.10035","DOIUrl":"10.1017/psy.2025.10035","url":null,"abstract":"<p><p>Rating procedure is crucial in many applied fields (e.g., educational, clinical, emergency). In these contexts, a rater (e.g., teacher, doctor) scores a subject (e.g., student, doctor) on a rating scale. Given raters' variability, several statistical methods have been proposed for assessing and improving the quality of ratings. The analysis and the estimate of the Intraclass Correlation Coefficient (ICC) are major concerns in such cases. As evidenced by the literature, ICC might differ across different subgroups of raters and might be affected by contextual factors and subject heterogeneity. Model estimation in the presence of heterogeneity has been one of the recent challenges in this research line. Consequently, several methods have been proposed to address this issue under a parametric multilevel modelling framework, in which strong distributional assumptions are made. We propose a more flexible model under the Bayesian nonparametric (BNP) framework, in which most of those assumptions are relaxed. By eliciting hierarchical discrete nonparametric priors, the model accommodates clusters among raters and subjects, naturally accounts for heterogeneity, and improves estimates' accuracy. We propose a general BNP heteroscedastic framework to analyze continuous and coarse rating data and possible latent differences among subjects and raters. The estimated densities are used to make inferences about the rating process and the quality of the ratings. By exploiting a stick-breaking representation of the discrete nonparametric priors, a general class of ICC indices might be derived for these models. Our method allows us to independently identify latent similarities between subjects and raters and can be applied in <i>precise education</i> to improve personalized teaching programs or interventions. Theoretical results about the ICC are provided together with computational strategies. Simulations and a real-world application are presented, and possible future directions are discussed.</p>","PeriodicalId":54534,"journal":{"name":"Psychometrika","volume":" ","pages":"1445-1480"},"PeriodicalIF":3.1,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12660027/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144818305","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A New Fit Assessment Framework for Common Factor Models Using Generalized Residuals. 基于广义残差的共因子模型拟合评估新框架。
IF 3.1 2区 心理学 Q1 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-09-01 Epub Date: 2025-08-07 DOI: 10.1017/psy.2025.10037
Youjin Sung, Youngjin Han, Yang Liu

Assessing fit in common factor models solely through the lens of mean and covariance structures, as is commonly done with conventional goodness-of-fit (GOF) assessments, may overlook critical aspects of misfit, potentially leading to misleading conclusions. To achieve more flexible fit assessment, we extend the theory of generalized residuals (Haberman & Sinharay, 2013), originally developed for models with categorical data, to encompass more general measurement models. Within this extended framework, we propose several fit test statistics designed to evaluate various parametric assumptions involved in common factor models. The examples include assessing the distributional assumptions of latent variables and the functional form assumptions of individual manifest variables. The performance of the proposed statistics is examined through simulation studies and an empirical data analysis. Our findings suggest that generalized residuals are promising tools for detecting misfit in measurement models, often masked when assessed by conventional GOF testing methods.

仅仅通过均值和协方差结构来评估共同因素模型的拟合,就像传统的拟合优度(GOF)评估一样,可能会忽略不拟合的关键方面,从而可能导致误导性结论。为了实现更灵活的拟合评估,我们扩展了广义残差理论(Haberman & Sinharay, 2013),该理论最初是为具有分类数据的模型开发的,以涵盖更一般的测量模型。在这个扩展框架内,我们提出了几个拟合检验统计,旨在评估公共因素模型中涉及的各种参数假设。这些例子包括评估潜在变量的分布假设和单个显变量的函数形式假设。通过模拟研究和实证数据分析来检验所提出的统计数据的性能。我们的研究结果表明,广义残差是检测测量模型中不拟合的有希望的工具,通常在传统的GOF测试方法评估时被掩盖。
{"title":"A New Fit Assessment Framework for Common Factor Models Using Generalized Residuals.","authors":"Youjin Sung, Youngjin Han, Yang Liu","doi":"10.1017/psy.2025.10037","DOIUrl":"10.1017/psy.2025.10037","url":null,"abstract":"<p><p>Assessing fit in common factor models solely through the lens of mean and covariance structures, as is commonly done with conventional goodness-of-fit (GOF) assessments, may overlook critical aspects of misfit, potentially leading to misleading conclusions. To achieve more flexible fit assessment, we extend the theory of generalized residuals (Haberman & Sinharay, 2013), originally developed for models with categorical data, to encompass more general measurement models. Within this extended framework, we propose several fit test statistics designed to evaluate various parametric assumptions involved in common factor models. The examples include assessing the distributional assumptions of latent variables and the functional form assumptions of individual manifest variables. The performance of the proposed statistics is examined through simulation studies and an empirical data analysis. Our findings suggest that generalized residuals are promising tools for detecting misfit in measurement models, often masked when assessed by conventional GOF testing methods.</p>","PeriodicalId":54534,"journal":{"name":"Psychometrika","volume":" ","pages":"1419-1444"},"PeriodicalIF":3.1,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12660002/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144796128","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Psychometrika
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1