Measurement-Interdisciplinary Research and Perspectives最新文献

英文中文

Detecting Rater Bias in Mixed-Format Assessments 检测混合格式评估中的评分者偏差

IF 1 Q3 SOCIAL SCIENCES, INTERDISCIPLINARY

Measurement-Interdisciplinary Research and Perspectives

Pub Date : 2024-02-20 DOI: 10.1080/15366367.2023.2173468

Stefanie A. Wind, Yuan Ge

Mixed-format assessments made up of multiple-choice (MC) items and constructed response (CR) items that are scored using rater judgments include unique psychometric considerations. When these item ...

由多选题（MC）和建构式回答（CR）项目组成的混合形式评估使用评分者的判断进行评分，其中包括独特的心理测量学考虑因素。当这些项目...

引用次数: 0

Exploring Construct Measures Using Rasch Models and Discretization Methods to Analyze Existing Continuous Data 使用 Rasch 模型和离散化方法分析现有连续数据，探索结构测量方法

IF 1 Q3 SOCIAL SCIENCES, INTERDISCIPLINARY

Measurement-Interdisciplinary Research and Perspectives

Pub Date : 2024-02-20 DOI: 10.1080/15366367.2023.2210358

Chen Qiu, Michael R. Peabody, Kelly D. Bradley

It is meaningful to create a comprehensive score to extract information from mass continuous data when they measure the same latent concept. Therefore, this study adopts the logic of psychometrics ...

当大量连续数据测量同一潜在概念时，从这些数据中提取信息并创建一个综合分数是非常有意义的。因此，本研究采用心理测量学的逻辑 ...

引用次数: 0

An Examination of the Linking Error Currently Used in PISA 对 PISA 目前使用的链接误差的研究

IF 1 Q3 SOCIAL SCIENCES, INTERDISCIPLINARY

Measurement-Interdisciplinary Research and Perspectives

Pub Date : 2024-02-20 DOI: 10.1080/15366367.2023.2198915

Alexander Robitzsch, Oliver Lüdtke

Educational large-scale assessment (LSA) studies like the program for international student assessment (PISA) provide important information about trends in the performance of educational indicators...

国际学生评估项目（PISA）等大规模教育评估（LSA）研究提供了有关教育指标表现趋势的重要信息......

引用次数: 0

Choosing Between the Bi-Factor and Second-Order Factor Models: A Direct Test Using Latent Variable Modeling 在双因子模型和二阶因子模型之间做出选择：使用潜在变量模型进行直接测试

IF 1 Q3 SOCIAL SCIENCES, INTERDISCIPLINARY

Measurement-Interdisciplinary Research and Perspectives

Pub Date : 2024-02-20 DOI: 10.1080/15366367.2023.2173547

Tenko Raykov, Lisa Calvocoressi, Randall E. Schumacker

This paper is concerned with the process of selecting between the increasingly popular bi-factor model and the second-order factor model in measurement research. It is indicated that in certain set...

本文关注的是在测量研究中，如何在日益流行的双因素模型和二阶因素模型之间做出选择。文章指出，在某些情况下，双因素模型和二阶因素模型之间的...

引用次数: 0

Closing Reporting Gaps: A Comparison of Methods for Estimating Unreported Subgroup Achievement on NAEP 缩小报告差距：估算 NAEP 中未报告的小群体成绩的方法比较

IF 1 Q3 SOCIAL SCIENCES, INTERDISCIPLINARY

Measurement-Interdisciplinary Research and Perspectives

Pub Date : 2024-02-20 DOI: 10.1080/15366367.2023.2173467

David Bamat

The National Assessment of Educational Progress (NAEP) program only reports state-level subgroup results if it samples at least 62 students identifying with the subgroup. Since some subgroups const...

美国国家教育进展评估（NAEP）项目只有在抽样调查了至少 62 名属于该子群体的学生后，才会报告州一级的子群体结果。由于某些亚群构成...

引用次数: 0

Evaluating Cronbach’s Coefficient Alpha and Testing Its Identity to Scale Reliability: A Direct Bayesian Confirmatory Factor Analysis Procedure 评估克朗巴赫系数 Alpha 并测试其与量表可靠性的一致性：直接贝叶斯确证因子分析程序

IF 1 Q3 SOCIAL SCIENCES, INTERDISCIPLINARY

Measurement-Interdisciplinary Research and Perspectives

Pub Date : 2024-02-20 DOI: 10.1080/15366367.2023.2201963

Tenko Raykov, George Marcoulides, James Anthony, Natalja Menold

A Bayesian statistics-based approach is discussed that can be used for direct evaluation of the popular Cronbach’s coefficient alpha as an internal consistency index for multiple-component measurin...

本文讨论了一种基于贝叶斯统计的方法，该方法可用于直接评估流行的克朗巴赫系数α，将其作为多成分测量的内部一致性指标。

引用次数: 0

Integration of Historical Data for the Analysis of Multiple Assessment Studies 多元评估研究的历史数据整合分析

IF 1 Q3 SOCIAL SCIENCES, INTERDISCIPLINARY

Measurement-Interdisciplinary Research and Perspectives

Pub Date : 2023-07-03 DOI: 10.1080/15366367.2022.2115250

Katerina M. Marcoulides

ABSTRACT Integrative data analyses have recently been shown to be an effective tool for researchers interested in synthesizing datasets from multiple studies in order to draw statistical or substantive conclusions. The actual process of integrating the different datasets depends on the availability of some common measures or items reflecting the same studied constructs. However, exactly how many common items are needed to effectively integrate multiple datasets has to date not been determined. This study evaluated the effect of using different numbers of common items in integrative data analysis applications. The study used simulations based on realistic data integration settings in which the number of common item sets was varied. The results provided insight concerning the optimal numbers of common items sets to safeguard estimation precision. The practical implications of these findings in view of past research in the psychometric literature concerning the necessary number of common item sets are also discussed.

综合数据分析最近被证明是研究人员对从多个研究中综合数据集以得出统计或实质性结论感兴趣的有效工具。整合不同数据集的实际过程取决于反映相同研究结构的一些通用度量或项目的可用性。然而，要有效地整合多个数据集，究竟需要多少共同的项目至今还没有确定。本研究评估了在综合数据分析应用中使用不同数量的常见项目的效果。该研究使用了基于现实数据集成设置的模拟，其中常见项目集的数量是不同的。结果提供了关于维护估计精度的公共项目集的最佳数量的见解。鉴于过去心理测量学文献中关于共同项目集必要数量的研究，本文还讨论了这些发现的实际意义。

引用次数: 0

Functional Data Analysis and Person Response Functions 功能数据分析和人的反应功能

IF 1 Q3 SOCIAL SCIENCES, INTERDISCIPLINARY

Measurement-Interdisciplinary Research and Perspectives

Pub Date : 2023-07-03 DOI: 10.1080/15366367.2022.2054130

Kyle T. Turner, G. Engelhard

ABSTRACT The purpose of this study is to illustrate the use of functional data analysis (FDA) as a general methodology for analyzing person response functions (PRFs). Applications of FDA to psychometrics have included the estimation of item response functions and latent distributions, as well as differential item functioning. Although FDA has been suggested for modeling PRFs, there has been relatively little research stressing this application. FDA offers an approach for diagnosing person responses that may be due to guessing and other sources of within-person multidimensionality. PRFs provide graphical displays that can be used to highlight unusual response patterns, and to identify persons that are not responding as expected to a set of test items. In addition to examining individual PRFs, functional clustering techniques can be used to identify subgroups of persons that may be exhibiting categories of misfit such as guessing. A small simulation study is conducted to illustrate how FDA can be used to identify persons exhibiting different levels of guessing behavior (5%, 10%, 15% and 20%). The methodology is also applied to real data from a 3rd grade science assessment used in a southeastern state. FDA offers a promising methodology for evaluating whether or not meaningful scores have been obtained for a person. Typical indices of psychometric quality, such as standard errors of measurement and person fit indices, are not sufficient for representing certain types of aberrance in person response patterns. Nonparametric graphical methods for estimating PRFs that are based FDA provide a rich source of validity evidence regarding the meaning and usefulness of each person’s score.

本研究的目的是说明功能数据分析(FDA)作为分析人反应函数(prf)的一般方法的使用。FDA在心理测量学中的应用包括项目反应函数和潜在分布的估计，以及差异项目功能的估计。虽然FDA已被建议建模PRFs，有相对较少的研究强调这一应用。FDA提供了一种诊断人的反应的方法，这种反应可能是由于猜测和其他来源的人的多维度。prf提供图形显示，可用于突出显示不寻常的响应模式，并识别未按预期对一组测试项目做出响应的人员。除了检查单个prf外，功能聚类技术还可用于识别可能表现出不适合类别(如猜测)的人的子组。进行了一个小型模拟研究，以说明如何使用FDA来识别表现出不同程度猜测行为的人(5%，10%，15%和20%)。该方法还应用于东南部一个州的三年级科学评估的真实数据。FDA提供了一种很有前途的方法来评估是否为一个人获得了有意义的分数。典型的心理测量质量指标，如测量标准误差和人的拟合指数，不足以反映人的反应模式的某些类型的异常。估计基于FDA的prf的非参数图形方法提供了关于每个人得分的意义和有用性的有效性证据的丰富来源。

{"title":"Functional Data Analysis and Person Response Functions","authors":"Kyle T. Turner, G. Engelhard","doi":"10.1080/15366367.2022.2054130","DOIUrl":"https://doi.org/10.1080/15366367.2022.2054130","url":null,"abstract":"ABSTRACT The purpose of this study is to illustrate the use of functional data analysis (FDA) as a general methodology for analyzing person response functions (PRFs). Applications of FDA to psychometrics have included the estimation of item response functions and latent distributions, as well as differential item functioning. Although FDA has been suggested for modeling PRFs, there has been relatively little research stressing this application. FDA offers an approach for diagnosing person responses that may be due to guessing and other sources of within-person multidimensionality. PRFs provide graphical displays that can be used to highlight unusual response patterns, and to identify persons that are not responding as expected to a set of test items. In addition to examining individual PRFs, functional clustering techniques can be used to identify subgroups of persons that may be exhibiting categories of misfit such as guessing. A small simulation study is conducted to illustrate how FDA can be used to identify persons exhibiting different levels of guessing behavior (5%, 10%, 15% and 20%). The methodology is also applied to real data from a 3rd grade science assessment used in a southeastern state. FDA offers a promising methodology for evaluating whether or not meaningful scores have been obtained for a person. Typical indices of psychometric quality, such as standard errors of measurement and person fit indices, are not sufficient for representing certain types of aberrance in person response patterns. Nonparametric graphical methods for estimating PRFs that are based FDA provide a rich source of validity evidence regarding the meaning and usefulness of each person’s score.","PeriodicalId":46596,"journal":{"name":"Measurement-Interdisciplinary Research and Perspectives","volume":"17 1","pages":"129 - 146"},"PeriodicalIF":1.0,"publicationDate":"2023-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79466782","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Person Misfit and Person Reliability in Rating Scale Measures: The Role of Response Styles 评量表测量中的人错配与人信度:反应风格的作用

IF 1 Q3 SOCIAL SCIENCES, INTERDISCIPLINARY

Measurement-Interdisciplinary Research and Perspectives

Pub Date : 2023-07-03 DOI: 10.1080/15366367.2022.2114243

Tongtong Zou, D. Bolt

ABSTRACT Person misfit and person reliability indices in item response theory (IRT) can play an important role in evaluating the validity of a test or survey instrument at the respondent level. Prior empirical comparisons of these indices have been applied to binary item response data and suggest that the two types of indices return very similar results. In this paper, however, we demonstrate an important applied distinction between these methods when applied to polytomously-scored rating scale items, namely their varying sensitivities to response style tendencies. Using several empirical datasets, we illustrate settings in which these indices are in one case highly correlated and in two other cases weakly correlated. In the datasets showing a weak correlation between indices, the primary distinction appears due to the effects of response style behavior, whereby respondents whose response styles are less common (e.g. a disproportionate selection of the midpoint response) are found to misfit using Drasgow et al’s person misfit index, but often show high levels of reliability from a person reliability perspective; just the opposite frequently occurs for respondents that over-select the rating scale extremes. It is suggested that person misfit reporting should be supplemented with an evaluation of person reliability to best understand the validity of measurement at the respondent level when using IRT models with rating scale measures.

项目反应理论(IRT)中的人错配和人信度指标在被调查者水平上评价测验或调查工具的效度具有重要作用。之前的经验比较这些指数已应用于二进制项目响应数据，并表明这两种类型的指数返回非常相似的结果。然而，在本文中，我们证明了这些方法在应用于多分式评分量表项目时的重要应用区别，即它们对反应风格倾向的不同敏感性。使用几个经验数据集，我们说明了这些指数在一种情况下高度相关，在另外两种情况下弱相关的设置。在显示指数之间弱相关性的数据集中，主要区别是由于反应风格行为的影响，即使用draggow等人的人不匹配指数发现，其反应风格不太常见(例如，中点反应的不成比例选择)的受访者不匹配，但从个人可靠性的角度来看，往往显示出高水平的可靠性;而过度选择极端评分量表的受访者则经常出现相反的情况。建议在使用IRT模型和评定量表测量时，个人不匹配报告应辅以个人信度评估，以最好地理解在被调查者水平上测量的有效性。

{"title":"Person Misfit and Person Reliability in Rating Scale Measures: The Role of Response Styles","authors":"Tongtong Zou, D. Bolt","doi":"10.1080/15366367.2022.2114243","DOIUrl":"https://doi.org/10.1080/15366367.2022.2114243","url":null,"abstract":"ABSTRACT Person misfit and person reliability indices in item response theory (IRT) can play an important role in evaluating the validity of a test or survey instrument at the respondent level. Prior empirical comparisons of these indices have been applied to binary item response data and suggest that the two types of indices return very similar results. In this paper, however, we demonstrate an important applied distinction between these methods when applied to polytomously-scored rating scale items, namely their varying sensitivities to response style tendencies. Using several empirical datasets, we illustrate settings in which these indices are in one case highly correlated and in two other cases weakly correlated. In the datasets showing a weak correlation between indices, the primary distinction appears due to the effects of response style behavior, whereby respondents whose response styles are less common (e.g. a disproportionate selection of the midpoint response) are found to misfit using Drasgow et al’s person misfit index, but often show high levels of reliability from a person reliability perspective; just the opposite frequently occurs for respondents that over-select the rating scale extremes. It is suggested that person misfit reporting should be supplemented with an evaluation of person reliability to best understand the validity of measurement at the respondent level when using IRT models with rating scale measures.","PeriodicalId":46596,"journal":{"name":"Measurement-Interdisciplinary Research and Perspectives","volume":"138 1","pages":"167 - 180"},"PeriodicalIF":1.0,"publicationDate":"2023-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73861512","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Concerto Software for Computerized Adaptive Testing – Free Version 计算机自适应测试协奏曲软件-免费版本

IF 1 Q3 SOCIAL SCIENCES, INTERDISCIPLINARY

Measurement-Interdisciplinary Research and Perspectives

Pub Date : 2023-07-03 DOI: 10.1080/15366367.2023.2187274

Yiling Cheng

ABSTRACT Computerized adaptive testing (CAT) offers an efficient and highly accurate method for estimating examinees' abilities. In this article, the free version of Concerto Software for CAT was reviewed, dividing our evaluation into three sections: software implementation, the Item Response Theory (IRT) features of CAT, and user experience. Overall, Concerto is an excellent tool for researchers seeking to create computerized adaptive (or non-adaptive) tests, providing a robust platform with comprehensive IRT capabilities and a user-friendly interface.

计算机自适应测试(CAT)为评估考生的能力提供了一种高效、高精度的方法。本文对Concerto Software for CAT的免费版本进行了审查，将我们的评估分为三个部分:软件实现、CAT的项目反应理论(IRT)特征和用户体验。总的来说，协奏曲是一个优秀的工具，为研究人员寻求创建计算机自适应(或非自适应)测试，提供了一个强大的平台，全面的IRT功能和用户友好的界面。

引用次数: 0

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Measurement-Interdisciplinary Research and Perspectives

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀