首页 > 最新文献

Journal of Educational and Behavioral Statistics最新文献

英文 中文
Two Statistical Tests for the Detection of Item Compromise 检测项目折衷的两个统计检验
IF 2.4 3区 心理学 Q2 EDUCATION & EDUCATIONAL RESEARCH Pub Date : 2022-05-11 DOI: 10.3102/10769986221094789
W. van der Linden
Two independent statistical tests of item compromise are presented, one based on the test takers’ responses and the other on their response times (RTs) on the same items. The tests can be used to monitor an item in real time during online continuous testing but are also applicable as part of post hoc forensic analysis. The two test statistics are simple intuitive quantities as the sum of the responses and RTs observed for the test takers on the item. Common features of the tests are ease of interpretation and computational simplicity. Both tests are uniformly most powerful under the assumption of known ability and speed parameters for the test takers. Examples of power functions for items with realistic parameter values suggest maximum power for 20–30 test takers with item preknowledge for the response-based test and 10–20 test takers for the RT-based test.
提出了两个独立的项目折衷统计测试,一个基于考生的回答,另一个基于他们对相同项目的回答时间。这些测试可以用于在线连续测试期间实时监控物品,但也适用于事后取证分析。这两个测试统计数据是简单直观的量,是考生对该项目的反应和RT的总和。测试的共同特点是易于解释和计算简单。在假设考生的能力和速度参数已知的情况下,这两种测试都是最有力的。具有真实参数值的项目的幂函数示例表明,20–30名具有项目先验知识的考生参加基于反应的测试,10–20名考生参加基于RT的测试。
{"title":"Two Statistical Tests for the Detection of Item Compromise","authors":"W. van der Linden","doi":"10.3102/10769986221094789","DOIUrl":"https://doi.org/10.3102/10769986221094789","url":null,"abstract":"Two independent statistical tests of item compromise are presented, one based on the test takers’ responses and the other on their response times (RTs) on the same items. The tests can be used to monitor an item in real time during online continuous testing but are also applicable as part of post hoc forensic analysis. The two test statistics are simple intuitive quantities as the sum of the responses and RTs observed for the test takers on the item. Common features of the tests are ease of interpretation and computational simplicity. Both tests are uniformly most powerful under the assumption of known ability and speed parameters for the test takers. Examples of power functions for items with realistic parameter values suggest maximum power for 20–30 test takers with item preknowledge for the response-based test and 10–20 test takers for the RT-based test.","PeriodicalId":48001,"journal":{"name":"Journal of Educational and Behavioral Statistics","volume":"47 1","pages":"485 - 504"},"PeriodicalIF":2.4,"publicationDate":"2022-05-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44049872","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A Critical View on the NEAT Equating Design: Statistical Modeling and Identifiability Problems 关于NEAT等式设计的批判性观点:统计建模和可识别性问题
IF 2.4 3区 心理学 Q2 EDUCATION & EDUCATIONAL RESEARCH Pub Date : 2022-04-29 DOI: 10.3102/10769986221090609
Ernesto San Martín, Jorge González
The nonequivalent groups with anchor test (NEAT) design is widely used in test equating. Under this design, two groups of examinees are administered different test forms with each test form containing a subset of common items. Because test takers from different groups are assigned only one test form, missing score data emerge by design rendering some of the score distributions unavailable. The partially observed score data formally lead to an identifiability problem, which has not been recognized as such in the equating literature and has been considered from different perspectives, all of them making different assumptions in order to estimate the unidentified score distributions. In this article, we formally specify the statistical model underlying the NEAT design and unveil the lack of identifiability of the parameters of interest that compose the equating transformation. We use the theory of partial identification to show alternatives to traditional practices that have been proposed to identify the score distributions when conducting equating under the NEAT design.
非等价群锚定试验(NEAT)设计在试验等值中得到了广泛的应用。在这种设计下,两组考生使用不同的测试表格,每个测试表格包含一个子集的通用项目。由于来自不同组的考生只被分配一份考试表格,因此通过设计使一些分数分布不可用,就会出现缺失的分数数据。部分观察到的分数数据正式导致了一个可识别性问题,该问题在等式文献中没有得到承认,并且从不同的角度进行了考虑,所有这些都做出了不同的假设,以估计未识别的分数分布。在这篇文章中,我们正式指定了NEAT设计的统计模型,并揭示了组成等式转换的感兴趣参数缺乏可识别性。我们使用部分识别理论来展示在NEAT设计下进行等值时,为识别分数分布而提出的传统实践的替代方案。
{"title":"A Critical View on the NEAT Equating Design: Statistical Modeling and Identifiability Problems","authors":"Ernesto San Martín, Jorge González","doi":"10.3102/10769986221090609","DOIUrl":"https://doi.org/10.3102/10769986221090609","url":null,"abstract":"The nonequivalent groups with anchor test (NEAT) design is widely used in test equating. Under this design, two groups of examinees are administered different test forms with each test form containing a subset of common items. Because test takers from different groups are assigned only one test form, missing score data emerge by design rendering some of the score distributions unavailable. The partially observed score data formally lead to an identifiability problem, which has not been recognized as such in the equating literature and has been considered from different perspectives, all of them making different assumptions in order to estimate the unidentified score distributions. In this article, we formally specify the statistical model underlying the NEAT design and unveil the lack of identifiability of the parameters of interest that compose the equating transformation. We use the theory of partial identification to show alternatives to traditional practices that have been proposed to identify the score distributions when conducting equating under the NEAT design.","PeriodicalId":48001,"journal":{"name":"Journal of Educational and Behavioral Statistics","volume":"47 1","pages":"406 - 437"},"PeriodicalIF":2.4,"publicationDate":"2022-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43615425","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Statistical Inference for G-indices of Agreement 一致性g指数的统计推断
IF 2.4 3区 心理学 Q2 EDUCATION & EDUCATIONAL RESEARCH Pub Date : 2022-04-29 DOI: 10.3102/10769986221088561
D. Bonett
The limitations of Cohen’s κ are reviewed and an alternative G-index is recommended for assessing nominal-scale agreement. Maximum likelihood estimates, standard errors, and confidence intervals for a two-rater G-index are derived for one-group and two-group designs. A new G-index of agreement for multirater designs is proposed. Statistical inference methods for some important special cases of the multirater design also are derived. G-index meta-analysis methods are proposed and can be used to combine and compare agreement across two or more populations. Closed-form sample-size formulas to achieve desired confidence interval precision are proposed for two-rater and multirater designs. R functions are given for all results.
本文回顾了Cohen 's κ的局限性,并推荐了一种替代的g指数来评估名义尺度的一致性。最大似然估计,标准误差和置信区间的两个评级的g指数为一组和两组设计推导。提出了一种新的多参数设计一致性g指数。本文还推导了几种重要的特殊情况下的统计推断方法。提出了g指数荟萃分析方法,可用于组合和比较两个或多个人群的一致性。提出了用于双因子和多因子设计的封闭式样本大小公式,以达到所需的置信区间精度。所有结果都给出了R函数。
{"title":"Statistical Inference for G-indices of Agreement","authors":"D. Bonett","doi":"10.3102/10769986221088561","DOIUrl":"https://doi.org/10.3102/10769986221088561","url":null,"abstract":"The limitations of Cohen’s κ are reviewed and an alternative G-index is recommended for assessing nominal-scale agreement. Maximum likelihood estimates, standard errors, and confidence intervals for a two-rater G-index are derived for one-group and two-group designs. A new G-index of agreement for multirater designs is proposed. Statistical inference methods for some important special cases of the multirater design also are derived. G-index meta-analysis methods are proposed and can be used to combine and compare agreement across two or more populations. Closed-form sample-size formulas to achieve desired confidence interval precision are proposed for two-rater and multirater designs. R functions are given for all results.","PeriodicalId":48001,"journal":{"name":"Journal of Educational and Behavioral Statistics","volume":"47 1","pages":"438 - 458"},"PeriodicalIF":2.4,"publicationDate":"2022-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44008526","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Latent Trait Item Response Models for Continuous Responses 连续反应的潜在特质-项目反应模型
IF 2.4 3区 心理学 Q2 EDUCATION & EDUCATIONAL RESEARCH Pub Date : 2022-04-08 DOI: 10.3102/10769986231184147
G. Tutz, Pascal Jordan
A general framework of latent trait item response models for continuous responses is given. In contrast to classical test theory (CTT) models, which traditionally distinguish between true scores and error scores, the responses are clearly linked to latent traits. It is shown that CTT models can be derived as special cases, but the model class is much wider. It provides, in particular, appropriate modeling of responses that are restricted in some way, for example, if responses are positive or are restricted to an interval. Restrictions of this sort are easily incorporated in the modeling framework. Restriction to an interval is typically ignored in common models yielding inappropriate models, for example, when modeling Likert-type data. The model also extends common response time models, which can be treated as special cases. The properties of the model class are derived and the role of the total score is investigated, which leads to a modified total score. Several applications illustrate the use of the model including an example, in which covariates that may modify the response are taken into account.
给出了用于连续反应的潜在特质项目反应模型的一般框架。经典测试理论(CTT)模型传统上区分真实分数和错误分数,与之相反,反应显然与潜在特征有关。结果表明,CTT模型可以作为特殊情况导出,但模型类别要广泛得多。它特别提供了在某种程度上受到限制的响应的适当建模,例如,如果响应是积极的或被限制在一个区间内。这种类型的限制很容易合并到建模框架中。在产生不适当模型的普通模型中,通常会忽略对间隔的限制,例如,在对likert类型数据建模时。该模型还扩展了常见的响应时间模型,可以将其视为特殊情况。推导了模型类的属性,研究了总分的作用,得到了修改后的总分。几个应用说明了该模型的使用,包括一个例子,其中考虑了可能修改响应的协变量。
{"title":"Latent Trait Item Response Models for Continuous Responses","authors":"G. Tutz, Pascal Jordan","doi":"10.3102/10769986231184147","DOIUrl":"https://doi.org/10.3102/10769986231184147","url":null,"abstract":"A general framework of latent trait item response models for continuous responses is given. In contrast to classical test theory (CTT) models, which traditionally distinguish between true scores and error scores, the responses are clearly linked to latent traits. It is shown that CTT models can be derived as special cases, but the model class is much wider. It provides, in particular, appropriate modeling of responses that are restricted in some way, for example, if responses are positive or are restricted to an interval. Restrictions of this sort are easily incorporated in the modeling framework. Restriction to an interval is typically ignored in common models yielding inappropriate models, for example, when modeling Likert-type data. The model also extends common response time models, which can be treated as special cases. The properties of the model class are derived and the role of the total score is investigated, which leads to a modified total score. Several applications illustrate the use of the model including an example, in which covariates that may modify the response are taken into account.","PeriodicalId":48001,"journal":{"name":"Journal of Educational and Behavioral Statistics","volume":"1 1","pages":""},"PeriodicalIF":2.4,"publicationDate":"2022-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47754564","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Handling Missing Data in Cross-Classified Multilevel Analyses: An Evaluation of Different Multiple Imputation Approaches 交叉分类多水平分析中缺失数据的处理:不同多重插值方法的评价
IF 2.4 3区 心理学 Q2 EDUCATION & EDUCATIONAL RESEARCH Pub Date : 2022-02-18 DOI: 10.3102/10769986231151224
S. Grund, O. Lüdtke, A. Robitzsch
Multiple imputation (MI) is a popular method for handling missing data. In education research, it can be challenging to use MI because the data often have a clustered structure that need to be accommodated during MI. Although much research has considered applications of MI in hierarchical data, little is known about its use in cross-classified data, in which observations are clustered in multiple higher-level units simultaneously (e.g., schools and neighborhoods, transitions from primary to secondary schools). In this article, we consider several approaches to MI for cross-classified data (CC-MI), including a novel fully conditional specification approach, a joint modeling approach, and other approaches that are based on single- and two-level MI. In this context, we clarify the conditions that CC-MI methods need to fulfill to provide a suitable treatment of missing data, and we compare the approaches both from a theoretical perspective and in a simulation study. Finally, we illustrate the use of CC-MI in real data and discuss the implications of our findings for research practice.
多重插补(MI)是处理缺失数据的常用方法。在教育研究中,使用MI可能具有挑战性,因为数据通常具有在MI期间需要适应的聚类结构。尽管许多研究都考虑了MI在分层数据中的应用,但对其在交叉分类数据中的使用知之甚少,其中观测同时聚集在多个更高级别的单元中(例如,学校和社区,从小学到中学的过渡)。在本文中,我们考虑了交叉分类数据(CC-MI)的几种MI方法,包括一种新的全条件规范方法、联合建模方法以及其他基于单级和两级MI的方法。在这种情况下,我们阐明了CC-MI方法需要满足的条件,以提供对缺失数据的适当处理,我们从理论角度和模拟研究两个方面对这两种方法进行了比较。最后,我们说明了CC-MI在实际数据中的使用,并讨论了我们的发现对研究实践的启示。
{"title":"Handling Missing Data in Cross-Classified Multilevel Analyses: An Evaluation of Different Multiple Imputation Approaches","authors":"S. Grund, O. Lüdtke, A. Robitzsch","doi":"10.3102/10769986231151224","DOIUrl":"https://doi.org/10.3102/10769986231151224","url":null,"abstract":"Multiple imputation (MI) is a popular method for handling missing data. In education research, it can be challenging to use MI because the data often have a clustered structure that need to be accommodated during MI. Although much research has considered applications of MI in hierarchical data, little is known about its use in cross-classified data, in which observations are clustered in multiple higher-level units simultaneously (e.g., schools and neighborhoods, transitions from primary to secondary schools). In this article, we consider several approaches to MI for cross-classified data (CC-MI), including a novel fully conditional specification approach, a joint modeling approach, and other approaches that are based on single- and two-level MI. In this context, we clarify the conditions that CC-MI methods need to fulfill to provide a suitable treatment of missing data, and we compare the approaches both from a theoretical perspective and in a simulation study. Finally, we illustrate the use of CC-MI in real data and discuss the implications of our findings for research practice.","PeriodicalId":48001,"journal":{"name":"Journal of Educational and Behavioral Statistics","volume":"48 1","pages":"454 - 489"},"PeriodicalIF":2.4,"publicationDate":"2022-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41948412","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Analyzing Longitudinal Social Relations Model Data Using the Social Relations Structural Equation Model 利用社会关系结构方程模型分析纵向社会关系模型数据
IF 2.4 3区 心理学 Q2 EDUCATION & EDUCATIONAL RESEARCH Pub Date : 2021-12-14 DOI: 10.3102/10769986211056541
S. Nestler, O. Lüdtke, A. Robitzsch
The social relations model (SRM) is very often used in psychology to examine the components, determinants, and consequences of interpersonal judgments and behaviors that arise in social groups. The standard SRM was developed to analyze cross-sectional data. Based on a recently suggested integration of the SRM with structural equation models (SEM) framework, we show here how longitudinal SRM data can be analyzed using the SR-SEM. Two examples are presented to illustrate the model, and we also present the results of a small simulation study comparing the SR-SEM approach to a two-step approach. Altogether, the SR-SEM has a number of advantages compared to earlier suggestions for analyzing longitudinal SRM data, making it extremely useful for applied research.
社会关系模型(SRM)在心理学中经常用于研究社会群体中出现的人际判断和行为的组成部分、决定因素和后果。标准SRM是为分析横截面数据而开发的。基于最近提出的SRM与结构方程模型(SEM)框架的集成,我们在这里展示了如何使用SR-SEM分析纵向SRM数据。给出了两个例子来说明该模型,我们还给出了一个小型模拟研究的结果,将SR-SEM方法与两步方法进行了比较。总之,与早期分析纵向SRM数据的建议相比,SR-SEM具有许多优势,使其对应用研究非常有用。
{"title":"Analyzing Longitudinal Social Relations Model Data Using the Social Relations Structural Equation Model","authors":"S. Nestler, O. Lüdtke, A. Robitzsch","doi":"10.3102/10769986211056541","DOIUrl":"https://doi.org/10.3102/10769986211056541","url":null,"abstract":"The social relations model (SRM) is very often used in psychology to examine the components, determinants, and consequences of interpersonal judgments and behaviors that arise in social groups. The standard SRM was developed to analyze cross-sectional data. Based on a recently suggested integration of the SRM with structural equation models (SEM) framework, we show here how longitudinal SRM data can be analyzed using the SR-SEM. Two examples are presented to illustrate the model, and we also present the results of a small simulation study comparing the SR-SEM approach to a two-step approach. Altogether, the SR-SEM has a number of advantages compared to earlier suggestions for analyzing longitudinal SRM data, making it extremely useful for applied research.","PeriodicalId":48001,"journal":{"name":"Journal of Educational and Behavioral Statistics","volume":"47 1","pages":"231 - 260"},"PeriodicalIF":2.4,"publicationDate":"2021-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47561898","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Item Pool Quality Control in Educational Testing: Change Point Model, Compound Risk, and Sequential Detection 教育测试中的题库质量控制:变化点模型、复合风险和顺序检测
IF 2.4 3区 心理学 Q2 EDUCATION & EDUCATIONAL RESEARCH Pub Date : 2021-12-13 DOI: 10.3102/10769986211059085
Yunxiao Chen, Yi-Hsuan Lee, Xiaoou Li
In standardized educational testing, test items are reused in multiple test administrations. To ensure the validity of test scores, the psychometric properties of items should remain unchanged over time. In this article, we consider the sequential monitoring of test items, in particular, the detection of abrupt changes to their psychometric properties, where a change can be caused by, for example, leakage of the item or change of the corresponding curriculum. We propose a statistical framework for the detection of abrupt changes in individual items. This framework consists of (1) a multistream Bayesian change point model describing sequential changes in items, (2) a compound risk function quantifying the risk in sequential decisions, and (3) sequential decision rules that control the compound risk. Throughout the sequential decision process, the proposed decision rule balances the trade-off between two sources of errors, the false detection of prechange items, and the nondetection of postchange items. An item-specific monitoring statistic is proposed based on an item response theory model that eliminates the confounding from the examinee population which changes over time. Sequential decision rules and their theoretical properties are developed under two settings: the oracle setting where the Bayesian change point model is completely known and a more realistic setting where some parameters of the model are unknown. Simulation studies are conducted under settings that mimic real operational tests.
在标准化的教育测试中,测试项目在多个测试管理中重复使用。为了确保考试成绩的有效性,项目的心理测量特性应随时间保持不变。在这篇文章中,我们考虑了对测试项目的顺序监测,特别是检测其心理测量特性的突然变化,例如,项目的泄露或相应课程的变化可能会导致变化。我们提出了一个统计框架来检测单个项目的突变。该框架由(1)描述项目顺序变化的多流贝叶斯变点模型,(2)量化顺序决策中风险的复合风险函数,以及(3)控制复合风险的顺序决策规则组成。在整个顺序决策过程中,所提出的决策规则平衡了两个错误来源之间的权衡,即更改前项目的错误检测和更改后项目的未检测。基于项目反应理论模型,提出了一种针对项目的监测统计数据,该模型消除了受试者群体中随时间变化的混杂因素。序列决策规则及其理论性质是在两种设置下发展起来的:在预言机设置下,贝叶斯变点模型是完全已知的,在更现实的设置下,模型的一些参数是未知的。模拟研究是在模拟实际操作测试的环境下进行的。
{"title":"Item Pool Quality Control in Educational Testing: Change Point Model, Compound Risk, and Sequential Detection","authors":"Yunxiao Chen, Yi-Hsuan Lee, Xiaoou Li","doi":"10.3102/10769986211059085","DOIUrl":"https://doi.org/10.3102/10769986211059085","url":null,"abstract":"In standardized educational testing, test items are reused in multiple test administrations. To ensure the validity of test scores, the psychometric properties of items should remain unchanged over time. In this article, we consider the sequential monitoring of test items, in particular, the detection of abrupt changes to their psychometric properties, where a change can be caused by, for example, leakage of the item or change of the corresponding curriculum. We propose a statistical framework for the detection of abrupt changes in individual items. This framework consists of (1) a multistream Bayesian change point model describing sequential changes in items, (2) a compound risk function quantifying the risk in sequential decisions, and (3) sequential decision rules that control the compound risk. Throughout the sequential decision process, the proposed decision rule balances the trade-off between two sources of errors, the false detection of prechange items, and the nondetection of postchange items. An item-specific monitoring statistic is proposed based on an item response theory model that eliminates the confounding from the examinee population which changes over time. Sequential decision rules and their theoretical properties are developed under two settings: the oracle setting where the Bayesian change point model is completely known and a more realistic setting where some parameters of the model are unknown. Simulation studies are conducted under settings that mimic real operational tests.","PeriodicalId":48001,"journal":{"name":"Journal of Educational and Behavioral Statistics","volume":"47 1","pages":"322 - 352"},"PeriodicalIF":2.4,"publicationDate":"2021-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43301337","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
A New Multiprocess IRT Model With Ideal Points for Likert-Type Items 具有李克特类项目理想点的多进程IRT模型
IF 2.4 3区 心理学 Q2 EDUCATION & EDUCATIONAL RESEARCH Pub Date : 2021-12-09 DOI: 10.3102/10769986211057160
K. Jin, Yi-Jhen Wu, Hui-Fang Chen
For surveys of complex issues that entail multiple steps, multiple reference points, and nongradient attributes (e.g., social inequality), this study proposes a new multiprocess model that integrates ideal-point and dominance approaches into a treelike structure (IDtree). In the IDtree, an ideal-point approach describes an individual’s attitude and then a dominance approach describes their tendency for using extreme response categories. Evaluation of IDtree performance via two empirical data sets showed that the IDtree fit these data better than other models. Furthermore, simulation studies showed a satisfactory parameter recovery of the IDtree. Thus, the IDtree model sheds light on the response processes of a multistage structure.
对于涉及多个步骤、多个参考点和非传统属性(如社会不平等)的复杂问题的调查,本研究提出了一种新的多过程模型,该模型将理想点和优势方法集成到树状结构(IDtree)中。在IDtree中,理想点方法描述了个人的态度,然后优势方法描述了他们使用极端反应类别的倾向。通过两个经验数据集对IDtree性能的评估表明,IDtree比其他模型更适合这些数据。此外,仿真研究表明,IDtree的参数恢复效果令人满意。因此,IDtree模型揭示了多级结构的响应过程。
{"title":"A New Multiprocess IRT Model With Ideal Points for Likert-Type Items","authors":"K. Jin, Yi-Jhen Wu, Hui-Fang Chen","doi":"10.3102/10769986211057160","DOIUrl":"https://doi.org/10.3102/10769986211057160","url":null,"abstract":"For surveys of complex issues that entail multiple steps, multiple reference points, and nongradient attributes (e.g., social inequality), this study proposes a new multiprocess model that integrates ideal-point and dominance approaches into a treelike structure (IDtree). In the IDtree, an ideal-point approach describes an individual’s attitude and then a dominance approach describes their tendency for using extreme response categories. Evaluation of IDtree performance via two empirical data sets showed that the IDtree fit these data better than other models. Furthermore, simulation studies showed a satisfactory parameter recovery of the IDtree. Thus, the IDtree model sheds light on the response processes of a multistage structure.","PeriodicalId":48001,"journal":{"name":"Journal of Educational and Behavioral Statistics","volume":"47 1","pages":"297 - 321"},"PeriodicalIF":2.4,"publicationDate":"2021-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48319208","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Acknowledgments 致谢
IF 2.4 3区 心理学 Q2 EDUCATION & EDUCATIONAL RESEARCH Pub Date : 2021-11-02 DOI: 10.3102/10769986211056337
Stephen R. Aichele, Michael Gottfried
Stephen Aichele, Colorado State University Usama Ali, Educational Testing Service Marı́a Álvarez Hernández, Centro Universitario de la Defensa Eva Baker, University of California, Los Angeles Michela Battauz, University of Udine Daniel Bauer, University of North Carolina Jorge Bazán, Universidade de São Paulo William Belzak, University of North Carolina at Chapel Hill Yoav Bergner, New York University Howard Bloom, Manpower Demonstration Research Corporation (MDRC) Ulf Bockenholt, Northwestern University Maria Bolsinova, Tilburg University Daniel Bolt, University of Wisconsin Zach Branson, Carnegie Mellon University Robert Brennan, University of Iowa Tiago Calico, American Institutes for Research Jodi Casabianca, Educational Testing Service Katherine Castellano, Educational Testing Service Mei-Hsiu Chen, State University of New York at Binghamton Ping Chen, Beijing Normal University Yinghan Chen, University of Nevada Reno Yunxiao Chen, London School of Economics and Political Science Michael Cheung, National University of Singapore Chia-Yi Chiu, Rutgers, The State University of New Jersey Kilchan Choi, University of California, Los Angeles Karl Bang Christensen, University of Copenhagen Brian Clauser, National Board of Medical Examiners (NBME) Paul De Boeck, Ohio State University Dries Debeer, University of Leuven (KU Leuven) Ben Domingue, Stanford University Nianbo Dong, University of North Carolina at Chapel Hill Jeffrey Douglas, University of Illinois Urbana-Champaign Han Du, University of California, Los Angeles Georgios Fellouris, University of Illinois Urbana-Champaign Leah Feuerstahler, Fordham University William Finch, Ball State University Jean-Paul Fox, University of Twente Ken Fujimoto, Loyola University Chicago Johann Gagnon-Bartsch, University of Michigan Michael Garet, American Institutes for Research Andrew Gelman, Columbia University Flavio Gonçalves, Universidade Federal de Minas Gerais Jorge Gonzaléz, Pontificia Universidad Católica de Chile Maithreyi Gopalan, Penn State College of Education Journal of Educational and Behavioral Statistics 2021, Vol. 46, No. 6, pp. 776–778 DOI: 10.3102/10769986211056337 Article reuse guidelines: sagepub.com/journals-permissions © 2021 AERA. https://journals.sagepub.com/home/jeb
Stephen Aichele,科罗拉多州立大学Usama Ali,教育测试服务Marı́aÁlvarez Hernández,国防大学中心Eva Baker,加州大学洛杉矶分校Michela Battauz,乌迪内·丹尼尔·鲍尔大学,北卡罗来纳大学Jorge Bazán,圣保罗大学William Belzak,北卡罗来纳州教堂山大学Yoav Bergner,纽约大学Howard Bloom、人力资源示范研究公司(MDRC)Ulf Bockenholt、西北大学Maria Bolsinova、蒂尔堡大学Daniel Bolt、威斯康星大学Zach Branson、卡内基梅隆大学Robert Brennan、爱荷华大学Tiago Calico、美国研究院Jodi Casabianca、教育测试服务机构Katherine Castellano,教育测试服务中心陈美秀,纽约州立大学陈,北京师范大学陈映涵,内华达大学陈,伦敦政治经济学院张,新加坡国立大学邱嘉义,罗格斯,新泽西州立大学蔡,加州大学,洛杉矶Karl Bang Christensen、哥本哈根大学Brian Clauser、国家医学检查委员会(NBME)Paul De Boeck、俄亥俄州立大学Dries Debeer、鲁汶大学Ben Domingue、斯坦福大学Nianbo Dong、北卡罗来纳大学教堂山分校Jeffrey Douglas、伊利诺伊大学厄巴纳-香槟分校Han Du、加利福尼亚大学,洛杉矶Georgios Fellouris、伊利诺伊大学厄巴纳-香槟分校Leah Feuerstahler、福特汉姆大学William Finch、鲍尔州立大学Jean-Paul Fox、特文特大学Ken Fujimoto、芝加哥洛约拉大学Johann Gagnon Bartsch、密歇根大学Michael Garet、美国研究院Andrew Gelman、哥伦比亚大学Flavio Gonçalves,米纳斯吉拉斯联邦大学Jorge Gonzaléz,智利天主教大学Maithreyi Gopalan,宾夕法尼亚州立教育学院《2021年教育与行为统计杂志》,第46卷,第6期,第776–778页DOI:10.3102/107699862111056337文章重用指南:sagepub.com/journals-permissions©2021 AERA。https://journals.sagepub.com/home/jeb
{"title":"Acknowledgments","authors":"Stephen R. Aichele, Michael Gottfried","doi":"10.3102/10769986211056337","DOIUrl":"https://doi.org/10.3102/10769986211056337","url":null,"abstract":"Stephen Aichele, Colorado State University Usama Ali, Educational Testing Service Marı́a Álvarez Hernández, Centro Universitario de la Defensa Eva Baker, University of California, Los Angeles Michela Battauz, University of Udine Daniel Bauer, University of North Carolina Jorge Bazán, Universidade de São Paulo William Belzak, University of North Carolina at Chapel Hill Yoav Bergner, New York University Howard Bloom, Manpower Demonstration Research Corporation (MDRC) Ulf Bockenholt, Northwestern University Maria Bolsinova, Tilburg University Daniel Bolt, University of Wisconsin Zach Branson, Carnegie Mellon University Robert Brennan, University of Iowa Tiago Calico, American Institutes for Research Jodi Casabianca, Educational Testing Service Katherine Castellano, Educational Testing Service Mei-Hsiu Chen, State University of New York at Binghamton Ping Chen, Beijing Normal University Yinghan Chen, University of Nevada Reno Yunxiao Chen, London School of Economics and Political Science Michael Cheung, National University of Singapore Chia-Yi Chiu, Rutgers, The State University of New Jersey Kilchan Choi, University of California, Los Angeles Karl Bang Christensen, University of Copenhagen Brian Clauser, National Board of Medical Examiners (NBME) Paul De Boeck, Ohio State University Dries Debeer, University of Leuven (KU Leuven) Ben Domingue, Stanford University Nianbo Dong, University of North Carolina at Chapel Hill Jeffrey Douglas, University of Illinois Urbana-Champaign Han Du, University of California, Los Angeles Georgios Fellouris, University of Illinois Urbana-Champaign Leah Feuerstahler, Fordham University William Finch, Ball State University Jean-Paul Fox, University of Twente Ken Fujimoto, Loyola University Chicago Johann Gagnon-Bartsch, University of Michigan Michael Garet, American Institutes for Research Andrew Gelman, Columbia University Flavio Gonçalves, Universidade Federal de Minas Gerais Jorge Gonzaléz, Pontificia Universidad Católica de Chile Maithreyi Gopalan, Penn State College of Education Journal of Educational and Behavioral Statistics 2021, Vol. 46, No. 6, pp. 776–778 DOI: 10.3102/10769986211056337 Article reuse guidelines: sagepub.com/journals-permissions © 2021 AERA. https://journals.sagepub.com/home/jeb","PeriodicalId":48001,"journal":{"name":"Journal of Educational and Behavioral Statistics","volume":"46 1","pages":"776 - 778"},"PeriodicalIF":2.4,"publicationDate":"2021-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44572876","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
On the Generalized S − X 2 –Test of Item Fit: Some Variants, Residuals, and a Graphical Visualization 项目拟合的广义S−x2检验:一些变异、残差和图形可视化
IF 2.4 3区 心理学 Q2 EDUCATION & EDUCATIONAL RESEARCH Pub Date : 2021-10-25 DOI: 10.3102/10769986211050304
Jochen Ranger, Kay Brauer
The generalized S − X 2 –test is a test of item fit for items with polytomous responses format. The test is based on a comparison of the observed and expected number of responses in strata defined by the test score. In this article, we make four contributions. We demonstrate that the performance of the generalized S − X 2 –test depends on how sparse cells are pooled. We propose alternative implementations of the test within the framework of limited information testing. We derive the distribution of the S − X 2 –residuals that can be used for post hoc analyses. We suggest a diagnostic plot that visualizes the form of the misfit. The performance of the alternative implementations is investigated in a simulation study. The simulation study suggests that the alternative implementations are capable of controlling the Type-I error rate well and have high power. An empirical application concludes this article.
广义S−x2 -检验是一种对具有多分式回答格式的题目的拟合性检验。该测试是基于测试分数定义的地层中观察到的响应数和预期响应数的比较。在本文中,我们做了四个贡献。我们证明了广义S - x2 -测试的性能取决于稀疏单元池的方式。我们在有限信息测试的框架内提出了测试的替代实现。我们推导了S−x2 -残差的分布,可用于事后分析。我们建议一个诊断图,可视化的形式的不适合。在仿真研究中对备选实现的性能进行了研究。仿真研究表明,备选实现能够很好地控制i型错误率,并且具有较高的功率。本文最后以实证应用为结论。
{"title":"On the Generalized S − X 2 –Test of Item Fit: Some Variants, Residuals, and a Graphical Visualization","authors":"Jochen Ranger, Kay Brauer","doi":"10.3102/10769986211050304","DOIUrl":"https://doi.org/10.3102/10769986211050304","url":null,"abstract":"The generalized S − X 2 –test is a test of item fit for items with polytomous responses format. The test is based on a comparison of the observed and expected number of responses in strata defined by the test score. In this article, we make four contributions. We demonstrate that the performance of the generalized S − X 2 –test depends on how sparse cells are pooled. We propose alternative implementations of the test within the framework of limited information testing. We derive the distribution of the S − X 2 –residuals that can be used for post hoc analyses. We suggest a diagnostic plot that visualizes the form of the misfit. The performance of the alternative implementations is investigated in a simulation study. The simulation study suggests that the alternative implementations are capable of controlling the Type-I error rate well and have high power. An empirical application concludes this article.","PeriodicalId":48001,"journal":{"name":"Journal of Educational and Behavioral Statistics","volume":"47 1","pages":"202 - 230"},"PeriodicalIF":2.4,"publicationDate":"2021-10-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41455826","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of Educational and Behavioral Statistics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1