首页 > 最新文献

Journal of Educational and Behavioral Statistics最新文献

英文 中文
Handling Missing Data in Cross-Classified Multilevel Analyses: An Evaluation of Different Multiple Imputation Approaches 交叉分类多水平分析中缺失数据的处理:不同多重插值方法的评价
IF 2.4 3区 心理学 Q2 EDUCATION & EDUCATIONAL RESEARCH Pub Date : 2022-02-18 DOI: 10.3102/10769986231151224
S. Grund, O. Lüdtke, A. Robitzsch
Multiple imputation (MI) is a popular method for handling missing data. In education research, it can be challenging to use MI because the data often have a clustered structure that need to be accommodated during MI. Although much research has considered applications of MI in hierarchical data, little is known about its use in cross-classified data, in which observations are clustered in multiple higher-level units simultaneously (e.g., schools and neighborhoods, transitions from primary to secondary schools). In this article, we consider several approaches to MI for cross-classified data (CC-MI), including a novel fully conditional specification approach, a joint modeling approach, and other approaches that are based on single- and two-level MI. In this context, we clarify the conditions that CC-MI methods need to fulfill to provide a suitable treatment of missing data, and we compare the approaches both from a theoretical perspective and in a simulation study. Finally, we illustrate the use of CC-MI in real data and discuss the implications of our findings for research practice.
多重插补(MI)是处理缺失数据的常用方法。在教育研究中,使用MI可能具有挑战性,因为数据通常具有在MI期间需要适应的聚类结构。尽管许多研究都考虑了MI在分层数据中的应用,但对其在交叉分类数据中的使用知之甚少,其中观测同时聚集在多个更高级别的单元中(例如,学校和社区,从小学到中学的过渡)。在本文中,我们考虑了交叉分类数据(CC-MI)的几种MI方法,包括一种新的全条件规范方法、联合建模方法以及其他基于单级和两级MI的方法。在这种情况下,我们阐明了CC-MI方法需要满足的条件,以提供对缺失数据的适当处理,我们从理论角度和模拟研究两个方面对这两种方法进行了比较。最后,我们说明了CC-MI在实际数据中的使用,并讨论了我们的发现对研究实践的启示。
{"title":"Handling Missing Data in Cross-Classified Multilevel Analyses: An Evaluation of Different Multiple Imputation Approaches","authors":"S. Grund, O. Lüdtke, A. Robitzsch","doi":"10.3102/10769986231151224","DOIUrl":"https://doi.org/10.3102/10769986231151224","url":null,"abstract":"Multiple imputation (MI) is a popular method for handling missing data. In education research, it can be challenging to use MI because the data often have a clustered structure that need to be accommodated during MI. Although much research has considered applications of MI in hierarchical data, little is known about its use in cross-classified data, in which observations are clustered in multiple higher-level units simultaneously (e.g., schools and neighborhoods, transitions from primary to secondary schools). In this article, we consider several approaches to MI for cross-classified data (CC-MI), including a novel fully conditional specification approach, a joint modeling approach, and other approaches that are based on single- and two-level MI. In this context, we clarify the conditions that CC-MI methods need to fulfill to provide a suitable treatment of missing data, and we compare the approaches both from a theoretical perspective and in a simulation study. Finally, we illustrate the use of CC-MI in real data and discuss the implications of our findings for research practice.","PeriodicalId":48001,"journal":{"name":"Journal of Educational and Behavioral Statistics","volume":"48 1","pages":"454 - 489"},"PeriodicalIF":2.4,"publicationDate":"2022-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41948412","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Analyzing Longitudinal Social Relations Model Data Using the Social Relations Structural Equation Model 利用社会关系结构方程模型分析纵向社会关系模型数据
IF 2.4 3区 心理学 Q2 EDUCATION & EDUCATIONAL RESEARCH Pub Date : 2021-12-14 DOI: 10.3102/10769986211056541
S. Nestler, O. Lüdtke, A. Robitzsch
The social relations model (SRM) is very often used in psychology to examine the components, determinants, and consequences of interpersonal judgments and behaviors that arise in social groups. The standard SRM was developed to analyze cross-sectional data. Based on a recently suggested integration of the SRM with structural equation models (SEM) framework, we show here how longitudinal SRM data can be analyzed using the SR-SEM. Two examples are presented to illustrate the model, and we also present the results of a small simulation study comparing the SR-SEM approach to a two-step approach. Altogether, the SR-SEM has a number of advantages compared to earlier suggestions for analyzing longitudinal SRM data, making it extremely useful for applied research.
社会关系模型(SRM)在心理学中经常用于研究社会群体中出现的人际判断和行为的组成部分、决定因素和后果。标准SRM是为分析横截面数据而开发的。基于最近提出的SRM与结构方程模型(SEM)框架的集成,我们在这里展示了如何使用SR-SEM分析纵向SRM数据。给出了两个例子来说明该模型,我们还给出了一个小型模拟研究的结果,将SR-SEM方法与两步方法进行了比较。总之,与早期分析纵向SRM数据的建议相比,SR-SEM具有许多优势,使其对应用研究非常有用。
{"title":"Analyzing Longitudinal Social Relations Model Data Using the Social Relations Structural Equation Model","authors":"S. Nestler, O. Lüdtke, A. Robitzsch","doi":"10.3102/10769986211056541","DOIUrl":"https://doi.org/10.3102/10769986211056541","url":null,"abstract":"The social relations model (SRM) is very often used in psychology to examine the components, determinants, and consequences of interpersonal judgments and behaviors that arise in social groups. The standard SRM was developed to analyze cross-sectional data. Based on a recently suggested integration of the SRM with structural equation models (SEM) framework, we show here how longitudinal SRM data can be analyzed using the SR-SEM. Two examples are presented to illustrate the model, and we also present the results of a small simulation study comparing the SR-SEM approach to a two-step approach. Altogether, the SR-SEM has a number of advantages compared to earlier suggestions for analyzing longitudinal SRM data, making it extremely useful for applied research.","PeriodicalId":48001,"journal":{"name":"Journal of Educational and Behavioral Statistics","volume":"47 1","pages":"231 - 260"},"PeriodicalIF":2.4,"publicationDate":"2021-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47561898","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Item Pool Quality Control in Educational Testing: Change Point Model, Compound Risk, and Sequential Detection 教育测试中的题库质量控制:变化点模型、复合风险和顺序检测
IF 2.4 3区 心理学 Q2 EDUCATION & EDUCATIONAL RESEARCH Pub Date : 2021-12-13 DOI: 10.3102/10769986211059085
Yunxiao Chen, Yi-Hsuan Lee, Xiaoou Li
In standardized educational testing, test items are reused in multiple test administrations. To ensure the validity of test scores, the psychometric properties of items should remain unchanged over time. In this article, we consider the sequential monitoring of test items, in particular, the detection of abrupt changes to their psychometric properties, where a change can be caused by, for example, leakage of the item or change of the corresponding curriculum. We propose a statistical framework for the detection of abrupt changes in individual items. This framework consists of (1) a multistream Bayesian change point model describing sequential changes in items, (2) a compound risk function quantifying the risk in sequential decisions, and (3) sequential decision rules that control the compound risk. Throughout the sequential decision process, the proposed decision rule balances the trade-off between two sources of errors, the false detection of prechange items, and the nondetection of postchange items. An item-specific monitoring statistic is proposed based on an item response theory model that eliminates the confounding from the examinee population which changes over time. Sequential decision rules and their theoretical properties are developed under two settings: the oracle setting where the Bayesian change point model is completely known and a more realistic setting where some parameters of the model are unknown. Simulation studies are conducted under settings that mimic real operational tests.
在标准化的教育测试中,测试项目在多个测试管理中重复使用。为了确保考试成绩的有效性,项目的心理测量特性应随时间保持不变。在这篇文章中,我们考虑了对测试项目的顺序监测,特别是检测其心理测量特性的突然变化,例如,项目的泄露或相应课程的变化可能会导致变化。我们提出了一个统计框架来检测单个项目的突变。该框架由(1)描述项目顺序变化的多流贝叶斯变点模型,(2)量化顺序决策中风险的复合风险函数,以及(3)控制复合风险的顺序决策规则组成。在整个顺序决策过程中,所提出的决策规则平衡了两个错误来源之间的权衡,即更改前项目的错误检测和更改后项目的未检测。基于项目反应理论模型,提出了一种针对项目的监测统计数据,该模型消除了受试者群体中随时间变化的混杂因素。序列决策规则及其理论性质是在两种设置下发展起来的:在预言机设置下,贝叶斯变点模型是完全已知的,在更现实的设置下,模型的一些参数是未知的。模拟研究是在模拟实际操作测试的环境下进行的。
{"title":"Item Pool Quality Control in Educational Testing: Change Point Model, Compound Risk, and Sequential Detection","authors":"Yunxiao Chen, Yi-Hsuan Lee, Xiaoou Li","doi":"10.3102/10769986211059085","DOIUrl":"https://doi.org/10.3102/10769986211059085","url":null,"abstract":"In standardized educational testing, test items are reused in multiple test administrations. To ensure the validity of test scores, the psychometric properties of items should remain unchanged over time. In this article, we consider the sequential monitoring of test items, in particular, the detection of abrupt changes to their psychometric properties, where a change can be caused by, for example, leakage of the item or change of the corresponding curriculum. We propose a statistical framework for the detection of abrupt changes in individual items. This framework consists of (1) a multistream Bayesian change point model describing sequential changes in items, (2) a compound risk function quantifying the risk in sequential decisions, and (3) sequential decision rules that control the compound risk. Throughout the sequential decision process, the proposed decision rule balances the trade-off between two sources of errors, the false detection of prechange items, and the nondetection of postchange items. An item-specific monitoring statistic is proposed based on an item response theory model that eliminates the confounding from the examinee population which changes over time. Sequential decision rules and their theoretical properties are developed under two settings: the oracle setting where the Bayesian change point model is completely known and a more realistic setting where some parameters of the model are unknown. Simulation studies are conducted under settings that mimic real operational tests.","PeriodicalId":48001,"journal":{"name":"Journal of Educational and Behavioral Statistics","volume":"47 1","pages":"322 - 352"},"PeriodicalIF":2.4,"publicationDate":"2021-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43301337","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
A New Multiprocess IRT Model With Ideal Points for Likert-Type Items 具有李克特类项目理想点的多进程IRT模型
IF 2.4 3区 心理学 Q2 EDUCATION & EDUCATIONAL RESEARCH Pub Date : 2021-12-09 DOI: 10.3102/10769986211057160
K. Jin, Yi-Jhen Wu, Hui-Fang Chen
For surveys of complex issues that entail multiple steps, multiple reference points, and nongradient attributes (e.g., social inequality), this study proposes a new multiprocess model that integrates ideal-point and dominance approaches into a treelike structure (IDtree). In the IDtree, an ideal-point approach describes an individual’s attitude and then a dominance approach describes their tendency for using extreme response categories. Evaluation of IDtree performance via two empirical data sets showed that the IDtree fit these data better than other models. Furthermore, simulation studies showed a satisfactory parameter recovery of the IDtree. Thus, the IDtree model sheds light on the response processes of a multistage structure.
对于涉及多个步骤、多个参考点和非传统属性(如社会不平等)的复杂问题的调查,本研究提出了一种新的多过程模型,该模型将理想点和优势方法集成到树状结构(IDtree)中。在IDtree中,理想点方法描述了个人的态度,然后优势方法描述了他们使用极端反应类别的倾向。通过两个经验数据集对IDtree性能的评估表明,IDtree比其他模型更适合这些数据。此外,仿真研究表明,IDtree的参数恢复效果令人满意。因此,IDtree模型揭示了多级结构的响应过程。
{"title":"A New Multiprocess IRT Model With Ideal Points for Likert-Type Items","authors":"K. Jin, Yi-Jhen Wu, Hui-Fang Chen","doi":"10.3102/10769986211057160","DOIUrl":"https://doi.org/10.3102/10769986211057160","url":null,"abstract":"For surveys of complex issues that entail multiple steps, multiple reference points, and nongradient attributes (e.g., social inequality), this study proposes a new multiprocess model that integrates ideal-point and dominance approaches into a treelike structure (IDtree). In the IDtree, an ideal-point approach describes an individual’s attitude and then a dominance approach describes their tendency for using extreme response categories. Evaluation of IDtree performance via two empirical data sets showed that the IDtree fit these data better than other models. Furthermore, simulation studies showed a satisfactory parameter recovery of the IDtree. Thus, the IDtree model sheds light on the response processes of a multistage structure.","PeriodicalId":48001,"journal":{"name":"Journal of Educational and Behavioral Statistics","volume":"47 1","pages":"297 - 321"},"PeriodicalIF":2.4,"publicationDate":"2021-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48319208","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Acknowledgments 致谢
IF 2.4 3区 心理学 Q2 EDUCATION & EDUCATIONAL RESEARCH Pub Date : 2021-11-02 DOI: 10.3102/10769986211056337
Stephen R. Aichele, Michael Gottfried
Stephen Aichele, Colorado State University Usama Ali, Educational Testing Service Marı́a Álvarez Hernández, Centro Universitario de la Defensa Eva Baker, University of California, Los Angeles Michela Battauz, University of Udine Daniel Bauer, University of North Carolina Jorge Bazán, Universidade de São Paulo William Belzak, University of North Carolina at Chapel Hill Yoav Bergner, New York University Howard Bloom, Manpower Demonstration Research Corporation (MDRC) Ulf Bockenholt, Northwestern University Maria Bolsinova, Tilburg University Daniel Bolt, University of Wisconsin Zach Branson, Carnegie Mellon University Robert Brennan, University of Iowa Tiago Calico, American Institutes for Research Jodi Casabianca, Educational Testing Service Katherine Castellano, Educational Testing Service Mei-Hsiu Chen, State University of New York at Binghamton Ping Chen, Beijing Normal University Yinghan Chen, University of Nevada Reno Yunxiao Chen, London School of Economics and Political Science Michael Cheung, National University of Singapore Chia-Yi Chiu, Rutgers, The State University of New Jersey Kilchan Choi, University of California, Los Angeles Karl Bang Christensen, University of Copenhagen Brian Clauser, National Board of Medical Examiners (NBME) Paul De Boeck, Ohio State University Dries Debeer, University of Leuven (KU Leuven) Ben Domingue, Stanford University Nianbo Dong, University of North Carolina at Chapel Hill Jeffrey Douglas, University of Illinois Urbana-Champaign Han Du, University of California, Los Angeles Georgios Fellouris, University of Illinois Urbana-Champaign Leah Feuerstahler, Fordham University William Finch, Ball State University Jean-Paul Fox, University of Twente Ken Fujimoto, Loyola University Chicago Johann Gagnon-Bartsch, University of Michigan Michael Garet, American Institutes for Research Andrew Gelman, Columbia University Flavio Gonçalves, Universidade Federal de Minas Gerais Jorge Gonzaléz, Pontificia Universidad Católica de Chile Maithreyi Gopalan, Penn State College of Education Journal of Educational and Behavioral Statistics 2021, Vol. 46, No. 6, pp. 776–778 DOI: 10.3102/10769986211056337 Article reuse guidelines: sagepub.com/journals-permissions © 2021 AERA. https://journals.sagepub.com/home/jeb
Stephen Aichele,科罗拉多州立大学Usama Ali,教育测试服务Marı́aÁlvarez Hernández,国防大学中心Eva Baker,加州大学洛杉矶分校Michela Battauz,乌迪内·丹尼尔·鲍尔大学,北卡罗来纳大学Jorge Bazán,圣保罗大学William Belzak,北卡罗来纳州教堂山大学Yoav Bergner,纽约大学Howard Bloom、人力资源示范研究公司(MDRC)Ulf Bockenholt、西北大学Maria Bolsinova、蒂尔堡大学Daniel Bolt、威斯康星大学Zach Branson、卡内基梅隆大学Robert Brennan、爱荷华大学Tiago Calico、美国研究院Jodi Casabianca、教育测试服务机构Katherine Castellano,教育测试服务中心陈美秀,纽约州立大学陈,北京师范大学陈映涵,内华达大学陈,伦敦政治经济学院张,新加坡国立大学邱嘉义,罗格斯,新泽西州立大学蔡,加州大学,洛杉矶Karl Bang Christensen、哥本哈根大学Brian Clauser、国家医学检查委员会(NBME)Paul De Boeck、俄亥俄州立大学Dries Debeer、鲁汶大学Ben Domingue、斯坦福大学Nianbo Dong、北卡罗来纳大学教堂山分校Jeffrey Douglas、伊利诺伊大学厄巴纳-香槟分校Han Du、加利福尼亚大学,洛杉矶Georgios Fellouris、伊利诺伊大学厄巴纳-香槟分校Leah Feuerstahler、福特汉姆大学William Finch、鲍尔州立大学Jean-Paul Fox、特文特大学Ken Fujimoto、芝加哥洛约拉大学Johann Gagnon Bartsch、密歇根大学Michael Garet、美国研究院Andrew Gelman、哥伦比亚大学Flavio Gonçalves,米纳斯吉拉斯联邦大学Jorge Gonzaléz,智利天主教大学Maithreyi Gopalan,宾夕法尼亚州立教育学院《2021年教育与行为统计杂志》,第46卷,第6期,第776–778页DOI:10.3102/107699862111056337文章重用指南:sagepub.com/journals-permissions©2021 AERA。https://journals.sagepub.com/home/jeb
{"title":"Acknowledgments","authors":"Stephen R. Aichele, Michael Gottfried","doi":"10.3102/10769986211056337","DOIUrl":"https://doi.org/10.3102/10769986211056337","url":null,"abstract":"Stephen Aichele, Colorado State University Usama Ali, Educational Testing Service Marı́a Álvarez Hernández, Centro Universitario de la Defensa Eva Baker, University of California, Los Angeles Michela Battauz, University of Udine Daniel Bauer, University of North Carolina Jorge Bazán, Universidade de São Paulo William Belzak, University of North Carolina at Chapel Hill Yoav Bergner, New York University Howard Bloom, Manpower Demonstration Research Corporation (MDRC) Ulf Bockenholt, Northwestern University Maria Bolsinova, Tilburg University Daniel Bolt, University of Wisconsin Zach Branson, Carnegie Mellon University Robert Brennan, University of Iowa Tiago Calico, American Institutes for Research Jodi Casabianca, Educational Testing Service Katherine Castellano, Educational Testing Service Mei-Hsiu Chen, State University of New York at Binghamton Ping Chen, Beijing Normal University Yinghan Chen, University of Nevada Reno Yunxiao Chen, London School of Economics and Political Science Michael Cheung, National University of Singapore Chia-Yi Chiu, Rutgers, The State University of New Jersey Kilchan Choi, University of California, Los Angeles Karl Bang Christensen, University of Copenhagen Brian Clauser, National Board of Medical Examiners (NBME) Paul De Boeck, Ohio State University Dries Debeer, University of Leuven (KU Leuven) Ben Domingue, Stanford University Nianbo Dong, University of North Carolina at Chapel Hill Jeffrey Douglas, University of Illinois Urbana-Champaign Han Du, University of California, Los Angeles Georgios Fellouris, University of Illinois Urbana-Champaign Leah Feuerstahler, Fordham University William Finch, Ball State University Jean-Paul Fox, University of Twente Ken Fujimoto, Loyola University Chicago Johann Gagnon-Bartsch, University of Michigan Michael Garet, American Institutes for Research Andrew Gelman, Columbia University Flavio Gonçalves, Universidade Federal de Minas Gerais Jorge Gonzaléz, Pontificia Universidad Católica de Chile Maithreyi Gopalan, Penn State College of Education Journal of Educational and Behavioral Statistics 2021, Vol. 46, No. 6, pp. 776–778 DOI: 10.3102/10769986211056337 Article reuse guidelines: sagepub.com/journals-permissions © 2021 AERA. https://journals.sagepub.com/home/jeb","PeriodicalId":48001,"journal":{"name":"Journal of Educational and Behavioral Statistics","volume":"46 1","pages":"776 - 778"},"PeriodicalIF":2.4,"publicationDate":"2021-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44572876","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
On the Generalized S − X 2 –Test of Item Fit: Some Variants, Residuals, and a Graphical Visualization 项目拟合的广义S−x2检验:一些变异、残差和图形可视化
IF 2.4 3区 心理学 Q2 EDUCATION & EDUCATIONAL RESEARCH Pub Date : 2021-10-25 DOI: 10.3102/10769986211050304
Jochen Ranger, Kay Brauer
The generalized S − X 2 –test is a test of item fit for items with polytomous responses format. The test is based on a comparison of the observed and expected number of responses in strata defined by the test score. In this article, we make four contributions. We demonstrate that the performance of the generalized S − X 2 –test depends on how sparse cells are pooled. We propose alternative implementations of the test within the framework of limited information testing. We derive the distribution of the S − X 2 –residuals that can be used for post hoc analyses. We suggest a diagnostic plot that visualizes the form of the misfit. The performance of the alternative implementations is investigated in a simulation study. The simulation study suggests that the alternative implementations are capable of controlling the Type-I error rate well and have high power. An empirical application concludes this article.
广义S−x2 -检验是一种对具有多分式回答格式的题目的拟合性检验。该测试是基于测试分数定义的地层中观察到的响应数和预期响应数的比较。在本文中,我们做了四个贡献。我们证明了广义S - x2 -测试的性能取决于稀疏单元池的方式。我们在有限信息测试的框架内提出了测试的替代实现。我们推导了S−x2 -残差的分布,可用于事后分析。我们建议一个诊断图,可视化的形式的不适合。在仿真研究中对备选实现的性能进行了研究。仿真研究表明,备选实现能够很好地控制i型错误率,并且具有较高的功率。本文最后以实证应用为结论。
{"title":"On the Generalized S − X 2 –Test of Item Fit: Some Variants, Residuals, and a Graphical Visualization","authors":"Jochen Ranger, Kay Brauer","doi":"10.3102/10769986211050304","DOIUrl":"https://doi.org/10.3102/10769986211050304","url":null,"abstract":"The generalized S − X 2 –test is a test of item fit for items with polytomous responses format. The test is based on a comparison of the observed and expected number of responses in strata defined by the test score. In this article, we make four contributions. We demonstrate that the performance of the generalized S − X 2 –test depends on how sparse cells are pooled. We propose alternative implementations of the test within the framework of limited information testing. We derive the distribution of the S − X 2 –residuals that can be used for post hoc analyses. We suggest a diagnostic plot that visualizes the form of the misfit. The performance of the alternative implementations is investigated in a simulation study. The simulation study suggests that the alternative implementations are capable of controlling the Type-I error rate well and have high power. An empirical application concludes this article.","PeriodicalId":48001,"journal":{"name":"Journal of Educational and Behavioral Statistics","volume":"47 1","pages":"202 - 230"},"PeriodicalIF":2.4,"publicationDate":"2021-10-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41455826","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Reporting Proficiency Levels for Examinees With Incomplete Data 数据不完整的考生报告熟练程度
IF 2.4 3区 心理学 Q2 EDUCATION & EDUCATIONAL RESEARCH Pub Date : 2021-10-24 DOI: 10.3102/10769986211051379
S. Sinharay
Takers of educational tests often receive proficiency levels instead of or in addition to scaled scores. For example, proficiency levels are reported for the Advanced Placement (AP®) and U.S. Medical Licensing examinations. Technical difficulties and other unforeseen events occasionally lead to missing item scores and hence to incomplete data on these tests. The reporting of proficiency levels to the examinees with incomplete data requires estimation of the performance of the examinees on the missing part and essentially involves imputation of missing data. In this article, six approaches from the literature on missing data analysis are brought to bear on the problem of reporting of proficiency levels to the examinees with incomplete data. Data from several large-scale educational tests are used to compare the performances of the six approaches to the approach that is operationally used for reporting proficiency levels for these tests. A multiple imputation approach based on chained equations is shown to lead to the most accurate reporting of proficiency levels for data that were missing at random or completely at random, while the model-based approach of Holman and Glas performed the best for data that are missing not at random. Several recommendations are made on the reporting of proficiency levels to the examinees with incomplete data.
参加教育考试的人通常会获得熟练程度,而不是按比例计算的分数。例如,高级入学考试(AP®)和美国医学执照考试的熟练程度报告。技术困难和其他不可预见的事件偶尔会导致项目分数缺失,从而导致这些测试的数据不完整。向数据不完整的考生报告熟练程度需要估计考生在缺失部分的表现,本质上涉及缺失数据的插补。在本文中,从文献中关于缺失数据分析的六种方法来解决向数据不完整的考生报告熟练程度的问题。使用来自几次大规模教育测试的数据,将六种方法的性能与用于报告这些测试熟练程度的方法进行比较。基于链式方程的多重插补方法被证明可以最准确地报告随机或完全随机缺失的数据的熟练程度,而Holman和Glas的基于模型的方法对非随机丢失的数据表现最好。就向数据不完整的考生报告熟练程度提出了几项建议。
{"title":"Reporting Proficiency Levels for Examinees With Incomplete Data","authors":"S. Sinharay","doi":"10.3102/10769986211051379","DOIUrl":"https://doi.org/10.3102/10769986211051379","url":null,"abstract":"Takers of educational tests often receive proficiency levels instead of or in addition to scaled scores. For example, proficiency levels are reported for the Advanced Placement (AP®) and U.S. Medical Licensing examinations. Technical difficulties and other unforeseen events occasionally lead to missing item scores and hence to incomplete data on these tests. The reporting of proficiency levels to the examinees with incomplete data requires estimation of the performance of the examinees on the missing part and essentially involves imputation of missing data. In this article, six approaches from the literature on missing data analysis are brought to bear on the problem of reporting of proficiency levels to the examinees with incomplete data. Data from several large-scale educational tests are used to compare the performances of the six approaches to the approach that is operationally used for reporting proficiency levels for these tests. A multiple imputation approach based on chained equations is shown to lead to the most accurate reporting of proficiency levels for data that were missing at random or completely at random, while the model-based approach of Holman and Glas performed the best for data that are missing not at random. Several recommendations are made on the reporting of proficiency levels to the examinees with incomplete data.","PeriodicalId":48001,"journal":{"name":"Journal of Educational and Behavioral Statistics","volume":"47 1","pages":"263 - 296"},"PeriodicalIF":2.4,"publicationDate":"2021-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43010884","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Comparison of Within- and Between-Series Effect Estimates in the Meta-Analysis of Multiple Baseline Studies 多个基线研究荟萃分析中系列内和系列间效应估计的比较
IF 2.4 3区 心理学 Q2 EDUCATION & EDUCATIONAL RESEARCH Pub Date : 2021-08-05 DOI: 10.3102/10769986211035507
Seang-Hwane Joo, Yan Wang, J. Ferron, S. N. Beretvas, Mariola Moeyaert, W. Van den Noortgate
Multiple baseline (MB) designs are becoming more prevalent in educational and behavioral research, and as they do, there is growing interest in combining effect size estimates across studies. To further refine the meta-analytic methods of estimating the effect, this study developed and compared eight alternative methods of estimating intervention effects from a set of MB studies. The methods differed in the assumptions made and varied in whether they relied on within- or between-series comparisons, modeled raw data or effect sizes, and did or did not standardize. Small sample functioning was examined through two simulation studies, which showed that when data were consistent with assumptions the bias was consistently less than 5% of the effect size for each method, whereas root mean squared error varied substantially across methods. When assumptions were violated, substantial biases were found. Implications and limitations are discussed.
多重基线(MB)设计在教育和行为研究中变得越来越普遍,随着它们的出现,人们对跨研究组合效应大小估计的兴趣越来越大。为了进一步完善评估效果的荟萃分析方法,本研究开发并比较了一组MB研究中评估干预效果的八种替代方法。这些方法的不同之处在于所做的假设,以及它们是否依赖于序列内或序列间比较、建模原始数据或效应大小,以及是否标准化。通过两项模拟研究检验了小样本功能,结果表明,当数据与假设一致时,每种方法的偏差始终小于效应大小的5%,而不同方法的均方根误差差异很大。当假设被违背时,就会发现大量的偏差。讨论了影响和局限性。
{"title":"Comparison of Within- and Between-Series Effect Estimates in the Meta-Analysis of Multiple Baseline Studies","authors":"Seang-Hwane Joo, Yan Wang, J. Ferron, S. N. Beretvas, Mariola Moeyaert, W. Van den Noortgate","doi":"10.3102/10769986211035507","DOIUrl":"https://doi.org/10.3102/10769986211035507","url":null,"abstract":"Multiple baseline (MB) designs are becoming more prevalent in educational and behavioral research, and as they do, there is growing interest in combining effect size estimates across studies. To further refine the meta-analytic methods of estimating the effect, this study developed and compared eight alternative methods of estimating intervention effects from a set of MB studies. The methods differed in the assumptions made and varied in whether they relied on within- or between-series comparisons, modeled raw data or effect sizes, and did or did not standardize. Small sample functioning was examined through two simulation studies, which showed that when data were consistent with assumptions the bias was consistently less than 5% of the effect size for each method, whereas root mean squared error varied substantially across methods. When assumptions were violated, substantial biases were found. Implications and limitations are discussed.","PeriodicalId":48001,"journal":{"name":"Journal of Educational and Behavioral Statistics","volume":"47 1","pages":"131 - 166"},"PeriodicalIF":2.4,"publicationDate":"2021-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48923433","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Analyzing Cross-Sectionally Clustered Data Using Generalized Estimating Equations 用广义估计方程分析截面聚类数据
IF 2.4 3区 心理学 Q2 EDUCATION & EDUCATIONAL RESEARCH Pub Date : 2021-06-04 DOI: 10.3102/10769986211017480
Francis L. Huang
The presence of clustered data is common in the sociobehavioral sciences. One approach that specifically deals with clustered data but has seen little use in education is the generalized estimating equations (GEEs) approach. We provide a background on GEEs, discuss why it is appropriate for the analysis of clustered data, and provide worked examples using both continuous and binary outcomes. Comparisons are made between GEEs, multilevel models, and ordinary least squares results to highlight similarities and differences between the approaches. Detailed walkthroughs are provided using both R and SPSS Version 26.
聚类数据的存在在社会行为科学中很常见。一种专门处理聚类数据但在教育中很少使用的方法是广义估计方程(GEEs)方法。我们提供了GEE的背景,讨论了为什么它适合分析聚类数据,并提供了使用连续结果和二元结果的实例。对GEE、多级模型和普通最小二乘法结果进行了比较,以突出两种方法之间的异同。使用R和SPSS Version 26提供了详细的演练。
{"title":"Analyzing Cross-Sectionally Clustered Data Using Generalized Estimating Equations","authors":"Francis L. Huang","doi":"10.3102/10769986211017480","DOIUrl":"https://doi.org/10.3102/10769986211017480","url":null,"abstract":"The presence of clustered data is common in the sociobehavioral sciences. One approach that specifically deals with clustered data but has seen little use in education is the generalized estimating equations (GEEs) approach. We provide a background on GEEs, discuss why it is appropriate for the analysis of clustered data, and provide worked examples using both continuous and binary outcomes. Comparisons are made between GEEs, multilevel models, and ordinary least squares results to highlight similarities and differences between the approaches. Detailed walkthroughs are provided using both R and SPSS Version 26.","PeriodicalId":48001,"journal":{"name":"Journal of Educational and Behavioral Statistics","volume":"47 1","pages":"101 - 125"},"PeriodicalIF":2.4,"publicationDate":"2021-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43238549","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Using Sequence Mining Techniques for Understanding Incorrect Behavioral Patterns on Interactive Tasks 使用序列挖掘技术理解交互任务中的错误行为模式
IF 2.4 3区 心理学 Q2 EDUCATION & EDUCATIONAL RESEARCH Pub Date : 2021-05-03 DOI: 10.3102/10769986211010467
Esther Ulitzsch, Qiwei He, S. Pohl
Interactive tasks designed to elicit real-life problem-solving behavior are rapidly becoming more widely used in educational assessment. Incorrect responses to such tasks can occur for a variety of different reasons such as low proficiency levels, low metacognitive strategies, or motivational issues. We demonstrate how behavioral patterns associated with incorrect responses can, in part, be understood, supporting insights into the different sources of failure on a task. To this end, we make use of sequence mining techniques that leverage the information contained in time-stamped action sequences commonly logged in assessments with interactive tasks for (a) investigating what distinguishes incorrect behavioral patterns from correct ones and (b) identifying subgroups of examinees with similar incorrect behavioral patterns. Analyzing a task from the Programme for the International Assessment of Adult Competencies 2012 assessment, we find incorrect behavioral patterns to be more heterogeneous than correct ones. We identify multiple subgroups of incorrect behavioral patterns, which point toward different levels of effort and lack of different subskills needed for solving the task. Albeit focusing on a single task, meaningful patterns of major differences in how examinees approach a given task that generalize across multiple tasks are uncovered. Implications for the construction and analysis of interactive tasks as well as the design of interventions for complex problem-solving skills are derived.
旨在引发现实生活中解决问题行为的互动任务在教育评估中的应用越来越广泛。对这类任务的错误反应可能是由于各种不同的原因,如熟练程度低、元认知策略低或动机问题。我们展示了如何在一定程度上理解与错误反应相关的行为模式,支持深入了解任务中失败的不同来源。为此,我们利用序列挖掘技术,利用带有时间戳的动作序列中包含的信息,这些信息通常与交互式任务一起记录在评估中,用于(a)调查错误行为模式与正确行为模式的区别,以及(b)识别具有类似错误行为模式的受试者亚组。通过分析2012年国际成人能力评估计划评估的一项任务,我们发现不正确的行为模式比正确的更具异质性。我们确定了多个不正确行为模式的亚组,这些行为模式指向不同的努力水平,并且缺乏解决任务所需的不同子技能。尽管只关注一项任务,但考生处理给定任务的方式存在重大差异,这些差异在多个任务中普遍存在。得出了对交互式任务的构建和分析以及复杂问题解决技能干预措施的设计的启示。
{"title":"Using Sequence Mining Techniques for Understanding Incorrect Behavioral Patterns on Interactive Tasks","authors":"Esther Ulitzsch, Qiwei He, S. Pohl","doi":"10.3102/10769986211010467","DOIUrl":"https://doi.org/10.3102/10769986211010467","url":null,"abstract":"Interactive tasks designed to elicit real-life problem-solving behavior are rapidly becoming more widely used in educational assessment. Incorrect responses to such tasks can occur for a variety of different reasons such as low proficiency levels, low metacognitive strategies, or motivational issues. We demonstrate how behavioral patterns associated with incorrect responses can, in part, be understood, supporting insights into the different sources of failure on a task. To this end, we make use of sequence mining techniques that leverage the information contained in time-stamped action sequences commonly logged in assessments with interactive tasks for (a) investigating what distinguishes incorrect behavioral patterns from correct ones and (b) identifying subgroups of examinees with similar incorrect behavioral patterns. Analyzing a task from the Programme for the International Assessment of Adult Competencies 2012 assessment, we find incorrect behavioral patterns to be more heterogeneous than correct ones. We identify multiple subgroups of incorrect behavioral patterns, which point toward different levels of effort and lack of different subskills needed for solving the task. Albeit focusing on a single task, meaningful patterns of major differences in how examinees approach a given task that generalize across multiple tasks are uncovered. Implications for the construction and analysis of interactive tasks as well as the design of interventions for complex problem-solving skills are derived.","PeriodicalId":48001,"journal":{"name":"Journal of Educational and Behavioral Statistics","volume":"47 1","pages":"3 - 35"},"PeriodicalIF":2.4,"publicationDate":"2021-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41989802","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
期刊
Journal of Educational and Behavioral Statistics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1