首页 > 最新文献

Journal of Educational Measurement最新文献

英文 中文
Online Calibration in Multidimensional Computerized Adaptive Testing with Polytomously Scored Items 多维计算机自适应测试中的在线标定
IF 1.3 4区 心理学 Q3 PSYCHOLOGY, APPLIED Pub Date : 2022-12-15 DOI: 10.1111/jedm.12353
Lu Yuan, Yingshi Huang, Shuhang Li, Ping Chen

Online calibration is a key technology for item calibration in computerized adaptive testing (CAT) and has been widely used in various forms of CAT, including unidimensional CAT, multidimensional CAT (MCAT), CAT with polytomously scored items, and cognitive diagnostic CAT. However, as multidimensional and polytomous assessment data become more common, only a few published reports focus on online calibration in MCAT with polytomously scored items (P-MCAT). Therefore, standing on the shoulders of the existing online calibration methods/designs, this study proposes four new P-MCAT online calibration methods and two new P-MCAT online calibration designs and conducts two simulation studies to evaluate their performance under varying conditions (i.e., different calibration sample sizes and correlations between dimensions). Results show that all of the newly proposed methods can accurately recover item parameters, and the adaptive designs outperform the random design in most cases. In the end, this paper provides practical guidance based on simulation results.

在线定标是计算机自适应测试(computer adaptive testing, CAT)项目定标的关键技术,已广泛应用于各种形式的CAT,包括一维CAT、多维CAT、多元计分CAT和认知诊断CAT。然而,随着多维和多分式评估数据变得越来越普遍,只有少数已发表的报告关注多分式评分项目(P-MCAT)的MCAT在线校准。因此,本研究在现有在线校准方法/设计的基础上,提出了四种新的P-MCAT在线校准方法和两种新的P-MCAT在线校准设计,并进行了两次仿真研究,以评估其在不同条件下(即不同校准样本量和维度之间的相关性)的性能。结果表明,所有新提出的方法都能准确地恢复项目参数,并且自适应设计在大多数情况下优于随机设计。最后,根据仿真结果给出了实际指导。
{"title":"Online Calibration in Multidimensional Computerized Adaptive Testing with Polytomously Scored Items","authors":"Lu Yuan,&nbsp;Yingshi Huang,&nbsp;Shuhang Li,&nbsp;Ping Chen","doi":"10.1111/jedm.12353","DOIUrl":"10.1111/jedm.12353","url":null,"abstract":"<p>Online calibration is a key technology for item calibration in computerized adaptive testing (CAT) and has been widely used in various forms of CAT, including unidimensional CAT, multidimensional CAT (MCAT), CAT with polytomously scored items, and cognitive diagnostic CAT. However, as multidimensional and polytomous assessment data become more common, only a few published reports focus on online calibration in MCAT with polytomously scored items (P-MCAT). Therefore, standing on the shoulders of the existing online calibration methods/designs, this study proposes four new P-MCAT online calibration methods and two new P-MCAT online calibration designs and conducts two simulation studies to evaluate their performance under varying conditions (i.e., different calibration sample sizes and correlations between dimensions). Results show that all of the newly proposed methods can accurately recover item parameters, and the adaptive designs outperform the random design in most cases. In the end, this paper provides practical guidance based on simulation results.</p>","PeriodicalId":47871,"journal":{"name":"Journal of Educational Measurement","volume":"60 3","pages":"476-500"},"PeriodicalIF":1.3,"publicationDate":"2022-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47208290","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Measuring the Uncertainty of Imputed Scores 估算分数的不确定度测量
IF 1.3 4区 心理学 Q3 PSYCHOLOGY, APPLIED Pub Date : 2022-12-14 DOI: 10.1111/jedm.12352
Sandip Sinharay

Technical difficulties and other unforeseen events occasionally lead to incomplete data on educational tests, which necessitates the reporting of imputed scores to some examinees. While there exist several approaches for reporting imputed scores, there is a lack of any guidance on the reporting of the uncertainty of imputed scores. In this paper, several approaches are suggested for quantifying the uncertainty of imputed scores using measures that are similar in spirit to estimates of reliability and standard error of measurement. A simulation study is performed to examine the properties of the approaches. The approaches are then applied to data from a state test on which some examinees' scores had to be imputed following computer problems. Several recommendations are made for practice.

技术困难和其他不可预见的事件有时会导致教育考试数据不完整,这就需要向一些考生报告估算分数。虽然存在几种报告估算分数的方法,但缺乏关于估算分数不确定性报告的任何指导。在本文中,提出了几种方法来量化不确定的估算分数使用的措施,在精神上类似于估计的可靠性和标准误差的测量。通过仿真研究来检验这些方法的特性。然后,这些方法被应用于一项州考试的数据,一些考生的分数必须在计算机出现问题后计算出来。对实践提出了几点建议。
{"title":"Measuring the Uncertainty of Imputed Scores","authors":"Sandip Sinharay","doi":"10.1111/jedm.12352","DOIUrl":"10.1111/jedm.12352","url":null,"abstract":"<p>Technical difficulties and other unforeseen events occasionally lead to incomplete data on educational tests, which necessitates the reporting of imputed scores to some examinees. While there exist several approaches for reporting imputed scores, there is a lack of any guidance on the reporting of the uncertainty of imputed scores. In this paper, several approaches are suggested for quantifying the uncertainty of imputed scores using measures that are similar in spirit to estimates of reliability and standard error of measurement. A simulation study is performed to examine the properties of the approaches. The approaches are then applied to data from a state test on which some examinees' scores had to be imputed following computer problems. Several recommendations are made for practice.</p>","PeriodicalId":47871,"journal":{"name":"Journal of Educational Measurement","volume":"60 2","pages":"351-375"},"PeriodicalIF":1.3,"publicationDate":"2022-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45116305","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
An Exponentially Weighted Moving Average Procedure for Detecting Back Random Responding Behavior 一种指数加权移动平均法检测反向随机响应行为
IF 1.3 4区 心理学 Q3 PSYCHOLOGY, APPLIED Pub Date : 2022-12-09 DOI: 10.1111/jedm.12351
Yinhong He

Back random responding (BRR) behavior is one of the commonly observed careless response behaviors. Accurately detecting BRR behavior can improve test validities. Yu and Cheng (2019) showed that the change point analysis (CPA) procedure based on weighted residual (CPA-WR) performed well in detecting BRR. Compared with the CPA procedure, the exponentially weighted moving average (EWMA) obtains more detailed information. This study equipped the weighted residual statistic with EWMA, and proposed the EWMA-WR method to detect BRR. To make the critical values adaptive to the ability levels, this study proposed the Monte Carlo simulation with ability stratification (MC-stratification) method for calculating critical values. Compared to the original Monte Carlo simulation (MC) method, the newly proposed MC-stratification method generated a larger number of satisfactory results. The performances of CPA-WR and EWMA-WR were evaluated under different conditions that varied in the test lengths, abnormal proportions, critical values and smoothing constants used in the EWMA-WR method. The results showed that EWMA-WR was more powerful than CPA-WR in detecting BRR. Moreover, an empirical study was conducted to illustrate the utility of EWMA-WR for detecting BRR.

反向随机响应(BRR)行为是一种常见的不小心响应行为。准确地检测BRR行为可以提高测试的有效性。Yu and Cheng(2019)表明,基于加权残差(CPA- wr)的变化点分析(CPA)程序在检测BRR方面表现良好。与CPA方法相比,指数加权移动平均(EWMA)方法可以获得更详细的信息。本研究将加权残差统计量与EWMA结合,提出了EWMA- wr方法来检测BRR。为了使临界值与能力水平相适应,本文提出了基于能力分层的蒙特卡罗模拟(MC-stratification)方法来计算临界值。与原来的蒙特卡罗模拟(MC)方法相比,新提出的MC分层方法产生了更多令人满意的结果。在不同的测试长度、异常比例、临界值和EWMA-WR方法使用的平滑常数等条件下,对CPA-WR和EWMA-WR方法的性能进行了评价。结果表明,EWMA-WR比CPA-WR对BRR的检测更有效。此外,本文还通过实证研究说明了EWMA-WR在BRR检测中的实用性。
{"title":"An Exponentially Weighted Moving Average Procedure for Detecting Back Random Responding Behavior","authors":"Yinhong He","doi":"10.1111/jedm.12351","DOIUrl":"10.1111/jedm.12351","url":null,"abstract":"<p>Back random responding (BRR) behavior is one of the commonly observed careless response behaviors. Accurately detecting BRR behavior can improve test validities. Yu and Cheng (2019) showed that the change point analysis (CPA) procedure based on weighted residual (CPA-WR) performed well in detecting BRR. Compared with the CPA procedure, the exponentially weighted moving average (EWMA) obtains more detailed information. This study equipped the weighted residual statistic with EWMA, and proposed the EWMA-WR method to detect BRR. To make the critical values adaptive to the ability levels, this study proposed the Monte Carlo simulation with ability stratification (MC-stratification) method for calculating critical values. Compared to the original Monte Carlo simulation (MC) method, the newly proposed MC-stratification method generated a larger number of satisfactory results. The performances of CPA-WR and EWMA-WR were evaluated under different conditions that varied in the test lengths, abnormal proportions, critical values and smoothing constants used in the EWMA-WR method. The results showed that EWMA-WR was more powerful than CPA-WR in detecting BRR. Moreover, an empirical study was conducted to illustrate the utility of EWMA-WR for detecting BRR.</p>","PeriodicalId":47871,"journal":{"name":"Journal of Educational Measurement","volume":"60 2","pages":"282-317"},"PeriodicalIF":1.3,"publicationDate":"2022-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47390314","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Multiple-Group Joint Modeling of Item Responses, Response Times, and Action Counts with the Conway-Maxwell-Poisson Distribution Conway‐Maxwell‐Poisson分布的项目响应、响应时间和行动次数的多组联合建模
IF 1.3 4区 心理学 Q3 PSYCHOLOGY, APPLIED Pub Date : 2022-12-07 DOI: 10.1111/jedm.12349
Xin Qiao, Hong Jiao, Qiwei He

Multiple group modeling is one of the methods to address the measurement noninvariance issue. Traditional studies on multiple group modeling have mainly focused on item responses. In computer-based assessments, joint modeling of response times and action counts with item responses helps estimate the latent speed and action levels in addition to latent ability. These two new data sources can also be used to further address the measurement noninvariance issue. One challenge, however, is to correctly model action counts which can be underdispersed, overdispersed, or equidispersed in real data sets. To address this, we adopted the Conway-Maxwell-Poisson distribution that accounts for different types of dispersion in action counts and incorporated it in the multiple group joint modeling of item responses, response times, and action counts. Bayesian Markov Chain Monte Carlo method was used for model parameter estimation. To illustrate an application of the proposed model, an empirical data analysis was conducted using the Programme for International Student Assessment (PISA) 2015 collaborative problem-solving items where potential measurement noninvariance issue existed between gender groups. Results indicated that Conway-Maxwell-Poisson model yielded better model fit than alternative count data models such as negative binomial and Poisson models. In addition, response times and action counts provided further information on performance differences between groups.

多组建模是解决测量不变性问题的方法之一。传统的多群体建模研究主要集中在项目反应上。在基于计算机的评估中,反应时间和行动次数与项目反应的联合建模有助于估计潜在速度和行动水平以及潜在能力。这两个新数据源还可以用于进一步解决度量不变性问题。然而,一个挑战是正确地建模行动计数,这些计数在实际数据集中可能是不充分分散、过度分散或等分散的。为了解决这个问题,我们采用了康威-麦克斯韦-泊松分布,该分布解释了不同类型的行动计数分散,并将其纳入项目反应、反应时间和行动计数的多组联合建模中。模型参数估计采用贝叶斯马尔可夫链蒙特卡罗方法。为了说明所提出模型的应用,使用2015年国际学生评估项目(PISA)协作解决问题的项目进行了实证数据分析,其中性别群体之间存在潜在的测量不变性问题。结果表明,康威-麦克斯韦-泊松模型比其他计数数据模型如负二项和泊松模型具有更好的模型拟合效果。此外,响应时间和操作计数提供了组间性能差异的进一步信息。
{"title":"Multiple-Group Joint Modeling of Item Responses, Response Times, and Action Counts with the Conway-Maxwell-Poisson Distribution","authors":"Xin Qiao,&nbsp;Hong Jiao,&nbsp;Qiwei He","doi":"10.1111/jedm.12349","DOIUrl":"10.1111/jedm.12349","url":null,"abstract":"<p>Multiple group modeling is one of the methods to address the measurement noninvariance issue. Traditional studies on multiple group modeling have mainly focused on item responses. In computer-based assessments, joint modeling of response times and action counts with item responses helps estimate the latent speed and action levels in addition to latent ability. These two new data sources can also be used to further address the measurement noninvariance issue. One challenge, however, is to correctly model action counts which can be underdispersed, overdispersed, or equidispersed in real data sets. To address this, we adopted the Conway-Maxwell-Poisson distribution that accounts for different types of dispersion in action counts and incorporated it in the multiple group joint modeling of item responses, response times, and action counts. Bayesian Markov Chain Monte Carlo method was used for model parameter estimation. To illustrate an application of the proposed model, an empirical data analysis was conducted using the Programme for International Student Assessment (PISA) 2015 collaborative problem-solving items where potential measurement noninvariance issue existed between gender groups. Results indicated that Conway-Maxwell-Poisson model yielded better model fit than alternative count data models such as negative binomial and Poisson models. In addition, response times and action counts provided further information on performance differences between groups.</p>","PeriodicalId":47871,"journal":{"name":"Journal of Educational Measurement","volume":"60 2","pages":"255-281"},"PeriodicalIF":1.3,"publicationDate":"2022-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45484845","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
NCME Presidential Address 2022: Turning the Page to the Next Chapter of Educational Measurement 全国教育计量学会2022年会长致辞:掀开教育计量的新篇章
IF 1.3 4区 心理学 Q3 PSYCHOLOGY, APPLIED Pub Date : 2022-11-09 DOI: 10.1111/jedm.12350
Derek C. Briggs
{"title":"NCME Presidential Address 2022: Turning the Page to the Next Chapter of Educational Measurement","authors":"Derek C. Briggs","doi":"10.1111/jedm.12350","DOIUrl":"https://doi.org/10.1111/jedm.12350","url":null,"abstract":"","PeriodicalId":47871,"journal":{"name":"Journal of Educational Measurement","volume":"59 4","pages":"398-417"},"PeriodicalIF":1.3,"publicationDate":"2022-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"137813868","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Unified Comparison of IRT-Based Effect Sizes for DIF Investigations DIF研究中基于IRT的效应大小的统一比较
IF 1.3 4区 心理学 Q3 PSYCHOLOGY, APPLIED Pub Date : 2022-11-07 DOI: 10.1111/jedm.12347
R. Philip Chalmers

Several marginal effect size (ES) statistics suitable for quantifying the magnitude of differential item functioning (DIF) have been proposed in the area of item response theory; for instance, the Differential Functioning of Items and Tests (DFIT) statistics, signed and unsigned item difference in the sample statistics (SIDS, UIDS, NSIDS, and NUIDS), the standardized indices of impact, and the differential response functioning (DRF) statistics. However, the relationship between these proposed statistics has not been fully discussed, particularly with respect to population parameter definitions and recovery performance across independent samples. To address these issues, this article provides a unified presentation of competing DIF ES definitions and estimators, and evaluates the recovery efficacy of these competing estimators using a set of Monte Carlo simulation experiments. Statistical and inferential properties of the estimators are discussed, as well as future areas of research in this model-based area of bias quantification.

在项目反应理论领域,提出了几种适用于量化差异项目功能(DIF)大小的边际效应量(ES)统计量;例如,项目和测试的差异功能(DFIT)统计、样本统计(SIDS、ids、NSIDS和NUIDS)中签名和未签名的项目差异、影响的标准化指数和差异响应功能(DRF)统计。然而,这些拟议统计数据之间的关系尚未得到充分讨论,特别是关于独立样本的总体参数定义和恢复性能。为了解决这些问题,本文提供了相互竞争的DIF ES定义和估计器的统一表示,并使用一组蒙特卡罗模拟实验评估了这些相互竞争的估计器的恢复效率。讨论了估计器的统计和推理性质,以及在这个基于模型的偏差量化领域的未来研究领域。
{"title":"A Unified Comparison of IRT-Based Effect Sizes for DIF Investigations","authors":"R. Philip Chalmers","doi":"10.1111/jedm.12347","DOIUrl":"10.1111/jedm.12347","url":null,"abstract":"<p>Several marginal effect size (ES) statistics suitable for quantifying the magnitude of differential item functioning (DIF) have been proposed in the area of item response theory; for instance, the Differential Functioning of Items and Tests (DFIT) statistics, signed and unsigned item difference in the sample statistics (SIDS, UIDS, NSIDS, and NUIDS), the standardized indices of impact, and the differential response functioning (DRF) statistics. However, the relationship between these proposed statistics has not been fully discussed, particularly with respect to population parameter definitions and recovery performance across independent samples. To address these issues, this article provides a unified presentation of competing DIF ES definitions and estimators, and evaluates the recovery efficacy of these competing estimators using a set of Monte Carlo simulation experiments. Statistical and inferential properties of the estimators are discussed, as well as future areas of research in this model-based area of bias quantification.</p>","PeriodicalId":47871,"journal":{"name":"Journal of Educational Measurement","volume":"60 2","pages":"318-350"},"PeriodicalIF":1.3,"publicationDate":"2022-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47360097","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
A Statistical Test for the Detection of Item Compromise Combining Responses and Response Times 结合响应和响应时间检测项目折衷的统计检验
IF 1.3 4区 心理学 Q3 PSYCHOLOGY, APPLIED Pub Date : 2022-10-28 DOI: 10.1111/jedm.12346
Wim J. van der Linden, Dmitry I. Belov

A test of item compromise is presented which combines the test takers' responses and response times (RTs) into a statistic defined as the number of correct responses on the item for test takers with RTs flagged as suspicious. The test has null and alternative distributions belonging to the well-known family of compound binomial distributions, is simple to calculate, and has results that are easy to interpret. It also demonstrated nearly perfect power for the detection of compromise with no more than 10 test takers with preknowledge of the more difficult and discriminating items in a set of empirical examples. For the easier and less discriminating items, the presence of some 20 test takers with preknowledge still sufficed. A test based on the reverse statistic of the total time by test takers with responses flagged as suspicious may seem a natural alternative but misses the property of a monotone likelihood ratio necessary to decide between a test that should be left or right sided.

提出了一个项目妥协测试,该测试将考生的反应和反应时间(RTs)结合到一个统计数据中,该统计数据被定义为对被标记为可疑的考生的项目正确回答的数量。该检验具有零分布和备选分布,属于众所周知的复合二项分布家族,计算简单,结果易于解释。在一组经验例子中,它还展示了几乎完美的检测妥协的能力,不超过10名考生预先知道更难和有区别的项目。对于比较容易和不太容易辨别的题目,大约20个有预见性的考生就足够了。一个基于被标记为可疑的考生总时间的反向统计的测试似乎是一个自然的选择,但缺少单调似然比的属性,这是决定一个测试应该是左还是右的必要条件。
{"title":"A Statistical Test for the Detection of Item Compromise Combining Responses and Response Times","authors":"Wim J. van der Linden,&nbsp;Dmitry I. Belov","doi":"10.1111/jedm.12346","DOIUrl":"10.1111/jedm.12346","url":null,"abstract":"<p>A test of item compromise is presented which combines the test takers' responses and response times (RTs) into a statistic defined as the number of correct responses on the item for test takers with RTs flagged as suspicious. The test has null and alternative distributions belonging to the well-known family of compound binomial distributions, is simple to calculate, and has results that are easy to interpret. It also demonstrated nearly perfect power for the detection of compromise with no more than 10 test takers with preknowledge of the more difficult and discriminating items in a set of empirical examples. For the easier and less discriminating items, the presence of some 20 test takers with preknowledge still sufficed. A test based on the reverse statistic of the total time by test takers with responses flagged as suspicious may seem a natural alternative but misses the property of a monotone likelihood ratio necessary to decide between a test that should be left or right sided.</p>","PeriodicalId":47871,"journal":{"name":"Journal of Educational Measurement","volume":"60 2","pages":"235-254"},"PeriodicalIF":1.3,"publicationDate":"2022-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/jedm.12346","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47060232","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Fully Gibbs Sampling Algorithms for Bayesian Variable Selection in Latent Regression Models 潜在回归模型中贝叶斯变量选择的全吉布斯采样算法
IF 1.3 4区 心理学 Q3 PSYCHOLOGY, APPLIED Pub Date : 2022-10-25 DOI: 10.1111/jedm.12348
Kazuhiro Yamaguchi, Jihong Zhang

This study proposed Gibbs sampling algorithms for variable selection in a latent regression model under a unidimensional two-parameter logistic item response theory model. Three types of shrinkage priors were employed to obtain shrinkage estimates: double-exponential (i.e., Laplace), horseshoe, and horseshoe+ priors. These shrinkage priors were compared to a uniform prior case in both simulation and real data analysis. The simulation study revealed that two types of horseshoe priors had a smaller root mean square errors and shorter 95% credible interval lengths than double-exponential or uniform priors. In addition, the horseshoe+ prior was slightly more stable than the horseshoe prior. The real data example successfully proved the utility of horseshoe and horseshoe+ priors in selecting effective predictive covariates for math achievement.

本研究在一维双参数逻辑项目反应理论模型下,提出了潜在回归模型中变量选择的Gibbs抽样算法。使用三种类型的收缩先验来获得收缩估计:双指数(即拉普拉斯)、马蹄形和马蹄形+先验。在模拟和实际数据分析中,将这些收缩先验与均匀先验情况进行了比较。模拟研究表明,两种类型的马蹄形先验比双指数或均匀先验具有更小的均方根误差和更短的95%可信区间长度。此外,马蹄形+先验比马蹄形先验稍微稳定一些。实际数据示例成功地证明了马蹄形和马蹄形+先验在为数学成绩选择有效预测协变量方面的效用。
{"title":"Fully Gibbs Sampling Algorithms for Bayesian Variable Selection in Latent Regression Models","authors":"Kazuhiro Yamaguchi,&nbsp;Jihong Zhang","doi":"10.1111/jedm.12348","DOIUrl":"https://doi.org/10.1111/jedm.12348","url":null,"abstract":"<p>This study proposed Gibbs sampling algorithms for variable selection in a latent regression model under a unidimensional two-parameter logistic item response theory model. Three types of shrinkage priors were employed to obtain shrinkage estimates: double-exponential (i.e., Laplace), horseshoe, and horseshoe+ priors. These shrinkage priors were compared to a uniform prior case in both simulation and real data analysis. The simulation study revealed that two types of horseshoe priors had a smaller root mean square errors and shorter 95% credible interval lengths than double-exponential or uniform priors. In addition, the horseshoe+ prior was slightly more stable than the horseshoe prior. The real data example successfully proved the utility of horseshoe and horseshoe+ priors in selecting effective predictive covariates for math achievement.</p>","PeriodicalId":47871,"journal":{"name":"Journal of Educational Measurement","volume":"60 2","pages":"202-234"},"PeriodicalIF":1.3,"publicationDate":"2022-10-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50154343","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Factor Mixture Model for Item Responses and Certainty of Response Indices to Identify Student Knowledge Profiles 项目反应与反应指标确定性的因子混合模型识别学生知识概况
IF 1.3 4区 心理学 Q3 PSYCHOLOGY, APPLIED Pub Date : 2022-10-10 DOI: 10.1111/jedm.12344
Chia-Wen Chen, Björn Andersson, Jinxin Zhu

The certainty of response index (CRI) measures respondents' confidence level when answering an item. In conjunction with the answers to the items, previous studies have used descriptive statistics and arbitrary thresholds to identify student knowledge profiles with the CRIs. Whereas this approach overlooked the measurement error of the observed item responses and indices, we address this by proposing a factor mixture model that integrates a latent class model to detect student subgroups and a measurement model to control for student ability and confidence level. Applying the model to 773 seventh graders' responses to an algebra test, where some items were related to new material that had not been taught in class, we found two subgroups: (1) students who had high confidence in answering items involving the new material; and (2) students who had low confidence in answering items involving the new material but higher general self-confidence than the first group. We regressed the posterior probability of the group membership on gender, prior achievement, and preview behavior and found preview behavior a significant factor associated with the membership. Finally, we discussed the implications of the current study for teaching practices and future research.

回答的确定性指数(CRI)衡量被调查者在回答一个问题时的信心水平。结合这些问题的答案,以前的研究使用描述性统计和任意阈值来确定学生的知识概况与cri。鉴于这种方法忽略了观察到的项目反应和指数的测量误差,我们提出了一个因素混合模型,该模型集成了一个潜在类别模型来检测学生亚组和一个测量模型来控制学生的能力和信心水平。将该模型应用于773名七年级学生对代数测试的回答,其中一些问题与课堂上没有教过的新材料有关,我们发现了两个亚组:(1)对回答涉及新材料的问题有高信心的学生;(2)在回答新材料问题时自信心较低,但总体自信心高于第一组。我们对小组成员的性别、先前成就和预习行为的后验概率进行了回归,发现预习行为是影响小组成员的重要因素。最后,讨论了本研究对教学实践和未来研究的启示。
{"title":"A Factor Mixture Model for Item Responses and Certainty of Response Indices to Identify Student Knowledge Profiles","authors":"Chia-Wen Chen,&nbsp;Björn Andersson,&nbsp;Jinxin Zhu","doi":"10.1111/jedm.12344","DOIUrl":"10.1111/jedm.12344","url":null,"abstract":"<p>The certainty of response index (CRI) measures respondents' confidence level when answering an item. In conjunction with the answers to the items, previous studies have used descriptive statistics and arbitrary thresholds to identify student knowledge profiles with the CRIs. Whereas this approach overlooked the measurement error of the observed item responses and indices, we address this by proposing a factor mixture model that integrates a latent class model to detect student subgroups and a measurement model to control for student ability and confidence level. Applying the model to 773 seventh graders' responses to an algebra test, where some items were related to new material that had not been taught in class, we found two subgroups: (1) students who had high confidence in answering items involving the new material; and (2) students who had low confidence in answering items involving the new material but higher general self-confidence than the first group. We regressed the posterior probability of the group membership on gender, prior achievement, and preview behavior and found preview behavior a significant factor associated with the membership. Finally, we discussed the implications of the current study for teaching practices and future research.</p>","PeriodicalId":47871,"journal":{"name":"Journal of Educational Measurement","volume":"60 1","pages":"28-51"},"PeriodicalIF":1.3,"publicationDate":"2022-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/jedm.12344","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43460732","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Betty Lanteigne, Christine Coombe, & James Dean Brown. 2021. Challenges in Language Testing around the World: Insights for language test users. Singapore: Springer, 2021, 129.99 € (hardcover), ISBN 978-981-33-4232-3 (eBook). xxiii + 553 pp. https://doi.org/10.1007/978-981-33-4232-3 Betty Lanteigne、Christine Coombe和James DeanBrown。2021.世界各地语言测试的挑战:语言测试用户的见解。新加坡:施普林格出版社,2021,129.99欧元(精装本),ISBN 978-981-33-4232-3(电子书)。xxiii+553页。https://doi.org/10.1007/978-981-33-4232-3
IF 1.3 4区 心理学 Q3 PSYCHOLOGY, APPLIED Pub Date : 2022-09-25 DOI: 10.1111/jedm.12343
Bahram Kazemian, Shafigeh Mohammadian
{"title":"Betty Lanteigne, Christine Coombe, & James Dean Brown. 2021. Challenges in Language Testing around the World: Insights for language test users. Singapore: Springer, 2021, 129.99 € (hardcover), ISBN 978-981-33-4232-3 (eBook). xxiii + 553 pp. https://doi.org/10.1007/978-981-33-4232-3","authors":"Bahram Kazemian,&nbsp;Shafigeh Mohammadian","doi":"10.1111/jedm.12343","DOIUrl":"10.1111/jedm.12343","url":null,"abstract":"","PeriodicalId":47871,"journal":{"name":"Journal of Educational Measurement","volume":"59 4","pages":"536-544"},"PeriodicalIF":1.3,"publicationDate":"2022-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45401317","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of Educational Measurement
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1