Pub Date : 2024-09-01DOI: 10.1007/s11336-024-09999-w
Chun Wang
{"title":"Correction: A Diagnostic Facet Status Model (DFSM) for Extracting Instructionally Useful Information from Diagnostic Assessment.","authors":"Chun Wang","doi":"10.1007/s11336-024-09999-w","DOIUrl":"10.1007/s11336-024-09999-w","url":null,"abstract":"","PeriodicalId":54534,"journal":{"name":"Psychometrika","volume":null,"pages":null},"PeriodicalIF":2.9,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141983980","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-01Epub Date: 2024-07-04DOI: 10.1007/s11336-024-09980-7
Paul De Boeck, Michael L DeKay, Jolynn Pek
Wu and Browne (Psychometrika 80(3):571-600, 2015. https://doi.org/10.1007/s11336-015-9451-3 ; henceforth W &B) introduced the notion of adventitious error to explicitly take into account approximate goodness of fit of covariance structure models (CSMs). Adventitious error supposes that observed covariance matrices are not directly sampled from a theoretical population covariance matrix but from an operational population covariance matrix. This operational matrix is randomly distorted from the theoretical matrix due to differences in study implementations. W &B showed how adventitious error is linked to the root mean square error of approximation (RMSEA) and how the standard errors (SEs) of parameter estimates are augmented. Our contribution is to consider adventitious error as a general phenomenon and to illustrate its consequences. Using simulations, we illustrate that its impact on SEs can be generalized to pairwise relations between variables beyond the CSM context. Using derivations, we conjecture that heterogeneity of effect sizes across studies and overestimation of statistical power can both be interpreted as stemming from adventitious error. We also show that adventitious error, if it occurs, has an impact on the uncertainty of composite measurement outcomes such as factor scores and summed scores. The results of a simulation study show that the impact on measurement uncertainty is rather small although larger for factor scores than for summed scores. Adventitious error is an assumption about the data generating mechanism; the notion offers a statistical framework for understanding a broad range of phenomena, including approximate fit, varying research findings, heterogeneity of effects, and overestimates of power.
Wu 和 Browne(Psychometrika 80(3):571-600, 2015. https://doi.org/10.1007/s11336-015-9451-3; 以下简称 W &B)引入了偶然误差的概念,以明确考虑协方差结构模型(CSM)的近似拟合优度。偶然误差假设观测到的协方差矩阵不是直接从理论种群协方差矩阵中采样,而是从操作种群协方差矩阵中采样。由于研究实施的不同,该操作矩阵与理论矩阵之间存在随机扭曲。W & B 展示了偶然误差与均方根近似误差 (RMSEA) 的关系,以及参数估计的标准误差 (SE) 是如何增加的。我们的贡献在于将偶然误差视为一种普遍现象,并说明其后果。通过模拟,我们说明了偶然误差对标准误差的影响可以扩展到 CSM 范围之外的变量之间的成对关系。通过推导,我们推测不同研究之间效应大小的异质性和统计能力的高估都可以解释为源于偶然误差。我们还表明,偶然误差(如果发生)会对因子得分和总分等综合测量结果的不确定性产生影响。模拟研究的结果表明,对测量不确定性的影响相当小,但对因子得分的影响大于对总分的影响。偶然误差是对数据生成机制的一种假设;这一概念为理解各种现象提供了一个统计框架,这些现象包括近似拟合、不同的研究结果、效应的异质性以及对力量的高估。
{"title":"Adventitious Error and Its Implications for Testing Relations Between Variables and for Composite Measurement Outcomes.","authors":"Paul De Boeck, Michael L DeKay, Jolynn Pek","doi":"10.1007/s11336-024-09980-7","DOIUrl":"10.1007/s11336-024-09980-7","url":null,"abstract":"<p><p>Wu and Browne (Psychometrika 80(3):571-600, 2015. https://doi.org/10.1007/s11336-015-9451-3 ; henceforth W &B) introduced the notion of adventitious error to explicitly take into account approximate goodness of fit of covariance structure models (CSMs). Adventitious error supposes that observed covariance matrices are not directly sampled from a theoretical population covariance matrix but from an operational population covariance matrix. This operational matrix is randomly distorted from the theoretical matrix due to differences in study implementations. W &B showed how adventitious error is linked to the root mean square error of approximation (RMSEA) and how the standard errors (SEs) of parameter estimates are augmented. Our contribution is to consider adventitious error as a general phenomenon and to illustrate its consequences. Using simulations, we illustrate that its impact on SEs can be generalized to pairwise relations between variables beyond the CSM context. Using derivations, we conjecture that heterogeneity of effect sizes across studies and overestimation of statistical power can both be interpreted as stemming from adventitious error. We also show that adventitious error, if it occurs, has an impact on the uncertainty of composite measurement outcomes such as factor scores and summed scores. The results of a simulation study show that the impact on measurement uncertainty is rather small although larger for factor scores than for summed scores. Adventitious error is an assumption about the data generating mechanism; the notion offers a statistical framework for understanding a broad range of phenomena, including approximate fit, varying research findings, heterogeneity of effects, and overestimates of power.</p>","PeriodicalId":54534,"journal":{"name":"Psychometrika","volume":null,"pages":null},"PeriodicalIF":2.9,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11458726/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141499657","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-01DOI: 10.1007/s11336-024-09974-5
Chen-Wei Liu, Björn Andersson, Anders Skrondal
{"title":"Erratum: A Constrained Metropolis-Hastings Robbins-Monro Algorithm for Q Matrix Estimation in DINA Models.","authors":"Chen-Wei Liu, Björn Andersson, Anders Skrondal","doi":"10.1007/s11336-024-09974-5","DOIUrl":"10.1007/s11336-024-09974-5","url":null,"abstract":"","PeriodicalId":54534,"journal":{"name":"Psychometrika","volume":null,"pages":null},"PeriodicalIF":2.9,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140946581","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-01Epub Date: 2024-06-22DOI: 10.1007/s11336-024-09979-0
Garritt L Page, Ernesto San Martín, David Torres Irribarra, Sébastien Van Bellegem
We aim to estimate school value-added dynamically in time. Our principal motivation for doing so is to establish school effectiveness persistence while taking into account the temporal dependence that typically exists in school performance from one year to the next. We propose two methods of incorporating temporal dependence in value-added models. In the first we model the random school effects that are commonly present in value-added models with an auto-regressive process. In the second approach, we incorporate dependence in value-added estimators by modeling the performance of one cohort based on the previous cohort's performance. An identification analysis allows us to make explicit the meaning of the corresponding value-added indicators: based on these meanings, we show that each model is useful for monitoring specific aspects of school persistence. Furthermore, we carefully detail how value-added can be estimated over time. We show through simulations that ignoring temporal dependence when it exists results in diminished efficiency in value-added estimation while incorporating it results in improved estimation (even when temporal dependence is weak). Finally, we illustrate the methodology by considering two cohorts from Chile's national standardized test in mathematics.
{"title":"Temporally Dynamic, Cohort-Varying Value-Added Models.","authors":"Garritt L Page, Ernesto San Martín, David Torres Irribarra, Sébastien Van Bellegem","doi":"10.1007/s11336-024-09979-0","DOIUrl":"10.1007/s11336-024-09979-0","url":null,"abstract":"<p><p>We aim to estimate school value-added dynamically in time. Our principal motivation for doing so is to establish school effectiveness persistence while taking into account the temporal dependence that typically exists in school performance from one year to the next. We propose two methods of incorporating temporal dependence in value-added models. In the first we model the random school effects that are commonly present in value-added models with an auto-regressive process. In the second approach, we incorporate dependence in value-added estimators by modeling the performance of one cohort based on the previous cohort's performance. An identification analysis allows us to make explicit the meaning of the corresponding value-added indicators: based on these meanings, we show that each model is useful for monitoring specific aspects of school persistence. Furthermore, we carefully detail how value-added can be estimated over time. We show through simulations that ignoring temporal dependence when it exists results in diminished efficiency in value-added estimation while incorporating it results in improved estimation (even when temporal dependence is weak). Finally, we illustrate the methodology by considering two cohorts from Chile's national standardized test in mathematics.</p>","PeriodicalId":54534,"journal":{"name":"Psychometrika","volume":null,"pages":null},"PeriodicalIF":2.9,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141441115","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-01Epub Date: 2024-03-01DOI: 10.1007/s11336-024-09955-8
Chengyu Cui, Chun Wang, Gongjun Xu
Multidimensional item response theory (MIRT) models have generated increasing interest in the psychometrics literature. Efficient approaches for estimating MIRT models with dichotomous responses have been developed, but constructing an equally efficient and robust algorithm for polytomous models has received limited attention. To address this gap, this paper presents a novel Gaussian variational estimation algorithm for the multidimensional generalized partial credit model. The proposed algorithm demonstrates both fast and accurate performance, as illustrated through a series of simulation studies and two real data analyses.
{"title":"Variational Estimation for Multidimensional Generalized Partial Credit Model.","authors":"Chengyu Cui, Chun Wang, Gongjun Xu","doi":"10.1007/s11336-024-09955-8","DOIUrl":"10.1007/s11336-024-09955-8","url":null,"abstract":"<p><p>Multidimensional item response theory (MIRT) models have generated increasing interest in the psychometrics literature. Efficient approaches for estimating MIRT models with dichotomous responses have been developed, but constructing an equally efficient and robust algorithm for polytomous models has received limited attention. To address this gap, this paper presents a novel Gaussian variational estimation algorithm for the multidimensional generalized partial credit model. The proposed algorithm demonstrates both fast and accurate performance, as illustrated through a series of simulation studies and two real data analyses.</p>","PeriodicalId":54534,"journal":{"name":"Psychometrika","volume":null,"pages":null},"PeriodicalIF":2.9,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140013759","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-30DOI: 10.1007/s11336-024-10000-x
Khadiga H A Sayed, Maarten J L F Cruyff, Peter G M van der Heijden
Randomized response is an interview technique for sensitive questions designed to eliminate evasive response bias. Since this elimination is only partially successful, two models have been proposed for modeling evasive response bias: the cheater detection model for a design with two sub-samples with different randomization probabilities and the self-protective no sayers model for a design with multiple sensitive questions. This paper shows the correspondence between these models, and introduces models for the new, hybrid "ever/last year" design that account for self-protective no saying and cheating. The model for one set of ever/last year questions has a degree of freedom that can be used for the inclusion of a response bias parameter. Models with multiple degrees of freedom are introduced for extensions of the design with a third randomized response question and a second set of ever/last year questions. The models are illustrated with two surveys on doping use. We conclude with a discussion of the pros and cons of the ever/last year design and its potential for future research.
{"title":"Modeling Evasive Response Bias in Randomized Response: Cheater Detection Versus Self-protective No-Saying.","authors":"Khadiga H A Sayed, Maarten J L F Cruyff, Peter G M van der Heijden","doi":"10.1007/s11336-024-10000-x","DOIUrl":"https://doi.org/10.1007/s11336-024-10000-x","url":null,"abstract":"<p><p>Randomized response is an interview technique for sensitive questions designed to eliminate evasive response bias. Since this elimination is only partially successful, two models have been proposed for modeling evasive response bias: the cheater detection model for a design with two sub-samples with different randomization probabilities and the self-protective no sayers model for a design with multiple sensitive questions. This paper shows the correspondence between these models, and introduces models for the new, hybrid \"ever/last year\" design that account for self-protective no saying and cheating. The model for one set of ever/last year questions has a degree of freedom that can be used for the inclusion of a response bias parameter. Models with multiple degrees of freedom are introduced for extensions of the design with a third randomized response question and a second set of ever/last year questions. The models are illustrated with two surveys on doping use. We conclude with a discussion of the pros and cons of the ever/last year design and its potential for future research.</p>","PeriodicalId":54534,"journal":{"name":"Psychometrika","volume":null,"pages":null},"PeriodicalIF":2.9,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142114830","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-17DOI: 10.1007/s11336-024-09997-y
Zhongtian Lin, Tao Jiang, Frank Rijmen, Paul Van Wamelen
A well-known person fit statistic in the item response theory (IRT) literature is the statistic (Drasgow et al. in Br J Math Stat Psychol 38(1):67-86, 1985). Snijders (Psychometrika 66(3):331-342, 2001) derived , which is the asymptotically correct version of when the ability parameter is estimated. However, both statistics and other extensions later developed concern either only the unidimensional IRT models or multidimensional models that require a joint estimate of latent traits across all the dimensions. Considering a marginalized maximum likelihood ability estimator, this paper proposes and , which are extensions of and , respectively, for the Rasch testlet model. The computation of relies on several extensions of the Lord-Wingersky algorithm (1984) that are additional contributions of this paper. Simulation results show that has close-to-nominal Type I error rates and satisfactory power for detecting aberrant responses. For unidimensional models, and reduce to and , respectively, and therefore allows for the evaluation of person fit with a wider range of IRT models. A real data application is presented to show the utility of the proposed statistics for a test with an underlying structure that consists of both the traditional unidimensional component and the Rasch testlet component.
在项目反应理论(IRT)文献中,一个著名的拟合统计量是 l z 统计量(Drasgow 等人,载于 Br J Math Stat Psychol 38(1):67-86,1985 年)。Snijders(Psychometrika 66(3):331-342,2001)推导出了 l z ∗,这是能力参数估计时 l z 的渐近正确版本。然而,这两个统计量和后来开发的其他扩展都只涉及单维 IRT 模型或多维模型,后者需要对所有维度的潜在特质进行联合估计。考虑到边际最大似然能力估计器,本文提出了 l zt 和 l zt ∗,它们分别是 l z 和 l z ∗ 的扩展,适用于 Rasch 小测验模型。l zt ∗ 的计算依赖于 Lord-Wingersky 算法(1984 年)的几个扩展,这是本文的额外贡献。模拟结果表明,l zt ∗ 具有接近正常的 I 类错误率和令人满意的异常反应检测能力。对于单维模型,l zt 和 l zt ∗ 分别简化为 l z 和 l z ∗,因此可以对更广泛的 IRT 模型进行拟合评估。本文介绍了一个真实的数据应用,以展示所提出的统计方法在一个测试中的实用性,该测试的基本结构由传统的单维部分和 Rasch 小测试部分组成。
{"title":"Asymptotically Correct Person Fit z-Statistics For the Rasch Testlet Model.","authors":"Zhongtian Lin, Tao Jiang, Frank Rijmen, Paul Van Wamelen","doi":"10.1007/s11336-024-09997-y","DOIUrl":"https://doi.org/10.1007/s11336-024-09997-y","url":null,"abstract":"<p><p>A well-known person fit statistic in the item response theory (IRT) literature is the <math><msub><mi>l</mi> <mi>z</mi></msub> </math> statistic (Drasgow et al. in Br J Math Stat Psychol 38(1):67-86, 1985). Snijders (Psychometrika 66(3):331-342, 2001) derived <math><mmultiscripts><mi>l</mi> <mrow><mi>z</mi></mrow> <mrow><mrow></mrow> <mo>∗</mo></mrow> </mmultiscripts> </math> , which is the asymptotically correct version of <math><msub><mi>l</mi> <mi>z</mi></msub> </math> when the ability parameter is estimated. However, both statistics and other extensions later developed concern either only the unidimensional IRT models or multidimensional models that require a joint estimate of latent traits across all the dimensions. Considering a marginalized maximum likelihood ability estimator, this paper proposes <math><msub><mi>l</mi> <mrow><mi>zt</mi></mrow> </msub> </math> and <math><mmultiscripts><mi>l</mi> <mrow><mi>zt</mi></mrow> <mrow><mrow></mrow> <mo>∗</mo></mrow> </mmultiscripts> </math> , which are extensions of <math><msub><mi>l</mi> <mi>z</mi></msub> </math> and <math><mmultiscripts><mi>l</mi> <mrow><mi>z</mi></mrow> <mrow><mrow></mrow> <mo>∗</mo></mrow> </mmultiscripts> </math> , respectively, for the Rasch testlet model. The computation of <math><mmultiscripts><mi>l</mi> <mrow><mi>zt</mi></mrow> <mrow><mrow></mrow> <mo>∗</mo></mrow> </mmultiscripts> </math> relies on several extensions of the Lord-Wingersky algorithm (1984) that are additional contributions of this paper. Simulation results show that <math><mmultiscripts><mi>l</mi> <mrow><mi>zt</mi></mrow> <mrow><mrow></mrow> <mo>∗</mo></mrow> </mmultiscripts> </math> has close-to-nominal Type I error rates and satisfactory power for detecting aberrant responses. For unidimensional models, <math><msub><mi>l</mi> <mrow><mi>zt</mi></mrow> </msub> </math> and <math><mmultiscripts><mi>l</mi> <mrow><mi>zt</mi></mrow> <mrow><mrow></mrow> <mo>∗</mo></mrow> </mmultiscripts> </math> reduce to <math><msub><mi>l</mi> <mi>z</mi></msub> </math> and <math><mmultiscripts><mi>l</mi> <mrow><mi>z</mi></mrow> <mrow><mrow></mrow> <mo>∗</mo></mrow> </mmultiscripts> </math> , respectively, and therefore allows for the evaluation of person fit with a wider range of IRT models. A real data application is presented to show the utility of the proposed statistics for a test with an underlying structure that consists of both the traditional unidimensional component and the Rasch testlet component.</p>","PeriodicalId":54534,"journal":{"name":"Psychometrika","volume":null,"pages":null},"PeriodicalIF":2.9,"publicationDate":"2024-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141996955","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-14DOI: 10.1007/s11336-024-09987-0
Michael C Edwards
{"title":"Book Review: Subscores : A Practical Guide to Their Production and Consumption by Shelby Haberman, Sandip Sinharay, Richard A. Feinberg, & Howard Wainer.","authors":"Michael C Edwards","doi":"10.1007/s11336-024-09987-0","DOIUrl":"https://doi.org/10.1007/s11336-024-09987-0","url":null,"abstract":"","PeriodicalId":54534,"journal":{"name":"Psychometrika","volume":null,"pages":null},"PeriodicalIF":2.9,"publicationDate":"2024-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141977236","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-10DOI: 10.1007/s11336-024-09998-x
Na Shan, Ping-Feng Xu
In multidimensional tests, the identification of latent traits measured by each item is crucial. In addition to item-trait relationship, differential item functioning (DIF) is routinely evaluated to ensure valid comparison among different groups. The two problems are investigated separately in the literature. This paper uses a unified framework for detecting item-trait relationship and DIF in multidimensional item response theory (MIRT) models. By incorporating DIF effects in MIRT models, these problems can be considered as variable selection for latent/observed variables and their interactions. A Bayesian adaptive Lasso procedure is developed for variable selection, in which item-trait relationship and DIF effects can be obtained simultaneously. Simulation studies show the performance of our method for parameter estimation, the recovery of item-trait relationship and the detection of DIF effects. An application is presented using data from the Eysenck Personality Questionnaire.
{"title":"Bayesian Adaptive Lasso for Detecting Item-Trait Relationship and Differential Item Functioning in Multidimensional Item Response Theory Models.","authors":"Na Shan, Ping-Feng Xu","doi":"10.1007/s11336-024-09998-x","DOIUrl":"https://doi.org/10.1007/s11336-024-09998-x","url":null,"abstract":"<p><p>In multidimensional tests, the identification of latent traits measured by each item is crucial. In addition to item-trait relationship, differential item functioning (DIF) is routinely evaluated to ensure valid comparison among different groups. The two problems are investigated separately in the literature. This paper uses a unified framework for detecting item-trait relationship and DIF in multidimensional item response theory (MIRT) models. By incorporating DIF effects in MIRT models, these problems can be considered as variable selection for latent/observed variables and their interactions. A Bayesian adaptive Lasso procedure is developed for variable selection, in which item-trait relationship and DIF effects can be obtained simultaneously. Simulation studies show the performance of our method for parameter estimation, the recovery of item-trait relationship and the detection of DIF effects. An application is presented using data from the Eysenck Personality Questionnaire.</p>","PeriodicalId":54534,"journal":{"name":"Psychometrika","volume":null,"pages":null},"PeriodicalIF":2.9,"publicationDate":"2024-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141914581","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-21DOI: 10.1007/s11336-024-09982-5
Jules L Ellis, Klaas Sijtsma, Kristel de Groot, Patrick J F Groenen
In psychophysiology, an interesting question is how to estimate the reliability of event-related potentials collected by means of the Eriksen Flanker Task or similar tests. A special problem presents itself if the data represent neurological reactions that are associated with some responses (in case of the Flanker Task, responding incorrectly on a trial) but not others (like when providing a correct response), inherently resulting in unequal numbers of observations per subject. The general trend in reliability research here is to use generalizability theory and Bayesian estimation. We show that a new approach based on classical test theory and frequentist estimation can do the job as well and in a simpler way, and even provides additional insight to matters that were unsolved in the generalizability method approach. One of our contributions is the definition of a single, overall reliability coefficient for an entire group of subjects with unequal numbers of observations. Both methods have slightly different objectives. We argue in favor of the classical approach but without rejecting the generalizability approach.
{"title":"Reliability Theory for Measurements with Variable Test Length, Illustrated with ERN and Pe Collected in the Flanker Task.","authors":"Jules L Ellis, Klaas Sijtsma, Kristel de Groot, Patrick J F Groenen","doi":"10.1007/s11336-024-09982-5","DOIUrl":"https://doi.org/10.1007/s11336-024-09982-5","url":null,"abstract":"<p><p>In psychophysiology, an interesting question is how to estimate the reliability of event-related potentials collected by means of the Eriksen Flanker Task or similar tests. A special problem presents itself if the data represent neurological reactions that are associated with some responses (in case of the Flanker Task, responding incorrectly on a trial) but not others (like when providing a correct response), inherently resulting in unequal numbers of observations per subject. The general trend in reliability research here is to use generalizability theory and Bayesian estimation. We show that a new approach based on classical test theory and frequentist estimation can do the job as well and in a simpler way, and even provides additional insight to matters that were unsolved in the generalizability method approach. One of our contributions is the definition of a single, overall reliability coefficient for an entire group of subjects with unequal numbers of observations. Both methods have slightly different objectives. We argue in favor of the classical approach but without rejecting the generalizability approach.</p>","PeriodicalId":54534,"journal":{"name":"Psychometrika","volume":null,"pages":null},"PeriodicalIF":2.9,"publicationDate":"2024-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141735703","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}